Understanding Agents | Mino Course

In Lesson 1, we established that LLMs are brains (token prediction engines) and agent frameworks are bodies (execution layers). Now let’s understand how these components work together to create autonomous coding agents that can complete complex tasks.

The Agent Execution Loop

An agent isn’t just an LLM responding to prompts. It’s a feedback loop that combines reasoning with action, allowing the LLM to iteratively work toward a goal.

Basic Loop: Perceive → Reason → Act → Verify → Iterate

The agent execution follows this pattern:

Perceive: Read context - files, state, history
Reason: LLM generates plan and next action
Act: Execute tool - Read, Edit, Bash, etc.
Observe: Process tool output
Verify: Goal achieved?
Iterate: If not, loop back to Perceive

Key distinction: A chat interface requires you to manually execute actions between prompts. An agent autonomously loops through this cycle.

Example: Implementing a Feature

Chat interface workflow:

You: “How should I add GDPR consent tracking to this intake form?”
LLM: “Here’s the code…”
You manually edit files
You: “I got this error…”
LLM: “Try this fix…”
You manually edit again

Agent workflow:

You: “Add GDPR consent tracking to this intake form”
Agent: [Perceive] Reads form files → [Reason] Plans implementation → [Act] Edits files → [Observe] Runs tests → [Verify] Tests fail → [Reason] Analyzes error → [Act] Fixes code → [Observe] Runs tests → [Verify] Tests pass → Done

The agent closes the loop automatically, executing the full cycle without requiring manual intervention at each step.

Under the Hood: It’s All Just Text

Here’s the fundamental truth that demystifies AI coding agents: everything is just text flowing through a context window.

No magic, no separate reasoning engine, no hidden state. When you interact with an agent, you’re watching a conversation unfold in a single, large text buffer.

The Textual Flow

Every agent interaction follows this pattern:

System Prompt (sets up agent behavior, defines tools)
User Prompt (“Add input validation to client intake form”)
Agent decides to use a tool
Tool Result added to context as text
Agent reasons and decides next action
Repeat until task complete

Reasoning is Just Text The agent doesn’t “think” separately from its output. When you see the agent reasoning (“I should check the validation logic…”), that’s not internal thought - it’s text being generated in the context, visible to both you and the LLM itself.

Concrete Example: What the Context Actually Looks Like

When you ask an agent to “Add KvK number validation to the client intake endpoint,” the context window contains a single, continuous text stream:

SYSTEM: You are an AI coding assistant. You have access to these tools:
- Read(file_path): Read a file
- Edit(file_path, old_string, new_string): Edit a file
- Bash(command): Run a shell command
- Grep(pattern, path): Search for pattern in files

USER: Add KvK number validation to the client intake endpoint

ASSISTANT: I'll search for the registration endpoint first.
<tool_use>
  <tool_name>Grep</tool_name>
  <parameters>
    <pattern>register</pattern>
  </parameters>
</tool_use>

TOOL_RESULT: Found in:
- src/routes/intake.ts:12: router.post('/client', intakeHandler.createClient)

ASSISTANT: Found the endpoint. Let me read the handler.
<tool_use>
  <tool_name>Read</tool_name>
  <parameters>
    <file_path>src/handlers/intake.ts</file_path>
  </parameters>
</tool_use>

[... continues ...]

This entire conversation exists as one continuous text stream in the LLM’s context window.

Why This Matters

Understanding the textual nature of agents helps you:

Anticipate behavior - The agent only knows what’s in the context
Debug confusion - If the agent forgets something, it probably scrolled out of context
Structure better prompts - You’re adding text to a conversation, not issuing commands
Recognize limitations - Context windows are finite; complex tasks may lose details

The Stateless Advantage

Here’s a crucial insight: The LLM is completely stateless. Its only “world” is the current context window.

The LLM doesn’t “remember” previous conversations. It has no hidden internal state. Each response is generated solely from the text currently in the context.

This is a massive advantage, not a limitation. You control what the agent knows by controlling what’s in the context.

Clean-slate exploration: Start a new conversation, and the agent has no bias from previous decisions
Unbiased code review: The agent can critically audit its own work in a fresh context

The same code that gets “looks sound overall” in one context triggers “Critical security vulnerabilities” in a fresh context. This enables Generate → Review → Iterate workflows.

Tools: Built-In vs External

Agents become useful through tools - functions the LLM can call to interact with the world.

Built-In Tools: Optimized for Speed

CLI coding agents ship with purpose-built tools for common workflows:

Read, Edit, Bash, Grep, Write, Glob - These aren’t just wrappers around shell commands. They’re engineered with edge case handling, LLM-friendly output formats, safety guardrails, and token efficiency.

External Tools: MCP Protocol

MCP (Model Context Protocol) is a standardized plugin system for adding custom tools. Use it to connect your agent to external systems:

Database clients (Postgres, MongoDB)
API integrations (GitHub, Legal Intelligence, Rechtspraak.nl)
Cloud platforms (AWS, GCP, Azure)

CLI Coding Agents: Why They Win

While chat interfaces excel at answering questions and brainstorming, CLI coding agents deliver superior developer experience for actual implementation work.

The Concurrent Work Advantage

Multiple terminal tabs = multiple agents working on different projects simultaneously.

Open three tabs, run agents on different projects (refactoring in contract-analyzer, debugging in case-timeline, implementing in client-portal). Context-switch freely. Each agent keeps working independently.

IDE agents are tightly coupled to a single window and project. You’re blocked until the agent completes.

Chat interfaces reset context with each conversation. You manually copy-paste code and execute changes.

CLI agents unlock parallelism without managing conversation threads or multiple IDE instances.

Context Engineering and Steering

Now that you understand agents as textual systems and LLMs as stateless, the core truth emerges: effective AI-assisted coding is about engineering context to steer behavior.

The context window is the agent’s entire world - everything it knows comes from the text flowing through it. You control that text: system prompts, your instructions, tool results, conversation history.

Vague context produces wandering behavior
Precise, scoped context steers the agent exactly where you need it

You can steer upfront with focused prompts, or dynamically mid-conversation when the agent drifts. The stateless nature means you can even steer the agent to objectively review its own code in a fresh conversation.

This is system design thinking applied to text - you’re already good at designing interfaces and contracts. The rest of this course teaches how to apply those skills to engineer context and steer agents across real coding scenarios.