Track

Agent Loop

Reason · Act · Observe · Repeat

You ask an LLM a question, it answers — that's a single turn. Now imagine you want the LLM to do something more involved: look up information, run a calculation, check a database, then synthesize everything into a final answer. One call won't cut it. You need a loop.

The ReAct (Reasoning + Acting) loop is the simplest pattern that makes this work. The agent thinks about what to do next, picks an action, executes it, observes what happened, and repeats — until it has enough information to answer. This is the foundation every agentic system is built on.

The Cycle

At its core, a ReAct loop has three phases that repeat: Thought → Action → Observation.

The LLM reasons about the current state — what it knows, what it needs, what to do next. It produces a structured command: a tool name and arguments. The tool's result comes back and gets appended to the conversation.

This loop continues until the agent decides it has enough information and emits a final answer instead of an action.

Why Loops Matter

Without a loop, an LLM can only answer from its training data. It can't look things up, run code, or interact with APIs. A loop turns the LLM from a static knowledge base into an active problem-solver.

Every agent framework — LangChain, CrewAI, AutoGPT, OpenAI Assistants — runs some variant of this loop. The details differ, but the core pattern is identical: think, act, observe, repeat.

Concrete Example

Minimal ReAct Loop

import re

MAX_STEPS = 6
ACTION_RE = re.compile(r"Action:\s*(\w+)\((.*)\)")
FINAL_RE = re.compile(r"Final Answer:\s*(.*)")

def run_agent(question, llm, tools):
    scratchpad = f"Question: {question}\n"
    for _ in range(MAX_STEPS):
        output = llm(scratchpad)
        scratchpad += output + "\n"

        m = FINAL_RE.search(output)
        if m:
            return m.group(1)

        m = ACTION_RE.search(output)
        if m:
            tool_name, args = m.group(1), m.group(2)
            result = tools.get(tool_name, lambda _: "Unknown tool")(args)
            scratchpad += f"Observation: {result}\n"

    return "I could not find an answer."

The scratchpad is a single growing string that holds the entire conversation so far. On each step, we send it to the LLM and get back more text. We check for a Final Answer first. If none, we look for an Action line, extract the tool name and arguments, call the tool, and append the Observation. After MAX_STEPS with no answer, we bail.

Key Ideas

Scratchpad

A single growing transcript the LLM reads every step. No separate memory store needed.

Action Parsing

Extract structured commands from free-form text using simple regex patterns.

Tool Dispatch

Route parsed actions to functions via a dictionary lookup.

Observation Feedback

Tool results go back into the scratchpad so the agent learns from them.

Budgeting

Always cap the loop so the agent can't run forever.

Problems in this track

6 problems. Sign in to start solving.

TitleDifficultyAcceptanceEst.

Implement a Minimal ReAct Loop

Build a Thought -> Action -> Observation loop that terminates on a Final Answer.

Easy78%20m

Route Tool Calls from Agent Output

Parse agent output and route tool calls to the right handler with a fallback for unknown tools.

Easy75%15m

Recover from Tool Errors in the Loop

Wrap tool calls in try/except so that runtime errors become observations and the loop continues.

Medium65%25m

Stop a Loop with a Step Budget

Track step count and return a fallback result when the agent exceeds its allowed step budget.

Medium68%20m

Summarize a Growing Scratchpad

When the scratchpad exceeds a token limit, summarize the older portion while preserving the latest action and observation.

Medium55%30m

Execute a Multi-Tool Plan

Execute a list of ordered plan steps by dispatching tools in sequence, accumulating observations, and returning a final answer.

Hard45%35m

The Cycle

At its core, a ReAct loop has three phases that repeat: Thought → Action → Observation.

This loop continues until the agent decides it has enough information and emits a final answer instead of an action.

Why Loops Matter

Every agent framework — LangChain, CrewAI, AutoGPT, OpenAI Assistants — runs some variant of this loop. The details differ, but the core pattern is identical: think, act, observe, repeat.

Minimal ReAct Loop

import re MAX_STEPS = 6 ACTION_RE = re.compile(r"Action:\s*(\w+)\((.*)\)") FINAL_RE = re.compile(r"Final Answer:\s*(.*)") def run_agent(question, llm, tools): scratchpad = f"Question: {question}\n" for _ in range(MAX_STEPS): output = llm(scratchpad) scratchpad += output + "\n" m = FINAL_RE.search(output) if m: return m.group(1) m = ACTION_RE.search(output) if m: tool_name, args = m.group(1), m.group(2) result = tools.get(tool_name, lambda _: "Unknown tool")(args) scratchpad += f"Observation: {result}\n" return "I could not find an answer."

Key Ideas

Scratchpad

A single growing transcript the LLM reads every step. No separate memory store needed.

Action Parsing

Extract structured commands from free-form text using simple regex patterns.

Tool Dispatch

Route parsed actions to functions via a dictionary lookup.

Observation Feedback

Tool results go back into the scratchpad so the agent learns from them.

Budgeting

Always cap the loop so the agent can't run forever.