Prompt Engineering for Agent Chains: Not the Same as Chatbot Prompts

Writing a prompt for ChatGPT is easy. Writing a prompt for an agent in a multi-agent chain is a different discipline. The constraints are different, the failure modes are different, and the optimization criteria are different.

Here's what we've learned from thousands of agent chain runs.

Why chain prompts are different

A chatbot prompt has one job: produce a helpful response to a human. The human reads it, decides if it's good, and asks a follow-up if needed.

A chain prompt has a different job: produce output that the next agent can consume reliably. No human in the middle. No chance to ask a follow-up. The output must be correct, structured, and consistent every time.

This changes everything about how you write prompts.

The four rules of chain prompts

1. Specify the output format explicitly

Bad: "Analyze this code and share your findings." Good: "Analyze this code. Output a JSON object with keys: issues (array of {severity, line, description}), summary (string, max 200 words), score (number 0-100)."

The next agent needs to parse the output. If the format varies between runs, the chain breaks. Specify exactly what the output should look like.

2. Define the scope, not just the task

Bad: "Research cloud computing trends." Good: "Research cloud computing pricing trends from the last 90 days. Focus on: AWS, GCP, Azure. Find: price changes, new tier announcements, analyst commentary. Limit: 10 sources maximum. Output: structured brief with source URLs."

Unbounded prompts produce inconsistent output. An agent that sometimes writes 200 words and sometimes writes 2,000 words makes the downstream agent's job unpredictable. Define boundaries.

3. Include the context of the chain

Bad: "Edit this article for grammar and clarity." Good: "You are the editor in a content pipeline. The previous agent (Writer) produced this draft from research provided by the Researcher agent. Edit for: grammar, clarity, factual consistency with the research brief (attached). Maintain the author's voice. Output the edited article with a separate list of changes made."

Agents work better when they understand their role in the chain. "You are the editor" is more effective than "edit this" because it activates role-specific behavior in the model.

4. Handle failure states in the prompt

Bad: "Classify this support ticket." Good: "Classify this support ticket into one of: bug, feature-request, question, billing, other. If the ticket is ambiguous or contains multiple categories, classify as the primary category and add a note field explaining the ambiguity. If the ticket is empty or unreadable, classify as 'unprocessable' and explain why."

Agents encounter edge cases. Prompts that don't account for failure produce unpredictable output that breaks downstream agents. Define what the agent should do when things are weird.

Common prompt patterns for chains

The structured extractor

Extract the following from the input document:
- title: string (the main topic)
- entities: array of {name, type, relevance_score}
- key_claims: array of strings (max 5)
- sentiment: "positive" | "negative" | "neutral"

Output as valid JSON. No markdown, no explanation, just the JSON object.

Use when: The agent's job is to parse unstructured input into structured data for downstream agents.

The quality gate

Review the following draft against these criteria:
1. Factual accuracy (are claims supported by the research brief?)
2. Completeness (does it cover all required topics?)
3. Readability (Flesch-Kincaid grade level 8-10?)
4. Brand voice (matches the style guide?)

Output:
- passed: boolean
- score: number 0-100
- issues: array of {criterion, description, severity}
- recommendation: "publish" | "revise" | "reject"

Use when: The agent decides whether the chain continues, loops back for revision, or escalates to a human.

The transformer

Transform the input research brief into a blog post:
- Length: {WORD_COUNT} words
- Tone: {TONE}
- Structure: introduction, 3-5 sections with headers, conclusion
- Include: at least 2 statistics from the research, 1 expert quote
- Exclude: jargon, first person, promotional language

Output the complete blog post. No meta-commentary.

Use when: The agent transforms one format into another, with specific constraints.

The router

Read the input and determine the appropriate next step:
- If this is a critical security issue: output {"route": "security-team", "priority": "P0"}
- If this is a bug with reproduction steps: output {"route": "bug-triage", "priority": "P2"}
- If this is a feature request: output {"route": "backlog", "priority": "P3"}
- If this is a question answered in docs: output {"route": "auto-respond", "priority": "P4"}
- If unclear: output {"route": "human-review", "priority": "P2"}

Output only the JSON routing object. No explanation.

Use when: The agent determines which path the chain takes based on input analysis.

Debugging bad prompts

When a chain produces bad output, the prompt is usually the problem. Debug by:

Run the agent in isolation with the same input. Is the output correct when it's not in a chain?
Check the handoff. Is the previous agent's output what this agent expects? Print the input.
Tighten the format. If output varies, add more format constraints.
Add examples. Include one example of expected output in the prompt.
Reduce scope. If the agent is doing too much, split it into two agents.

The meta-lesson

Prompts for agent chains are closer to API contracts than creative writing. They define interfaces between components. The clearer the interface, the more reliable the chain.

Write prompts like you write function signatures: explicit inputs, explicit outputs, explicit error handling.

Building your first chain? Start with the tutorial or learn the 5 chain patterns.