Skip to content
← all posts
7 min read

Why Event-Driven Agents Beat Step-by-Step Workflows

Mentiko Team

Most agent frameworks start with the same pitch: define your steps, wire them together, hit run. It's intuitive. It maps to how we think about work -- step one, step two, step three. But once you push sequential agent workflows into production, the cracks show up fast.

Agents aren't shell scripts. They don't produce the same output every time. They discover things at runtime that change what should happen next. They fail in ways that have nothing to do with the step they're on. Sequential workflows fight this reality. Event-driven orchestration works with it.

The sequential workflow problem

A sequential pipeline looks clean on a whiteboard:

[Researcher] → [Writer] → [Editor] → [Publisher]

Four boxes, three arrows. Simple. Now add reality:

  • The researcher finds three subtopics instead of one. You need three writers, not one.
  • The editor flags a factual error. You need to route back to the researcher, but only for section 2.
  • The publisher's API is rate-limited. You need to buffer and retry, but only for this step.
  • A new requirement says legal review is needed for posts mentioning compliance.

In a sequential framework, every one of these is a structural change. You're editing the pipeline definition, adding conditional branches, inserting retry wrappers, building routing logic. Your clean four-step pipeline becomes a 12-node graph with six conditionals and a retry loop.

The deeper problem: sequential pipelines couple the shape of the work to the definition of the workflow. When the work changes shape at runtime -- which it does constantly with LLM-powered agents -- the definition can't keep up.

What event-driven actually means

In an event-driven system, agents don't know about each other. They know about events.

An agent has two responsibilities: do its work, and emit an event when it's done. What happens next isn't its problem. Other agents declare what events they care about, and the orchestrator handles the wiring.

Here's a concrete example. A researcher agent finishes and writes this event file:

{
  "event": "research:complete",
  "agent": "researcher",
  "status": "success",
  "output": "/workspace/research-findings.md",
  "topics_found": 3,
  "timestamp": "2026-03-19T09:14:22Z"
}

The writer agent's config says it triggers on research:complete:

name: writer
triggers:
  - event: "research:complete"
    condition: status == "success"
prompt: "Write a draft based on the research at {output}."
emits: "writing:complete"

The researcher doesn't know the writer exists. The writer doesn't know the researcher exists. They're connected through events, not through a pipeline definition.

This means you can change what happens after research without touching the researcher. Add a writer? New YAML file. Add a legal reviewer that only triggers when the research mentions compliance? New YAML file with a condition on the event payload. Remove the editor step on Tuesdays? Condition on timestamp.

File-based events: Mentiko's approach

Mentiko stores events as plain files in a .events/ directory inside the workspace. This is a deliberate choice over message queues, in-memory state, or database rows.

After a chain runs, the events directory looks like this:

$ ls .events/
001-research-complete.event
002-writing-complete.event
003-editing-complete.event
004-publishing-complete.event

Each file is readable JSON. No special tooling required:

$ cat .events/003-editing-complete.event
{
  "event": "editing:complete",
  "agent": "editor",
  "status": "success",
  "output": "/workspace/draft-v2.md",
  "changes": [
    {"type": "factual_correction", "section": 2, "severity": "minor"},
    {"type": "tone_adjustment", "section": 4, "severity": "trivial"}
  ],
  "input_tokens": 4200,
  "output_tokens": 3800,
  "model": "claude-sonnet-4-20250514",
  "duration_ms": 8420,
  "timestamp": "2026-03-19T09:15:44Z"
}

Three properties make file-based events powerful in ways that other event transports aren't:

Greppable. Finding every failed agent across the last 50 runs is grep "\"status\": \"error\"" runs/*/events/*.event. Finding which agent consumed the most tokens is grep "input_tokens" runs/*/events/*.event | sort -t: -k3 -rn. No dashboards, no query languages. Just grep.

Diffable. Comparing today's run against yesterday's is diff runs/2026-03-18/events/ runs/2026-03-19/events/. You see exactly what changed: different outputs, different token counts, different error messages. Version control tools work on event files out of the box because they're just text.

Committable. You can check event files into git. This means your CI can assert that a chain's event output matches expected patterns. Regression testing for agent pipelines becomes git diff --exit-code expected-events/ actual-events/.

Fan-out and fan-in with events

The pattern that breaks most sequential frameworks is fan-out/fan-in: split work across N parallel agents, then combine results.

In a sequential pipeline, you need explicit parallel execution constructs, barrier synchronization, and merge logic. It's the most complex part of any DAG framework.

With events, fan-out is just multiple agents watching the same event:

# agents/analyst-financial.yaml
name: analyst-financial
triggers:
  - event: "data-collection:complete"
emits: "analysis-financial:complete"

# agents/analyst-legal.yaml
name: analyst-legal
triggers:
  - event: "data-collection:complete"
emits: "analysis-legal:complete"

# agents/analyst-technical.yaml
name: analyst-technical
triggers:
  - event: "data-collection:complete"
emits: "analysis-technical:complete"

All three trigger when data collection finishes. They run in parallel automatically -- no parallel execution directive, no thread pool configuration, no async/await ceremony.

Fan-in is an agent that watches for multiple events:

# agents/synthesizer.yaml
name: synthesizer
triggers:
  - event: "analysis-financial:complete"
  - event: "analysis-legal:complete"
  - event: "analysis-technical:complete"
wait: all
prompt: "Synthesize findings from all three analyses into a report."
emits: "synthesis:complete"

The wait: all directive tells the orchestrator to hold until every trigger event exists. Adding a fourth analyst is creating one YAML file. Removing one is deleting a file. The synthesizer config stays the same either way -- it watches for whatever analysis events show up.

Compare this to restructuring a DAG graph every time the number of parallel branches changes.

Error recovery: retry the agent, not the chain

Sequential pipelines have a brutal failure model. Step 4 of 7 fails, so you either retry from step 4 (losing the isolation between steps) or restart from step 1 (wasting all the work from steps 1-3).

Event-driven error recovery is surgical. When an agent fails, it emits an error event:

{
  "event": "writing:error",
  "agent": "writer",
  "status": "error",
  "error_type": "context_length_exceeded",
  "input_tokens": 142000,
  "max_tokens": 128000,
  "model": "gpt-5.4",
  "timestamp": "2026-03-19T09:15:02Z"
}

A recovery agent watches for error events and decides what to do:

name: error-recovery
triggers:
  - event: "*.error"
prompt: |
  An agent failed. Read the error event and decide the best recovery:
  - context_length_exceeded: route to summarizer, then retry original
  - rate_limited: schedule retry with backoff
  - malformed_output: retry with stricter format instructions
  - api_unavailable: try fallback model
emits: "recovery:action"

The key insight: recovery decisions are made per-agent, based on the specific failure. The researcher's completed work is untouched. The editor hasn't run yet and doesn't need to know anything went wrong. You retry exactly the part that failed, with full context about why it failed.

This is impossible in a flat sequential pipeline without building a custom retry framework on top. With events, it falls out naturally from the architecture.

Debugging: ls, cat, done

When a chain produces unexpected output, the debugging workflow with file-based events is:

# 1. See what happened
$ ls .events/
001-research-complete.event
002-writing-error.event
003-recovery-action.event
004-writing-complete.event
005-editing-complete.event

# 2. Read the error
$ cat .events/002-writing-error.event
{
  "event": "writing:error",
  "error_type": "context_length_exceeded",
  "input_tokens": 142000
}

# 3. See what recovery did
$ cat .events/003-recovery-action.event
{
  "event": "recovery:action",
  "action": "summarize_input_and_retry",
  "original_agent": "writer",
  "new_input_tokens": 38000
}

# 4. Verify the retry worked
$ cat .events/004-writing-complete.event
{
  "event": "writing:complete",
  "status": "success"
}

Five commands. No log aggregation service. No tracing infrastructure. No correlation IDs. The event files are the complete, ordered history of everything that happened. You can email them to a teammate, paste them into a GitHub issue, or pipe them through jq for structured analysis.

At 2am when a production chain is failing, this is the difference between a 5-minute fix and a 2-hour investigation.

When sequential is fine

Event-driven isn't always the right call. For simple, linear chains where the shape of the work genuinely doesn't change, sequential execution is clearer:

  • A 2-agent chain (extract data, format report) doesn't benefit from event indirection.
  • A strict pipeline where step order is contractual (validate, sign, submit) is more readable as a sequence.
  • Prototyping. When you're figuring out what agents you even need, wiring them sequentially is faster than designing event contracts.

The inflection point is around 4+ agents, or any chain that needs branching, parallelism, or conditional routing. Once you hit that complexity, the overhead of event-driven architecture pays for itself immediately in debuggability and flexibility.

The bottom line

Sequential workflows describe how work flows through a system. Event-driven architectures describe what happened and let the system figure out the flow. When your agents are deterministic and the workflow is fixed, either approach works. When your agents are LLM-powered, non-deterministic, and the workflow adapts at runtime -- events match the nature of the work.

File-based events add a layer on top of that: full transparency, zero infrastructure, and debugging that starts with ls. It's not the most sophisticated event transport. It's the most useful one.


Dive deeper: File-Based Events vs Message Queues explains the infrastructure tradeoffs. 5 Agent Chain Patterns shows these patterns as portable JSON definitions. Or build your first chain in five minutes.

Get new posts in your inbox

No spam. Unsubscribe anytime.