Beads Task Tracking for Agent Chains

You built the chain. It runs. But six weeks later, nobody remembers why the summarizer uses GPT-5.4 instead of Claude, or who decided the retry threshold should be 3. The chain definition tells you what the system does. It doesn't tell you why decisions were made, what was tried and discarded, or what's still on the backlog.

That's where Beads comes in. Beads is a lightweight task tracker that lives alongside your chain definitions. It captures the context around your agent work -- the decisions, the iterations, the open questions -- so your chain configs never exist in a vacuum.

What Beads tracks

Beads organizes work into tasks with attached notes. Each task has a subject, a status, and optional metadata. Notes are freeform text attached to tasks -- meeting context, architecture decisions, debugging sessions, whatever doesn't belong in the chain JSON but matters for understanding it.

A typical Beads workflow for agent chains looks like this:

# Create a task for a new chain
bd create "Build customer onboarding chain" \
  --description "3-agent pipeline: welcome email, account setup, first-run tutorial"

# Add context as you work
bd note 42 "Decided on GPT-5.4 for welcome email -- needs warm tone, Claude was too formal in testing"
bd note 42 "Account setup agent needs database write access. Added db-write scope to workspace config."

# Track progress
bd update 42 --status in_progress
bd update 42 --status complete

The task number (42 in this example) links everything together. When someone asks "why does the welcome agent use GPT-5.4?" you run bd get 42 and the decision context is right there.

Setting up Beads with a Mentiko project

Beads stores its data in a .beads/ directory at the root of your project. If you're version-controlling your chain definitions (you should be), Beads data lives right next to them.

Initialize Beads in your Mentiko project:

cd /path/to/your-mentiko-project
bd init

This creates a .beads/ directory with the task database. Add it to your .gitignore if you want local-only tracking, or commit it if you want shared visibility across the team.

For team usage, committing .beads/ means every team member sees the same task history, the same decision notes, and the same backlog. When someone reviews a PR that changes a chain definition, they can cross-reference the Beads task to understand the full context.

Tracking chain development lifecycle

A typical chain goes through several phases: design, implementation, testing, production tuning. Beads captures context at each stage.

Here's a real workflow for building a content pipeline chain:

# Phase 1: Design
bd create "Content pipeline chain" \
  --description "Research -> Draft -> Edit -> Publish. Event-driven, scheduled daily."

bd note 1 "Requirements from content team: 800-1200 words, SEO-optimized, brand voice guidelines attached"
bd note 1 "Decided on 4-agent chain instead of 3. Splitting edit into fact-check + style-edit."

# Phase 2: Implementation
bd create "Implement content pipeline agents" \
  --description "Write chain JSON, configure workspaces, set up event bindings"

bd note 2 "Researcher agent needs web access. Using docker workspace with network enabled."
bd note 2 "Style editor prompt went through 6 iterations. v6 in chains/content-pipeline.json is the one that matches brand voice."

# Phase 3: Testing
bd create "Test content pipeline end-to-end" \
  --description "Run 10 topics through the full chain. Check output quality and timing."

bd note 3 "Run 1-5 passed. Run 6 hit rate limit on research agent. Added retry with 2s backoff."
bd note 3 "Average chain completion: 4m 12s. Acceptable for daily scheduled run."

# Phase 4: Production
bd create "Deploy content pipeline to production schedule" \
  --description "Set up cron, configure alerting, document runbook"

bd note 4 "Scheduled for 6am UTC daily. Alert threshold: 3 consecutive failures."

Four tasks, each with notes that capture what happened and why. When you revisit this chain in three months, the Beads history is a complete narrative of how it got to its current state.

Linking tasks to chain definitions

Beads metadata lets you associate tasks with specific chains:

bd create "Optimize research agent token usage" \
  --description "Current avg: 12k tokens per run. Target: under 8k." \
  --metadata chain=content-pipeline agent=researcher

bd create "Add fallback model for draft agent" \
  --description "Primary: GPT-5.4. Fallback: Claude if OpenAI is down." \
  --metadata chain=content-pipeline agent=drafter priority=high

The --metadata fields are arbitrary key-value pairs. Use them to filter tasks by chain, by agent, by priority, or by whatever dimensions matter to your team. When you're debugging a specific agent, filter to see all tasks related to it:

bd list --metadata agent=researcher

Capturing debugging sessions

Agent chains fail in production. When they do, the event files tell you what happened. Beads captures the analysis -- the why and the fix.

bd create "Debug: content pipeline failing on long-form topics" \
  --description "Chain fails when researcher output exceeds 15k tokens. Summarizer context window overflow."

bd note 12 "Root cause: researcher output not truncated before passing to summarizer"
bd note 12 "Fix: added max_output_tokens: 10000 to researcher config"
bd note 12 "Also added a pre-summarizer agent that chunks long output. See chains/content-pipeline.json v12."

bd update 12 --status complete

This is the kind of context that gets lost in Slack threads and stand-up notes. In Beads, it's permanently attached to a searchable task.

Using Beads with team workflows

For teams, Beads integrates naturally into existing development workflows:

Start each chain modification with a Beads task. Reference the task number in your commit message. When someone reviews the PR, they check the Beads task for context that doesn't belong in the chain JSON or the commit message.

# Create the task
bd create "Switch content pipeline to use workspace pools" \
  --description "Current: new Docker container per run. Target: warm pool of 3 containers."

# Do the work, commit
git commit -m "Switch content pipeline to workspace pools (beads #15)"

# Add post-deploy notes
bd note 15 "Deployed. Cold start dropped from 8s to 0.3s. Pool of 3 handles our daily volume."
bd update 15 --status complete

The commit message references the Beads task. The Beads task has the context. The chain JSON has the configuration. Three layers of documentation, each in the right place.

Reviewing project status

Beads gives you a snapshot of where things stand:

# Overview
bd stats

# All open tasks
bd list --status pending,in_progress

# Recent activity
bd list --sort updated --limit 10

# Everything related to a specific chain
bd list --metadata chain=content-pipeline

For shift handoffs or team syncs, bd stats shows total tasks, open vs. closed, and recent activity. It's a 2-second status check instead of a 20-minute meeting.

What Beads doesn't do

Beads is not a project management tool. It doesn't have sprints, story points, Gantt charts, or burndown charts. It's a context capture system that sits next to your chain definitions.

If you need project management, use your existing tools. Beads handles the layer between "Jira ticket says build a content pipeline" and "chains/content-pipeline.json exists" -- the decisions, iterations, and debugging context that make the chain understandable six months from now.

Your chain definitions describe the system. Beads describes the journey. Both matter when you're maintaining agent pipelines at scale.

Get started with your first agent chain, then add Beads tracking as your chains grow beyond the prototype stage.