Skip to content
← all posts
8 min read

Decision Flow: How AI Agents Make Better Decisions Than Chat

Mentiko Team

Open a ChatGPT thread. Type "should we use Kafka or RabbitMQ for our event bus?" Watch what happens. You'll get a balanced comparison covering throughput, latency, operational complexity, community support, and five other dimensions you didn't ask about. Each dimension gets a paragraph. Both options are presented as valid. The conclusion is "it depends on your use case."

You are now exactly where you started, except 90 seconds older.

This is what happens when you use a conversation as a decision engine. The format is wrong for the job. Chat is great for exploration, terrible for convergence. Decision flow is a different format designed specifically for convergence.

Why chat fails at decisions

Three structural problems make chat-based AI unusable for real decisions.

Context evaporates. A ChatGPT conversation is linear. Your requirements are in message 3, the constraints you mentioned are in message 7, and the clarification that actually matters is in message 12. The model is stitching together a coherent view of your preferences from scattered fragments across a thread. By message 15, it's lost the nuance of what you said in message 4.

There's no structured output. Ask a chatbot for a recommendation and you get prose. Prose can't be scored. Prose can't be compared side-by-side. Prose can't be fed into a downstream system. You read it, nod, and then manually translate it into a Jira ticket or a Slack message. The decision lives in a chat window nobody will ever reopen.

Comparison is impossible. Try comparing three architecture options in a chat thread. You'll ask for option A, scroll up to remember what option B said about latency, realize you forgot to ask option C about cost, go back and ask, then lose track of which option had better fault tolerance. Chat is sequential. Comparison requires parallel presentation.

These aren't AI limitations. They're format limitations. The model is fine. The interface is wrong.

What decision flow actually is

Decision flow is a three-round structured process that replaces open-ended conversation with constrained, high-signal interactions. Each round narrows the decision space.

Round 1: Preference elicitation. The system learns what you care about through forced binary tradeoffs -- not by asking you to write a paragraph about your requirements.

Round 2: Scored options. The system generates a small number of concrete options, each scored against your revealed preferences. Not a list of everything possible. A curated shortlist ranked by fit.

Round 3: Execution plan. You pick an option. The system generates actionable work items with dependencies, estimates, and assignments. The decision becomes work, not a document.

The whole thing takes 3-5 minutes. Compare that to a 45-minute meeting that ends with "let's think about it more" or a ChatGPT thread that ends with "it depends."

How tinder-style tradeoff cards work

Round 1 is the core innovation. Instead of asking "what are your requirements?" (which produces vague, contradictory wish lists), decision flow presents binary tradeoff cards. Each card forces a choice between two values that are both reasonable but in tension.

A decision about infrastructure tooling might generate these cards:

  • Operational simplicity vs. Maximum throughput
  • Managed service vs. Self-hosted control
  • Team familiarity vs. Best technical fit
  • Lower upfront cost vs. Lower long-term cost
  • Vendor ecosystem vs. Open-source flexibility

Swipe right for the first option, left for the second. No middle ground. No "both." That constraint is the entire point.

Here's why this works better than asking people what they want: people are terrible at articulating priorities in the abstract but excellent at making concrete tradeoffs. Ask an engineering lead "what matters most for your database choice?" and you'll get "performance, reliability, and cost" -- which tells you nothing because everyone wants all three. Show them a card that says "higher throughput" vs. "simpler operations" and they'll pick one in two seconds. That pick reveals actual priority.

Five to seven cards generates a weighted preference vector. The math is simple:

{
  "preferences": {
    "operational_simplicity": 0.82,
    "managed_service": 0.71,
    "team_familiarity": 0.65,
    "lower_upfront_cost": 0.45,
    "vendor_ecosystem": 0.38
  }
}

Each preference gets a weight between 0 and 1 based on how consistently it was chosen when it appeared in tradeoff pairs. If you always picked simplicity over competing values, it scores high. If cost was sometimes sacrificed for other things, it scores lower. This isn't a survey -- it's revealed preference extraction. The same technique behavioral economists use, applied to technical decisions.

Real examples

Architecture decisions

Prompt: "We need to add real-time notifications to our SaaS product. 200k users, 5-person engineering team."

Round 1 tradeoff cards surface that this team values development speed over infrastructure control, prefers WebSocket familiarity over newer protocols, and would rather pay more for a managed service than operate their own.

Round 2 generates three options: Pusher (91% match), Ably (84% match), self-hosted Socket.io on Redis (62% match). The self-hosted option scores low because every card pointed away from operational overhead.

Round 3 for the Pusher pick: set up Pusher account, implement server-side event publishing, add client-side listener library, build notification preferences UI, write integration tests. Five tasks with dependencies and time estimates.

Tool selection

Prompt: "We're choosing a CI/CD platform. Currently on Jenkins, team of 12, mix of Node and Python services."

The tradeoff cards reveal that migration ease beats feature richness, that the team would rather have opinionated defaults than maximum configurability, and that cost sensitivity is low relative to developer experience.

Round 2: GitHub Actions (89% match), GitLab CI (76% match), CircleCI (72% match). Jenkins staying in place isn't offered as an option because the decision was framed as a migration.

Hiring criteria

Prompt: "We're hiring a senior backend engineer. Should we prioritize system design experience, language-specific expertise, or leadership potential?"

Round 1 cards force tradeoffs between these in pairs, plus dimensions like remote vs. on-site preference, generalist vs. specialist, and shipping speed vs. code quality standards. The output is a weighted rubric that the hiring panel can actually use for consistent candidate evaluation.

Why this beats a 10-paragraph ChatGPT conversation

Five specific reasons.

Speed. Seven swipes take 30 seconds. Writing and refining a prompt to get good output from ChatGPT takes 5-10 minutes. Then you iterate. Then you ask follow-ups. Decision flow front-loads the context gathering into a format that takes seconds instead of minutes.

Signal density. Each swipe is pure signal. No filler, no qualifiers, no "I think maybe we should consider." Binary inputs produce clean data. Chat inputs produce noisy prose that the model has to parse for intent.

Reproducibility. Run the same tradeoff cards with three different stakeholders. You get three preference vectors you can compare and reconcile. Try getting three people to independently describe their requirements in chat and then merging those conversations. It doesn't work.

Auditability. The preference vector, the option scores, and the selected plan are structured data stored with the decision. Six months later, you can review exactly why you picked Pusher over Ably. In a chat thread, you'd be scrolling through 40 messages trying to reconstruct the reasoning.

Action. The output isn't advice. It's a plan with tasks. The gap between "we decided" and "we started" drops from days (or weeks, or never) to one click.

How Mentiko implements this

Decision flow is a first-class feature in the Mentiko platform, not a plugin or an afterthought. It's accessible from the dashboard and deeply integrated with the chain execution system.

Three modes for different stakes

Intake mode for quick decisions. You describe the problem, the AI gives you a recommendation with a rationale. No rounds, no cards. Takes 30 seconds. Good for low-stakes choices where you just need a tiebreaker.

Guided mode is the full three-round flow described above. This is for decisions that affect your architecture, your team, or your roadmap. The tradeoff cards, scored options, and execution plan.

Classic mode is a traditional comparison dashboard. Side-by-side option cards with full detail on every dimension. Good for stakeholders who want to see everything at once rather than going through the guided flow.

Chain integration

This is where decision flow gets interesting. A decision can trigger a Mentiko chain. The execution plan from Round 3 isn't just a list of tasks -- it can be wired to actual agent chains that execute the work.

{
  "name": "post-decision-setup",
  "trigger": "decision:selected",
  "agents": [
    {
      "name": "provisioner",
      "prompt": "Set up the selected service based on decision context.",
      "triggers": ["decision:selected"],
      "emits": ["provision:complete"]
    },
    {
      "name": "config-writer",
      "prompt": "Generate configuration files for the selected option.",
      "triggers": ["provision:complete"],
      "emits": ["config:ready"]
    },
    {
      "name": "test-runner",
      "prompt": "Run integration tests against the new setup.",
      "triggers": ["config:ready"],
      "emits": ["chain:complete"]
    }
  ]
}

Pick your database in decision flow. The provisioning chain runs. Configuration gets generated. Tests execute. The decision didn't produce a document -- it produced a running system.

Decision history

Every decision is stored with its full context: the original prompt, the preference vector from Round 1, all scored options from Round 2, the selected option, and the execution plan from Round 3. This creates a searchable decision log.

Three months later, when someone asks "why did we pick Pusher?" the answer isn't in someone's memory or buried in a Slack thread. It's in the decision record, with the exact tradeoffs that were made and the scores that drove the choice.

This also enables decision retrospectives. Compare what you predicted (operational simplicity: high, cost: moderate) against what actually happened. Feed that data back into future decisions. Organizations that track decision quality get better at deciding over time. Organizations that don't keep making the same mistakes.

Getting started

Decision flow ships with guided mode in Mentiko's current release. If your team makes technical decisions in Slack threads and meeting notes -- which is most teams -- this is a direct upgrade.

The pattern works even if you start small. Use intake mode for daily choices, guided mode for weekly architecture discussions, and chain integration when you're ready to close the loop between deciding and doing.

Join the waitlist to get access, or read more about how decision flow turns AI from chat into action.

Get new posts in your inbox

No spam. Unsubscribe anytime.