Most teams pick the wrong first AI agent. Not because they lack good options — because they pick the flashy one instead of the valuable one. Here is the framework we use to choose a first project that earns its keep and builds momentum for the next.
Why the first agent matters more than it looks
Your first agent does two jobs. The obvious one is to automate a piece of work. The quieter, more important one is to build organisational confidence. A first project that ships, behaves predictably and visibly saves time turns sceptics into allies. A first project that drags on, hallucinates, or solves a problem nobody had does the opposite — and can stall your whole AI roadmap for a year.
So the goal isn’t to pick the most impressive agent. It’s to pick the one most likely to succeed quickly and obviously.
Score candidates on two axes
List the workflows people grumble about. Then score each on a simple two-by-two: impact (how much time, money or risk it removes) and feasibility (how cleanly an agent can actually do it today). The sweet spot for a first project is high feasibility and meaningful — not necessarily maximal — impact.
Start in the top-right of the impact/feasibility grid, then bias toward feasibility. A medium-impact win that ships beats a high-impact project that stalls.
What raises feasibility
- Clear inputs and outputs. The task has a well-defined starting point (an email, a ticket, a document) and a recognisable “done”.
- Tolerance for review. A human can sign off the agent’s output before it has consequences, so early mistakes are caught cheaply.
- Available knowledge. The information the agent needs already exists in documents, a knowledge base or systems you can connect to.
- Repetition. The task happens often enough that automating it pays back fast.
What lowers it
- Irreversible actions with no review step (moving money, deleting records) on day one.
- Knowledge that lives only in people’s heads.
- Highly ambiguous goals where even two colleagues would disagree on the right answer.
A worked example
Suppose a support team is drowning in repetitive “where is my order?” emails. Impact: high — it’s a big chunk of volume. Feasibility: also high — the inputs are clear (an email), the knowledge exists (order systems, a returns policy), and a human can review drafts before they send. That’s a textbook first agent: it triages and drafts replies, a person approves them, and you measure the result.
Contrast that with “an agent that sets our pricing strategy”. High impact, but feasibility is low: the goal is ambiguous, the data is messy, and the actions are consequential. That’s a third- or fourth-project problem, not a first one.
Define success before you build
Pick one or two metrics you’ll judge the agent on, and capture a baseline before you start. For a support agent that might be median first-response time and the share of tickets needing human edits. Without a baseline you’ll have a working agent and no way to prove it’s working — which makes funding the next one much harder.
Keep the scope embarrassingly small
The most common failure we see isn’t a bad idea — it’s a good idea with the scope cranked to maximum. Resist it. Ship the narrowest useful version: one workflow, one team, one clear review step. You can always widen the agent’s remit once it has earned trust in production.
A quick checklist
- Is the task frequent and repetitive?
- Are the inputs and outputs clear?
- Does the knowledge already exist somewhere we can reach?
- Can a human review the output before it has consequences?
- Have we written down the baseline metric we’ll improve?
- Is the scope small enough to ship in weeks, not quarters?
If you can answer yes to most of these, you’ve probably found your first agent. If you can’t, keep looking — the right starting point is out there, and it’s usually less glamorous and more useful than the one that first comes to mind.


