I Asked My Own Tool Whether It Should Exist

02 June 2026 · 7 min read

Delphibuilder-logdecision-makingAI

I Asked My Own Tool Whether It Should Exist

I built a tool that argues with itself so I can’t get certain too early. Then I asked it whether it should exist, and it took apart my favourite version of the idea before catching itself making the exact mistake it was built to catch.

Let me back up.

A few weeks ago I built a thing I’ve been calling Delphi. The name is borrowed from the old Delphi method, where you put a question to a panel of experts, collect their views in rounds, feed the disagreement back in, and let positions move. The classic version is slow and expensive and people drop out halfway through. Mine runs the panel as AI personas instead. Five experts, each with a different background and bias. A contrarian whose only job is to attack the emerging answer. It does its own web research, runs multiple rounds, and at the end it does not hand you a confident answer. It hands you the structured disagreement. Where the experts converged, where they split, what would have to be true for the whole thing to flip.

The point of it is to stop you converging on certainty too early. That is the failure mode I keep watching in transformation work. A room agrees too fast, mistakes the speed of agreement for the quality of it, and ships the first idea that felt clever. Delphi is built to slow that exact reflex down.

I was proud of it. Genuinely. And that is precisely the moment I have learned to distrust myself.

Pride is a blind spot wearing a nice jacket. When I am proud of something I stop being able to see it straight. So I did the only honest thing I could think of. I pointed the tool at itself and asked it the one question I least wanted to ask. Is this even a viable product, and if so what is the real business, and if not what is its highest-value use.

I did not brief it to be kind.

It came back in about 33 minutes and £3.16 of compute. Five expert personas, two rounds. And it did not flatter me.

Here is the first thing that stung. The obvious build, the one every founder reaches for, is a self-serve software product. Sign up, run your decisions, pay per seat, scale. The tool argued against it. Not weakly. It said the real value of the thing is not decision efficiency for commercial buyers at all. It is legitimacy. The buyers who will actually pay are the ones who are already forced to show their working. Health-guideline bodies. Financial risk committees. Government foresight units. Places where a documented, defensible deliberation is not a nice-to-have, it is the job. For everyone else, the verdict was brutal, and I am going to quote it because softening it would be cheating:

“Buyers without a regulator pointing a gun at them will not actually pay for better thinking.”

Read that again. That is my own tool telling me that better thinking does not sell. That people pay for cover, for audit trails, for the ability to say we deliberated properly when it goes wrong. Not for being more right. I have spent twenty years in and around organisational change telling people that better thinking is the prize. The machine I built to protect good thinking just told me the market mostly does not want it unless someone is making them buy it.

I sat with that for a while. It is not wrong.

But that is not even the part that got me. The part that got me was the recursion.

Somewhere in the run, the tool turned on its own panel. It noticed that its five experts had mildly converged in the first round. They had agreed too comfortably, too early, on the idea that structured dissent is valuable, without stress-testing the conditions under which it stops being valuable. And it flagged that. It said, in effect, look, here is live evidence of the exact problem this product exists to solve, happening inside this very analysis, among informed experts, in real time.

The tool caught itself doing the precise thing it was built to prevent.

I cannot tell you how that felt. I built a machine to catch premature certainty, and the most convincing proof that the problem is real and hard came from watching the machine nearly fall into it and then notice. It was not a clean demo. It was better than a clean demo. It was the thing actually working on itself, including on its own blind spots.

So what do I take from all this. A few things, and none of them are tidy.

The first is that the comfortable version of an idea is usually the one you should interrogate hardest, because comfort is doing the thinking for you. My favourite framing, self-serve software for smart teams, was the framing I least examined. The tool went straight for it.

The second is that pride is a signal, not a reward. When I feel proud of something I have made, that is now my cue to go looking for the part of it I have stopped being able to see. Pride means I have stopped looking. That is the dangerous bit.

The third is more uncomfortable and I am still chewing on it. If better thinking genuinely does not sell on its own, then a lot of what I do for a living rests on persuading people to want something the market does not naturally reward. That is either a problem with my offer or the whole point of it. I have not decided which yet. The tool did not decide it for me, and I notice it explicitly refused to. It said, in as many words, this is your decision, the system will not make it for you. Which is exactly what it should say.

There is one place I have already started using it, and it is not as a product. It is in client work, on the questions that are too important to answer fast. When a leadership team is staring at a decision they cannot easily reverse, I run it through the panel first. Not to get the answer. To get the disagreement on the record, to surface the assumptions nobody said out loud, and to catch where my own bias is quietly steering the diagnosis before I walk into the room with it. It turns a hard question into something you can audit. You can go back afterwards and see what was contested, what moved, and why it moved. That is the part I keep coming back to. The tool told me legitimacy is what the market pays for, and the way I actually use it is to build an audit trail around deep thinking. Those are the same thing pointed at different people.

I am not telling you whether Delphi becomes a product. I genuinely do not know yet, and the honest answer is that the run made me more uncertain, not less, which is precisely what it was supposed to do.

What I will tell you is this. The most useful thing I have built recently is not a tool that gives me answers. It is a tool that takes away my false ones. And the most useful thing it did was turn that on its maker.

If you have got something you are proud of right now, that is your cue. Go and ask it the question you least want to ask.

Tim Robinson, 2 June 2026

If this is your problem

AI made tasks faster, then progress levelled off?

I run a fixed-price diagnosis: one to two days, £4,500, and you get your level measured and the constraint named in a document written for your board, not your backlog. There is also a 90-minute AI Maturity Review at £750 if you want a smaller first step.

Read about the AI Maturity Diagnosis Or take the 3-minute survey

Tim Robinson

Transformation Consultant & AI Practitioner

20+ years fixing how organisations work. I help leadership teams redesign operating models and apply AI where it actually matters.

Book a quick chat

07 February 2026

Making Better Decisions Faster: AI-Powered Prioritisation and Strategic Alignment

Product teams struggle with decision-making not due to lacking frameworks, but because important decisions involve competing priorities and incomplete information. AI transforms this from high-stakes events into continuous, evidence-based exploration.

12 July 2026

The 2 August panic was wrong. Your real AI Act deadline is December 2027.

The EU has formally adopted the Digital Omnibus: the high-risk AI obligations most organisations were bracing for have moved from August 2026 to December 2027. That is not a reprieve to ignore. It is exactly enough runway to do this properly — but only if you start now.

27 June 2026

Your transformation doesn't need a new strategy. It needs a diagnosis.

Medicine has a rigorous, teachable method for reasoning under uncertainty when being wrong is expensive. It's called differential diagnosis, and almost all of it transfers to why transformation stalls, including the biases that make most change diagnosis fail.

I Asked My Own Tool Whether It Should Exist

I Asked My Own Tool Whether It Should Exist

AI made tasks faster, then progress levelled off?

Tim Robinson

Related posts

Making Better Decisions Faster: AI-Powered Prioritisation and Strategic Alignment

The 2 August panic was wrong. Your real AI Act deadline is December 2027.

Your transformation doesn't need a new strategy. It needs a diagnosis.