I Asked My Own Tool Whether It Should Exist
I built a tool that argues with itself so I can’t get certain too early. Then I asked it whether it should exist, and it took apart my favourite version of the idea before catching itself making the exact mistake it was built to catch.
Let me back up.
A few weeks ago I built a thing I’ve been calling Delphi. The name is borrowed from the old Delphi method, where you put a question to a panel of experts, collect their views in rounds, feed the disagreement back in, and let positions move. The classic version is slow and expensive and people drop out halfway through. Mine runs the panel as AI personas instead. Five experts, each with a different background and bias. A contrarian whose only job is to attack the emerging answer. It does its own web research, runs multiple rounds, and at the end it does not hand you a confident answer. It hands you the structured disagreement. Where the experts converged, where they split, what would have to be true for the whole thing to flip.
The point of it is to stop you converging on certainty too early. That is the failure mode I keep watching in transformation work. A room agrees too fast, mistakes the speed of agreement for the quality of it, and ships the first idea that felt clever. Delphi is built to slow that exact reflex down.
I was proud of it. Genuinely. And that is precisely the moment I have learned to distrust myself.
Pride is a blind spot wearing a nice jacket. When I am proud of something I stop being able to see it straight. So I did the only honest thing I could think of. I pointed the tool at itself and asked it the one question I least wanted to ask. Is this even a viable product, and if so what is the real business, and if not what is its highest-value use.
I did not brief it to be kind.
It came back in about 33 minutes and £3.16 of compute. Five expert personas, two rounds. And it did not flatter me.
Here is the first thing that stung. The obvious build, the one every founder reaches for, is a self-serve software product. Sign up, run your decisions, pay per seat, scale. The tool argued against it. Not weakly. It said the real value of the thing is not decision efficiency for commercial buyers at all. It is legitimacy. The buyers who will actually pay are the ones who are already forced to show their working. Health-guideline bodies. Financial risk committees. Government foresight units. Places where a documented, defensible deliberation is not a nice-to-have, it is the job. For everyone else, the verdict was brutal, and I am going to quote it because softening it would be cheating:
“Buyers without a regulator pointing a gun at them will not actually pay for better thinking.”
Read that again. That is my own tool telling me that better thinking does not sell. That people pay for cover, for audit trails, for the ability to say we deliberated properly when it goes wrong. Not for being more right. I have spent twenty years in and around organisational change telling people that better thinking is the prize. The machine I built to protect good thinking just told me the market mostly does not want it unless someone is making them buy it.
I sat with that for a while. It is not wrong.
But that is not even the part that got me. The part that got me was the recursion.
Somewhere in the run, the tool turned on its own panel. It noticed that its five experts had mildly converged in the first round. They had agreed too comfortably, too early, on the idea that structured dissent is valuable, without stress-testing the conditions under which it stops being valuable. And it flagged that. It said, in effect, look, here is live evidence of the exact problem this product exists to solve, happening inside this very analysis, among informed experts, in real time.
The tool caught itself doing the precise thing it was built to prevent.
I cannot tell you how that felt. I built a machine to catch premature certainty, and the most convincing proof that the problem is real and hard came from watching the machine nearly fall into it and then notice. It was not a clean demo. It was better than a clean demo. It was the thing actually working on itself, including on its own blind spots.
So what do I take from all this. A few things, and none of them are tidy.
The first is that the comfortable version of an idea is usually the one you should interrogate hardest, because comfort is doing the thinking for you. My favourite framing, self-serve software for smart teams, was the framing I least examined. The tool went straight for it.
The second is that pride is a signal, not a reward. When I feel proud of something I have made, that is now my cue to go looking for the part of it I have stopped being able to see. Pride means I have stopped looking. That is the dangerous bit.
The third is more uncomfortable and I am still chewing on it. If better thinking genuinely does not sell on its own, then a lot of what I do for a living rests on persuading people to want something the market does not naturally reward. That is either a problem with my offer or the whole point of it. I have not decided which yet. The tool did not decide it for me, and I notice it explicitly refused to. It said, in as many words, this is your decision, the system will not make it for you. Which is exactly what it should say.
There is one place I have already started using it, and it is not as a product. It is in client work, on the questions that are too important to answer fast. When a leadership team is staring at a decision they cannot easily reverse, I run it through the panel first. Not to get the answer. To get the disagreement on the record, to surface the assumptions nobody said out loud, and to catch where my own bias is quietly steering the diagnosis before I walk into the room with it. It turns a hard question into something you can audit. You can go back afterwards and see what was contested, what moved, and why it moved. That is the part I keep coming back to. The tool told me legitimacy is what the market pays for, and the way I actually use it is to build an audit trail around deep thinking. Those are the same thing pointed at different people.
I am not telling you whether Delphi becomes a product. I genuinely do not know yet, and the honest answer is that the run made me more uncertain, not less, which is precisely what it was supposed to do.
What I will tell you is this. The most useful thing I have built recently is not a tool that gives me answers. It is a tool that takes away my false ones. And the most useful thing it did was turn that on its maker.
If you have got something you are proud of right now, that is your cue. Go and ask it the question you least want to ask.
Tim Robinson, 2 June 2026