A working journal, writtenarguedsketchedin the open

001 — Premise

The Journal is written by the people who deliver the work — not by a content team and not by a model. Field notes from engagements we are mid-flight on, opinions formed by being wrong about things, and the occasional argument with a fashionable idea. We publish when there is something useful to say. Sometimes that is monthly. Sometimes it is not.

● Latest

The most recent piece, written this month.

Capability tells you what an agent can do. Authorisation tells you what it is permitted to do. In six production incidents across our estate this year, only one of those questions had been asked before deployment.

Strategy — July 2026

10 min

Capability tells you what an agent can do. Authorisation tells you what it is permitted to do. In six production incidents across our estate this year, only one of those questions had been asked before deployment.

The approval meeting for an agentic deployment almost always answers one question: can this agent complete the task? The question that produces incidents is different — what is this agent permitted to do when the task specification runs out? That boundary condition, absent from evaluations and invisible in demos, is responsible for a class of production failure that is substantially harder to remediate than to define in advance.

Julian R. Mountford

Founder & Chairman

Read the piece →

● Earlier in the archive

Pieces worth going back to.

Long-context inference replaced the retrieval layer in seven of nine production deployments we reviewed this year. Five of those seven are rebuilding — not because accuracy was wrong, but because nobody had multiplied the per-query token cost by the daily query volume before the architecture was committed.

N° 02 — Field notes

10 min

Long-context inference replaced the retrieval layer in seven of nine production deployments we reviewed this year. Five of those seven are rebuilding — not because accuracy was wrong, but because nobody had multiplied the per-query token cost by the daily query volume before the architecture was committed.

The 1M-token context window is a genuine engineering advance. But in nine production deployments where we have reviewed its use as a retrieval substitute, the two that held value shared a structural property the other seven did not: the task required reasoning across information that chunk-based retrieval would have destroyed. The seven that are rebuilding failed on cost and latency at production volume — failure modes that were calculable before the architecture was committed and visible in the first month's infrastructure bill.

A working journal, writtenarguedsketchedin the open

The most recent piece, written this month.

Capability tells you what an agent can do. Authorisation tells you what it is permitted to do. In six production incidents across our estate this year, only one of those questions had been asked before deployment.

Pieces worth going back to.

Eleven of the fourteen production AI systems we currently manage are running on a different model version than the one they launched with. In six of the eleven, no engineer has formally approved the change.

Fine-tuning earned its cost in three of the eleven enterprise AI programmes we have reviewed this year. The other eight are in retraining cycles they did not budget for, or being converted to retrieval architectures.

The MCP server is not your authorisation layer, and treating it as one is the mistake we are fixing in five production deployments.

The EU AI Act's high-risk system provisions arrive in August. Most organisations have not yet formally classified their deployed systems — and classification is the easier half of the problem.

Extended thinking earns its cost on a fraction of enterprise queries. The rest is latency you are paying by accident.

Most multi-agent orchestration deployments are solving coordination problems the architecture introduced.

The AI programmes authorised in 2024 are reaching their renewals. Most cannot demonstrate what they produced.

The context window is not a retrieval architecture.

The latency budget is back, and the systems we built while it was gone are showing it.

After month twelve, your AI evaluation is measuring the world your system created.

Enterprise procurement has found its questions. Most AI vendors are still rehearsing their answers.

What breaks in agentic systems at month seven, and the three structural gates we now require before any deployment.

Why most enterprise AI pilots never reach production — and the four conditions that change that.

The case for retrieval-first architectures over fine-tuning, in seven failed projects.

Boards have stopped asking what AI is. They are starting to ask better questions.

Ready to build something that matters?