EngineeringJune 20269 min

The MCP server is not your authorisation layer, and treating it as one is the mistake we are fixing in five production deployments.

Model Context Protocol has standardised how agents discover and invoke tools across enterprise systems — a genuine step forward that compressed weeks of integration work to hours. What it does not standardise is who the agent is when it makes those calls, whether that specific operation is authorised in this workflow context, or what audit record connects a service account write to the human instruction that caused it. We are correcting this in five production deployments right now, and in three of them we built the original architecture.

Sher Ghan

— Principal AI Engineer

The MCP server is not your authorisation layer, and treating it as one is the mistake we are fixing in five production deployments.

The integration pitch has acquired a standard shape. A diagram — usually on the second slide, sometimes the first — shows an agent in the centre, surrounded by labelled MCP servers: one for the CRM, one for the document store, one for the internal knowledge base, one for calendar and email. Arrows radiate outward. The agent can reach everything it needs. The demo, run against a curated dataset and a carefully scoped tool set, performs exactly as shown. The question nobody asks at this stage — the one I have started requiring an answer to before the diagram appears — is on whose authority the agent makes those calls, and what record exists when it does.

Model Context Protocol has done something genuinely valuable. It standardised how agents discover and invoke tools — a problem that every team in 2024 was solving differently, almost always worse. The specification is clean, the ecosystem is substantial, and the time from 'I need this agent to talk to our CRM' to 'the agent is talking to our CRM' has compressed from weeks of bespoke integration work to hours of configuration. Firms still building proprietary integration layers in 2025 were doing unnecessary work at higher cost and lower reliability. The protocol belongs in enterprise AI deployments, and it is increasingly going into them.

The problem is that MCP standardises capability discovery and invocation. It does not standardise identity, authorisation, or audit. Those gaps were largely invisible in the protocol's early deployments — developer tooling, read-only lookups against code repositories, constrained environments where the blast radius of a misdirected call was low. They are not invisible in 2026, when enterprise agents are making write calls into production systems through MCP connections carrying service account credentials that nobody has reviewed in months. We are correcting this in five deployments right now. In three of those five, we built the original architecture. That is the honest context for the rest of this.

#02What the protocol specifies and what sits above it

The MCP specification defines how a client discovers a server's capabilities and invokes them. The server exposes a set of tools; the agent selects a tool and calls it with parameters; the server executes against the underlying system and returns a result. The message format is defined. The capability discovery handshake is defined. The error codes are defined. What is not defined is who the agent is when it makes a call, whether the operation requested is appropriate in this context, or what accountability record should exist when the operation is committed to the downstream system.

This is not a deficiency in the specification. Authorisation is correctly not a transport protocol concern — HTTP does not define what a given user is permitted to do on a given service, and HTTP is not criticised for that gap. The difference in practice is that application-layer authorisation over HTTP is well understood and routinely implemented. Developers building web applications know they need an auth layer and have decades of established patterns to reach for. Developers building MCP integrations in 2025 and 2026 are working with a relatively young ecosystem, often not recognising the gap between 'the agent can call this tool' and 'the agent is authorised to perform this operation' as a distinct architecture problem that needs solving separately.

What fills the gap in most deployments is the MCP server's service account. The agent calls the server; the server calls the underlying system using whatever credentials it was provisioned with. In the five deployments I am describing, those credentials ranged from appropriately scoped read-only access to a service account with write access to the full database schema, established during initial configuration when an engineer needed write access to test a feature and never re-scoped before the system went into production.

#03The three gaps that appear in production

Three failure modes show up consistently in MCP deployments that went into production without an explicit authorisation architecture. None of them appear in a demo. All three were present in at least two of the five deployments I am describing.

The first is credential scope. An MCP server's service account is typically provisioned for what the server might ever need — the union of permissions required across all of its potential uses, all of its agents, and all of the workflows it might serve. The agent inherits whatever the server is authorised to do. A specific agent performing a specific workflow task inherits write access to records it will never touch in that workflow, delete rights included out of caution, and permissions granted for an edge case that has not arisen in six months of production operation. In the four deployments where we conducted a retrospective scope audit, the operations the agent actually invoked in production represented between 18 and 34 per cent of what the service account was technically authorised to perform. The remainder was blast radius.

The second gap is identity threading. When an agent makes a write call through MCP, the resulting record in the downstream system is attributed to the service account. The CRM audit log shows service account 'ai-agent-prod' at 14:37:22. It does not show which agent session triggered the call, which user instruction initiated the task, or what reasoning the agent was executing when it chose this tool. In March, an agent with MCP access to a financial services CRM made forty-seven write operations over approximately two hours — updating contact records, modifying deal stages, adding notes. Every operation was logged. Every operation was attributed to the service account. Reconstructing the connection between those forty-seven operations and the user interactions that caused them required four hours of correlation work against a separate observability log, and that was only possible because we had included structured agent session logging in the delivery package. Without that logging, forty-seven committed writes would have been permanently unattributable to any instruction or intent.

The third gap is tool surface. An MCP server exposing thirty or more distinct operations creates an environment where tool selection becomes a non-trivial inference task at every agent step, and the consequences of wrong selection scale with the breadth of available operations. In a constrained tool set of four or five, wrong selection is obvious and the recovery path is short. In a set of thirty-seven tools, an agent can route a write operation through a tool that was not designed for the input it receives — because that tool's name and description matched the agent's intermediate goal more closely than the correct tool at that particular step in the task. Three incidents in our production estate involved operations committed through tools with no input validation for that input type, because the tool authors had not anticipated receiving it. The correct tools had validation. The agent did not select them.

“The audit log records the service account write at 14:37:22. It does not record the instruction that caused it. Technically complete; operationally useless.”

#04What we build in before any write endpoint is touched

The requirements are not technically demanding. They are organisationally demanding — which is exactly why they do not appear in initial MCP architectures unless they are written into the delivery brief before a line of server configuration is committed.

The first is minimum-viable credential scope, established before the server reaches a production environment. This means running the target workflows in a staging environment, observing the specific operations the agent actually invokes — not inferring them from the tool list — and scoping the service account to that set. The cataloguing exercise has, in four recent deployments, reduced the effective write surface by 60 to 80 per cent relative to initial provisioning. The residual scope is what the agent needs. The rest is unnecessary exposure that no longer exists.

The second is agent session identity threaded into every downstream audit record. Every tool invocation through MCP should carry a correlation identifier that connects the MCP call to the agent session and, above that, to the user interaction or scheduled job that initiated the work. The mechanism varies by system: a trace ID propagated through the MCP client and included in upstream request headers; an enrichment step at the server layer before the underlying API call; a structured log record keyed to the session identifier. The engineering cost in our recent deployments has been approximately one working day per MCP server — the kind of number that gets deprioritised without a specific incident requiring it, and the kind that looks different after the first audit request the system cannot answer.

The third is an operation-level authorisation check, separate from session-level authentication. Session authentication establishes who the agent is; it does not establish whether this specific agent is permitted to perform this specific operation in this workflow context at this step. A per-invocation policy check — as simple as a role-based rule evaluated before execution, or as contextual as an authorisation service that understands the current workflow state — constrains the agent to what it is meant to do in this deployment, rather than what the service account is able to do across all deployments. In our production systems, the latency added by this check has ranged from 12 to 45 milliseconds per invocation. More practically, writing the authorisation policy is the only point in the architecture at which someone must specify, in language a non-engineer can audit, what the system is permitted to do and under what conditions. That document tends to be more useful at the first compliance review than anything else the programme produced.

#05What the ecosystem will provide, and why it will not arrive in time

The MCP specification community is aware of these gaps. Authorisation guidance is being developed, and the specification's authors have been consistent in treating authorisation as an application-layer concern that sits above the protocol rather than inside it. Reference architectures for enterprise MCP authorisation will, in due course, lower the cost of satisfying these requirements materially.

The agents going to production in June 2026 will not wait for that work to complete. They are being deployed now, against production systems, with service accounts that have not been scoped and audit trails that cannot reconstruct the decisions they record. The correction path — when the first attribution request arrives, or when the first compliance audit reaches the MCP layer — runs through the three requirements above, applied retrospectively, against a running system. That work has averaged five weeks of engineering per deployment in our experience. We have done it five times. We would rather others find these five weeks before they need them.

● About the author

Sher Ghan

Principal AI Engineer

Every piece in the Journal is written personally by a senior practitioner, drawing on the engagement that motivated it. No ghostwriters, no content team, no models. If a paragraph here resonates with a problem you are looking at, the author is the person to reply to — direct lines beat anonymous inboxes.

Get in touch with the practice

Earlier piece

Boards have stopped asking what AI is. They are starting to ask better questions.

Later piece

The EU AI Act's high-risk system provisions arrive in August. Most organisations have not yet formally classified their deployed systems — and classification is the easier half of the problem.