Agent-ready APIs need a contract audit before MCP rollout

By Greg Nowak. Last updated 2026-06-29.

Many internal APIs are perfectly useful inside the business because people know how to work around their rough edges. A developer remembers which field is really required. A support engineer knows that one endpoint name is historical. An integration specialist can read an unclear error, ask around, and try again.

Agents do not have that workplace context. Once an API is exposed through MCP, the contract is no longer just a method, path, payload, and status code. The model may discover the tool, compare it with similar tools, prepare the request, interpret the response, and decide what to do next. Small ambiguities become runtime decisions.

That is why an MCP rollout should start with a contract audit, not only a server adapter. The practical question is simple: can an agent-driven caller understand the operation, stay within the right boundary, recover from failure, and avoid doing more than the user intended?

MCP makes API wording operational

The MCP Tools specification is clear about the shift. Servers expose tools that language models can invoke, and those tools are identified through names and metadata describing their schemas. The specification also describes tools as model-controlled, meaning the model can discover and invoke them based on context and user prompts.

That puts pressure on details teams often treat as documentation polish. A tool named update_record with a thin description might pass a technical check, but it gives the model very little to judge. Which record? What kind of update? Is it reversible? Is it safe to call after a partial failure?

Input and output schemas matter for the same reason. Input schemas define the parameters the caller should provide. Output schemas can help clients and LLMs validate and handle structured results. Error handling is part of the contract too, because MCP distinguishes protocol errors from tool execution errors. For an agent, a vague failure is not just inconvenient. It can lead to a bad retry, a poor fallback choice, or a confident answer built on a failed action.

The OpenAI Agents SDK MCP documentation shows how close this is to real implementation work. It covers hosted MCP tools, Streamable HTTP servers, legacy SSE, and stdio servers, along with approval flows, tool filtering, per-call metadata, caching, and tracing. Those capabilities make pilots practical. They also mean policy decisions need to be made before live CRM, finance, customer, file, or operations data is connected.

OpenAPI is the starting point, not the finish line

The OpenAPI Initiative describes OAS as a language-agnostic way to carry API information through the lifecycle. It helps consumers understand capabilities, generate clients, configure infrastructure, and create tests. OpenAPI documents can also support contract testing and security testing. So for most teams, OpenAPI is the right artifact to review before endpoints become MCP tools.

But structural validity is not the same as agent-readiness. The paper Making OpenAPI Documentation Agent-Ready reports on an industrial setting with 16 production APIs and roughly 600 endpoints. Those APIs were stable and widely used inside a microservice architecture. Yet early MCP-based experiments still ran into repeated failures in task planning, tool selection, and payload construction. The researchers found 2,450 documentation and REST-related smells across the endpoints.

The lesson is not that the APIs were broken. It is that APIs can be good enough for internal service teams and still be too ambiguous for agents.

Audit area	What to check	Business implication
Tool purpose	Can the model tell this operation apart from nearby operations?	Reduces the chance of the wrong tool being selected for a customer or operations task.
Inputs	Are required fields, enums, formats, limits, and examples explicit?	Prevents payload construction from depending on undocumented team knowledge.
Outputs	Are success responses structured and predictable?	Makes downstream agent steps easier to validate and less likely to misread.
Errors	Can the caller distinguish bad input, business rule failure, auth failure, and server failure?	Supports safer retries, escalation, and user-facing explanations.
Authorization	Which user, tenant, client, resource, and downstream system are involved?	Keeps agent access aligned with the real business boundary.
Scope	Can risky actions require narrower permission or fresh approval?	Limits blast radius when a workflow combines several tool calls.
Gateway	Can the pilot run through a logged, staged MCP gateway first?	Gives the team a place to test filtering, approvals, tracing, and failures before production.

A useful MCP contract audit connects documentation quality with tool behavior, authorization, and operational control.

Authorization needs to be mapped before exposure

The MCP Authorization specification defines authorization at the transport layer for HTTP-based transports. In that model, a protected MCP server acts as a resource server, the MCP client makes protected requests on behalf of a resource owner, and the authorization server issues access tokens for use at the MCP server.

For a business system, those roles should be written down before an agent gets access. Which identity is acting? Which tenant or workspace is in scope? Which downstream API receives the request? Which operation needs fresh consent or elevation?

This becomes especially important when an MCP server proxies third-party or internal APIs. The MCP Security Best Practices warn against token passthrough, where a server accepts tokens not issued for that MCP server and forwards them downstream. The same guidance also calls out confused deputy risks, SSRF during OAuth metadata discovery, local MCP server compromise, risky authorization URL handling, and over-broad scopes.

Scope design is often where Greg can add the most practical value. The security guidance recommends progressive, least-privilege scope models and highlights common mistakes such as wildcard scopes, omnibus access, and publishing the full scope catalog. In a normal integration, broad access is already risky. In an agent workflow, it is worse because several small steps may be chained together. If the first token can read everything, write everywhere, and perform administrative actions, one mistaken or compromised tool call has too much room to move.

What Greg should review before rollout

For GrN.dk clients, this can be a short, focused agent-readiness review rather than a long architecture programme. Start by inventorying candidate endpoints and marking each one as read-only, write, destructive, financial, identity-related, or administrative. That immediately separates low-risk discovery work from operations that need tighter consent, logging, and approval.

Next, tighten the OpenAPI material where ambiguity would affect tool choice or payload construction: summaries, descriptions, parameter names, enums, examples, response schemas, and error objects. Then translate only the smallest useful set of endpoints into MCP tools, with narrow names and descriptions. A smaller tool surface is easier to test, easier to explain, and easier to govern.

The authorization boundary should be explicit: resource owner, client, tenant, scopes, consent prompts, redirect URI handling, token audience, and elevation moments. Failure testing should happen in staging, and it should include more than happy-path calls. Test missing fields, wrong enum values, stale IDs, unauthorized tenants, denied elevation, rate limits, downstream API failures, and malformed tool outputs.

The first production-facing pilot should run through a staged MCP gateway with logging, tracing, approval policies, and tool filtering enabled. This is not about slowing adoption. It is about making the API surface clear enough for agents to use and constrained enough for the business to trust.

MCP gives teams a standard way to connect models to tools. OpenAPI gives them a lifecycle artifact to improve. The contract audit sits between the two, turning existing API knowledge into a safer agent interface before it becomes a production dependency.