Blog

88% of Companies Have Already Seen AI Agent Security Failures

Jorge Ruiz — Fri, 27 Mar 2026 05:06:08 GMT

What Real-World Failures Reveal About the Hidden Risks of AI Agents

An overwhelming 88% of organizations report either confirmed or suspected AI agent security or privacy incidents within the last year.

MCP Authorization: How to Manage Permissions for AI Agents & Services

Kay James — Mon, 23 Feb 2026 19:05:56 GMT

Traditional IAM was built for humans and servers. Agentic AI introduces a third actor: the semi-autonomous agent that explores and interacts with tools dynamically. While our previous discussion on MCP Authentication focused on verifying who an agent is, that identity is useless without a robust framework for MCP Authorization to control what that agent is actually allowed to do.

MCP Authentication: The Complete Guide to Modern Credential Flow in AI Systems

Kay James — Mon, 23 Feb 2026 19:05:54 GMT

Static credentials are a liability in an autonomous world. When you hand an AI agent a "keys to the kingdom" API key, you’re granting access but also losing control. Modern architectures break when ownership and control are unclear.

How AI Changes Authentication & Authorization Models

Kay James — Fri, 13 Feb 2026 14:28:30 GMT

Static API keys and human-only passwords cannot secure a world where AI agents act autonomously. Traditional authentication and authorization models assume a human is at the keyboard, but 2026 architectures rely on machine-to-machine (M2M) intent.

Gravitee in Gartner Market Guide for AI Gateways

linus.hakansson@graviteesource.com (Linus Håkansson) — Wed, 11 Feb 2026 18:38:45 GMT

At the end of last year, Gartner released the 2025 Market Guide for AI Gateways, where Gravitee is recognized as a Representative Vendor in the category of API Management Platforms Adding AI Extensions.

State of AI Agent Security 2026 Report: When Adoption Outpaces Control

Jorge Ruiz — Wed, 04 Feb 2026 15:39:13 GMT

We recently surveyed over 900 executives and technical practitioners to understand how organizations are managing the move toward autonomous systems. Today, we are releasing the results in The State of AI Agent Security 2026 Report.

Centralize MCP Authentication with MCP Server Application Types

Kay James — Thu, 22 Jan 2026 17:09:59 GMT

Gravitee 4.10 introduces MCP Server applications

MCP servers are moving into real systems. Most teams still onboard them like experiments.

Gravitee 4.10: One Control Point to Secure & Govern AI Agents, MCP, and LLMs

Jorge Ruiz — Thu, 22 Jan 2026 16:29:40 GMT

AI agents are already wired into real systems. They call LLMs, discover tools, and take actions that used to be locked behind human workflows. That shifts the problem from “Can we build an agent?” to “Can we control what the agent can see and do?”

MCP Proxy: Unified Governance for Agents Tools

prachi.jamdade@graviteesource.com (Prachi Jamdade) — Thu, 22 Jan 2026 16:07:52 GMT

AI agents are moving fast. Governance is not.

Teams are wiring agents to MCP servers to access tools, APIs, and data. That works, until every agent talks directly to every server. At that point, you lose control. You cannot see which tools are being called. You cannot restrict access cleanly. You cannot enforce authentication consistently. And you cannot explain what happens when something goes wrong.

Gravitee 4.10 fixes that.

In LLM Proxy release blog, we discussed how agents use three paths to connect and interact with the world around them. One of the three paths is connecting to tools, normally through an MCP server. This release introduces the MCP proxy, a new AI gateway capability designed to secure, govern, and observe MCP traffic without changing how agents or MCP servers work.

What is the MCP Proxy?

An MCP proxy is a component that sits between MCP clients and MCP servers and mediates all communication between them.

The proxy understands the MCP protocol and inspects requests at the method level, including tool discovery, tool execution, and prompt access.

Because MCP is an RPC-based protocol, a generic HTTP proxy is not sufficient. An MCP proxy must parse and interpret the MCP payload to determine which operation is being invoked and apply controls based on that context.

This design creates a single control point for MCP traffic. It avoids direct, point-to-point integrations between clients and servers, and enables centralized enforcement of authentication, authorization, policy evaluation, and observability across all MCP interactions.

To make this concrete, the rest of this post uses a simple example. A hotel booking agent that helps users search hotels, view bookings, and manage reservations. The agent talks to LLMs through the LLM Proxy, and calls backend booking APIs through MCP.

Each MCP policy below prevents a real problem that shows up when agents start calling booking tools in production.

What Ships in Gravitee 4.10

Gravitee 4.10 introduces three core capabilities for MCP.

1. A new MCP proxy API type

The MCP proxy is a new API type, purpose-built for MCP servers.

It proxies upstream MCP servers, whether they are custom-built, third-party, or generated using Gravitee’s MCP Tool Server. Because it understands the MCP protocol, it can apply gateway capabilities at the MCP operation level, not just at the connection level.

That includes:

MCP Analytics: Native analytics on tool calls, prompts, and errors

MCP Proxy tracks and logs MCP-specific events such as tool calls, prompt requests, and errors. This gives teams visibility into how agents are using MCP servers in practice.

You can answer questions like:

Which tools are being called most often?
Which prompts fail or error out?
Which agents are generating the most MCP traffic?

A booking agent can search hotels, view bookings, and cancel reservations. Without visibility, teams cannot tell which tools the agent is actually using.

Analytics show which booking tools are called most often and where failures happen.

Caching: At the MCP Method Level

Some MCP operations return repeatable results, such as metadata or tool listings.

Because the proxy understands which MCP method is being invoked, the proxy layer can cache responses safely. This reduces unnecessary calls to MCP servers and improves response times for agents without changing server code. Caching happens at the MCP operation level, not at the HTTP layer.

Agents often ask for the same information, like available booking tools or hotel metadata. Without caching, every agent hits the backend for the same answers.

Caching avoids repeated calls and keeps booking tools responsive even during peak hours.

Rate limiting and retries based on MCP method behavior

MCP traffic is not predictable. Some operations are lightweight. Others are expensive or sensitive. The MCP proxy applies rate limiting and retry logic with full awareness of MCP methods. Teams can protect MCP servers from overload.

If an agent gets stuck and repeatedly calls a booking tool, it can overload the system. Rate limiting stops runaway calls before they affect real bookings. Other users can keep searching and booking without disruption.

A short outage should not break a booking flow. If a tool fails, the proxy retries automatically. The user keeps going instead of starting over.

Transform: Payload-aware transformations when needed

The MCP proxy can transform MCP requests and responses based on the invoked method.

This allows teams to adapt inputs or outputs without modifying agents or MCP servers. Transformations apply only where they make sense, because the gateway knows exactly which MCP operation is in play.

Booking tools often expose internal details that users or external agents should not see. The proxy removes that internal metadata before returning results. Agents get only what they need to complete the booking.

2. MCP ACL policy for fine-grained access control

Gravitee 4.10 adds a dedicated ACL policy for MCP proxy APIs. This policy lets teams define access rules per MCP method. That includes protocol methods such as:

tools/list
tool/call
prompts/list
resources/subscribe
And other MCP-native operations

You can decide which users or agents are allowed to discover tools, which tools they can call, and which MCP servers they can interact with at all.

Not every agent should access every booking tool. Some tools are public, like searching hotels. Others are private, like viewing or canceling bookings only for authenticated users.

ACLs ensure agents only see and call the tools they are allowed to use.

3. MCP Authorization, handled by the gateway

MCP includes a formal authorization specification. Implementing it correctly is non-trivial, especially for server developers who just want to expose tools.

Gravitee 4.10 offloads this work.

The MCP proxy is compliant with the MCP authorization specification. When an MCP client connects without an access token, the gateway handles the flow. It redirects the client to the configured authorization server, where the end user can authenticate and grant consent.

This is exactly how MCP clients expect secured servers to behave.

For developers, this means MCP servers no longer need to implement the authorization spec themselves. They delegate authentication and consent handling to the gateway, just like microservices delegate security concerns to an API gateway.

Viewing bookings requires knowing who the user is. If an agent connects without a token, the gateway handles login and consent. Booking tools only run after the user is authenticated.

How the MCP Proxy Helps Moving from Prototype to Production Faster

Gravitee 4.10 treats MCP as a first-class citizen, not an edge case.

Developers ship faster by offloading auth, consent, and access control to the gateway.
Platform teams get one place to govern MCP traffic instead of maintaining point-to-point agent integrations.
Operations teams see tool calls, failures, and retries in real time, not after incidents.
Security teams control which agents can discover and call MCP tools, down to the method level.

Start Controlling MCP Before Agents Control You!

MCP turns tools into runtime capabilities. And that power needs control.

The MCP proxy gives you visibility, access control, and standards-compliant authorization without changing how agents or servers are built. If you cannot control how agents use tools, you do not control your system.

Explore the Gravitee 4.10 release, head to the MCP proxy documentation and start proxying your MCP servers today.

Ready to control and secure your MCP servers? Don’t hold back; set up a call with one of our experts today!

LLM Proxy: One Front Door to Multiple LLM Providers

prachi.jamdade@graviteesource.com (Prachi Jamdade) — Thu, 22 Jan 2026 16:04:15 GMT

As organizations move from simple generative AI to more advanced agentic systems, their infrastructure often starts to break. The problem isn’t just technical. It’s about ownership and control. When teams deploy AI models and agents without a clear central authority, they lose the ability to audit, secure, or even shut them down. That quickly becomes a serious risk.

Gravitee 4.10 introduces the AI Gateway as a core pillar of the AI Agent Management Platform (AMP), designed to control how agents interact with the world around them. Agents follow three critical paths: talking to other agents, calling tools, and invoking LLMs. This release marks a significant evolution from our previous Agent Mesh to a comprehensive platform designed to govern the entire lifecycle of AI agents in one place.

With 4.10, Gravitee brings these three paths under one gateway. The release ships LLM Proxy and MCP Proxy, giving teams a controlled front door to LLM providers and agent tools. This builds on the A2A Proxy, introduced in 4.8, which already governs agent-to-agent communication. Together, these proxies form the AI Gateway, a single control plane to secure, govern, and observe every interaction agents make, before sprawl and risk take over.

But… Why Does LLM Access Breaks Down at Scale?

Early GenAI integrations are simple. One app. One LLM provider. One API key.

That model collapses as soon as AI becomes shared infrastructure.

Teams connect agents directly to providers. Each integration becomes point-to-point. There is no global visibility into which models are used, how often, or at what cost. Switching providers means refactoring code. Enforcing security or compliance rules for every individual team.

This is the same failure pattern APIs went through a decade ago. AI needs the same gateway discipline.

What is the LLM Proxy?

This release introduces a new LLM Proxy API type, built to sit between your AI consumers, such as agents or applications, and your LLM providers. It gives enterprises one control point for model access, security, routing, and cost management, without forcing developers to write business logic for all this.

The LLM Proxy acts as an intelligent middleware layer. It abstracts the complexity of multiple providers such as OpenAI, Gemini, Bedrock and OpenAI-compatible APIs like Ollama, Together AI, Local AI and Mistral AI into a single, unified interface. With additional providers added over time without requiring changes to consumer integrations.

To make this concrete, the rest of this post uses a simple example. A hotel booking agent that helps users search hotels, view bookings, and manage reservations. The agent talks to LLMs through the LLM Proxy, and calls backend booking APIs through MCP.
Each policy below exists to prevent a specific failure that shows up when a hotel booking agent runs in production.

What Ships in Gravitee 4.10

Gravitee 4.10 lays the foundation for enterprise-grade LLM governance with a focused feature set.

1. LLM Analytics:

Provides out-of-the-box analytics in Elasticsearch showing which models are being consumed, token usage, and associated costs, assuming cost metrics are configured. An in-app analytics dashboard will follow in the next release.

A hotel booking agent handles search queries, booking confirmations, and customer support questions. Without analytics, teams cannot see which interactions consume the most tokens or which models drive cost.

LLM analytics expose exactly where spend comes from and which agent flows are responsible.

2. Token rate limiting:

Enforces quotas based on input and output tokens per LLM invocation. When limits are reached, requests fail, protecting budgets and ensuring fair usage and service quality.

For example - Hotel search traffic spikes during peak travel periods. If one agent starts generating long responses or looping on retries, it can consume the entire token budget.

Token rate limits ensure search, booking, and support workflows all get fair access to LLM capacity.

Here, we set the limit of 10000 tokens every 5 minutes.

3. Role Based Access Control:

Controls which teams, agents, or applications can access the LLM proxy and which models they are allowed to use, enforcing consistent access policies across all LLM traffic.

Not every agent needs the same models. Customer-facing chat requires high-quality responses, while internal booking automation does not need high-quality models.

RBAC ensures each agent uses only the models it actually needs, keeping costs predictable and access controlled.

4. Provider & model routing: Automatically routes requests to the correct LLM provider and model based on consumer requests, without requiring changes to client code.

Most hotel searches tolerate lower-cost models. Booking confirmations and cancellations do not.

Model routing automatically sends critical booking steps to the most reliable model, without changing agent logic.

5. Guardrails: Prevent agents or consumers from sending unsafe, non-compliant, or policy-violating prompts to LLMs by enforcing guardrails at the gateway.

When an agent submits a prompt containing harmful, obscene, or exploitative language, the LLM Proxy detects it at runtime and rejects the request before forwarding it to the provider.

A public hotel booking interface accepts natural language input from anyone. Without guardrails, abusive or unsafe prompts reach the LLM directly.

Guardrails block these requests before they ever reach a model, protecting both users and the brand reputation.

6. API key sharing: Centralizes and abstracts provider API keys at the gateway level so consumers never embed or manage provider credentials directly.

Instead of dozens of OpenAI keys embedded across codebases, agents authenticate through a single gateway-managed API key, protected by Gravitee API keys or OAuth for an added security layer.

A hotel platform often runs multiple agents across environments. Embedding provider API keys in each agent quickly becomes unmanageable and insecure.

Centralized and shared key management keeps credentials secure, out-of-code and allows rotation without redeploying agents.

7. Transform: Automatically maps OpenAI compatible requests to provider-specific formats and transforms responses back to a consistent interface for consumers.

An agent sends an OpenAI-style request, and the gateway automatically converts it to Bedrock or Gemini format and normalizes the response back.

Agents speak a single, consistent LLM interface. But LLM providers do not. Request and response transformation lets the booking agent stay provider-agnostic while the gateway adapts traffic behind the scenes.

8. Retry: Automatically retries failed LLM requests when a provider has a temporary issue, so agents do not need to handle retries themselves

A short LLM outage should not break a hotel booking workflow. That would affect the user experience. If something fails briefly, the gateway retries automatically and the user keeps going.

9. Model governance: Defines which LLM providers and models are available within the organization, enabling controlled rollout, model approval, and easier provider switching to reduce vendor lock-in.

Booking confirmations should not change behavior overnight. Model governance lets teams test new models on search questions first, then use them for bookings only when they are proven.

Key Benefits for the Whole Organization

Engineers and developers: No provider-specific code. Easy model switches.
Platform and IT teams: Centralized access control, rate limits, and API key management.
Security teams: Full visibility into LLM usage. Enforce which models agents can access.
Data and AI teams: One entry point to multiple LLMs. Compare and change models without rewrites.
Business leaders: Clear insight into LLM usage and costs. Predictable spend and future pricing control.

Start Controlling Your LLM Traffic Today!

The LLM Proxy in Gravitee 4.10 lets teams scale AI through a single, governed entry point to multiple LLM providers, giving developers speed while restoring visibility, control, and cost clarity.

This is not about adding another AI abstraction. It is about applying proven gateway principles to the most sensitive part of the AI stack.

As Gartner highlights, “34% of top-performing organizations in building AI-powered solutions use AI gateways compared to just 8% of lower performers.” in Gartner’s 2025 AI in Software Engineering Survey.

If AI is becoming core infrastructure, it deserves infrastructure-grade controls. Gravitee 4.10 delivers exactly that.

Want to start managing your LLMs? Don’t hold back; set up a call with one of our experts today to see how Gravitee's AI Gateway help you achieve this.