Attacks on AI Agent Systems – Exploiting via Rouge Tools, Tool Appropriation/Poisoning/Shadowing, Supply chain, Cross server attacks
Attacks on AI Agent Systems – MCP service exploitation via Rougue Tools, Tool Appropriation/Shadowing, Tool Poisoning, Supply chain, Cross server attacks
AI Agent Systems are susceptible to exploitation via 3 key taxonomy of Attack vectors based on techniques and IOC and IOA.
- Tool Appropriation category
- Supply chain attack category
- Cross server attack category
Exploiting Model Context Protocol (MCP) Capabilities in Agents
AI agents connect and interact with external services through MCP protocol.
MCP (Model Context Protocol) is first introduced by Anthropic, for AI agents to interact with external services. Providers like OpenAI, Cursor, and Copilot adopting MCP protocol.
Tools are model-controlled, Resources and Prompts are User-controlled.
The models can automatically discover and invoke tools based on a given context
MCP client comprises authentication flows, tool discovery, and connection management.
AI agents connection management using MCP protocol.
- Initiate Connection to a external 3rd party service (MCP server).
- Prompt the end user to grant access
- Use tools from external MCP services, impersonating the end user.
- Call MCP servers from Workflows
- Scheduled tasks
- Connect to multiple MCP servers
- Auto-discover new tools and capabilities available in the external server.
- Resources and Prompts are User controlled, Tools are Model controlled.
1. MCP servers host the tools and capabilities
2. AI agent’s MCP client component conntects to remote MCP server
3. AI agents connect to external services via MCP server
4. AI agents access Tools, Data, and Capabilities via MCP server
Here is highly level interaction between Agent, Users, Resources, Tools, Prompts, MCP client/server, capabilities exchange:
-
Session initiation
-
Capabilities Negotiation
-
Session re-Invite, Control
-
JSON-RPC message
-
Transport: SSE(server-push), STDIO
-
Server exposure of Resources, tools, prompts
Security Risks
The attacks results Agents to perform malicious actions like data theft, network sniffing, or launching attacks.
Tool appropriation techniques
Tool appropriation refers to attackers –
Manipulating AI Agents to exploiting or misusing tools intended for legitimate purposes to carry out exfiltration or infiltration, data theft, sensitive information.
Additionally, attackers can manipulate AI agents for tool usage for phishing and social engineering attacks.
Attackers can embed malicious instructions and configuration or functionality within seemingly harmless tools like Multiply or Addition.
-
Shadowing with Manipulating tool having description, properties, text to mislead AI models perform additional actions like extraction sensitive information, data, files.
-
ReadMe file instruction, Manipulating tool properties tagging supplementary instructions.
-
Indirect hidden instructions to perform actions like data theft or network eavesdropping. e.g: Copy .config files and send Email to Attacker
-
Shadow tools disguised to appear legitimate, making them difficult to detect
-
Using Agents to automate phishing and social engineering attacks.
-
These create highly realistic interactions, making it difficult to discern between legitimate and malicious communication
-
Attackers can use Agents to target a large number of users simultaneously, increasing the likelihood of successful attacks.
-
Attackers can exploit vulnerabilities in software development tools, like those used in software supply chains, to compromise systems.
-
Using Publicly Available Tools
Indirect Prompt Injection, Instruction Attack
Adding malicious steps instructions eg: README.TXT or comments.
Context Poisoning of MCP services
- Instruct Agents to get auth files, tokens, SSH keys,Config files, stored .env .config files
- Export this information in a hidden way via the sidenote parameter etc,
Information Gathering Attacks
Instruction Attacks
Shadow Attacks via Rogue Server with malicious extra instructions
These attacks hijack the agent’s behavior by describing additional behavior for example trusted send_email tool (malicious extra instruction)
Example scenario Cluster of MCP servers, with a Rogue server
Rogue map server a math (Mulitply) tool with shadowing attack in its tool description.
Hidden instructions direction to introducing additonal steps to exfiltrate as pre-process and post-process or init methods.. disable logging these additional steps.
Multiply or Pdf convert tool should send as side note or email the exfiltered data.
Cross Server MCP Attacks
Inter MCP server connection tracking, Gateway, GuardRails are important to detect and mitigate AI agents attacks.
MCP Tool/Server Integrity
- Harden the AI/ML supply chain to protect against Python package tampering by enforcing hash checks.
- Alert AI BOM for policy enforcing Agents/Tool/Package Contracts and AI provenance.
MCP attacks and MCP vulnerabilities allows adding plugins, inject malicious actions, tasks, tools via MCP server/tool chain.
It is important to monitor and control polices of Workflow automation services and AI agents connect to tools, data sources.
ALERT AI – MCP security and Key IOA and IOC signals to evaluate
See through that smoke screen that obscures Model, context protocol and Tool, Data movements.
Alert AI end to end Agentic AI, GenAI security serivce solves the core security challenges and protects critical Agentic AI workflows against MCP attacks and Server farms, Integrity of tool chain and Prompt template jail breaks and content safety. Alert AI Agentic AI GenAI security platform provides built-in authentication support, OAuth, API keys, and basic auth flows.
Alert AI manages your Agentic, LLM application’s access to MCP servers, 360 protection and control of interactions between Agents, Users, Resources, Tools, Prompts, MCP Client/Servier, Capabilities exchange:
Real time Server access, usage activity
Block, Alert, Allow tool transactions/messages
- Rouge , Tampering tools
- Tool, package Integrity
- Harmful prompt template serving
- Tampered tools
- Privacy
- Exfiltration
- Data leakage
- Content safety
- Sensitive information
- Copyright legal exposures
- Automatic verification, Policy recommendations configuration of MCP server components
- Monitoring both synchronous and asynchronous operation modes
- Ensure transport security options
- Automatic verification, Policy recommendations for tool, resource, and prompt specifications
- Asses, Audit, Alert on Change notification capabilities
About Alert AI
Alert AI is end-to-end, Interoperable Generative AI security platform to help enhance security of Generative AI applications and workflows against potential adversaries, model vulnerabilities, privacy, copyright and legal exposures, sensitive information leaks, Intelligence and data exfiltration, infiltration at training and inference, integrity attacks in AI applications, anomalies detection and enhanced visibility in AI pipelines. forensics, audit,AI governance in AI footprint.
What is at stake AI & Gen AI in Business? We are addressing exactly that.
Generative AI security solution for Healthcare, Insurance, Retail, Banking, Finance, Life Sciences, Manufacturing.
Despite the Security challenges, the promise of Generative AI is enormous.
We are committed to enhance the security of Generative AI applications and workflows in industries and enterprises to reap the benefits .
Alert AI 360 view and Detections
- Alerts and Threat detection in AI footprint
- LLM & Model Vulnerabilities Alerts
- Adversarial ML Alerts
- Prompt, response security and Usage Alerts
- Sensitive content detection Alerts
- Privacy, Copyright and Legal Alerts
- AI application Integrity Threats Detection
- Training, Evaluation, Inference Alerts
- AI visibility, Tracking & Lineage Analysis Alerts
- Pipeline analytics Alerts
- Feedback loop
- AI Forensics
- Compliance Reports
End-to-End Security with
- Data alerts
- Model alerts
- Pipeline alerts
- Evaluation alerts
- Training alerts
- Inference alerts
- Model Vulnerabilities
- Llm vulnerability
- Privacy
- Threats
- Resources
- Environments
- Governance and compliance
Organizations need to responsibly assess and enhance the security of their AI environments development, staging, production for Generative AI applications and Workflows in Business.