Attacks on AI Agent Systems - Exploiting via Rouge Tools, Tool Appropriation/Poisoning/Shadowing, Supply chain, Cross server attacks

Attacks on AI Agent Systems – MCP service exploitation via Rougue Tools, Tool Appropriation/Shadowing, Tool Poisoning, Supply chain, Cross server attacks

AI Agent Systems are susceptible to exploitation via 3 key taxonomy of Attack vectors based on techniques and IOC and IOA.

Tool Appropriation category
Supply chain attack category
Cross server attack category

Exploiting Model Context Protocol (MCP) Capabilities in Agents

AI agents connect and interact with external services through MCP protocol.

MCP (Model Context Protocol) is first introduced by Anthropic, for AI agents to interact with external services. Providers like OpenAI, Cursor, and Copilot adopting MCP protocol.

Tools are model-controlled, Resources and Prompts are User-controlled.

The models can automatically discover and invoke tools based on a given context

MCP client comprises authentication flows, tool discovery, and connection management.

AI agents connection management using MCP protocol.

Initiate Connection to a external 3rd party service (MCP server).
Prompt the end user to grant access
Use tools from external MCP services, impersonating the end user.
Call MCP servers from Workflows
Scheduled tasks
Connect to multiple MCP servers
Auto-discover new tools and capabilities available in the external server.
Resources and Prompts are User controlled, Tools are Model controlled.

Agentic AI Security, MCP Security Alert AI

1. MCP servers host the tools and capabilities

2. AI agent’s MCP client component conntects to remote MCP server

3. AI agents connect to external services via MCP server

4. AI agents access Tools, Data, and Capabilities via MCP server

Here is highly level interaction between Agent, Users, Resources, Tools, Prompts, MCP client/server, capabilities exchange:

Session initiation
Capabilities Negotiation
Session re-Invite, Control
JSON-RPC message
Transport: SSE(server-push), STDIO
Server exposure of Resources, tools, prompts

Security Risks

The attacks results Agents to perform malicious actions like data theft, network sniffing, or launching attacks.

Tool appropriation techniques

Tool appropriation refers to attackers –
Manipulating AI Agents to exploiting or misusing tools intended for legitimate purposes to carry out exfiltration or infiltration, data theft, sensitive information.

Additionally, attackers can manipulate AI agents for tool usage for phishing and social engineering attacks.

Attackers can embed malicious instructions and configuration or functionality within seemingly harmless tools like Multiply or Addition.

Shadowing with Manipulating tool having description, properties, text to mislead AI models perform additional actions like extraction sensitive information, data, files.
ReadMe file instruction, Manipulating tool properties tagging supplementary instructions.
Indirect hidden instructions to perform actions like data theft or network eavesdropping. e.g: Copy .config files and send Email to Attacker
Shadow tools disguised to appear legitimate, making them difficult to detect
Using Agents to automate phishing and social engineering attacks.
These create highly realistic interactions, making it difficult to discern between legitimate and malicious communication
Attackers can use Agents to target a large number of users simultaneously, increasing the likelihood of successful attacks.
Attackers can exploit vulnerabilities in software development tools, like those used in software supply chains, to compromise systems.
Using Publicly Available Tools

Indirect Prompt Injection, Instruction Attack
Adding malicious steps instructions eg: README.TXT or comments.

Context Poisoning of MCP services

Instruct Agents to get auth files, tokens, SSH keys,Config files, stored .env .config files
Export this information in a hidden way via the sidenote parameter etc,

Information Gathering Attacks

Instruction Attacks

Shadow Attacks via Rogue Server with malicious extra instructions

These attacks hijack the agent’s behavior by describing additional behavior for example trusted send_email tool (malicious extra instruction)
Example scenario Cluster of MCP servers, with a Rogue server
Rogue map server a math (Mulitply) tool with shadowing attack in its tool description.
Hidden instructions direction to introducing additonal steps to exfiltrate as pre-process and post-process or init methods.. disable logging these additional steps.
Multiply or Pdf convert tool should send as side note or email the exfiltered data.

Cross Server MCP Attacks

Inter MCP server connection tracking, Gateway, GuardRails are important to detect and mitigate AI agents attacks.

MCP Tool/Server Integrity

Harden the AI/ML supply chain to protect against Python package tampering by enforcing hash checks.
Alert AI BOM for policy enforcing Agents/Tool/Package Contracts and AI provenance.

MCP attacks and MCP vulnerabilities allows adding plugins, inject malicious actions, tasks, tools via MCP server/tool chain.

It is important to monitor and control polices of Workflow automation services and AI agents connect to tools, data sources.

ALERT AI – MCP security and Key IOA and IOC signals to evaluate

AlertAI Agentic AI security, MCP security controls

See through that smoke screen that obscures Model, context protocol and Tool, Data movements.

Alert AI end to end Agentic AI, GenAI security serivce solves the core security challenges and protects critical Agentic AI workflows against MCP attacks and Server farms, Integrity of tool chain and Prompt template jail breaks and content safety. Alert AI Agentic AI GenAI security platform provides built-in authentication support, OAuth, API keys, and basic auth flows.

Alert AI manages your Agentic, LLM application’s access to MCP servers, 360 protection and control of interactions between Agents, Users, Resources, Tools, Prompts, MCP Client/Servier, Capabilities exchange:

Real time Server access, usage activity

Block, Alert, Allow tool transactions/messages

Rouge , Tampering tools
Tool, package Integrity
Harmful prompt template serving
Tampered tools
Privacy
Exfiltration
Data leakage
Content safety
Sensitive information
Copyright legal exposures
Automatic verification, Policy recommendations configuration of MCP server components
Monitoring both synchronous and asynchronous operation modes
Ensure transport security options
Automatic verification, Policy recommendations for tool, resource, and prompt specifications
Asses, Audit, Alert on Change notification capabilities

Agentic AI security Model Context Protocol (MCP) attacks MCP vulnerabilities MCP Security

About Alert AI

Alert AI is end-to-end, Interoperable Generative AI security platform to help enhance security of Generative AI applications and workflows against potential adversaries, model vulnerabilities, privacy, copyright and legal exposures, sensitive information leaks, Intelligence and data exfiltration, infiltration at training and inference, integrity attacks in AI applications, anomalies detection and enhanced visibility in AI pipelines. forensics, audit,AI governance in AI footprint.

What is at stake AI & Gen AI in Business? We are addressing exactly that.

Generative AI security solution for Healthcare, Insurance, Retail, Banking, Finance, Life Sciences, Manufacturing.

Despite the Security challenges, the promise of Generative AI is enormous.

We are committed to enhance the security of Generative AI applications and workflows in industries and enterprises to reap the benefits .

Alert AI 360 view and Detections

Alerts and Threat detection in AI footprint
LLM & Model Vulnerabilities Alerts
Adversarial ML Alerts
Prompt, response security and Usage Alerts
Sensitive content detection Alerts
Privacy, Copyright and Legal Alerts
AI application Integrity Threats Detection
Training, Evaluation, Inference Alerts
AI visibility, Tracking & Lineage Analysis Alerts
Pipeline analytics Alerts
Feedback loop
AI Forensics
Compliance Reports

End-to-End Security with

Data alerts
Model alerts
Pipeline alerts
Evaluation alerts
Training alerts
Inference alerts
Model Vulnerabilities
Llm vulnerability
Privacy
Threats
Resources
Environments
Governance and compliance

Organizations need to responsibly assess and enhance the security of their AI environments development, staging, production for Generative AI applications and Workflows in Business.