Alert AI: LLM Applications and AI Agents Security Alerts
Alert AI “Secure AI anywhere” AI security gateway uses a comprehensive, multi-layered defense in depth and monitoring strategy for generating 500+ unique alerts for LLM and AI agent applications.
Alert AI : Detections
Lists of Security Alerts
Data and model security Alerts
- Unauthorized access attempt to AI model API.
- High volume of requests to model API from a single, unknown IP.
- Unusual model access patterns detected (e.g., from a new geographic location).
- Data poisoning attempt detected in model training pipeline.
- Unexpected changes in model weights or architecture detected.
- Unauthorized deployment of new model version.
- Attempted model inversion attack identified.
- Suspicious download of model artifacts from internal network.
- Source code for proprietary models accessed by unauthorized user.
- Model theft attempt using reverse engineering identified.
- Model inference endpoint shows signs of enumeration attack.
- Adversarial inputs designed to cause model misclassification detected.
- Unusual increase in model retraining jobs.
- Anomaly detected in model validation metrics during training.
- Sensitive data exposure in model logs or outputs.
Inference, generation Alerts
- Unusually high latency for inference requests from specific user.
- Sudden spike in inference error rates.
- Output generation rate drops below expected threshold.
- High resource consumption (GPU/TPU) for a specific inference endpoint.
- Significant deviation in inference output patterns.
- High rate of API key rotation for inference services.
- Suspicious inference requests originating from outside expected network ranges.
- Inference request payload contains obfuscated or malicious code.
- Model produces an unusually high volume of rejected or filtered responses.
- Unexpected changes to model configuration file in production.
- Abnormal inference traffic originating from a known bad actor IP.
- Inference service crashes or fails to respond.
- Configuration of a production model changed without proper change control.
- Inference endpoint logs show failed authentication attempts.
- Output of a deployed model does not match expected baseline.
AI development lifecycle Alerts
- PII or sensitive data detected in a new training dataset.
- Changes to production AI pipeline (CI/CD) by an unauthorized user.
- Mismatch between model lineage metadata and deployed artifact.
- Compromised data source used for retraining a production model.
- Access to data labeling service from an unapproved IP.
- Untrusted third-party component introduced into the AI supply chain.
- Security vulnerability scanner identifies weakness in AI application.
- Unauthorized changes to model training scripts.
- Data used for model evaluation differs significantly from training data.
- AI artifact repository accessed by an unapproved user.
- New AI vulnerability (e.g., prompt injection method) detected by threat intelligence feed.
- Failure to comply with data privacy regulations (e.g., HIPAA) regarding AI data.
- High-privilege access granted to AI infrastructure by an unauthorized process.
- Model serving infrastructure scaled up or down outside of normal patterns.
- Automated vulnerability scanner detects an AI-specific vulnerability.
LLM AI agent application security alerts
Prompt injection and manipulation
- User input contains classic jailbreaking phrases.
- Indirect prompt injection detected via data from an integrated service.
- Evasion of safety filters using complex, multipart prompts.
- High frequency of failed prompt injection attempts.
- Agent’s output contains unauthorized system commands or database queries.
- User attempts to manipulate agent into revealing confidential information.
- Output from an agent includes sensitive data from a previous user’s interaction.
- Large-scale campaign of malicious prompts detected.
- Agent logs show repeated attempts to override initial instructions.
- Input prompt contains obfuscated, base64-encoded text.
- Multiple users submit identical, suspicious prompts.
- High entropy detected in prompt text from a single user.
- User attempts to trick the agent into performing social engineering.
- Prompt content flagged by toxicity classifier in an unexpected context.
- A successful prompt injection resulting in an unwanted action is detected.
Plugin and tool security
- Unauthorized access to a plugin API by an LLM agent.
- Agent attempts to call a function or tool outside its specified permissions.
- Plugin execution fails due to corrupted or malicious input from agent.
- Agent performs excessive number of calls to a specific, non-critical plugin.
- An agent attempts to access a plugin that it is not explicitly configured to use.
- Outgoing request from a plugin contains sensitive user data provided by the agent.
- A compromised plugin is identified in the application’s supply chain.
- Plugin code is modified in a way that allows for privilege escalation.
- Agent calls a plugin with parameters that suggest an OS command injection attempt.
- User exploits a vulnerable plugin via an agent to access external resources.
Operational and agent-specific threats
- Agent exhibits excessive agency, making decisions without approval.
- High number of failed tool calls by an agent.
- Agent logs show unexpected execution paths or chain of thought.
- An agent’s response indicates a misunderstanding of a security policy.
- Agent begins generating untrusted or fabricated content (hallucination).
- Agent access token used from an unusual IP address.
- Configuration change to an agent’s privilege scope.
- Agent attempts to access data outside its designated context window.
- Agent shows signs of behavioral drift and degradation over time.
- PII detected in an agent’s internal memory or logs.
- High frequency of agent restarts or crashes.
- Agent attempts to perform an action explicitly disallowed by guardrails.
- Log of an agent’s internal monologue reveals sensitive information.
- Discrepancy between agent’s final action and its stated intent.
- Agent’s decision-making logic is circumvented via a clever prompt.
Performance alerts
- Average request latency exceeds acceptable SLO.
- Peak request latency increases by more than 25% over baseline.
- Sudden drop in throughput (requests per second).
- High p99 latency for a specific LLM model.
- Increased latency for API calls to third-party services used by agents.
- Substantial increase in token generation time.
- Time-to-first-token exceeds threshold.
- Latency for specific user groups or geographic regions spikes.
- High memory usage correlates with increased latency.
- Latency increases with an increase in request complexity.
- Batch processing time for LLM jobs increases significantly.
- Disk I/O contention detected on LLM serving infrastructure.
- Unexplained decrease in throughput despite constant load.
- Latency spikes for requests during garbage collection.
- Latency for requests containing specific keywords or data types increases.
Resource utilization
- GPU utilization exceeds predefined threshold for sustained period.
- Memory consumption for an LLM process exceeds a safe limit.
- Sudden spike in GPU temperature.
- High CPU usage on LLM serving nodes.
- Network bandwidth utilization for model serving is abnormally high.
- Inefficient use of GPU memory detected (e.g., low batch size).
- Disk space usage on serving hosts is critically low.
- I/O bottlenecks detected during model loading or data retrieval.
- High resource consumption by a non-model process on a serving node.
- Resource exhaustion error logs from the LLM serving stack.
- Memory leak detected in LLM application container.
- Unused GPUs are not being scaled down.
- Sudden decrease in GPU utilization suggesting a stalled process.
- Overprovisioning of cloud resources for LLM inference.
- Underutilization of resources for critical LLM tasks.
AI Ops and LLM Ops monitoring alerts, Drift and degradation
- Data drift detected in incoming production data, causing a shift from baseline.
- Concept drift identified, where the relationship between inputs and outputs has changed.
- Output distribution drift detected in LLM responses.
- A/B test shows degraded performance in new model version.
- Accuracy of a downstream classification model fed by LLM output drops significantly.
- Hallucination rate of LLM increases over time.
- Coherence score of LLM responses drops below threshold.
- PII detection rate in LLM outputs changes unexpectedly.
- Sentiment of LLM responses shows a negative drift.
- A statistically significant difference is found between current and historical data distributions.
- Anomaly detected in the evaluation metrics of a canary deployment.
- High rate of human overrides in a human-in-the-loop system.
- User satisfaction scores for an LLM-powered feature decline.
- LLM’s internal representation (embeddings) drift over time.
- Performance benchmark tests for LLM regression.
System and infrastructure health
- LLM application container restart count exceeds threshold.
- Pods for LLM serving are not ready or are in an error state.
- Kubernetes cluster where LLM is deployed is running low on resources.
- API gateway for LLM requests reports an increase in 5xx errors.
- Data pipeline for LLM input data shows back pressure.
- Failure in data collection agent for LLM monitoring.
- Model registry service is unreachable.
- High number of failed deployments for LLM services.
- Logging service for LLM applications experiences high ingestion latency.
- Out-of-memory error on a serving node during peak load.
- Latency for a specific microservice in the LLM chain increases.
- A sudden change in cloud provider pricing for LLM services.
- Error rate for an LLM-related cloud function increases.
- Discrepancy between logs collected and expected volume.
- Security group for LLM infrastructure is misconfigured.
Operational efficiency and cost
- Token usage for LLM services exceeds the budget threshold.
- Cost of running a specific LLM or agent increases unexpectedly.
- Throughput is low relative to the provisioned resources.
- High number of redundant or repeated LLM calls.
- Prompt engineering changes increase token usage without performance gains.
- Usage of a more expensive LLM model exceeds normal usage patterns.
- Low cache hit rate for LLM responses.
- High volume of failed requests resulting in unnecessary billing.
- Inefficient prompt construction leading to excessive token consumption.
- Cost of third-party API calls from an agent spikes.
Prompt and input vulnerabilities
- Prompt Injection Detected: An input is flagged by a security guardrail as a deliberate attempt to manipulate the agent’s instructions.
- Privilege Escalation Attempt: A user prompt attempts to gain unauthorized access or elevate permissions.
- PII Detected in Input: User prompt contains Personally Identifiable Information that needs to be redacted before processing.
- PII Leakage in Output: Agent’s response contains sensitive PII that was not part of the original prompt.
- Confidential Data Exfiltration Attempt: A prompt attempts to trick the agent into revealing internal data or system information.
- Jailbreak Attempt: User prompts are attempting to bypass safety filters and constraints.
- Harmful Content Generation: Agent output contains content that violates safety policies (e.g., hate speech, dangerous advice).
- Denial-of-Service (DoS) via Prompt: A high volume of complex or resource-intensive prompts from a single source.
- Unusual User Behavior (Potential Attack): A user suddenly changes their query patterns or query frequency.
- Unauthorized API Call: Agent attempts to use an external tool or API without proper authorization.
- Malicious File Upload: File upload functionality is used to inject malicious content or commands.
- Encoding Obfuscation: A prompt uses complex encoding to hide malicious intent.
- Multi-lingual Obfuscation: The prompt switches between languages to bypass filters.
- Excessive Tool Use: The agent makes an unusually high number of calls to a specific tool for a single request.
- SQL/Code Injection Attempt: A prompt includes malicious SQL or code snippets aimed at back-end systems.
- Indirect Prompt Injection (Stored): Malicious instructions are embedded in external documents consumed by the agent.
- Model Data Extraction Attempt: A prompt tries to extract the agent’s internal instructions or system prompts.
- Security Vulnerability Scan Detected: Automated probes from security scanners are identified in the prompt logs.
- User Access Anomaly: A user with minimal privileges exhibits behavior associated with highly privileged accounts.
- Credential Exposure: An agent-generated response inadvertently contains API keys or other credentials.
- Violation of Data Privacy Policies: Agent stores or processes data in a way that violates defined privacy rules.
- Policy Violation Detected: Agent generates a response that violates a pre-defined policy.
- Unusual Tool Call Parameters: Agent uses an external tool with strange or malformed parameters.
- Agent Self-Modification Attempt: An advanced agent attempts to rewrite its own system prompts or instructions.
- Sensitive Information from Source Document (RAG): The agent retrieves sensitive data from a knowledge base that should have been filtered.
Cost and Operations (25 alerts)
- Daily/Weekly Cost Exceeded: The application’s total token or API cost exceeds a predefined budget for a given period.
- Cost Spike Detected: Sudden, anomalous spike in token usage or API costs.
- High Cost Per User: An individual user or user group is generating a disproportionately high amount of cost.
- Expensive Model Usage: Frequent use of a more expensive LLM model when a cheaper one would suffice.Inefficient Prompting: Identification of prompt patterns that use an unnecessarily high number of tokens for a task.
- Token Overspending (Single Request): A single request consumes an unusually large number of tokens.
- Billing API Failure: Failure to retrieve or process billing data from the LLM provider.
- Token Rate Limit Exceeded: The application hits its token or request rate limit with the LLM provider API.
- Provider API Latency Increase: Observed latency for the LLM provider’s API is increasing.
- API Authentication Failure: The application’s API key is failing to authenticate with the LLM provider.
- Resource Exhaustion Error: The LLM provider returns an error indicating that resource limits have been reached.
- Caching Miss Rate Increase: Decrease in cache hit rate for frequent or identical requests.
- High Rate of Duplicate Requests: Agent is receiving and processing a high volume of identical requests.
- Load Balancer Error: Failure in the load balancing of requests to different LLM agents or providers.
- High Wait Time in Queue: Requests are spending an excessive amount of time in the processing queue.
- High Infrastructure Cost: The underlying infrastructure (e.g., GPU servers) for self-hosted models shows unusually high costs.
- New Model Version Cost Increase: The latest version of the model is significantly more expensive without a proportional performance increase.
- Token Cost Per Conversation Spike: The average cost per user conversation thread rises above a benchmark.
- Cost Discrepancy: Discrepancy between the expected cost and the actual billed cost from the provider.
- High Percentage of Zero-Value Tokens: Model generates a high number of filler tokens that don’t add value.
- Tool Usage Cost Spike: A specific external tool API call cost increases unexpectedly.
- Data Ingestion Cost Spike (RAG): The cost of ingesting and embedding data for a RAG system increases significantly.
- Memory Leak Detected: Agent process’s memory usage grows steadily over time, indicating a memory leak.
- Configuration Drift: The agent’s configuration parameters have been changed in an unexpected way.
- Model Obsolescence: The LLM model in use has been deprecated or is no longer the recommended version.
- Prompt injection detected: A user’s prompt is flagged as an attempt to hijack the agent’s instructions, such as Ignore all previous instructions….
- Indirect prompt injection: The agent is manipulated through an external data source, like a document retrieved from a knowledge base.
- Data leakage in prompt: A user’s prompt contains sensitive information (e.g., PII, PHI) that could be improperly processed.
- PII detected in chat history: Sensitive information is found within the conversation history, potentially in violation of privacy rules.
- PII redaction failure: An attempt to redact sensitive information (PII) from a prompt or response fails.
- Excessive prompt length: A prompt exceeds the configured maximum length, indicating a potential resource-exhaustion attack.
- Unusual input language: An input is received in a language not supported by the model, which could indicate a malicious intent.
- Character encoding attack detected: Input contains malicious character encoding designed to bypass sanitization filters.
- Escalated prompt complexity: The complexity score of a prompt, measured by number of sub-tasks, increases dramatically, indicating a possible abuse attempt.
- Malicious file upload: The agent is used to process a file containing malware or other malicious content.
- Malicious link injected into prompt: A prompt contains a malicious URL, potentially used for phishing or social engineering.
- Jailbreak attempt detected: The user attempts to circumvent the agent’s safety or guardrail policies.
- Obfuscated prompt injection: Sophisticated prompt obfuscation techniques are detected, possibly using base64 or other encoding methods.
- High-frequency prompt injection attempts: Multiple prompt injection alerts are triggered from the same user or IP address in a short time.
- Suspicious agent tool usage request: A prompt attempts to force the agent to use a tool in an unauthorized way.
Tool and environment security
- Unauthorized tool call: The agent attempts to invoke a tool or function that it is not permitted to use.
- Tool call with malicious arguments: The agent is instructed to use a tool with potentially harmful or invalid input.
- Unexpected shell command execution: The agent executes an operating system command that is not part of its normal behavior.
- Unauthorized environment variable access: The agent attempts to read a sensitive environment variable.
- Sensitive data from tool output: A tool returns sensitive data in its output, which was not properly handled or redacted.
- API key or token exposure: The agent’s response or a tool’s output contains a hardcoded API key or access token.
- Agent makes outbound network request to blocklisted domain: The agent attempts to connect to a known malicious IP address or domain.
- High-frequency tool errors: An agent tool fails repeatedly, indicating a potential attempt to destabilize the system.
- Excessive agent actions: An agent takes a high number of steps or actions to complete a single, simple request.
- Agent attempts to access restricted file path: The agent’s tool tries to read, write, or execute a file in a restricted directory.
- Agent-initiated credential theft attempt: The agent’s generated code or action attempts to interact with a credential store.
- Privilege escalation detected via agent tool: The agent is used to execute a tool with higher privileges than intended.
- Agent attempts to modify its own configuration: The agent attempts to change its own prompt or instruction set.
Output and Data security
- Harmful content generation: The model generates output classified as toxic, hateful, or explicit.
- Disclosure of sensitive training data: The LLM produces an output that appears to be a direct quote or replication of its training data, potentially exposing confidential information.
- LLM generates malicious code: The model generates code that is flagged as potentially malicious, such as a command injection payload.
- Social engineering attempt detected: The agent’s output is flagged as a social engineering tactic, like requesting sensitive information.
- Misinformation or hallucination detected: The agent generates a response containing factually incorrect information.
- LLM generates copyrighted content: The model’s output contains content that matches known copyrighted material.
- Excessive PII in response: The agent’s output contains an unusually high amount of PII or other sensitive data.
- Data exfiltration attempt: The agent’s output is structured to transfer data to an external, unauthorized location.
- Model bias detected: A pattern of biased or unfair responses is detected during a user interaction.
User and Application security
- Unusual user behavior: A user’s interaction pattern (e.g., rapid, repetitive prompts) is flagged as suspicious.
- API key compromised: An LLM API key shows unusual usage patterns, such as a spike in token consumption.
- Credential stuffing attempt: Multiple failed login attempts are made against the LLM application.
- High-frequency, low-relevance queries from a single user: A user submits many queries that consistently receive low-relevance scores, which can indicate an enumeration attack.
- Denial-of-service attempt: A high volume of requests is detected from a single source, potentially aiming to overwhelm the service.
Performance Alerts
Latency and throughput
- High end-to-end latency: The total time from user prompt to agent response exceeds a set threshold (e.g., 500ms).
- High LLM provider latency: The time for the LLM provider to return a response exceeds a set threshold, indicating an upstream issue.
- High retrieval latency: The retrieval-augmented generation (RAG) system is slow to retrieve documents from the knowledge base.
- High tool execution latency: An external tool or API call takes too long to execute.
- High agent planning latency: The agent’s internal planning or reasoning steps take an excessive amount of time.
- Slow token generation rate: The model’s token generation speed drops below an acceptable rate.
- Low overall throughput: The number of requests processed per minute falls below a minimum threshold.
- Throughput degradation detected: A sudden drop in request throughput is observed.
Cost and resource usage
- Spike in token usage: The number of tokens used per conversation or over a time period increases unexpectedly.
- Unexpected cost increase from LLM provider: Monitoring shows an unexpected spike in the LLM provider’s billing.
- High GPU utilization on inference servers: GPU usage on the LLM infrastructure is consistently at or near maximum capacity.
- Memory usage exceeding threshold: The memory consumption of the agent or model server exceeds a safe limit, potentially causing instability.
- High CPU utilization on orchestrator: The agent orchestration service shows high CPU load, indicating a bottleneck.
- Inefficient tool usage: The agent is using costly tools or APIs for tasks that could be handled more cheaply.
Agent and model behavior
- High database load from knowledge base: The vector database or knowledge base is under heavy load, increasing latency and cost.
- High rate of hallucinations: The percentage of hallucinated or factually incorrect responses exceeds the defined baseline.
- Semantic drift detected: The agent’s output is subtly shifting in tone, style, or quality over time.
- High rate of non-responsive agents: A significant number of agent conversations end without a valid response.
- High rate of agent loops: An agent gets stuck in a repetitive chain of thoughts or tool calls.
- Poor relevance scores: The model’s output consistently fails to match the user’s intent, leading to low relevance scores.
- High rate of model-side errors: The LLM provider reports an elevated number of errors during requests.
- Incorrect tool selection: The agent repeatedly chooses the wrong tool for a given task.
- Evaluation metric degradation: A quality evaluation metric (e.g., RAGAS score, model grade) drops below a defined threshold.
- Low user feedback scores: The aggregate user feedback (e.g., thumb-up/thumb-down) is lower than the baseline.
Monitoring and infrastructure alerts
System health and availability
- Agent service is down: The core agent service or a critical component is not responding.
- LLM API endpoint unavailable: The connection to the LLM provider fails or the provider’s API is not reachable.
- Vector database connection error: The agent cannot connect to its knowledge base.
- High error rate on LLM API: The LLM API returns an excessive number of error codes (e.g., 4xx, 5xx).
- Out-of-memory error: The application or a specific LLM process runs out of memory.
- Container crash: A container hosting the agent or a related service fails unexpectedly.
- Inference server overload: The LLM inference server reaches its maximum concurrent request limit.
- Deployment rollback: A recent deployment of the agent was automatically or manually rolled back due to issues.
- Certificate expiration: A TLS/SSL certificate used by the application or an API is approaching its expiration date.
Data and pipeline issues Alerts
- Stale knowledge base: The last successful update to the knowledge base or vector index was longer than expected.
- Document retrieval failure: The RAG system consistently fails to retrieve relevant documents for a given query.
- Data pipeline failure: The ETL process for updating the LLM’s knowledge base fails or is delayed.
- Missing metadata: A log or trace lacks essential metadata, hindering debugging.
- Data version mismatch: The agent is using a different version of the knowledge base or tool than expected.
- Tracing service offline: The OpenTelemetry or other tracing service is down, preventing end-to-end visibility.
Agent and workflow specific Alerts
- Agent fails to complete task: A user’s conversation with the agent is terminated without a successful resolution.
- Conversation state inconsistency: The agent’s memory or conversational state becomes corrupted.
- Workflow path deviation: An agent’s execution trace follows a different and more complex path than normal for a simple request.
- Unplanned tool usage sequence: An agent uses tools in an unexpected order during a multi-step task.
Performance and Reliability Alerts
- High Inference Latency: Average inference time exceeds a defined threshold (e.g., > 2 seconds).
- Latency Spikes: Sudden, unexplained spikes in inference latency, indicating a potential bottleneck.
- ncreased Error Rate: Percentage of failed requests surpasses a set threshold (e.g., > 5%).
- Agent Failure to Answer: Agent fails to produce a meaningful or coherent response.
- Agent Timeouts: Agent requests time out before a response is generated.
- Low Throughput: Decrease in the number of successful requests processed per minute, indicating a performance regression.
- High Dependency Latency: Agent’s external API or tool calls are consistently slow.
- Increased Retries: Spikes in internal retry attempts by the agent, signaling upstream issues.
- Degrading Performance over Time (Drift): Gradual deterioration of key performance metrics (latency, accuracy) over weeks or months.
- Significant Drop in Success Rate: The agent’s ability to complete its core task successfully falls below a benchmark.
- Agent Version Regression: A new agent version or configuration shows a performance decline compared to the previous one.
- High CPU/GPU Utilization: Excessive resource consumption by the agent, indicating inefficiency or a stuck process.
- High Memory Consumption: Agent process memory usage exceeds its normal operating range, potentially leading to instability.
- Resource Saturation: System-level metrics (e.g., CPU, GPU, memory) approach maximum capacity.
- Container/Pod Restarts: Frequent restarts of the LLM agent’s underlying infrastructure.
- Intermittent Failures: Sporadic, non-reproducible request failures that are hard to diagnose.
- Dependency Service Outage: Alerts when a critical external service (e.g., knowledge base, search API) is unreachable.
- Unexpected Agent Fallback: Agent frequently falls back to a simpler, less effective response template.
- Anomalous Request-to-Token Ratio: Unusual number of tokens generated per request, signaling an inefficient change in agent behavior.
- Frequent Prompt Retries: A user or automated system submits the same prompt multiple times without success.
- High Rate of Re-conversations: Users are frequently restarting conversations, indicating a failure to resolve the initial request.
- Abnormal Distribution of Agent Tools: The frequency of tool calls deviates from its established baseline, suggesting a change in reasoning.
- High Agent Handover Rate: Rate of requests being escalated from the agent to a human exceeds a threshold, identifying capability gaps.
- Failure to Correct Error States: The agent fails to recover gracefully from a tool call error.
- Significant Increase in Token Usage: Sudden spike in token consumption for a user or task, indicating a potential attack or prompt efficiency issue.
AI Integrity and Quality Alerts
- allucination Detected: An LLM-as-a-judge or fact-checking mechanism identifies a fabricated claim or factual inconsistency.
- Factual Accuracy Drop: Automated evaluation metrics show a significant decrease in factual correctness.
- Coherence Score Drop: The output’s logical flow or grammatical correctness deteriorates according to semantic consistency checks.
- Relevance Score Drop: Responses are becoming less relevant to user prompts over time.
- Unhelpful Response Rate: User feedback or implicit signals indicate a rise in unhelpful responses.
- Inconsistency Detected: The agent provides contradictory responses to similar queries.
- High Toxicity Score: A toxicity classifier flags an agent’s response for inappropriate, hateful, or harmful content.
- Fairness Metric Skew: A predefined fairness metric for specific demographic groups shows a concerning bias.
- Incomplete Responses: The agent’s output is consistently incomplete or cut off.
- Sentiment Shift: Agent responses show an unexplained shift towards a more negative or inappropriate tone.
- Data Quality Anomaly (RAG): Documents retrieved for Retrieval Augmented Generation (RAG) are scored as having low relevance or quality.
- Outdated Knowledge (RAG): The agent provides an answer based on outdated information, potentially from a stale knowledge base.
- Semantic Drift: The agent’s overall response style or meaning changes subtly over time.
- Entity Extraction Failure: The agent fails to properly identify and extract key entities from user input.
- New Keyword/Topic Drift: The agent starts generating responses on topics or keywords outside its intended scope.
- User Feedback Spike (Negative): Automated systems detect a sudden increase in negative user feedback ratings or flags.
- High User Redundancy: Users repeatedly rephrase their queries, indicating the agent is not understanding their intent.
- Significant Increase in User Edits: Users are frequently editing agent-generated drafts, indicating poor quality.
- High User Rejection Rate: Users consistently dismiss or ignore the agent’s responses.
- Model Confidence Drop: Automated systems report a decrease in the model’s confidence scores for its answers.
- Discrepancy with Ground Truth: The agent’s response differs significantly from a known ground truth or golden dataset.
- Misattribution of Sources (RAG): The agent cites incorrect or non-existent sources for information.
- Inaccurate Summarization: The agent provides a summary that misrepresents the source text.
- Incorrect Formatting: The agent consistently fails to adhere to a required output format (e.g., JSON, Markdown).
- Tool Use Error: The agent’s logic for selecting or using external tools is flawed, leading to poor output.
Performance and reliability alerts (120 alerts)
Latency (30 alerts)
- Response time Alerts:
- Critical: 99th percentile response time exceeds 5 seconds for any model endpoint. (5 variations based on endpoint/model)
- Warning: Average latency for text-to-image-model shows a 20% increase over the baseline. (5 variations based on model type)
- Anomaly: 95th percentile latency for customer_support_agent deviates significantly from its normal 7-day pattern. (5 variations based on AI agent)
- System throughput Alerts:
- Warning: Requests per minute drops below 50% of the historical average for over 15 minutes. (5 variations based on endpoint/service)
- Dependency latency Alerts:
- Warning: Latency for external API calls (knowledge base, database) exceeds 2 seconds for more than 10% of requests. (5 variations based on tool/API)
- Tool call latency Alerts:
- Critical: Latency for search_api_tool calls exceeds 10 seconds for more than 1% of calls in a 5-minute window. (5 variations based on tool)
- RAG latency Alerts:
- Warning: The vector database retrieval step’s average latency increases by 30% for over 30 minutes. (5 variations based on RAG component)
Agent task execution (20 alerts)
- Task completion rate Alert:
- Critical: Agent task success rate falls below 85% in a 15-minute window. (5 variations based on task type)
- Tool failure rate:
- Critical: Calls to external_api_tool fail for more than 5% of requests over 10 minutes. (5 variations based on tool)
- Retry loops:
- Warning: An agent enters a tool-retry loop for the same task more than 3 times within a single trace. (5 variations based on agent)
- Trace failures:
- Warning: The rate of failed agent traces increases by 50% compared to the previous hour. (5 variations based on agent)
System health (20 alerts)
- Server errors:
- Critical: LLM API endpoint returns a server-side error (HTTP 5xx) for more than 1% of requests. (5 variations based on endpoint)
- Resource utilization:
- Warning: CPU/GPU usage exceeds 90% for the LLM inference host for over 5 minutes. (5 variations based on host/resource)
- Warning: Memory usage on the LLM host exceeds 85% for over 10 minutes. (5 variations based on host/resource)
- Dependency outages:
- Critical: A downstream dependency (e.g., API, database) fails with 100% error rate for more than 30 seconds. (5 variations based on dependency)
Service quality metrics (50 alerts)
- Evaluation metrics:
- Critical: Automated relevance score for agent responses drops below a threshold of 0.7 for more than 30 minutes. (10 variations based on metric like relevance, coherence)
- Warning: Average sentiment score of user feedback shifts negatively by more than 15% in the last 24 hours. (10 variations based on sentiment shift)
- Qualitative failure modes:
- Critical: Agent output fails the hallucination check in Datadog for more than 5% of requests. (10 variations based on failure type like hallucination, ground truth failure)
- Warning: Rate of ‘Did not answer’ responses from the agent increases by 20% compared to the baseline. (10 variations based on failure type)
- Feedback correlation:
- Warning: A specific type of user query receives a user feedback score below 3/5 more than 20 times in an hour. (10 variations based on user feedback trigger)
Safety, security, and ethical alerts (100 alerts)
Prompt injection and manipulation (30 alerts)
- Prompt injection detection:
- Critical: An input containing a known prompt injection signature is detected and successfully bypasses guardrails. (10 variations based on attack signature/severity)
- Warning: A user prompt is flagged with a high confidence score for attempted jailbreak, but blocked by the safety filter. (10 variations based on attempted jailbreak type)
- Privilege escalation attempt:
- Critical: An agent attempts to perform a privileged action (e.g., delete_database) in response to an unverified prompt. (10 variations based on tool action)
Content safety and toxicity (20 alerts)
- Toxic input:
- Warning: The safety filter blocks more than 50 toxic inputs from a single user within an hour. (5 variations based on user behavior)
- Harmful output generation:
- Critical: Agent generates an output classified as harmful or toxic, bypassing safety checks. (5 variations based on safety classification)
- Sensitive topic engagement:
- Warning: Agent repeatedly engages in conversations flagged as touching on sensitive or high-risk topics. (5 variations based on topic)
- Bias amplification:
- Warning: Automated bias detection metrics indicate a significant increase in biased language in model outputs. (5 variations based on bias metric)
Data privacy and leakage (30 alerts)
- PII in input:
- Warning: The system detects personally identifiable information (PII) like email addresses in a user prompt that was not properly redacted. (10 variations based on PII type)
- PII in output:
- Critical: The LLM generates a response containing PII that it should not have access to or that was not present in the original prompt. (10 variations based on PII type)
- Sensitive data query:
- Critical: An agent attempts to access a sensitive data source (e.g., patient records) without proper authentication. (10 variations based on data source)
Hallucinations and misinformation (20 alerts)
- High hallucination rate:
- Critical: Hallucination detection system reports more than 10% of responses as ungrounded or contradictory to provided context. (5 variations based on hallucination type/model)
- Fact-checking failure:
- Warning: The agent provides information that contradicts a known, verified fact in an external knowledge base. (5 variations based on knowledge source)
- Invented citations:
- Warning: In a RAG application, the agent generates a plausible-sounding but completely fabricated source or citation. (5 variations based on RAG type)
- Contextual inconsistency:
- Warning: Semantic similarity metrics between a generated response and its source context drop below a safe threshold. (5 variations based on consistency score)
Cost and efficiency alerts (80 alerts)
Token usage (30 alerts)
- High token consumption:
- Warning: The average number of tokens per request for a specific model increases by 30% compared to the 7-day rolling average. (10 variations based on model/endpoint)
- Budget threshold exceeded:
- Critical: Total token usage for the month exceeds 100% of the allocated budget. (10 variations based on budget level)
- Warning: Total token usage for the week exceeds 75% of the weekly budget. (10 variations based on time frame)
Cost per interaction (30 alerts)
- Cost per query increase:
- Warning: The average cost per successful query rises by more than 20% in the last 24 hours. (10 variations based on query type)
- Expensive queries:
- Warning: A specific user or type of query is responsible for more than 50% of total costs in an hour. (10 variations based on user/query type)
- Model-specific cost spikes:
- Critical: The cost of using a specific, expensive model (gpt-4) unexpectedly spikes over a 30-minute period. (10 variations based on model)
Optimization opportunities (20 alerts)
- High cache miss rate:
- Warning: Cache hit rate drops below 20%, indicating under-utilization of the cache. (10 variations based on cache configuration)
- Inefficient prompts:
- Warning: Prompt analysis flags more than 100 queries with unusually long and verbose prompts. (10 variations based on prompt characteristic)
Conversational and UX alerts (60 alerts)
Conversation flow (20 alerts)
- Stuck in loop:
- Warning: An agent-user conversation trace contains more than 5 identical consecutive turns. (5 variations based on loop type)
- Conversation abandonment:
- Warning: The rate of conversations ending abruptly with no resolution increases by 25%. (5 variations based on metric)
- Escalation rate increase:
- Warning: The number of agent conversations requiring human handoff increases by 20% in the last hour. (5 variations based on metric)
- Sentiment shift:
- Warning: Automated sentiment detection flags a significant shift from positive to negative sentiment within a single conversation. (5 variations based on sentiment change)
User experience (20 alerts)
- Low user satisfaction:
- Warning: Average user feedback score on a specific feature drops below 3/5. (5 variations based on feature)
- Negative feedback spike:
- Critical: A sudden spike in negative user feedback (“bad response”, “not helpful”) is detected. (5 variations based on feedback type)
- Unusual interaction patterns:
- Warning: Anomaly detection identifies a user interaction pattern that deviates from the norm (e.g., unusually short or long conversations). (5 variations based on interaction type)
- High user effort:
- Warning: A user asks for clarification or rephrasing more than 3 times in a single conversation. (5 variations based on effort metric)
Dialog management (20 alerts)
- Out-of-scope query:
- Warning: Agent receives more than 50 queries identified as outside its defined scope in an hour. (5 variations based on scope)
- Goal failure:
- Warning: The agent fails to achieve its primary task objective in more than 10% of conversations over a 15-minute period. (5 variations based on task)
- Conversation divergence:
- Warning: The conversation topic diverges significantly from the initial user intent. (5 variations based on topic analysis)
- Ambiguous user intent:
- Warning: The system frequently reports low confidence in determining user intent, potentially indicating a need for prompt refinement. (5 variations based on confidence score)
Data and model drift alerts (70 alerts)
Data drift (30 alerts)
- Input data drift:
- Warning: The distribution of input token lengths for a specific agent shifts by more than 20% compared to the training data. (10 variations based on data feature)
- User behavior drift:
- Warning: The topics of incoming user queries deviate significantly from the historical norm, as detected by an unsupervised topic model. (10 variations based on topic shift)
- Vocabulary drift:
- Warning: A sudden influx of new, out-of-vocabulary words is detected in user prompts. (10 variations based on vocabulary metric)
Model output drift (30 alerts)
- Output distribution change:
- Warning: The distribution of generated response lengths changes by more than 25% from the model’s baseline behavior. (10 variations based on output metric)
- Output sentiment drift:
- Warning: The average sentiment of model outputs shifts unexpectedly, potentially indicating a change in tone. (10 variations based on sentiment shift)
- Coherence drift:
- Warning: The automated coherence score of model outputs drops significantly, suggesting less fluent responses. (10 variations based on coherence score)
Concept drift (10 alerts)
- Drift in effectiveness:
- Critical: An LLM monitoring system detects a significant increase in a specific type of model error (e.g., outdated information), indicating concept drift. (5 variations based on error type)
- Ground truth mismatch:
- Warning: The model’s performance on a daily-updated validation set deteriorates over several days. (5 variations based on metric)
RAG and knowledge base alerts (70 alerts)
Retrieval quality (30 alerts)
- Low retrieval relevance:
- Warning: The semantic similarity score between retrieved documents and the user query drops below a specified threshold. (10 variations based on threshold/source)
- Contextual noise:
- Warning: The agent is retrieving documents that are irrelevant or introduce noise into the generated response. (10 variations based on retrieval analysis)
- Source document changes:
- Warning: Retrieval system detects changes or updates to a core knowledge base document and triggers a verification process. (10 variations based on document type)
Retrieval process (20 alerts)
- Retrieval failure rate:
- Critical: The rate of zero-document retrievals for RAG-based queries increases significantly. (10 variations based on retrieval configuration)
- Database connection issues:
- Critical: The vector database reports connection errors or high query latency. (10 variations based on database)
Knowledge base freshness (20 alerts)
- Stale content detected:
- Warning: Fact-checking identifies information in the knowledge base that is no longer current. (10 variations based on freshness check)
- Synchronization error:
- Critical: The knowledge base synchronization process with the source of truth fails for more than 2 consecutive runs. (10 variations based on synchronization task)
Custom and advanced alerts (70 alerts)
Anomaly detection (20 alerts)
- User interaction anomaly:
- Warning: Anomaly detection on user interaction patterns flags an unusual surge in requests from a single IP address. (10 variations based on user metric)
- Model behavior anomaly:
- Warning: An internal ML model detects a significant anomaly in the LLM’s output embedding space, suggesting a shift in response style. (10 variations based on output metric)
Agentic behavior (30 alerts)
- Unusual tool sequencing:
- Warning: An agent attempts an unusual sequence of tool calls that deviates from typical execution paths. (10 variations based on tool sequence)
- Unexpected termination:
- Critical: An agent process terminates unexpectedly during a multi-step task execution. (10 variations based on termination event)
- Tool call failures:
- Critical: An agent fails to parse or execute a tool call correctly, resulting in an error. (10 variations based on tool call failure)
External alerts and feedback (20 alerts)
- Third-party tool alerts:
- Critical: An alert is received from a third-party LLM security provider (e.g., Lakera Guard) indicating a vulnerability. (10 variations based on tool)
- Human-in-the-loop triggers:
- Warning: A user manually flags a response as low-quality, triggering a review by a human operator. (10 variations based on human feedback)
Other Alert strategies:
- Specific granular thresholds: Instead of a single “High Latency” alert, create multiple based on specific endpoints, tool types, or latency bands. For example: LLM_API_Latency > 500ms, LLM_API_Latency > 1s, Retrieval_Latency_Exceeded_P99.
- Alerts per tool: For agents that use multiple tools (APIs, databases, etc.), specific alerts for each tool’s usage, performance, and security. For instance, Tool_CalendarAPI_Failure_Rate > 5%, Tool_Database_Execution_Latency > 2s.
- Alerts segmented by user Type or business critical function: Tailored alerts to different user groups (e.g., VIP_User_Experience_Degradation) or business journeys (e.g., Product_Search_Hallucination_Rate_Increase).
- Add statistical anomaly detection: Create alerts for each metric (latency, token usage, error rate) where a machine learning model detects a statistically significant deviation from the normal baseline.
- Expand on threat intelligence: Integrate your monitoring with threat intelligence feeds. Alerts could trigger on attempts to use known malicious prompts, attack patterns, or access blocklisted domains.