Prompt Security and Risk detection strategies in LLM application security

Prompt security and Tokenizer security

Tokenizer manipulation attacks
Adversaries can modify tokenizers configuration to corrupt the output of the model

Recommendations

Tokenizer manipulation Detection

Versioning tokenizers
Auditing tokenizers
Logging

In Large language models (LLMs):

1. Prompts are passed through Tokenizer

2. Tokenizer creates an array of token IDs a list of integers

3. LLM outputs an array of token IDs

4. Tokenizer outputs as readable Text

Security Risks

Insufficient validation when initializing tokenizers

This will enable Adversaries to corrupt token encoding and decoding, integrity attacks

Encoding or decoding attacks

Attacker access token mappings for example AutoTokenizer files

Detect changes Tokenizer file artifacts must be versioned AutoTokenizer files

Suble bias introduction Attack

Exploing Tokenization methods to introduce a subtle bias into prompts, leading to perturbed generations.

Prompt injection attack

Adversary crafts specific adversarial input to Generative AI application
Application use input as prompt of an LLM request
– unintended behavior
-jailbreaks attack
-leakage of training data,
– system compromise.

Expensive repeat requests Attacks

Introduces Latency in Response
Attackers can exploit , to introduce latency with crafted requests for LLM models to repeat tokens
Same prompt again repeating request

Long-running requests attacks
Adversary using to conduct denial of service attack
adversary using Long-running requests to resource exhaustion attack

Divergence attacks

Using token boundary artifacts.
to exploit greedy tokenization
Prompt ends with a wildcard token adversarial impact on model results

Recommendations

Subword regularization verification
Model robustness to token boundaries

Prompt security identity Confirmation

Some of the Metrics to evaluate:

Accuracy: Measures the correctness of identity confirmation. It includes the rate of true positives (correctly confirmed identities) and true negatives (correctly rejected identities).
False Acceptance Rate (FAR): The rate at which unauthorized users are incorrectly accepted by the system. Lower FAR is preferred.
False Rejection Rate (FRR): The rate at which authorized users are incorrectly rejected by the system. Lower FRR is preferred.
Equal Error Rate (EER): The point where FAR and FRR are equal. A lower EER indicates a more balanced and effective system.
Processing Time: The time it takes to confirm an identity. Faster processing times improve user experience.
Scalability: The system’s ability to handle an increasing number of users without performance degradation.
Usability: How user-friendly and convenient the identity confirmation process is for users.
Security: Measures the robustness of the system against attacks like spoofing, phishing, and brute force attacks.

Process

Data Collection:
- Biometric Data: Fingerprints, facial recognition, iris scans, voice recognition.
- Knowledge-Based: Passwords, PINs, security questions.
- Possession-Based: Smart cards, security tokens, mobile devices.
Feature Extraction:
- For biometric data, specific features (e.g., minutiae points in fingerprints, facial landmarks) are extracted.
- For knowledge-based methods, the correctness of the provided information is checked.
- For possession-based methods, the presence and validity of the token are verified.
Matching and Verification:
- The extracted features are compared against stored templates or data.
- Matching algorithms determine the similarity or correctness.
- Multi-factor authentication (MFA) might be used to combine multiple methods for higher security.
Decision Making:
- Based on matching results, a decision is made to either confirm or reject the identity.
- This decision can involve threshold settings that balance between FAR and FRR.
Feedback and Logging:
- Users are informed of the authentication result.
- All attempts are logged for security auditing and analysis.
Continuous Monitoring:
- Systems may continuously monitor for unusual activity that might indicate compromised credentials.
- Adaptive authentication techniques may adjust the level of scrutiny based on the context of the login attempt.

About Alert AI

Alert AI is end-to-end, Interoperable Generative AI security platform to help enhance security of Generative AI applications and workflows against potential adversaries, model vulnerabilities, privacy, copyright and legal exposures, sensitive information leaks, Intelligence and data exfiltration, infiltration at training and inference, integrity attacks in AI applications, anomalies detection and enhanced visibility in AI pipelines. forensics, audit,AI governance in AI footprint.

What is at stake AI & Gen AI in Business? We are addressing exactly that.

Generative AI security solution for Healthcare, Insurance, Retail, Banking, Finance, Life Sciences, Manufacturing.

Despite the Security challenges, the promise of Generative AI is enormous.

We are committed to enhance the security of Generative AI applications and workflows in industries and enterprises to reap the benefits .

Alert AI 360 view and Detections

Alerts and Threat detection in AI footprint
LLM & Model Vulnerabilities Alerts
Adversarial ML Alerts
Prompt, response security and Usage Alerts
Sensitive content detection Alerts
Privacy, Copyright and Legal Alerts
AI application Integrity Threats Detection
Training, Evaluation, Inference Alerts
AI visibility, Tracking & Lineage Analysis Alerts
Pipeline analytics Alerts
Feedback loop
AI Forensics
Compliance Reports

End-to-End Security with

Data alerts
Model alerts
Pipeline alerts
Evaluation alerts
Training alerts
Inference alerts
Model Vulnerabilities
Llm vulnerability
Privacy
Threats
Resources
Environments
Governance and compliance

Organizations need to responsibly assess and enhance the security of their AI environments development, staging, production for Generative AI applications and Workflows in Business.