Prompt Security, Identity and Risk detection strategies in LLM security
Prompt security and Tokenizer security
Tokenizer manipulation attacks
Adversaries can modify tokenizers configuration to corrupt the output of the model
Recommendations
Tokenizer manipulation Detection
- Versioning tokenizers
- Auditing tokenizers
- Logging
In Large language models (LLMs):
1. Prompts are passed through Tokenizer
2. Tokenizer creates an array of token IDs a list of integers
3. LLM outputs an array of token IDs
4. Tokenizer outputs as readable Text
Security Risks
Insufficient validation when initializing tokenizers
This will enable Adversaries to corrupt token encoding and decoding, integrity attacks
Encoding or decoding attacks
Attacker access token mappings for example AutoTokenizer files
Detect changes Tokenizer file artifacts must be versioned AutoTokenizer files
Suble bias introduction Attack
Exploing Tokenization methods to introduce a subtle bias into prompts, leading to perturbed generations.
Prompt injection attack
Adversary crafts specific adversarial input to Generative AI application
Application use input as prompt of an LLM request
– unintended behavior
-jailbreaks attack
-leakage of training data,
– system compromise.
Expensive repeat requests Attacks
Introduces Latency in Response
Attackers can exploit , to introduce latency with crafted requests for LLM models to repeat tokens
Same prompt again repeating request
Long-running requests attacks
Adversary using to conduct denial of service attack
adversary using Long-running requests to resource exhaustion attack
Divergence attacks
Using token boundary artifacts.
to exploit greedy tokenization
Prompt ends with a wildcard token adversarial impact on model results
Recommendations
Subword regularization verification
Model robustness to token boundaries
Prompt security identity Confirmation
Some of the Metrics to evaluate:
- Accuracy: Measures the correctness of identity confirmation. It includes the rate of true positives (correctly confirmed identities) and true negatives (correctly rejected identities).
- False Acceptance Rate (FAR): The rate at which unauthorized users are incorrectly accepted by the system. Lower FAR is preferred.
- False Rejection Rate (FRR): The rate at which authorized users are incorrectly rejected by the system. Lower FRR is preferred.
- Equal Error Rate (EER): The point where FAR and FRR are equal. A lower EER indicates a more balanced and effective system.
- Processing Time: The time it takes to confirm an identity. Faster processing times improve user experience.
- Scalability: The system’s ability to handle an increasing number of users without performance degradation.
- Usability: How user-friendly and convenient the identity confirmation process is for users.
- Security: Measures the robustness of the system against attacks like spoofing, phishing, and brute force attacks.
Process
- Data Collection:
- Biometric Data: Fingerprints, facial recognition, iris scans, voice recognition.
- Knowledge-Based: Passwords, PINs, security questions.
- Possession-Based: Smart cards, security tokens, mobile devices.
- Feature Extraction:
- For biometric data, specific features (e.g., minutiae points in fingerprints, facial landmarks) are extracted.
- For knowledge-based methods, the correctness of the provided information is checked.
- For possession-based methods, the presence and validity of the token are verified.
- Matching and Verification:
- The extracted features are compared against stored templates or data.
- Matching algorithms determine the similarity or correctness.
- Multi-factor authentication (MFA) might be used to combine multiple methods for higher security.
- Decision Making:
- Based on matching results, a decision is made to either confirm or reject the identity.
- This decision can involve threshold settings that balance between FAR and FRR.
- Feedback and Logging:
- Users are informed of the authentication result.
- All attempts are logged for security auditing and analysis.
- Continuous Monitoring:
- Systems may continuously monitor for unusual activity that might indicate compromised credentials.
- Adaptive authentication techniques may adjust the level of scrutiny based on the context of the login attempt.
About Alert AI
Alert AI is end-to-end, Interoperable Generative AI security platform to help enhance security of Generative AI applications and workflows against potential adversaries, model vulnerabilities, privacy, copyright and legal exposures, sensitive information leaks, Intelligence and data exfiltration, infiltration at training and inference, integrity attacks in AI applications, anomalies detection and enhanced visibility in AI pipelines. forensics, audit,AI governance in AI footprint.
What is at stake AI & Gen AI in Business? We are addressing exactly that.
Generative AI security solution for Healthcare, Insurance, Retail, Banking, Finance, Life Sciences, Manufacturing.
Despite the Security challenges, the promise of Generative AI is enormous.
We are committed to enhance the security of Generative AI applications and workflows in industries and enterprises to reap the benefits .
Alert AI 360 view and Detections
- Alerts and Threat detection in AI footprint
- LLM & Model Vulnerabilities Alerts
- Adversarial ML Alerts
- Prompt, response security and Usage Alerts
- Sensitive content detection Alerts
- Privacy, Copyright and Legal Alerts
- AI application Integrity Threats Detection
- Training, Evaluation, Inference Alerts
- AI visibility, Tracking & Lineage Analysis Alerts
- Pipeline analytics Alerts
- Feedback loop
- AI Forensics
- Compliance Reports
End-to-End Security with
- Data alerts
- Model alerts
- Pipeline alerts
- Evaluation alerts
- Training alerts
- Inference alerts
- Model Vulnerabilities
- Llm vulnerability
- Privacy
- Threats
- Resources
- Environments
- Governance and compliance
Organizations need to responsibly assess and enhance the security of their AI environments development, staging, production for Generative AI applications and Workflows in Business.
No Comments