Retrieval Augumented Generative (RAG) Model and Risks
Alerts and Risks in Generative AI applications and workflows
Metric events , logs, events, traces
Anomalies
Vulnerabilities
Risks
Threats
Introduction
Generative AI Large language models (LLMs) are deep learning algorithms that can generate new content, such as text, images, music, or code.
Using very large datasets they can recognize, summarize, translate, predict, and generate content. To deploy these large language models for specific use cases, the models can be customized using several techniques to achieve higher accuracy. Some techniques include prompt tuning, fine-tuning, and adapters.Large-scale compute infrastructure and large-scale data are necessary to maintain and develop LLMs.
- Techniques:
- Generative Adversarial Networks (GANs): Models that pit a generator against a discriminator to create realistic data.
- Variational Autoencoders (VAEs): Models that encode data into a latent space and then decode it back to generate new data.
- Transformer-based Models: Models like GPT-3/4 that generate text by predicting the next word in a sequence.RAG is a more customized Gen AI for retrieval of relevant data.
- Applications:
- Text Generation: Creating articles, poetry, and dialogue (e.g., GPT-3, ChatGPT).
- Image Generation: Creating realistic images from scratch (e.g., DALL-E, StyleGAN).
- Music Composition: Generating original music tracks (e.g., OpenAI Jukebox).
- Code Generation: Writing code snippets or entire programs (e.g., GitHub Copilot).
Let’s consider looking at the RAG model and some of the underlying security perspectives.
What is RAG?
RAG (a Retrieval Augmented Gen Model in AI), is an AI assistant that takes inputs from users,pulls out documents relevant to the questions/instructions stored in vector databases and enriches the output giving real time response.Unlike an regular LLM that may miss on context specific questions or a fine tuned LLM may not give answers to more proprietary based questions.
A RAG model is trained on a diverse dataset to become an expert and a specialized answer giver.
Eg . An LLM in general is trained on generic data but with RAG,more specific documents need to be integrated for fetching relevant details.
“
Querying an chatbot agent of an organization may help the customer partially but if you need clarifications on troubleshooting the product internals, it would not be able to answer unless it has the relevant documents, but with RAG architecture the organization has to integrate all the troubleshooting documents for providing an enhanced experience.”
Understanding the scope of the topic is not limited to training data.As new public queries emerge or proprietary queries emerge when the training data is not the only source, a RAG model approach is more proactive to new queries posed by users and prepares itself to fetch vast related information in real time and consolidates and integrates for real time responses.
RAG requires knowledge beyond the training data, and aims to provide accurate relevant information.RAG Uses a transformer Model.
Security Concerns on RAG Model:
Security Threats in a RAG AI occur at many interfaces.At the user interface when the user throws instructions and questions to the Chat Agent,a process called the prompt Engineering is open to arise at the interfacing end.In general,user crafts sql queries to get data, with interactive agents as in RAG, the user is prone to questioning and giving multiple instructions when the system is put to a high chance of resource consumption and overloading, causing disruptions to the agent. There is also the scenario of Leaking Chat histories which is an area of concern.
Vector databases store enormous amounts of data to address higher level context specific outputs.PII(Personal Identifiable Information) data are at risks.Inversions Attacks and Data poisoning on Vector databases may arise due to third party interventions.Employees are given access to customer databases causing a chance for security breach. Stolen credentials ,account taken over,and prompt injection attacks are scenarios of threat at this stage.
Model poisoning is a concept said to occur by posing a combination of both positive and negative questions where in the model is prone to confusion resulting in unexpected results.
Alert AI – RAG security Detections
System Integrity of LLMs
Build domain-specific security guardrails
Alerts on reliability and trustworthiness of LLMs
Alerts on Audit upstream dependency pipelines
Integrity verifications at runtime
Detection corrupt the output of the model
Detects modified Tokenizer files with supply chain attacks
Detects tokenizer manipulations in LLMs
MANAGE ACCESS TO RESOURCES IN YOUR AI CLUSTERS
Ensure security controls to LLM’s ready for enterprise infrastructure.
Assign the AI service roles on the AI resource’s to Managed identities
SPOT and STOP Attacks your AI infrastructure, compute, gpu, filtration, service ddos attacks
Ensure privacy , Obfuscate sensitive information
–Redaction and Obfuscation
–Copyright Legal
–Data privacy
–Sensitive content PII, PHI
Copyright Legal exposures
Sensitive information disclosure, Data privacy violations,
About Alert AI
What is at stake AI & Gen AI in Business? We are addressing exactly that. Generative AI security solution for Healthcare , Pharma, Insurance, Life Sciences, Retail, Banking, Finance, Manufacturing.
Alert AI is end-to-end, Interoperable Generative AI security platform to help enhance security of Generative AI applications and workflows. against potential adversaries, model vulnerabilities, privacy, copyright and legal exposures, sensitive information leaks, Intelligence and data exfiltration, infiltration at training and inference, integrity attacks in AI applications, anomalies detection and enhanced visibility in AI pipelines. forensics, audit,AI governance in AI footprint.
Despite the Security challenges, the promise of large language models is enormous.
We are committed to enabling industries and enterprises to reap the benefits of large language models.