Do you support RAG and fine-tuned models?

Yes. We assess RAG pipelines (retrieval, chunking, embedding, re-ranking), fine-tuned models, and production deployments across all major model providers, including self-hosted open-source models.

Can you test agentic AI systems?

Yes. Agentic systems — including multi-agent orchestration, function-calling chains, and autonomous decision loops — represent an expanded attack surface beyond standard LLM application security. We assess tool abuse, privilege escalation through agents, and indirect prompt injection in agentic contexts.

Can you align to NIST AI RMF or EU AI Act?

Yes. We provide governance support, risk registers, control mapping, and technical documentation for NIST AI RMF and EU AI Act readiness. For EU AI Act, we address risk classification, prohibited practice assessment, high-risk system obligations, and transparency requirements.

LLM red-teaming. Agentic blast-radius. Regulator-ready.

AI systems introduce attack surfaces that standard application security controls do not address. Prompt injection, model theft, indirect instruction hijacking via documents and tools, and agentic privilege escalation require purpose-built assessment methodologies. We assess, harden, and govern AI systems from the model layer through the application interface — and against regulatory obligations that are now enforceable.

OWASP LLM Top 10MITRE ATLASNIST AI RMFEU AI ActISO/IEC 42001

ATTACK SURFACE

Where AI breaks differently from web.

LLM Application Security

Direct prompt injection — user-supplied input overriding system instructions, role confusion, and safety bypass techniques
Indirect prompt injection — adversarial instructions embedded in documents, web content, emails, or tool outputs retrieved and processed by the model without user awareness
Data exfiltration via output — extraction of system prompts, RAG context, other users' data, and confidential configuration through crafted queries
Jailbreaks & safety bypass — multi-turn manipulation, roleplay exploitation, context window stuffing, and encoding evasion
Function / tool call abuse — injection of malicious tool arguments, tool selection manipulation, and chained tool misuse
Insecure output handling — LLM-generated content rendered in unsafe contexts (XSS, SQLi, path traversal downstream)

Agentic AI Security

Agent privilege escalation — agents acquiring capabilities or access beyond their intended scope through instruction manipulation or tool chaining
Multi-agent trust boundaries — in orchestrator/sub-agent architectures, assessment of trust assumptions between agents and the impact of one compromised agent on others
Persistent memory poisoning — injecting adversarial content into agent memory stores that influences future decisions or actions
Tool authorisation controls — whether agents can be induced to invoke tools with unintended arguments or outside authorised scope
Human-in-the-loop bypass — techniques to circumvent approval workflows and safety checkpoints in agentic pipelines
Blast radius assessment — mapping the real-world impact of a fully compromised agent given its tool access and permissions

RAG & Data Pipeline Security

Vector database security — access controls, tenant isolation, and the risk of cross-tenant data retrieval in multi-tenant RAG deployments
Document ingestion security — PII/PHI detection in preprocessing, metadata leakage, and malicious document content that survives into the index
Retrieval manipulation — adversarial documents crafted to be preferentially retrieved and override legitimate context
Embedding model integrity — validation that embedding models have not been substituted or tampered with in the pipeline
Query-time injection — user input crafted to manipulate the retrieval query and surface attacker-controlled content

Model & Supply Chain Security

Model provenance & integrity — cryptographic signing, provenance tracking (in-toto / SLSA for ML), and registry access controls
Data poisoning — assessment of training and fine-tuning data pipeline integrity and controls against adversarial data injection
Model theft & extraction — API-level model replication attacks and intellectual property exfiltration through repeated querying
Membership inference — determining whether specific data was present in training data, with privacy and compliance implications
Dependency & plugin security — SBOM for ML dependencies, scanning of LLM plugins and integrations, and third-party model risk

CONTROLS & RED TEAMING

Defensive engineering and continuous evaluation.

Controls & Guardrails

Gateway policies — input/output filtering, content classification, rate limits, and tool scope enforcement at the API gateway layer
Context governance — allowlists for retrievable document sets, chunk-level metadata access controls, and tenant isolation enforcement
PII / PHI redaction — pre-query sanitisation and post-response scrubbing for regulated data types
Prompt/response observability — telemetry, audit logging, and anomaly detection across all model interactions
Secrets hygiene — ensuring API keys, credentials, and system prompt contents cannot be extracted via model outputs

Red Teaming & Safety Evals

Adversarial prompt libraries — curated attack sets covering OWASP LLM Top 10 categories with application-specific adaptations
Automated safety evals — regression test suites integrated into CI/CD to detect guardrail regressions on model or prompt updates
Manual red team campaigns — chained, multi-turn attack scenarios including indirect injection via realistic document payloads
Model drift monitoring — detecting behavioural changes in updated models that weaken previously validated safety properties
Multimodal testing — vision, audio, and code inputs used as injection vectors where the application accepts non-text inputs

GOVERNANCE & REGULATION

Documentation regulators will accept.

NIST AI RMF

AI risk register and control catalogue mapped to Govern, Map, Measure, and Manage functions
Model cards and system cards documenting intended use, limitations, and known risks
AI risk profile and organisational risk tolerance documentation

EU AI Act

Risk classification — assessment of whether systems fall under prohibited, high-risk, limited-risk, or minimal-risk categories
High-risk system obligations — risk management system, data governance, technical documentation, human oversight, and accuracy/robustness requirements
GPAI model obligations — transparency, copyright policy, and systematic risk assessment for general-purpose AI models

ISO/IEC 42001 & Privacy

ISO/IEC 42001 AI management system gap assessment and implementation roadmap
Data retention and consent documentation for training and inference data
Privacy impact assessment for AI systems processing personal data
Export controls and cross-border data transfer obligations for model deployment

DELIVERABLES

What ships at the end.

Assessment

AI system threat model with MITRE ATLAS technique mapping
OWASP LLM Top 10 assessment report with findings, evidence, and remediation guidance
Agentic system blast radius analysis and privilege boundary assessment
RAG pipeline security review with data flow analysis

Engineering

Gateway and guardrail configuration with policy-as-code rules
Automated safety eval suite with CI/CD integration instructions
Observability configuration: prompt/response logging and anomaly detection rules
SBOM for ML dependencies with vulnerability assessment

Governance

AI risk register mapped to NIST AI RMF or EU AI Act as applicable
Model card and system card templates populated for assessed systems
EU AI Act risk classification memo with obligations analysis
Executive summary and remediation roadmap

Get a written proposal

Send scope + timeline. Detailed SoW within 1 business day.

Open the form →

Email a senior practitioner

Direct line for scoping questions. NDA available on request before you share details.

hello@grillisecurity.com →

Active incident?

24/7 incident line. Triage call + retainer set-up inside the hour for new engagements.

+372 5610 1641 →