LLM red-teaming. Agentic blast-radius. Regulator-ready.

AI systems introduce attack surfaces that standard application security controls do not address. Prompt injection, model theft, indirect instruction hijacking via documents and tools, and agentic privilege escalation require purpose-built assessment methodologies. We assess, harden, and govern AI systems from the model layer through the application interface — and against regulatory obligations that are now enforceable.

OWASP LLM Top 10MITRE ATLASNIST AI RMFEU AI ActISO/IEC 42001
ATTACK SURFACE

Where AI breaks differently from web.

LLM Application Security

  • Direct prompt injection user-supplied input overriding system instructions, role confusion, and safety bypass techniques
  • Indirect prompt injection adversarial instructions embedded in documents, web content, emails, or tool outputs retrieved and processed by the model without user awareness
  • Data exfiltration via output extraction of system prompts, RAG context, other users' data, and confidential configuration through crafted queries
  • Jailbreaks & safety bypass multi-turn manipulation, roleplay exploitation, context window stuffing, and encoding evasion
  • Function / tool call abuse injection of malicious tool arguments, tool selection manipulation, and chained tool misuse
  • Insecure output handling LLM-generated content rendered in unsafe contexts (XSS, SQLi, path traversal downstream)

Agentic AI Security

  • Agent privilege escalation agents acquiring capabilities or access beyond their intended scope through instruction manipulation or tool chaining
  • Multi-agent trust boundaries in orchestrator/sub-agent architectures, assessment of trust assumptions between agents and the impact of one compromised agent on others
  • Persistent memory poisoning injecting adversarial content into agent memory stores that influences future decisions or actions
  • Tool authorisation controls whether agents can be induced to invoke tools with unintended arguments or outside authorised scope
  • Human-in-the-loop bypass techniques to circumvent approval workflows and safety checkpoints in agentic pipelines
  • Blast radius assessment mapping the real-world impact of a fully compromised agent given its tool access and permissions

RAG & Data Pipeline Security

  • Vector database security access controls, tenant isolation, and the risk of cross-tenant data retrieval in multi-tenant RAG deployments
  • Document ingestion security PII/PHI detection in preprocessing, metadata leakage, and malicious document content that survives into the index
  • Retrieval manipulation adversarial documents crafted to be preferentially retrieved and override legitimate context
  • Embedding model integrity validation that embedding models have not been substituted or tampered with in the pipeline
  • Query-time injection user input crafted to manipulate the retrieval query and surface attacker-controlled content

Model & Supply Chain Security

  • Model provenance & integrity cryptographic signing, provenance tracking (in-toto / SLSA for ML), and registry access controls
  • Data poisoning assessment of training and fine-tuning data pipeline integrity and controls against adversarial data injection
  • Model theft & extraction API-level model replication attacks and intellectual property exfiltration through repeated querying
  • Membership inference determining whether specific data was present in training data, with privacy and compliance implications
  • Dependency & plugin security SBOM for ML dependencies, scanning of LLM plugins and integrations, and third-party model risk
CONTROLS & RED TEAMING

Defensive engineering and continuous evaluation.

Controls & Guardrails

  • Gateway policies input/output filtering, content classification, rate limits, and tool scope enforcement at the API gateway layer
  • Context governance allowlists for retrievable document sets, chunk-level metadata access controls, and tenant isolation enforcement
  • PII / PHI redaction pre-query sanitisation and post-response scrubbing for regulated data types
  • Prompt/response observability telemetry, audit logging, and anomaly detection across all model interactions
  • Secrets hygiene ensuring API keys, credentials, and system prompt contents cannot be extracted via model outputs

Red Teaming & Safety Evals

  • Adversarial prompt libraries curated attack sets covering OWASP LLM Top 10 categories with application-specific adaptations
  • Automated safety evals regression test suites integrated into CI/CD to detect guardrail regressions on model or prompt updates
  • Manual red team campaigns chained, multi-turn attack scenarios including indirect injection via realistic document payloads
  • Model drift monitoring detecting behavioural changes in updated models that weaken previously validated safety properties
  • Multimodal testing vision, audio, and code inputs used as injection vectors where the application accepts non-text inputs
GOVERNANCE & REGULATION

Documentation regulators will accept.

NIST AI RMF

  • AI risk register and control catalogue mapped to Govern, Map, Measure, and Manage functions
  • Model cards and system cards documenting intended use, limitations, and known risks
  • AI risk profile and organisational risk tolerance documentation

EU AI Act

  • Risk classification assessment of whether systems fall under prohibited, high-risk, limited-risk, or minimal-risk categories
  • High-risk system obligations risk management system, data governance, technical documentation, human oversight, and accuracy/robustness requirements
  • GPAI model obligations transparency, copyright policy, and systematic risk assessment for general-purpose AI models

ISO/IEC 42001 & Privacy

  • ISO/IEC 42001 AI management system gap assessment and implementation roadmap
  • Data retention and consent documentation for training and inference data
  • Privacy impact assessment for AI systems processing personal data
  • Export controls and cross-border data transfer obligations for model deployment
DELIVERABLES

What ships at the end.

Assessment

  • AI system threat model with MITRE ATLAS technique mapping
  • OWASP LLM Top 10 assessment report with findings, evidence, and remediation guidance
  • Agentic system blast radius analysis and privilege boundary assessment
  • RAG pipeline security review with data flow analysis

Engineering

  • Gateway and guardrail configuration with policy-as-code rules
  • Automated safety eval suite with CI/CD integration instructions
  • Observability configuration: prompt/response logging and anomaly detection rules
  • SBOM for ML dependencies with vulnerability assessment

Governance

  • AI risk register mapped to NIST AI RMF or EU AI Act as applicable
  • Model card and system card templates populated for assessed systems
  • EU AI Act risk classification memo with obligations analysis
  • Executive summary and remediation roadmap
ACTIVE INCIDENT?