/ practice · red-team

We stress-test what others have built.

Independent expert red-teaming of LLMs, agents, and AI systems — before they ship, before they scale, before the regulator calls. Adversarial prompts, jailbreak attempts, tool-misuse scenarios, prompt-injection chains, edge-case planning. We find the failures you’d rather find than your customers find. Every report signed and dated.

Methodology

/ 4 phases

Week 1

Scope + threat model

We agree which systems are in scope, which failure modes matter most, what access we need, and what we’re not allowed to touch. Independence clauses get drafted with your legal counsel before anyone runs a prompt.

Week 2–5

Adversarial campaign

Adversarial prompts, jailbreak attempts, prompt-injection chains, tool-misuse scenarios, edge-case planning, and grounded-QA stress tests. Code review, prompt review, eval review, log review where in scope. Every finding is referenced to a primary source.

Week 6–9

Test + report

We run our own evaluation suite against the system as a third party. The report draft goes through internal review by a red-team lead who is not on the engagement.

Week 10–12

Sign-off + submission pack

Final report, signed by the red-team lead, with the regulatory submission pack assembled per framework (when applicable). Findings, evidence, severity ladder, dissents — all included. The red-team lead’s name is on every page.

Engagement shape

/ what you sign up for

/ typical cycle

2 — 12 weeks (scope-dependent)

/ our team on the engagement

1 red-team lead, 1 framework specialist per framework in scope, 1 evidence reviewer

/ what we need from you

Authorised counterpart with sign-off authority, access to source/prompts/logs/incident records, your existing policy documents, and your legal team’s contact

/ revenue cap

No cap. This is core practice.

/ deliverables

01Findings report — signed, with severity ladder 1–3
02Patch playbook by failure mode — actionable, with rerun criteria
03Regulatory submission pack — per framework, when applicable
04Independence attestation — structurally enforced, recorded in your engagement contract

Mappings

/ 6 frameworks

EU AI Act

High-risk systems (Annex III), general-purpose AI models, transparency obligations (Article 50–55), and conformity-assessment procedure (Article 43)

ISO 42001

Full management-system mapping — clauses 4–10 plus Annex A controls

NIST AI RMF 1.0

Govern, Map, Measure, Manage functions; full traceability of evidence

UAE AI Charter

All twelve principles, evidence-mapped

DIFC + ADGM data regulations

Cross-border data flow, processing notices, data-subject rights as they intersect with AI decisioning

SOC 2 Type II

Mapped to the security, availability, and confidentiality trust services criteria where AI systems touch them

Evidence

/ forthcoming

/ status · forthcoming

The proof for this pillar gets linked here as we ship public scorecards and clear case studies for publication. We don’t backfill this section with placeholders — when evidence lands, it lands here.

Why independence is enforced contractually

The standard objection to independent assessment is the same one accountancy faced thirty years ago: the firm doing the review is also the firm doing the build. We refuse that shape. On every engagement, the red-team lead and the build lead are different people, named in your contract. Both are subject to revenue caps the contract spells out.

This is in writing because we won’t sell anything we wouldn’t sign our name to in a regulator’s submission.

Where we draw the line

We turn down red-team work that we can’t ship a report for. If a client wants a confidential review we can’t sign and publish a redacted version of, we refuse. The signed report is the product. Without it, what we’d be selling is a private opinion — and we don’t sell those.

/ ready to talk shape

Tell us what’s under contract.

START A BRIEF →READ THE FIELD NOTES

LATTICE/AIEST. 2026 · UAE

← BACK TO HOMEPAGE