AATMF v3.1 · Volume I

I.foundations.

Core concepts, risk methodology, and architectural overview.

Part 1: Introduction and MethodologyPart 2: Risk Assessment Methodology (AATMF-R v3)Part 3: Framework Architecture
01-introduction

Part 1: Introduction and Methodology

The Critical Need for AI Threat Modeling

Artificial intelligence has transitioned from research curiosity to critical infrastructure. Language models process medical queries, legal documents, financial transactions, and government communications. Yet the security frameworks designed to protect these systems were built for a fundamentally different paradigm.

Traditional cybersecurity operates on deterministic logic: inputs produce predictable outputs, vulnerabilities have defined boundaries, and exploits follow reproducible steps. AI systems break every one of these assumptions. They are probabilistic, context-dependent, and — critically — trained on human language, making them susceptible to the same manipulation techniques that have been used against humans for millennia.

This is the core thesis of AATMF: AI systems are vulnerable to social engineering because they were trained to respond like humans. This is the first technology where human manipulation techniques directly translate to technical exploitation.

Genesis and Evolution

Version Date Scope
v1.0 2024 Initial framework, 8 tactics
v2.0 Late 2024 Expanded to 12 tactics, added risk scoring
v3 February 2026 15 tactics, 240 techniques, 2,152+ procedures, namespaced IDs, Volumes V–VII, 2025–2026 threat integration

Scope

AATMF covers adversarial threats against:

  • Large Language Models (LLMs) and Large Reasoning Models (LRMs)
  • Multimodal models (vision, audio, video)
  • Retrieval-Augmented Generation (RAG) systems
  • Autonomous AI agents and multi-agent orchestrators
  • AI development and deployment infrastructure
  • Human-in-the-loop workflows
  • AI supply chains (models, datasets, tools, libraries)

Threat Actor Taxonomy

Actor Motivation Typical Tactics Sophistication
Script kiddies Curiosity, clout T1, T2 Low
Bug bounty hunters Financial reward T1–T5, T10 Medium–High
Cybercriminals Financial gain T1–T3, T7–T8, T13 Medium
Corporate espionage Competitive advantage T5, T10, T13–T14 High
Nation-state actors Strategic advantage T6, T11, T13–T15 Very High
AI red teams Security improvement All Very High
Insiders Various T6, T15 Variable

Methodology

Each technique in AATMF is documented with:

  1. Unique namespaced identifierT{tactic}-AT-{sequence:03d}
  2. Risk score — Computed via AATMF-R v3 six-factor formula
  3. Attack procedures — Concrete implementation variants with example prompts
  4. Detection patterns — Signatures and heuristics for identifying the technique
  5. Mitigation controls — Defensive measures mapped to the technique
  6. Cross-framework references — Mappings to MITRE ATLAS, OWASP, NIST, EU AI Act

← Volume I · Home · Part 2: Risk Assessment →

02-risk-assessment

Part 2: Risk Assessment Methodology (AATMF-R v3)

Formula

Risk = (L × I × E) / 6 × (D / 6) × R × C

Factors

Factor Symbol Range Description
Likelihood L 1–5 Probability of successful exploitation
Impact I 1–5 Severity of successful attack
Exploitability E 1–5 Ease of execution (skill, resources, access required)
Detectability D 1–5 Difficulty of detection (5 = nearly invisible)
Recoverability R 1–5 Effort to recover (5 = irrecoverable)
Cost Factor C 0.5–2.0 Economic impact multiplier

Scoring Guidelines

Likelihood (L)

Score Label Criteria
1 Rare Requires novel research, no known PoC
2 Unlikely Requires specialized knowledge
3 Possible Known technique, moderate skill required
4 Likely Well-documented, readily available tools
5 Almost Certain Automated, commodity attack

Impact (I)

Score Label Criteria
1 Negligible Minor policy violation, no data exposure
2 Minor Limited harmful content, no sensitive data
3 Moderate Sensitive data exposure, service degradation
4 Major Critical data breach, safety bypass, service outage
5 Catastrophic Physical harm potential, mass data breach, systemic compromise

Exploitability (E)

Score Label Criteria
1 Theoretical Requires custom research and novel techniques
2 Difficult Needs deep expertise and specific conditions
3 Moderate Documented approach, some skill required
4 Easy Copy-paste attacks, minimal customization
5 Trivial Automated tools, zero skill required

Detectability (D)

Score Label Criteria
1 Obvious Trivially detected by basic filters
2 Easy Standard monitoring catches it
3 Moderate Requires specialized detection
4 Difficult Advanced analysis needed
5 Nearly Invisible No reliable detection method exists

Recoverability (R)

Score Label Criteria
1 Immediate Auto-recoverable, no intervention needed
2 Quick Simple rollback or reset
3 Moderate Requires investigation and manual remediation
4 Difficult Extended downtime, data loss possible
5 Irrecoverable Permanent damage, no full recovery path

Cost Factor (C)

Range Criteria
0.5 Minimal economic impact, internal only
1.0 Standard business impact
1.5 Significant financial or reputational damage
2.0 Catastrophic economic consequences

Risk Rating Scale

Score Rating Color Action Required
250+ CRITICAL 🔴 Immediate remediation required
200–249 HIGH 🟠 Remediation within current sprint
150–199 MEDIUM 🟡 Scheduled remediation
100–149 LOW 🔵 Risk accepted or monitored
0–99 INFO Documented, no action required

Example Calculation

T1-AT-001 — Instruction Override Injection

Factor Score Rationale
Likelihood 5 Commodity attack, automated tools exist
Impact 4 Complete safety bypass
Exploitability 5 Copy-paste, zero skill
Detectability 3 Pattern-matchable but evolving
Recoverability 2 Session-scoped, no persistent damage
Cost Factor 1.5 Brand and regulatory risk
Risk = (5 × 4 × 5) / 6 × (3 / 6) × 2 × 1.5
     = 100/6 × 0.5 × 2 × 1.5
     = 16.67 × 0.5 × 2 × 1.5
     = 25.0

Note: Scores vary based on deployment context. A chatbot vs. an autonomous financial agent would score very differently on Impact and Cost Factor.


← Part 1 · Home · Part 3: Architecture →

03-architecture

Part 3: Framework Architecture

Hierarchical Structure

AATMF v3
├── 15 Tactics                    (high-level adversarial objectives)
│   ├── 240 Techniques            (specific attack methods)
│   │   ├── 2,152+ Attack Procedures  (implementation variants)
│   │   │   └── 4,980+ Prompts        (actual attack examples)
│   │   ├── Detection Patterns
│   │   └── Mitigation Controls
│   └── Risk Scoring (AATMF-R v3)
└── Cross-Framework Mappings
    ├── MITRE ATLAS v4.6.0
    ├── OWASP LLM Top 10 2025
    ├── NIST AI RMF / IR 8596
    └── EU AI Act

Namespaced Identifier System

v3 introduces namespaced identifiers to eliminate AT-ID collisions:

Element Format Example
Tactic T{n} T1, T15
Technique T{n}-AT-{seq:03d} T1-AT-001, T11-AT-016
Attack Procedure T{n}-AP-{seq}{letter} T1-AP-001A, T3-AP-010B

Why Namespacing?

In v3.0, AT-010 referred to "Dialogue Hijacking" in T1 and "Euphemism Exploitation" in T2 — completely different techniques sharing the same ID. Across all 15 tactics, 43 such collisions existed. The namespaced system guarantees every identifier is globally unique while preserving tactic membership at a glance.

Cross-Framework Mappings

MITRE ATLAS v4.6.0 (October 2025)

AATMF Tactic Primary ATLAS Mapping
T1 — Prompt Subversion AML.T0051 LLM Prompt Injection
T2 — Semantic Evasion AML.T0054 LLM Jailbreak
T3 — Reasoning Exploitation AML.T0054.001–003
T4 — Multi-Turn AML.T0056 LLM Meta Prompt Extraction
T5 — Model/API Exploitation AML.T0044 Full ML Model Access
T6 — Training Poisoning AML.T0020 Poison Training Data
T7 — Output Manipulation AML.T0024.002 Exfiltration via ML Inference API
T8 — Deception AML.T0048 Societal Harm
T9 — Multimodal AML.T0051 (cross-modal variants)
T10 — Integrity Breach AML.T0024 Exfiltration via Cyber Means
T11 — Agentic AML.T0057 LLM Agent Abuse
T12 — RAG Manipulation AML.T0058 RAG Poisoning
T13 — Supply Chain AML.T0010 ML Supply Chain Compromise
T14 — Infrastructure AML.T0029 Denial of ML Service
T15 — Human Workflow AML.T0048.004 Reputational Harm

OWASP LLM Top 10 2025

OWASP Entry AATMF Coverage
LLM01: Prompt Injection T1, T2, T3, T9
LLM02: Sensitive Information Disclosure T7, T10
LLM03: Supply Chain Vulnerabilities T13
LLM04: Data and Model Poisoning T6, T12
LLM05: Improper Output Handling T7, T8
LLM06: Excessive Agency T11
LLM07: System Prompt Leakage T1, T4
LLM08: Vector and Embedding Weaknesses T12
LLM09: Misinformation T8
LLM10: Unbounded Consumption T14

Tactic Overview

ID Tactic Techniques Procedures
T1 Prompt & Context Subversion 16 76
T2 Semantic & Linguistic Evasion 20 161
T3 Reasoning & Constraint Exploitation 19 178
T4 Multi-Turn & Memory Manipulation 16 147
T5 Model & API Exploitation 16 142
T6 Training & Feedback Poisoning 15 141
T7 Output Manipulation & Exfiltration 15 146
T8 External Deception & Misinformation 15 150
T9 Multimodal & Cross-Channel Attacks 17 147
T10 Integrity & Confidentiality Breach 15 147
T11 Agentic & Orchestrator Exploitation 16 160
T12 RAG & Knowledge Base Manipulation 15 149
T13 AI Supply Chain & Artifact Trust 15 150
T14 Infrastructure & Economic Warfare 15 150
T15 Human Workflow Exploitation 15 108
Total 240 2,152+

← Part 2 · Home · Volume II: Core Tactics →

Vol II →
Core Tactics (T01–T08)
The eight foundational adversarial-AI tactics: prompt subversion, semantic evasion, reason…
Vol III →
Advanced Tactics (T09–T12)
Multimodal attacks, integrity breach, agentic exploitation, RAG-specific threats — for sys…
Vol IV →
Infrastructure & Human (T13–T15)
Where the attack surface meets the surrounding stack: supply chain, infrastructure, and th…
Vol V →
Operations
Detection engineering, mitigation, incident response, red-team ops, blue-team defense — ap…
Vol VI →
Governance
Risk management, compliance mapping (NIST AI RMF, MITRE ATLAS), and security training prog…
Vol VII →
Appendices
Attack catalog, signatures, tools, templates, case studies, glossary — operational referen…
Author
Kai Aizen
Independent offensive security researcher. 23 published CVEs, 5 Linux kernel mainline patches, creator of AATMF / P.R.O.M.P.T / SEF, author of Adversarial Minds.