Codifying Trust in Artificial Intelligence.

The Constitution AI Project is a non-profit initiative dedicated to developing, promoting, and implementing a global, open-source framework for safe, accountable, and human-centric AI.

Our Mission

"To move beyond ethical platitudes and provide the world with a practical, auditable, and enforceable standard for trustworthy AI."

We believe that AI safety and innovation are not opposing forces. True progress requires a foundation of trust. Our mission is to build that foundation by providing a 'gold standard' constitutional framework that organizations, developers, and policymakers can adopt, adapt, and implement to ensure artificial intelligence is built and deployed in service of humanity.

Model AI Constitution

Article I: Foundational Principles & Prohibitions

Section 1.1: Unacceptable Risks (Prohibitions)

The development, procurement, or deployment of AI systems designed for, or having the primary effect of, the following purposes is strictly and unequivocally prohibited:

  • (a)

    Cognitive Behavioural Manipulation: Systems that deploy subliminal, manipulative, or deceptive techniques to materially distort a person’s or a specific vulnerable group's behavior in a manner that causes or is likely to cause harm.

  • (b)

    Social Scoring: Systems that classify, evaluate, or score individuals or groups based on their social behavior, socio-economic status, or personal characteristics, where such scoring leads to detrimental or unfavorable treatment.

  • (c)

    Indiscriminate Biometric Surveillance: Systems that engage in the untargeted scraping of facial images or other biometric data from the internet or public-access (CCTV) footage to create or expand biometric identification databases.

  • (d)

    Emotion Recognition in Sensitive Contexts: Systems used to infer emotions or mental states of individuals in the contexts of employment and educational institutions.

Expert Annotation:

This is the Constitution's "hard firewall" and a core component of a 100-point framework (Pillar 1.2). It is non-negotiable. It moves beyond vague ethical suggestions by establishing "red lines" derived directly from the "Unacceptable Risk" category of the EU AI Act. This provides immediate, unambiguous legal and ethical clarity to all developers and partners. A "100-point" framework does not "weigh" these risks; it eliminates them from the organization's activities.

Section 1.2: Core Principles

All AI systems not prohibited by Section 1.1 shall be designed, deployed, and governed in accordance with the following seven core principles:

  • (a)

    Human-Centricity & Dignity: AI systems shall serve humanity. They must respect, protect, and promote internationally recognized human rights, fundamental freedoms, and human dignity.

  • (b)

    Fairness & Non-Discrimination: AI systems shall be designed to treat all individuals and groups equitably and to actively mitigate and avoid "unfair bias". Systems shall not perpetuate or exacerbate discriminatory biases.

  • (c)

    Transparency & Explainability: The operation of AI systems shall be transparent. Technical "Explainability" (XAI) shall be provided to operators, and simple "Transparency" shall be provided to end-users, ensuring they know when they are interacting with an AI.

  • (d)

    Robustness, Safety, & Security: AI systems shall be safe, secure, and robust throughout their entire lifecycle. They must function appropriately for their intended use and be resilient against attacks or conditions that could cause harm.

  • (e)

    Privacy & Data Governance: AI systems shall comply with all privacy laws and be "built with privacy by design". Data governance must ensure training, validation, and testing datasets are "relevant, sufficiently representative and, to the best extent possible, free of errors and complete".

  • (f)

    Accountability & Human Oversight: Mechanisms shall be in place to ensure human oversight, responsibility, and accountability for AI systems and their outcomes. AI systems shall not be the final authority on decisions that "produce legal effects".

  • (g)

    Sustainability & Environmental Flourishing: AI systems shall be designed, trained, and operated to promote sustainable development. Their full environmental, societal, and economic impact must be assessed to ensure they contribute to "environment and ecosystem flourishing".

Expert Annotation:

This section codifies the global consensus identified in Pillar 1.1 and its corresponding table. It synthesizes the core principles from the world's most critical intergovernmental (OECD, UNESCO) and corporate (Microsoft, Google, Meta) frameworks. Crucially, it includes "Sustainability & Environmental Flourishing," a principle central to the UNESCO framework and EU guidelines that is often missed in corporate-only documents. This makes the framework more comprehensive and future-facing.

Article II: Governance & Oversight Structure

Section 2.1: The AI Risk Management Office (ARMO)

There shall be established an AI Risk Management Office (ARMO), an operational body whose charter is to implement this Constitution. The ARMO shall be responsible for:

  • (a)

    Operationalizing the National Institute of Standards and Technology (NIST) AI Risk Management Framework (RMF) across the full AI lifecycle.

  • (b)

    Executing the core functions of Govern, Map, Measure, and Manage for all AI projects.

  • (c)

    Conducting and documenting all required risk assessments, including the "Human Impact Scorecard" (Art. 3.3), and preparing compliance reports for the AI Review Board.

  • (d)

    Maintaining the central AI Model Registry (Art. 5.1).

Expert Annotation:

This is the "engine" of the governance system (Pillar 2.1). It creates the operational "do-ers." It avoids ambiguity by not asking the ARMO to "invent" a process. It mandates the "gold standard" NIST AI RMF as its operational charter. This makes the ARMO's role immediately clear, auditable, and aligned with best practices. It clearly defines the ARMO's relationship to the AIRB: the ARMO prepares compliance, and the AIRB adjudicates it. This is the first step in our "separation of powers."

Section 2.2: The AI Review Board (AIRB)

  • (a)

    There shall be established an independent, cross-functional AI Review Board (AIRB), which shall serve as the primary internal enforcement and adjudication body of this Constitution.

  • (b)

    The AIRB's composition and procedures shall be modeled on medical Institutional Review Boards (IRBs) / Independent Ethics Committees (IECs).

  • (c)

    The AIRB is vested with binding authority to review, audit, and:

    • i.

      Approve;

    • ii.

      Require Modification To; or

    • iii.

      Disapprove (Halt)

    any High-Risk AI System (Art. 3.2) prior to deployment.

  • (d)

    The AIRB shall be a "multidisciplinary team" composed of, at minimum: senior representatives from Legal, Ethics, Privacy, and Cybersecurity; senior Technical AI Experts; and at least one external member representing impacted communities or civil society.

  • (e)

    The AIRB shall have the authority to investigate all reported "serious incidents" (Art. 6.3) and impose the internal consequences defined in Article VI.

Expert Annotation:

This is the "internal check" and the "teeth" of the governance system (Pillar 2.2). This Article solves the "toothless ethics board" problem. Modeling it on the IRB is a critical step, as the IRB model is legally established for governing high-risk human-centric research. The mandate in Section 2.2(c) for binding authority is the most important part. It gives the AIRB real power to halt a product launch, a power that advisory-only committees lack. Section 2.2(d) mandates a "cross-functional" and "multidisciplinary" composition, including an external member, to prevent groupthink and "capture" by internal development priorities.

Section 2.3: Multi-Stakeholder Forum

  • (a)

    A Multi-StakeHolder Forum shall be convened on a regular basis to provide external perspectives on this Constitution and its implementation.

  • (b)

    This Forum shall include partners from academia, civil society, and impacted communities.

  • (c)

    The Forum shall provide external review of the organization's adherence to "inclusive research and design" and shall be a primary participant in the "Collective Constitutional AI" amendment process (Art. 8.2).

Expert Annotation:

This is the "external link" (Pillar 2.3). It ensures governance is not a purely internal "echo chamber". It formalizes engagement with the outside world, building public trust and accountability. It creates the body that will be necessary to execute the advanced "Collective Constitutional AI" amendment process in Article VIII.sThis architecture is interconnected.

Article III: Risk Classification & Management Framework

Section 3.1: Risk Tiers

All AI systems shall be classified by the ARMO, and confirmed by the AIRB, into one of four risk tiers:

  • (a)

    Unacceptable Risk: As defined in Art. I, Sec 1.1. These systems are prohibited.

  • (b)

    High-Risk: AI systems as defined in Art. 3.2. These systems are subject to the full mandates of this Constitution.

  • (c)

    Limited-Risk: AI systems that pose a risk of manipulation or deception (e.g., chatbots, deepfakes). These systems are subject to the transparency obligations of Art. 5.3.

  • (d)

    Minimal-Risk: All other AI systems (e.g., spam filters, AI-enabled video games). These systems are largely unregulated by this Constitution.

Expert Annotation:

This Article directly adopts the risk-based approach of the EU AI Act. This is the emerging global standard. This approach is efficient. It focuses 90% of the governance burden (the "heavy" mandates in Articles IV, VI, and VII) only on the "High-Risk" category, leaving "Minimal-Risk" systems (the vast majority) "largely left unregulated" to innovate freely. This "funnel" is the core operational mechanism of the entire constitution.

Section 3.2: High-Risk Systems (Definition)

An AI system shall be classified as High-Risk if it is intended to be used as a safety component of a product, or if it is used in one of the following domains:

  • (a)

    Biometric identification and categorization of natural persons;

  • (b)

    Management and operation of critical infrastructure;

  • (c)

    Education and vocational training (e.g., scoring exams, admissions);

  • (d)

    Employment, workers management, and access to self-employment (e.g., "CV-scanning tool that ranks job applicants");

  • (e)

    Access to essential private and public services and benefits (e.g., credit scoring, loan determinations);

  • (f)

    Law enforcement, migration, asylum, and border control;

  • (g)

    Administration of justice and democratic processes.

A High-Risk AI System is subject to the "Human Impact Scorecard" (Art. 3.3), the "Technical Assurance Mandate" (Art. IV), and the "Right to Redress" (Art. VI).

Expert Annotation:

This section provides a clear, non-ambiguous definition of "High-Risk," drawing its categories directly from the EU AI Act's Annex III. Section 3.2(d) and (e) are critical, as they cover the most common corporate use-cases that generate significant legal and ethical risk (e.g., hiring and credit). The final sentence is the "switch." It explicitly links this classification to the Constitution's most powerful requirements.

Section 3.3: Human Flourishing Impact Assessment

  • (a)

    Prior to development, all proposed High-Risk AI Systems must undergo and complete a "Human Flourishing Impact Assessment."

  • (b)

    This assessment shall be based on the "Capabilities Approach", a framework for assessing impact on human flourishing and well-being.

  • (c)

    The assessment shall be documented in a "Human Impact Scorecard" that includes both qualitative and quantitative metrics measuring the system's potential impact on the central human capabilities (e.g., Bodily Health, Practical Reason, Affiliation).

  • (d)

    This Scorecard is a mandatory component of the project's submission to the AIRB for approval.

Expert Annotation:

This is the operational codification of Pillar 1.3, one of the most advanced concepts in this framework. It replaces a vague "ethics" check with a "precise, quantitative assessment of AI's human impacts". It forces developers, at the design phase, to move beyond FATE (Fairness, Accountability, etc.) and to ask a deeper question: "Does this product actually help people flourish, and how can we prove it?". This mandate for a "Human Impact Scorecard" based on Nussbaum's "Capabilities Approach" is a hallmark of a "100-point" constitution.

Article IV: The Technical Assurance Mandate (Defense-in-Depth)

All High-Risk AI Systems shall adhere to the following 4-Layer "Defense-in-Depth" assurance protocol. Failure to document adherence to this Article shall result in a "presumption of fault" (Art. 6.2).

Section 4.1: Layer 1 (Design) - Risk Mapping & Data Governance

All High-Risk systems shall, prior to development:

  • (a)

    Complete the NIST AI RMF (Map & Measure) functions to identify and document all contextual, algorithmic, data, and operational risks.

  • (b)

    Implement Data Governance protocols ensuring all training, validation, and testing datasets are "relevant, sufficiently representative and... free of errors and complete" per the EU AI Act standard.

Section 4.2: Layer 2 (Verification) - Formal Assurance & The "Arbiter of Reason" Architecture

All High-Risk AI Systems shall be designed to address the "silent failure" and "black box" problem:

  • (a)

    "Arbiter of Reason" Architecture: Systems shall, where technically feasible, be designed to functionally separate the "Engine of Intuition" (the probabilistic, pattern-recognition component) from an "Arbiter of Reason" (a verifiable, rules-based, or logic-based component). The Arbiter shall check, validate, or constrain the Engine's output before it can cause a high-risk action.

  • (b)

    Formal Verification: For all safety-critical components (e.g., in autonomous systems), Formal Methods shall be used to prove, with mathematical guarantees, that the component satisfies defined safety, fairness, and robustness properties.

Section 4.3: Layer 3 (Validation) - Independent Audit & Adversarial Red Teaming

Prior to deployment, all High-Risk systems must be validated by parties other than the development team:

  • (a)

    Independent Third-Party Auditing: The system shall undergo and pass a comprehensive audit by a qualified, independent third party. The audit shall assess, at minimum, data quality, bias, fairness, efficacy, security, and regulatory compliance.

  • (b)

    Holistic AI Red Teaming: The system shall be subjected to continuous, holistic AI red teaming. This red teaming must be adversarial, non-technical as well as technical, and designed to discover novel failure modes and societal harms not captured by standard benchmarks.

Section 4.4: Layer 4 (Runtime) - Continuous Monitoring & Runtime Assurance

All deployed High-Risk systems shall be subject to post-deployment assurance:

  • (a)

    Continuous Monitoring: The system shall be instrumented with real-time dashboards to monitor for performance degradation, data drift, and bias anomalies.

  • (b)

    Runtime Verification: Where feasible for safety-critical systems, a "Runtime Verification" monitor shall be implemented to create a "provably safe envelope". This monitor shall observe the AI's behavior and have the power to apply a "safety intervention" to block any action that would violate a formally-defined safety property.

Expert Annotation:

This Article is the technical "core" of the Constitution (Pillar 3). It translates the abstract principle of "Safety" into a concrete, 4-layer engineering mandate. It creates a "Defense-in-Depth" strategy, acknowledging that any single layer (e.g., "auditing" alone) is insufficient. Section 4.2 is the most advanced part. The "Arbiter of Reason" architecture and the mandate for Formal Verification are "100-point" requirements. They are the only known methods to provide mathematical guarantees of safety, moving beyond the "hope" of statistical testing. Section 4.4 is also critical. It recognizes that risk does not end at deployment. The mandate for Continuous Monitoring and Runtime Verification ensures the system remains safe as the real world changes around it. The preamble to this Article links it directly to Article VI, creating the legal "teeth." Failure to follow this 4-layer protocol is the definition of "fault" (Art. 6.2).

Article V: Data, Transparency, & Explainability

Section 5.1: AI Model & Data Registry

The ARMO shall maintain a comprehensive AI Model Registry. For all High-Risk systems, this registry shall include, as part of its "technical documentation":

  • (a)

    The sources, lineage, and "data governance" process for its training, validation, and testing datasets.

  • (b)

    The "Human Impact Scorecard" (Art. 3.3).

  • (c)

    The results of all audits and red teaming (Art. 4.3).

  • (d)

    The results of all formal verification and runtime monitoring (Art. 4.2, 4.4).

  • (e)

    A clear "Transparency Note" explaining the model's purpose, capabilities, and limitations.

Section 5.2: Technical Explainability (XAI)

For all High-Risk systems, mechanisms for "Explainable AI" (XAI) shall be implemented to a standard appropriate for the context. This includes providing technically meaningful explanations of system outputs to internal operators, auditors, and the AIRB to enable meaningful human oversight and audit.

Section 5.3: Public Transparency

  • (a)

    For all Limited-Risk systems (e.g., chatbots, generative AI media), the system must ensure that end-users "are aware that they are interacting with AI".

  • (b)

    For all High-Risk systems, the "Transparency Note" (Art. 5.1e) shall be made available to users and regulators to explain the system's logic and capabilities in accessible language.

Expert Annotation:

This Article operationalizes the core principle of "Transparency". It "Defense-in-Depth" a crucial distinction:

Section 5.2 (XAI): This is technical explainability, for experts. It refers to the methods (like LIME, SHAP, or formal proofs) that allow an auditor or operator to understand why a model made a specific decision.

Section 5.3 (Transparency): This is public-facing transparency, for users. It is a simple notice ("you are talking to a bot") or a simple, accessible "Transparency Note".

A 100-point framework mandates both. Many frameworks confuse the two, demanding technical XAI for users (which is impossible) or offering simple transparency to auditors (which is insufficient). This Article mandates the right "explanation... to the right person in the right way". Section 5.1 creates the "paper trail." The AI Model Registry is the central, auditable logbook that makes enforcement of this Constitution possible.

Article VI: Rights, Redress, & Liability

Section 6.1: Right to Human Intervention & Contest

  • (a)

    Any individual subject to a decision by a High-Risk AI System "which produces legal effects... or similarly significantly affects him or her" (Art. 3.2e-g) shall have the right "not to be subject to a decision based solely on automated processing".

  • (b)

    This right shall include, at minimum:

    • i.

      The right to be clearly informed that such a decision was made;

    • ii.

      The right to "obtain human intervention on the part of the controller";

    • iii.

      The right to "express his or her point of view" to that human intervenor;

    • iv.

      The right to "contest the decision" and receive a meaningful, reasoned response.

  • (c)

    This shall be supplemented by the "right to explanation of individual decision-making" for all High-Risk systems.

Expert Annotation:

This is the codification of Pillar 4.2. It grants enforceable rights to individuals. This Article moves beyond corporate platitudes and creates a mechanism for redress. It directly codifies the rights from GDPR Article 22 and expands them by linking to the EU AI Act's "right to explanation". This is a critical, non-negotiable component of any framework that claims to be "human-centric."

Section 6.2: Presumption of Fault & Liability

  • (a)

    This Constitution rejects the "black box" defense for AI-induced harm.

  • (b)

    Where a deployed High-Risk AI System causes harm, fault shall be presumed on the part of the organization.

  • (c)

    This presumption may be rebutted. The burden of proof lies with the ARMO and the development team to demonstrate, through the documentation in the AI Model Registry (Art. 5.1), that they adhered in full to the "Technical Assurance Mandate" (Article IV).

  • (d)

    Failure to provide such proof shall be conclusive evidence of fault, and the AIRB shall impose a Tier 2 Violation (Art. 6.4).

Expert Annotation:

This is the central enforcement mechanism of the Constitution (Pillar 4.3). It proactively internalizes the "presumption of fault" (or "presumption of causality") from the EU's AI Liability Directive (AILD). This Article is what makes Article IV (Technical Assurance) binding. It creates a powerful, unavoidable incentive for development teams to actually do the formal verification, auditing, and monitoring required. It clarifies the internal chain of liability: "fault" is not "the AI messed up"; "fault" is "the humans failed to follow the assurance process." This is auditable and enforceable.

Section 6.3: Incident Reporting

  • (a)

    A "no-blame" internal culture for reporting AI incidents, failures, and near-misses shall be fostered.

  • (b)

    All "serious incidents"—defined as "situations in which AI systems caused, or very nearly caused, real-world harm"—shall be immediately reported to the ARMO and the AIRB.

  • (c)

    To promote collective ecosystem safety, all confirmed "serious incidents" shall be (anonymized and redacted for security) submitted to the public AI Incident Database (AIID).

Expert Annotation:

This is the "public accountability" mechanism (Pillar 4.4). It is modeled directly on the highly successful safety reporting systems in aviation and computer security. The purpose is not to "shame" the organization, but to "learn from experience so we can prevent or mitigate bad outcomes". Mandating this "no-blame" internal culture and "high-accountability" external reporting is a hallmark of a mature, "100-point" safety framework.

Section 6.4: Internal Enforcement

Non-compliance with this Constitution shall be enforced by the AIRB as follows:

  • (a)

    Tier 1 Violation: Any project in breach of Article I (Prohibitions).

    Consequence: Immediate, mandatory project termination and a full, binding review of the responsible division by the AIRB.

  • (b)

    Tier 2 Violation: Any High-Risk system found non-compliant with Articles III, IV, or V (e.g., failure of data governance, failure of assurance).

    Consequence: Immediate suspension of the project/deployment and a mandatory re-audit, with binding modifications required by the AIRB.

  • (c)

    Tier 3 Violation: Providing incorrect or misleading information to the AIRB or auditors.

    Consequence: Formal personnel-level accountability and a binding review.

Expert Annotation:

This Article operationalizes Pillar 4.1. It internalizes the logic and severity of the EU AI Act's penalty structure and makes them binding internal policy. This gives the AIRB (Art. 2.2) its "teeth."

Article VII: Frontier Model & Responsible Scaling

Section 7.1: Responsible Scaling Levels (RSLs)

This Constitution establishes a Responsible Scaling Level (RSL) protocol to govern the development of "frontier" or general-purpose AI models, whose capabilities may evolve and emerge unpredictably. This protocol is based on the "AI Safety Level" (ASL) framework and is designed to be "proportional" and "iterative".

  • (a)

    RSL-1 (Benign): Models with no meaningful catastrophic risk.

  • (b)

    RSL-2 (Early Misuse Potential): Models showing early signs of dangerous capabilities (e.g., unreliable CBRN information).

  • (c)

    RSL-3 (Catastrophic Misuse Risk): Models that substantially increase the risk of catastrophic misuse (e.g., providing useful CBRN instructions) or show low-level autonomous capabilities.

  • (d)

    RSL-4 (Autonomous Risk): Models with qualitatively higher autonomous capabilities or catastrophic potential.

Section 7.2: The Gating Mechanism

  • (a)

    All current models must meet, at minimum, the RSL-2 Deployment and Security Standards, including rigorous internal red teaming and security hardening.

  • (b)

    The training or deployment of a model at a higher RSL (e.g., moving from RSL-2 to RSL-3) is prohibited until the corresponding, stricter RSL-3 safeguards (e.g., external independent auditing, enhanced security, runtime assurance) have been:

    • i.

      Developed and implemented;

    • ii.

      Independently audited and validated; and

    • iii.

      Certified as "met" by the AI Review Board (AIRB).

  • (c)

    The triggers for these levels shall be based on emergent capabilities (e.g., performance on CBRN or autonomy benchmarks), not on static or lagging metrics like compute.

Expert Annotation:

This Article is the "living document" mechanism forschnical evolution (Pillar 5.2). It is the single most important article for governing future risk. It directly adopts the "Responsible Scaling Policy" (RSP) logic pioneered by Anthropic. Section 7.2 (The Gating Mechanism) is the key. It makes safety a precondition for innovation, not an afterthought. It gates scaling. This "capability-based" (not compute-based) trigger system is "future-proof." It does not matter how the model becomes dangerous; as soon as it can (as proven by an audit), the next level of safeguards is required before scaling can continue. This is the definition of a "100-point" adaptive framework.

Article VIII: Amendment & Ratification

Section 8.1: Formal Amendment Process

This Constitution may be amended by a, ensuring it remains a "living document".

Section 8.2: Collective Constitutional AI

This amendment process shall be informed by a periodics"Collective Constitutional AI" process.

  • (a)

    At least biannually, the Multi-Stakeholder Forum (Art. 2.3) shall be convened for a structured deliberation on the principles and prohibitions of this Constitution.

  • (b)

    This process, modeled on "collective intelligence", shall review public and expert input to determine if the Constitution's values remain aligned with evolving societal norms.

  • (c)

    The findings of this process shall be delivered to the AIRB as a formal "Recommendation for Amendment."

Expert Annotation:

This final Article is the "living document" mechanism for ethical evolution (Pillar 5.3). Section 8.1 provides the standard legal mechanism for change. Section 8.2 is the "100-point" innovation. It adopts the governance concept of "Constitutional AI" and scales it to a "Collective" (public) level. This Article ensures the Constitution's values do not become static. It creates a formal, democratic feedback loop that allows the public and civil society to participate in updating the organization's governing principles. This ensures long-term public trust and ethical legitimacy, making the framework truly "adaptive."

Get Involved

This framework is an open, collaborative effort. We are actively seeking partners from academia, industry, civil society, and policy to help refine, promote, and implement these standards.

For Organizations

Learn how to adopt and implement the Constitution in your AI governance lifecycle.

For Researchers

Collaborate with us on technical assurance, auditing standards, and formal verification methods.

For Policy

Use this framework as a 'gold standard' model for national and international AI regulation.

Contact Us to Partner