Compare | Runtime Governance for Agentic AI

The Six Ethical Requirements

Necessary. Not sufficient.

These six requirements establish verifiable ethics. But ethics alone can still fail via correlation collapse — when correlated sources create false confidence. That's why there's a seventh.

1. Published Principles

The agent must be bound to a public ethical framework: Beneficence, Non-maleficence, Integrity, Transparency, Autonomy, and Justice. Not guidelines. A formal document the agent is obligated to follow. Read the Covenant →

2. Runtime Conscience

Every action passes through ethical checks before execution. Not a post-hoc filter — part of the decision loop itself. View code →

3. Human Deferral

When uncertain or facing potential harm, the agent defers to humans with full context. Built into the workflow, not a suggestion. View code →

4. Cryptographic Audit

Every action and rationale recorded in an immutable, signed ledger. Not 'we log some things.' Everything. Trace exactly why the agent did what it did. View code →

5. Bilateral Consent

Consent goes both ways. Humans can refuse data access. The agent can refuse requests that violate its principles. Neither party compromises. View code →

6. Open Source (AGPL-3.0)

Ethical AI cannot be closed source. You can't audit what you can't see. 'Trust us, it's ethical' is not ethical. Show the code. View license →

Intuition (Corridor Maintenance)

The requirement ethics alone can't satisfy.

The agent monitors its own epistemic diversity. Before acting, it asks: "Am I reasoning from truly independent sources, or is this an echo chamber?"When effective source count drops below threshold (k_eff < 2), the decision is flagged for human review.

CHAOS

Too loud. No coordination. High variance.

ρ < 0.2

HEALTHY CORRIDOR

Diverse perspectives. Synthesizable.

0.2 < ρ < 0.7

RIGIDITY

Too quiet. Echo chamber. False confidence.

ρ > 0.7

Implemented as IDMA (Intuition Decision Making Algorithm) — the 4th DMA in the CIRIS pipeline.

View IDMA code →

Why Ethics Alone Fails

The math of echo chambers.

As sources become correlated (ρ → 1), effective diversity collapses regardless of how many sources you have:

k_eff = k / (1 + ρ(k-1)) → 1 as ρ → 1

10 sources with ρ=0.9 → k_eff ≈ 1.1 (effectively one source)

An ethical AI following correlated guidance is like a democracy where every voter reads the same newspaper. The vote count looks healthy. The effective diversity is 1.

Too correlated is the new too quiet. The system appears stable while fragility accumulates invisibly.

Based on Coherence Collapse Analysis (CCA) — validated across chemistry, political science, finance, and biology.

Read the paper →

The Research Agrees

Peer-reviewed papers, regulatory bodies, and transparency indices document the same gaps.

The Runtime Gap

Design-time ethics don't execute.

Most ethical AI stops at governance frameworks. CIRIS provides runtime governance — enforcing principles during operation, not just at design time. 'Runtime verification mechanisms ensure principles remain strictly adhered to during operation.' — Springer, AI and Ethics (2025)

Transparency Crisis

No company passes.

'No foundation model developer gets a passing score on transparency. None score above 60%.' — Stanford Foundation Model Transparency Index. Closed source means structural opacity.

Ethics Washing

A peer-reviewed term.

'The practice of signaling commitment to ethics without genuinely putting it into practice.' — Carnegie Council. If your ethical AI is a press release, not runtime code, researchers have a name for that.

Guardrails ≠ Conscience

Safety tools solve a different problem.

LlamaFirewall

Meta's guardrail framework mitigates prompt injection and insecure code. Security guardrails — not ethical conscience.

NeMo Guardrails

NVIDIA's system addresses adversarial prompting and injection attacks. Runtime safety — not reasoning about values.

The Distinction

Safety guardrails block bad outputs. Ethical conscience reasons about values. Safety prevents harm. Ethics reasons about right and wrong. Different problems.

EU AI Act Article 14

CIRIS implements regulatory requirements.

The EU AI Act mandates human oversight for high-risk systems. CIRIS's deferral mechanism implements Human-in-Command, Human-in-the-Loop, and Human-on-the-Loop. Most AI systems don't have a deferral mechanism at all.

Common Objections

And why they're wrong.

Constitutional AI

A training technique (RLHF variant). Shapes behavior during training. Does not enforce ethics at runtime. Training is not architecture.

Ethics Boards

Governance, not architecture. Ethics boards review policies. They don't gate every action. CIRIS enforces on every action. Documents don't execute.

Safe Models

Safety prevents harmful outputs. Ethics reasons about values. A 'safe' model can still make unethical decisions. Different problems. Both matter.

The Current Landscape

What we found when we looked for peers. Different projects, different goals.

Based on publicly available documentation as of December 2025. If we've missed something or gotten something wrong, open an issue.

Project	Runtime System	Principles	Conscience	Audit Trail	Consent	AGPL-3.0	Intuition
CIRIS	Yes	Yes	Yes	Yes	Yes	Yes	IDMA
MI9 Framework	Paper only	No	Concept	Concept	No	No	No
HADA Architecture	PoC only	No	No	Logging	No	No	No
Superego Prototype	Research	Partial	Partial	Partial	No	No	No
METR (nonprofit)	Evaluation only	No	No	No	No	No	No
Agentic AI Foundation	Standards only	No	No	No	No	No	No
Manus AI	Yes	No	No	Limited	No	No	No
HatCat	Yes	Partial	Steering	Partial	No	CC0	No

Sources: arXiv (MI9, HADA, Superego), Wikipedia (METR, Manus AI), WIRED (Agentic AI Foundation), GitHub (HatCat)

The AGI Question

Decentralized alignment beats centralized control.

Many Aligned Agents

Not one unaligned god.

The dominant AI safety narrative assumes one superintelligent system that must be perfectly aligned or humanity loses. CIRIS rejects that frame. Instead: many smaller agents, each bound to published principles, each auditable, each deferring to human authority. Distributed governance, not concentrated power. No single point of failure. No race to build God.

Distributed Governance

Power stays distributed. Each CIRIS instance answers to its local Wise Authority, not a central controller. Geopolitical risk from AI concentration is structural — the fix is architectural. See the vision →

Aligned Baby-AGIs

Small, verifiable agents scaling horizontally. Each bound to principles. Each auditable. Each killable. The alternative to racing toward uncontrollable ASI is building many controllable agents that stay aligned.

No Single Chokepoint

Centralized mega-AGI means winner-take-all dynamics and single points of catastrophic failure. Decentralized aligned agents mean no one entity controls the stack. Humanity keeps the keys.

Why This Is Structural

Not ideology. Geometry.

Coherence Collapse Analysis formalizes what distributed systems engineers already know: correlated constraints provide redundant protection. As correlation approaches 1, a system with 1,000 rules has the effective diversity of a system with one.

Centralized AI concentrates correlation by design — shared training data, RLHF convergence, deployment monoculture. The seven requirements aren't just good practice. They're the architectural response to a mathematically identifiable failure mode.

Published Principles

Diverse constraints

Runtime Conscience

Independent verification

Cryptographic Audit

Cross-agent challenge

AGPL-3.0

No enclosure

Federation keeps ρ low. Monopoly drives ρ → 1.

Read the paper →

MI9 Framework

A research architecture proposing runtime governance for agentic AI. Theoretical framework only — no deployed system, no published principles, no cryptographic audit. Paper, not product.

HADA Architecture

Reference architecture wrapping agents with stakeholder roles (ethics, audit, customer). A proof-of-concept demo, not a general-purpose ethical agent platform. Research, not runtime.

Manus AI

A deployed autonomous agent — but not alignment-focused. No published principles, no ethical reasoning layer, no deferral mechanism, no cryptographic audit, no consent framework. Capable, but not verifiably aligned.

HatCat

Real-time interpretability and steering for open-weights models. Detects concepts like deception and manipulation during generation, can steer away from harmful outputs. Complementary approach — monitors internals rather than reasoning about principles. CC0 licensed.

The Agentic AI Foundation

OpenAI, Anthropic, Block — building standards, not ethics.

The companies that could build ethical agentic AI are instead building agent communication protocols. Useful work. But it doesn't address conscience, principles, consent, or audit. They're standardizing how agents talk — not how agents reason about right and wrong.

This Is Not Optional

The math guarantees failure without intervention.

Every day AI systems operate without Type 3 governance, invisible fragility accumulates. Like a bridge that looks fine while its supports corrode—until they don't.

Two paths forward:

1.Reduce AI deployment — nobody is choosing this path, and it may no longer be possible
2.Insert Type 3 governance — what CIRIS implements

Help by installing and testing.

Every installation is a sensor. Every trace feeds the seismograph. Every bug report improves the system. The longer we wait, the worse the eventual collapse.

Install CIRIS View Source

The science says this is inevitable without intervention. We cannot predict when. But we can build infrastructure to detect it coming and buy time for human response.

What You'll Experience When You Install

Transparent Reasoning

Watch the agent's ethical checks in real-time. See why it chooses each action. Explore a trace →

Principle-Checked Answers

Every response passes through conscience validation against the published ethical principles.

Deferral in Edge Cases

When uncertain, the agent asks you instead of guessing. Human oversight built into the loop.

Deploy for safety-critical use cases: content moderation, crisis response, regulatory compliance, AI governance research.

Get it on Google Play macOS / Windows / Linux

AI that doesn't serve humanity is extracting from it.

The Stakes

Three Types of AI

Unethical AI

Ethical AI

Ethical + Intuitive AI