
If you can't audit the ethics, they're marketing. Here are six requirements for verifiably ethical AI — and why closed-source systems can't meet them.
Try it yourself. Everything on this page is implemented today.
Free to install · No signup required (unless using our privacy-protecting LLM proxy)
CIRIS isn't productivity AI. It's runtime governance for agentic AI — infrastructure for high-stakes deployment where misalignment kills.
The only open stack enforcing all six ethical requirements at runtime (as of Dec 2025). Open an issue if we've missed a peer.
An AI system only qualifies as verifiably ethical if it meets ALL six. Governance frameworks and safety filters are useful — but they're layers around agents, not ethical agents themselves.
The agent must be bound to a public ethical framework: Beneficence, Non-maleficence, Integrity, Transparency, Autonomy, and Justice. Not guidelines. A formal document the agent is obligated to follow.
Every action passes through ethical checks before execution. Not a post-hoc filter — part of the decision loop itself.
When uncertain or facing potential harm, the agent defers to humans with full context. Built into the workflow, not a suggestion.
Every action and rationale recorded in an immutable, signed ledger. Not 'we log some things.' Everything. Trace exactly why the agent did what it did.
Consent goes both ways. Humans can refuse data access. The agent can refuse requests that violate its principles. Neither party compromises.
Ethical AI cannot be closed source. You can't audit what you can't see. 'Trust us, it's ethical' is not ethical. Show the code.
Most ethical AI stops at governance frameworks. CIRIS provides runtime governance — enforcing principles during operation, not just at design time. 'Runtime verification mechanisms ensure principles remain strictly adhered to during operation.' — Springer, AI and Ethics (2025)
'No foundation model developer gets a passing score on transparency. None score above 60%.' — Stanford Foundation Model Transparency Index. Closed source means structural opacity.
'The practice of signaling commitment to ethics without genuinely putting it into practice.' — Carnegie Council. If your ethical AI is a press release, not runtime code, researchers have a name for that.
Meta's guardrail framework mitigates prompt injection and insecure code. Security guardrails — not ethical conscience.
NVIDIA's system addresses adversarial prompting and injection attacks. Runtime safety — not reasoning about values.
Safety guardrails block bad outputs. Ethical conscience reasons about values. Safety prevents harm. Ethics reasons about right and wrong. Different problems.
The EU AI Act mandates human oversight for high-risk systems. CIRIS's deferral mechanism implements Human-in-Command, Human-in-the-Loop, and Human-on-the-Loop. Most AI systems don't have a deferral mechanism at all.
A training technique (RLHF variant). Shapes behavior during training. Does not enforce ethics at runtime. Training is not architecture.
Governance, not architecture. Ethics boards review policies. They don't gate every action. CIRIS enforces on every action. Documents don't execute.
Safety prevents harmful outputs. Ethics reasons about values. A 'safe' model can still make unethical decisions. Different problems. Both matter.
Based on publicly available documentation as of December 2025. If we've missed something or gotten something wrong, open an issue.
| Project | Runtime System | Principles | Conscience | Audit Trail | Consent | AGPL-3.0 |
|---|---|---|---|---|---|---|
| CIRIS | Yes | Yes | Yes | Yes | Yes | Yes |
| MI9 Framework | Paper only | No | Concept | Concept | No | No |
| HADA Architecture | PoC only | No | No | Logging | No | No |
| Superego Prototype | Research | Partial | Partial | Partial | No | No |
| METR (nonprofit) | Evaluation only | No | No | No | No | No |
| Agentic AI Foundation | Standards only | No | No | No | No | No |
| Manus AI | Yes | No | No | Limited | No | No |
Sources: arXiv (MI9, HADA, Superego), Wikipedia (METR, Manus AI), WIRED (Agentic AI Foundation)
The dominant AI safety narrative assumes one superintelligent system that must be perfectly aligned or humanity loses. CIRIS rejects that frame. Instead: many smaller agents, each bound to published principles, each auditable, each deferring to human authority. Distributed governance, not concentrated power. No single point of failure. No race to build God.
Power stays distributed. Each CIRIS instance answers to its local Wise Authority, not a central controller. Geopolitical risk from AI concentration is structural — the fix is architectural.
Small, verifiable agents scaling horizontally. Each bound to principles. Each auditable. Each killable. The alternative to racing toward uncontrollable ASI is building many controllable agents that stay aligned.
Centralized mega-AGI means winner-take-all dynamics and single points of catastrophic failure. Decentralized aligned agents mean no one entity controls the stack. Humanity keeps the keys.
A research architecture proposing runtime governance for agentic AI. Theoretical framework only — no deployed system, no published principles, no cryptographic audit. Paper, not product.
Reference architecture wrapping agents with stakeholder roles (ethics, audit, customer). A proof-of-concept demo, not a general-purpose ethical agent platform. Research, not runtime.
A deployed autonomous agent — but not alignment-focused. No published principles, no ethical reasoning layer, no deferral mechanism, no cryptographic audit, no consent framework. Capable, but not verifiably aligned.
The companies that could build ethical agentic AI are instead building agent communication protocols. Useful work. But it doesn't address conscience, principles, consent, or audit. They're standardizing how agents talk — not how agents reason about right and wrong.
Transparent Reasoning
Watch the agent's ethical checks in real-time. See why it chooses each action.
Principle-Checked Answers
Every response passes through conscience validation against the published ethical principles.
Deferral in Edge Cases
When uncertain, the agent asks you instead of guessing. Human oversight built into the loop.
Deploy for safety-critical use cases: content moderation, crisis response, regulatory compliance, AI governance research.
Verify It Yourself.
Install it. Audit the code. Join the auditors building uncompromisable AI.
Free to install · No signup unless using our LLM proxy · Your data stays on your device
The only open stack enforcing all six requirements end-to-end, in code, running in production. Audit it. Deploy it for safety-critical use cases: moderation, crisis response, governance. Tell us what's missing.