HomeVisionPrinciplesGet StartedGitHub
Background Image
Enrich or Extract

AI that doesn't serve humanity is extracting from it.

If you can't check the ethics, they're marketing. Here's what to look for — and how existing approaches compare.

Three Types of AI

Ethics is necessary. It's not sufficient.

Some AI has no rules at all. Some follows rules but can't tell when its sources are just echoing each other. Only one type checks whether its information actually comes from different places.

1

No Rules

No published principles. No audit trail. Closed source. You can't check what it did or why.

Requires external regulation. Cannot govern itself.

2

Rules, No Awareness

Follows ethical rules. But can't tell when all its sources are just copying each other — so it can be confidently wrong.

Safe when supervised. Can't detect echo chambers on its own.

3

Rules + Awareness

Follows ethical rules AND checks whether its information comes from genuinely different places. When agreement looks suspicious, it flags it before acting.

This is what CIRIS builds.

An AI can follow every rule, pass every audit, and still fail if all its information comes from the same place. That blind spot is what CIRIS was built to fix.

Seven Things to Check

Six for ethics. One for blind spots.

These are the things that make AI verifiably ethical. The first six are about doing the right thing. The seventh is about catching the situations where 'doing the right thing' is based on bad information.

1. Published Principles

The agent must follow a public ethical framework. Not hidden rules — a document anyone can read and hold it accountable to.

2. Ethics Check on Every Decision

Every action goes through an ethics check before the agent does it. Not after the fact — before.

3. Asks Humans When Unsure

When uncertain or facing potential harm, the agent asks a person instead of guessing. Built into the workflow, not optional.

4. Proof of What It Did

Every decision is recorded and signed so you can verify exactly what happened and why. A receipt for every action.

5. Two-Way Consent

Consent goes both ways. You can say no to the agent. The agent can say no to you. Neither side is forced to compromise.

6. Open Source

You can't audit what you can't see. CIRIS is fully open source under AGPL-3.0 — anyone can read, verify, and improve the code.

7

Echo Chamber Detection

The thing rules alone can't catch.

Before acting, the agent asks: "Do my sources actually disagree with each other, or are they all getting their information from the same place?"Ten sources that all copied from the same original are really just one source. When agreement looks too uniform, the agent flags it for a person to review.

Too Noisy

Sources contradict each other so much that nothing useful can be concluded.

Healthy

Sources genuinely differ. Real agreement means something.

Echo Chamber

Looks like agreement, but sources are just repeating each other.

This is what makes CIRIS different from other ethical AI frameworks.

Want the math? Read the full thesis →

Why Rules Alone Aren't Enough

The echo chamber problem.

As sources start copying each other, the number of truly independent viewpoints collapses — even if you have ten sources on paper.

Ten sources that all read the same report? That's really one source counted ten times.

An ethical AI following copied guidance is like a democracy where every voter reads the same newspaper. The vote count looks healthy. The actual number of viewpoints is one.

Agreement only means something when the sources are actually independent.

This problem shows up everywhere — from financial markets to scientific peer review to social media.

Read the full thesis →

The Current Landscape

Different projects, different goals.

Based on publicly available documentation as of February 2026. If we've missed something or gotten something wrong, let us know.

ProjectChecks Every DecisionPublished RulesEthics Built InProof of What It DidOpen SourceEcho Chamber Detection
CIRISYesYesYesYesAGPL-3.0Yes
Constitutional AITraining onlyImplicitNoNoNoNo
LlamaFirewall / NeMo GuardrailsYesNoNoLoggingYesNo
HatCatYesPartialSteeringPartialCC0No
Ethics Boards / Governance FrameworksNoYesNoManualVariesNo

Guardrails and governance frameworks solve important but different problems. Safety blocks harmful outputs. Ethics reasons about values. CIRIS aims to do both — and catch the blind spots that neither addresses alone.

Three Layers of Protection

Each one solves a different problem.

Safety Guardrails

Block dangerous outputs — prompt injection, harmful content, adversarial attacks. Like a filter that catches bad things on the way out.

Ethical Conscience

Reasons about whether an action is right, not just whether it's safe. Like a judge weighing the situation before making a call.

Echo Chamber Detection

Checks whether agreement is real or just repetition. Like a fact-checker who asks "did you all read the same article?"

Many Aligned Agents

Distributed governance, not concentrated power.

No Single Point of Failure

Smaller agents, each accountable.

Many smaller agents, each bound to published principles, each auditable, each deferring to human authority. No single company or entity controls the whole stack. The more independent the agents, the harder it is for any one failure to cascade.

Research Status

This is active research. We're transparent about what's established and what's still being tested.

Well-established

  • - Copied sources reduce real diversity
  • - AI models share training data overlap
  • - Echo chambers create false confidence
  • - Independent verification catches more errors

Still being tested

  • - Precisely measuring how copied AI sources are
  • - Best thresholds for flagging echo chambers
  • - How well interventions reduce copying
  • - How this varies across different fields

Try It Yourself

Verify It Yourself.

Open source. Open to scrutiny.

Every claim on this page is backed by code you can read, traces you can verify, and research you can check. That's the point.