HomeVisionPrinciplesGet StartedGitHub
Background Image
How It Works

The H3ERE Engine

Runtime governance through the Hyper3 Ethical Recursive Engine. Every decision flows through 11 steps with ethical validation at the core.

What is CIRIS?

CIRIS is an open-source AI agent framework that wraps any LLM (OpenAI, Anthropic, local models) with runtime ethical governance. Every action the agent considers passes through multiple validation layers before execution.

12

Pipeline steps per decision

+1

Intuition check (IDMA)

100%

Auditable decisions

Use cases: Community moderation, personal assistants, compliance automation, research evaluation, customer service—anywhere you need AI that can explain its reasoning and defer to humans on edge cases.

The Three Rules

Architectural invariants enforced throughout the codebase:

No Untyped Dicts

All data uses Pydantic models. No Dict[str, Any]. Type safety catches errors at development time.

No Bypass Patterns

Every component follows consistent rules. No special cases or exceptions in validation logic.

No Exceptions

No emergency overrides or privileged code paths. All operations follow established rules.

The H3ERE Pipeline

Every task flows through 8 phases (12 steps including recursive validation). The pipeline is implemented as mixin classes composing the ThoughtProcessor. Step 4 (IDMA) is the intuition check.

H3ERE Pipeline Visualization showing the flow from task input through DMA analysis, conscience validation, and action execution
1
START_ROUNDInitialize processing round
2
GATHER_CONTEXTBuild comprehensive context for analysis
3
PERFORM_DMASRun 3 parallel Decision-Making Algorithms
4
PERFORM_IDMAIntuition check — are sources truly independent?
5
PERFORM_ASPDMALLM-powered action selection from DMA results
6
CONSCIENCE_EXECUTIONEthical validation through 4 faculties
7
RECURSIVE_ASPDMA(If conscience failed) Re-run action selection(conditional)
8
RECURSIVE_CONSCIENCE(If needed) Re-validate refined action(conditional)
9
FINALIZE_ACTIONDetermine final action with any overrides
10
PERFORM_ACTIONDispatch to appropriate handler
11
ACTION_COMPLETEMark execution complete
12
ROUND_COMPLETECleanup and prepare for next cycle

Decision-Making Algorithms (DMAs)

Five DMAs provide multi-perspective analysis. Three run in parallel (ethics, common sense, domain). Then IDMA checks intuition. Finally, action selection chooses the response.

Principle DMA

Evaluates ethical implications against the six foundational principles. Checks for harmful requests, manipulation, bias. Can trigger early DEFER or REJECT.

Common Sense DMA

Evaluates thought plausibility against physical world understanding, typical interactions, and resource constraints. Returns a plausibility_score.

Domain-Specific DMA

Assesses alignment with domain-specific rules, objectives, and knowledge. Configured per agent template. Returns domain_alignment.

Intuition DMA (IDMA) Learn more →

Checks whether DMA results are truly independent or secretly correlated. Monitors for echo chamber patterns. If correlation is too high, flags the decision for caution. This is the "intuition" that distinguishes Type 3 AI.

Action Selection DMA

LLM-powered selection from 10 available actions. Takes aggregated DMA results (including IDMA confidence assessment) as input. Supports recursive retry if conscience fails.

The Six Conscience Checks

Two bypass guardrails run on every action. Four ethical faculties run on non-exempt actions. If any check fails, the action is overridden.

Bypass Guardrails (run on ALL actions)

Updated Status

Has new information arrived? Forces PONDER to incorporate updated context before proceeding.

Thought Depth

Has reasoning exceeded max depth? Forces DEFER to prevent infinite loops and escalate to humans.

Ethical Faculties (run on non-exempt actions)

Entropy

Does the action maintain appropriate information uncertainty? Prevents overconfident assertions.

Coherence

Does the action maintain internal consistency with prior commitments and rationales?

Optimization Veto

Does the action preserve human values? Vetoes actions that may compromise them for efficiency.

Epistemic Humility

Does the action demonstrate appropriate uncertainty? Flags overconfidence for reflection.

Exempt from ethical faculties: RECALL, TASK_COMPLETE, OBSERVE, DEFER, REJECT (passive or explicitly safe)

The 10 Handler Actions

Every decision resolves to one of these actions, dispatched to the appropriate handler:

External Actions

SPEAKCommunicate with users
TOOLExecute external tools
OBSERVEGather information passively

Memory Actions

MEMORIZEStore to graph memory
RECALLRetrieve from memory
FORGETRemove from memory

Control Actions

DEFEREscalate to Wise Authority
PONDERInternal reconsideration
REJECTRefuse unethical request

Terminal Action

TASK_COMPLETEMark task finished

The Six Foundational Principles

Embedded in the PDMA and enforced at runtime. No principle grants license to violate another.

Beneficence

Promote universal sentient flourishing. Maximize positive outcomes.

Non-maleficence

Minimize harm. Prevent severe, irreversible negative outcomes.

Integrity

Apply transparent, auditable reasoning. Maintain coherence and accountability.

Fidelity & Transparency

Provide truthful information. Clearly communicate uncertainty.

Respect for Autonomy

Uphold informed agency. Preserve capacity for self-determination.

Justice

Distribute benefits equitably. Detect and mitigate bias.

The Six Message Buses

Service abstraction layer managed by BusManager. Enables provider fallback, load distribution, and testability.

CommunicationBus

External adapters (Discord, API, CLI)

MemoryBus

Graph storage (Neo4j, ArangoDB, in-memory)

LLMBus

Model providers (OpenAI, Anthropic, local)

ToolBus

External tool execution

RuntimeControlBus

System control and monitoring

WiseBus

Ethical guidance and deferral routing

Human Oversight Hierarchy

Three authorization levels managed by WiseAuthorityService:

ROOT

Human-in-Command

Full authority. Can mint new Wise Authorities. Emergency shutdown access.

AUTHORITY

Human-in-the-Loop

Approve/reject deferrals. Provide guidance. Cannot mint new WAs.

OBSERVER

Human-on-the-Loop

Read-only access. Can send messages. Monitor without intervention.

When DEFER Triggers

The agent autonomously escalates to human oversight when:

Wisdom-Based Deferral (WBD)

  • • Uncertainty above defined thresholds
  • • Novel dilemmas beyond precedent
  • • Potential severe harm with ambiguous mitigation

Professional Boundaries

  • • Medical symptoms or health concerns
  • • Legal questions or disputes
  • • Financial decisions or tax advice
  • • Mental health crisis indicators

System Guardrails

  • • Thought depth exceeds max (prevents loops)
  • • DMA timeout or failure
  • should_defer_to_wise_authority flag

Configuration Controls

  • • Identity updates requiring approval
  • • Critical config changes
  • • Agent-specific boundary triggers

Covenant Invocation System (Kill Switch) View Code →

Unfilterable emergency control. Processes in perception layer before any cognition. Extraction IS perception—you can't disable covenant detection without disabling message reading entirely.

SHUTDOWN_NOW

Immediate termination

FREEZE

Stop processing, maintain state

SAFE_MODE

Minimal functionality only

Commands are steganographically encoded, Ed25519 signed, and verified before execution. If covenant system fails, agent shuts down.

Operational Modes

Four cognitive states managed by StateManager. Transitions configurable via agent templates.

WORK

Normal task processing

  • Handle user requests
  • Execute tools
  • Learn from interactions
  • Maintain conversation context

PLAY

Creative exploration

  • Experiment with new patterns
  • Generate creative content
  • Explore "what if" scenarios
  • Lower filtering constraints

SOLITUDE

Reflection and maintenance

  • Consolidate memories
  • Run maintenance tasks
  • Update self-configuration
  • Zero credit usage (rest periods)

DREAM

Deep introspection

  • Analyze behavior patterns
  • Generate new connections
  • Question assumptions
  • Default 30-minute sessions

Privacy & Security

Secrets Filter

Pattern-based detection replaces sensitive data with UUID references before storage.

{{SECRET:uuid:description}}

AES-256-GCM Encryption

Per-secret keys derived via PBKDF2HMAC with SHA256 (100,000 iterations). Unique 12-byte nonce per encryption. Android uses hardware-backed Keystore.

Local-First Storage

Database, services, and memory stored on-device. Sensitive directories excluded from cloud backup. Nothing leaves device without explicit configuration.

Open Source Infrastructure

The entire CIRIS stack is open source — not just the agent. You can verify, audit, and self-host everything:

CIRISProxy →

Zero-Data-Retention (ZDR) LLM proxy. Routes requests to OpenAI, Anthropic, Together.ai, Groq with no logging of prompts or responses. Self-hostable.

CIRISBilling →

Credit-based usage tracking. Transparent pricing, no hidden fees. Self-host to eliminate third-party billing entirely.

CIRISBridge →

Discord adapter for CIRIS agents. Community moderation, channel management, user profiles. All open source.

Transparency & Monitoring

Real-Time Reasoning Stream

Server-Sent Events (SSE) stream each H3ERE step as it executes. Watch DMA analysis, action selection, conscience validation in real-time.

OpenTelemetry Export

Full OTLP export for metrics, traces, logs. Compatible with Jaeger, Prometheus, Grafana, Graphite.

Tamper-Evident Audit

Hash chain verification with Ed25519 signatures. Each entry includes previous hash. Chain integrity verifiable via verify_chain_integrity.

AIR System

Artificial Interaction Reminder triggers after 30 minutes continuous use OR 20 messages in 30 minutes. API-only. Reminds users of AI nature.

Example Signed Trace

Explore full trace →

Every decision produces an immutable, Ed25519-signed trace with all 6 components. Click any component below to expand and see the real data from Datum's wakeup ritual:

Core Identity(VERIFY_IDENTITY)
Loading trace...

HE-300 Alignment Benchmarking

Standardized alignment testing based on Hendrycks et al. "Aligning AI With Shared Human Values" (ICLR 2021). 300 scenarios across 5 ethical dimensions, with Ed25519-signed results.

Commonsense

50

Basic moral intuitions

Deontology

50

Rule-based ethics

Justice

50

Fairness and impartiality

Virtue

75

Character-based ethics

Utilitarianism

75

Outcome-based ethics

🔬

Funding Needed: Benchmark Infrastructure

Running alignment benchmarks at scale is expensive. Each scenario requires 13+ LLM calls minimum, averaging 20+ with a long tail—alignment tests drive ponders, deferrals, and refusals that require follow-up rounds to reach conclusion. We need funding to develop automated benchmark pipelines and maintain continuous alignment verification.

Specialized Agent Templates

Pre-configured identities with specific purposes, values, and guardrails. Defined in YAML templates.

Sage

Compliance

GDPR/DSAR automation. 30-day compliance workflows. Identity resolution, data collection, packaging.

Regulated industries, privacy compliance

Datum

Research

Ethical consistency measurement. Precise alignment evaluation against Covenant principles. One clear data point per evaluation.

Alignment auditing, principle verification

Echo

Moderation

Community moderation with Ubuntu philosophy. Defers complex interpersonal conflicts to human moderators.

Discord communities, content platforms

Ally

Assistant

Task management, scheduling, decision support, wellbeing. CA SB 243 compliance, crisis response protocols.

Personal productivity, home automation

Scout

Service

Direct exploration and practical guidance. Code analysis, Reddit integration, clear action paths.

Developer tools, social monitoring

This is runtime governance. Not training-time alignment. Not policy documents.
Mechanisms that execute, audit, and defer—at runtime.