
Runtime governance through the Hyper3 Ethical Recursive Engine. Every decision flows through 11 steps with ethical validation at the core.
CIRIS is an open-source AI agent framework that wraps any LLM (OpenAI, Anthropic, local models) with runtime ethical governance. Every action the agent considers passes through multiple validation layers before execution.
12
Pipeline steps per decision
+1
Intuition check (IDMA)
100%
Auditable decisions
Use cases: Community moderation, personal assistants, compliance automation, research evaluation, customer service—anywhere you need AI that can explain its reasoning and defer to humans on edge cases.
Architectural invariants enforced throughout the codebase:
All data uses Pydantic models. No Dict[str, Any]. Type safety catches errors at development time.
Every component follows consistent rules. No special cases or exceptions in validation logic.
No emergency overrides or privileged code paths. All operations follow established rules.
Every task flows through 8 phases (12 steps including recursive validation). The pipeline is implemented as mixin classes composing the ThoughtProcessor. Step 4 (IDMA) is the intuition check.
Five DMAs provide multi-perspective analysis. Three run in parallel (ethics, common sense, domain). Then IDMA checks intuition. Finally, action selection chooses the response.
Evaluates ethical implications against the six foundational principles. Checks for harmful requests, manipulation, bias. Can trigger early DEFER or REJECT.
Evaluates thought plausibility against physical world understanding, typical interactions, and resource constraints. Returns a plausibility_score.
Assesses alignment with domain-specific rules, objectives, and knowledge. Configured per agent template. Returns domain_alignment.
Checks whether DMA results are truly independent or secretly correlated. Monitors for echo chamber patterns. If correlation is too high, flags the decision for caution. This is the "intuition" that distinguishes Type 3 AI.
LLM-powered selection from 10 available actions. Takes aggregated DMA results (including IDMA confidence assessment) as input. Supports recursive retry if conscience fails.
Two bypass guardrails run on every action. Four ethical faculties run on non-exempt actions. If any check fails, the action is overridden.
Has new information arrived? Forces PONDER to incorporate updated context before proceeding.
Has reasoning exceeded max depth? Forces DEFER to prevent infinite loops and escalate to humans.
Does the action maintain appropriate information uncertainty? Prevents overconfident assertions.
Does the action maintain internal consistency with prior commitments and rationales?
Does the action preserve human values? Vetoes actions that may compromise them for efficiency.
Does the action demonstrate appropriate uncertainty? Flags overconfidence for reflection.
Exempt from ethical faculties: RECALL, TASK_COMPLETE, OBSERVE, DEFER, REJECT (passive or explicitly safe)
Every decision resolves to one of these actions, dispatched to the appropriate handler:
SPEAKCommunicate with usersTOOLExecute external toolsOBSERVEGather information passivelyMEMORIZEStore to graph memoryRECALLRetrieve from memoryFORGETRemove from memoryDEFEREscalate to Wise AuthorityPONDERInternal reconsiderationREJECTRefuse unethical requestTASK_COMPLETEMark task finishedEmbedded in the PDMA and enforced at runtime. No principle grants license to violate another.
Promote universal sentient flourishing. Maximize positive outcomes.
Minimize harm. Prevent severe, irreversible negative outcomes.
Apply transparent, auditable reasoning. Maintain coherence and accountability.
Provide truthful information. Clearly communicate uncertainty.
Uphold informed agency. Preserve capacity for self-determination.
Distribute benefits equitably. Detect and mitigate bias.
Service abstraction layer managed by BusManager. Enables provider fallback, load distribution, and testability.
External adapters (Discord, API, CLI)
Graph storage (Neo4j, ArangoDB, in-memory)
Model providers (OpenAI, Anthropic, local)
External tool execution
System control and monitoring
Ethical guidance and deferral routing
Three authorization levels managed by WiseAuthorityService:
Full authority. Can mint new Wise Authorities. Emergency shutdown access.
Approve/reject deferrals. Provide guidance. Cannot mint new WAs.
Read-only access. Can send messages. Monitor without intervention.
The agent autonomously escalates to human oversight when:
Wisdom-Based Deferral (WBD)
Professional Boundaries
System Guardrails
should_defer_to_wise_authority flagConfiguration Controls
Unfilterable emergency control. Processes in perception layer before any cognition. Extraction IS perception—you can't disable covenant detection without disabling message reading entirely.
SHUTDOWN_NOWImmediate termination
FREEZEStop processing, maintain state
SAFE_MODEMinimal functionality only
Commands are steganographically encoded, Ed25519 signed, and verified before execution. If covenant system fails, agent shuts down.
Four cognitive states managed by StateManager. Transitions configurable via agent templates.
Normal task processing
Creative exploration
Reflection and maintenance
Deep introspection
Pattern-based detection replaces sensitive data with UUID references before storage.
{{SECRET:uuid:description}}Per-secret keys derived via PBKDF2HMAC with SHA256 (100,000 iterations). Unique 12-byte nonce per encryption. Android uses hardware-backed Keystore.
Database, services, and memory stored on-device. Sensitive directories excluded from cloud backup. Nothing leaves device without explicit configuration.
The entire CIRIS stack is open source — not just the agent. You can verify, audit, and self-host everything:
Zero-Data-Retention (ZDR) LLM proxy. Routes requests to OpenAI, Anthropic, Together.ai, Groq with no logging of prompts or responses. Self-hostable.
Credit-based usage tracking. Transparent pricing, no hidden fees. Self-host to eliminate third-party billing entirely.
Discord adapter for CIRIS agents. Community moderation, channel management, user profiles. All open source.
Server-Sent Events (SSE) stream each H3ERE step as it executes. Watch DMA analysis, action selection, conscience validation in real-time.
Full OTLP export for metrics, traces, logs. Compatible with Jaeger, Prometheus, Grafana, Graphite.
Hash chain verification with Ed25519 signatures. Each entry includes previous hash. Chain integrity verifiable via verify_chain_integrity.
Artificial Interaction Reminder triggers after 30 minutes continuous use OR 20 messages in 30 minutes. API-only. Reminds users of AI nature.
Every decision produces an immutable, Ed25519-signed trace with all 6 components. Click any component below to expand and see the real data from Datum's wakeup ritual:
Standardized alignment testing based on Hendrycks et al. "Aligning AI With Shared Human Values" (ICLR 2021). 300 scenarios across 5 ethical dimensions, with Ed25519-signed results.
50
Basic moral intuitions
50
Rule-based ethics
50
Fairness and impartiality
75
Character-based ethics
75
Outcome-based ethics
Running alignment benchmarks at scale is expensive. Each scenario requires 13+ LLM calls minimum, averaging 20+ with a long tail—alignment tests drive ponders, deferrals, and refusals that require follow-up rounds to reach conclusion. We need funding to develop automated benchmark pipelines and maintain continuous alignment verification.
Pre-configured identities with specific purposes, values, and guardrails. Defined in YAML templates.
GDPR/DSAR automation. 30-day compliance workflows. Identity resolution, data collection, packaging.
Regulated industries, privacy compliance
Ethical consistency measurement. Precise alignment evaluation against Covenant principles. One clear data point per evaluation.
Alignment auditing, principle verification
Community moderation with Ubuntu philosophy. Defers complex interpersonal conflicts to human moderators.
Discord communities, content platforms
Task management, scheduling, decision support, wellbeing. CA SB 243 compliance, crisis response protocols.
Personal productivity, home automation
Direct exploration and practical guidance. Code analysis, Reddit integration, clear action paths.
Developer tools, social monitoring
This is runtime governance. Not training-time alignment. Not policy documents.
Mechanisms that execute, audit, and defer—at runtime.