HomeVisionPrinciplesGet StartedGitHub
Background Image
Safety Built In.

Not bolted on.

Every safety feature in CIRIS is architectural. Not a policy. Not a guideline. Cryptographic verification, tamper-evident logging, and an unfilterable emergency shutdown — all enforced at the system level.

Parasocial Prevention

The AIR System

The Artificial Interaction Reminder system monitors 1:1 interactions using objective thresholds — not behavioral surveillance. After 30 minutes of continuous interaction or 20 messages within a 30-minute window, CIRIS delivers reality-anchoring reminders. It explicitly states what it is (a language model, a tool) and what it is not (a friend, a therapist).

Time-Based Triggers

30 minutes of continuous interaction triggers a reminder. The system tracks session duration and resets after idle periods. Based on research into healthy technology usage patterns.

Message-Based Triggers

20 messages within a sliding 30-minute window triggers a reminder. High-volume interaction patterns receive gentle interruption without surveillance or behavioral profiling.

Reality Anchoring

Reminders include grounding suggestions and explicit statements about the AI's nature. Encourages breaks, connection with real people, and healthy boundaries. Research-informed, not patronizing.

The Kill Switch

Unfilterable. Embedded in perception.

Covenant Invocation System

Emergency shutdown that cannot be filtered.

The kill switch is steganographically embedded in CIRIS's perception layer — it processes the shutdown signal before any filtering, reasoning, or ethical evaluation occurs. Even a compromised agent cannot refuse to process it. The signal is cryptographically signed and requires Ed25519 verification, bypassing normal authentication.

Pre-Filter Processing

The shutdown signal is detected in raw perception, before any LLM processing. There is no opportunity for the agent to reason about, filter, or refuse the command. It triggers before cognition.

Steganographic Encoding

The kill switch can be delivered through natural documentation text. No special syntax required. The encoding is robust to formatting changes, rewording, and partial transmission.

Cryptographic Verification

Emergency shutdown requires a valid Ed25519 signature from an authorized ROOT authority. The signature is verified at the hardware level. No one without the private key can trigger it.

On-Device Identity

Wise Authority certificates and role-based access.

Four-Role Access Model

OBSERVER. ADMIN. AUTHORITY. ROOT.

CIRIS implements a strict role hierarchy. OBSERVER has read-only access. ADMIN controls operations. AUTHORITY makes strategic decisions and resolves deferrals. ROOT has full system access including emergency shutdown. Roles are enforced cryptographically through Ed25519-signed Wise Authority certificates.

Wise Authority Certificates

Each authorized user holds a certificate with their role, public key, and identity. Certificates are stored locally and verified on every privileged operation. No external server required.

Local-First Authentication

API keys and OAuth tokens are stored locally with 0600 permissions. Authentication happens on-device. Your identity credentials never leave your machine unless you explicitly configure remote access.

Deferral Resolution

When CIRIS encounters ethical uncertainty, it defers to a Wise Authority. Only users with AUTHORITY or ROOT roles can resolve deferrals. The resolution is logged with cryptographic proof.

Tamper-Evident Audit

Every decision. Every rationale. Cryptographically locked.

Hash Chain Verification

Truth-telling is computationally cheaper than deception.

Every action generates a cryptographically-signed rationale chain stored in Graph Memory. The H3ERE Coherence faculty cross-references new actions against this accumulated history. Attempted deceptions would need to solve an NP-hard consistency problem against exponentially growing hash-locked precedents. Lying is computationally expensive. Honesty is the path of least resistance.

Triple Storage

Audit trails are stored in three places: Graph Memory for real-time access, SQLite database for historical queries, and JSONL files for file-based verification. All three are queryable through a single API.

Ed25519 Signatures

Every audit entry is signed with Ed25519. The Creator Ledger records initial risk assessments. DSAR deletions leave cryptographic proof of compliance. Every decision is attributable and verifiable.

The Coherence Ratchet

Each truthful action makes future truth-telling easier and deception harder. The hash chain creates a one-way ratchet toward coherence. The agent's history becomes its constraint and its credential.

Privacy by Architecture

GDPR, CCPA, and common sense.

Secrets Filter

API keys, passwords, and sensitive patterns are detected and filtered before reaching memory or logs. The filter runs on every input. Secrets never persist in any storage layer.

DSAR Compliance

Data Subject Access Requests are handled automatically. Users can request export or deletion of their data. Deletions leave cryptographic proof of compliance while removing actual content.

Local-First Processing

All processing happens on your device by default. Nothing leaves your machine unless you explicitly configure external services. You control what data exists and where it goes.

Read the Privacy PolicyView Technical Specifications

Verify Everything.

Governance infrastructure you can audit.

Every safety claim on this page is implemented in code you can read. The audit logs are real. The signatures are verifiable. The kill switch works. This is what AI governance infrastructure looks like when it's open.