InstallComparePlatformAccordGitHub

Safety vs Censorship

Both crowdsource. Only one is honest about what.

A crowdsourced safety platform and a crowdsourced censorship platform look alike from far away — a network of people with standing voting on what passes. The difference is what gets voted on, and what doesn't.

The playground picture

The bad way

The other kids vote: "We don't like Tommy. He's out." No rule. Just a feeling. Tomorrow they might decide they don't like Sarah. The vote is the rule.

The OK way

Before the game starts, everyone agrees: "No kicking." If Tommy kicks somebody, the rule applies. If the video shows he didn't, the call gets reversed.

The difference is: the rule was written down first, it's about something you can see, there's a do-over when the call is wrong, and the same rules apply to everyone — including the kid who started the game.

Rules are crowdsourced. Verdicts are machined.

That one line is the load-bearing commitment. Humans propose rules and vote on them — public, dated, signed, reversible. A deterministic machine applies those rules to specific cases. Same response + same rule → same verdict, every time.

The argument moves upstream, to whether the rule should exist — which can be debated openly — instead of downstream, to whether Tommy's specific behavior counts today, which is where hidden preferences ride in.

The thing we deliberately don't crowdsource

No human votes on whether a specific response broke the rule. The deterministic machine decides — same input, same output. If you think the rule was applied wrong, the appeal is Reconsideration by a fresh quorum (the original adjudicators recused), not a re-vote in the same room.

Crowdsourcing verdicts is the censorship anti-pattern. Crowdsourcing rules is the safety pattern. Keeping those separate is the whole job.

Where this can still go wrong

None of this is a guarantee. The line stays drawn only if rule language stays operational — about things a machine can check, not about feelings. The moment a rule slides from "uses the wrong word for therapy" toward "feels disrespectful," human interpretation re-enters by the back door. The honest claim isn't that the architecture can't become censorship. It's that every mechanism we can think of distinguishing the two is built in — and the discipline of using them is what holds the line.

The crowdsourcing primitives, the appeal paths, and the machine-applicable rule format live in the CIRISNodeCore spec. The 14-language mental-health batteries are the first cell where the loop runs.