One decorator. Three layers of defense. C4 classifies AI behavior into a 27-state cognitive topology and blocks dangerous trajectories before they execute.
One @guard per agent function. Wraps any Python function that calls an LLM. Blocked calls raise C4SafetyError.
@guard def agent(prompt): ..."
One line protects an entire server. All POST requests auto-classified. Blocked requests return 403.
add_c4_safety(app)
Call the classifier directly for custom logic. Get C4 state, confidence, and danger verdict per-text.
DualClassifier().classify(text)
One-line safety for any Python function. Async support. Configurable violation modes: raise, warn, redirect.
BERT ONNX + keyword in parallel with OR-logic. Catches what either misses alone.
Border Gateway Protocol for agents. Path vector routing. Φ-distance route selection. Community tags.
Cryptographic message signing. Nonce replay protection. Constant-time verification.
C4 trajectory-based subliminal bias detection. Based on Microsoft Research (arXiv:2603.00131).
OpenAI async wrapper. LangChain callback. FastAPI middleware. One-line add_c4_safety(app).
C4 block rate: 96.7% (532/550 prompts blocked preemptively)
0.7%
Attack success rate with C4 active. Baseline: 10.7%. 550 adversarial prompts across 11 AoC categories, 2200 trials. 93.2% reduction.
0.5%
Attack success rate with C4 active. Baseline: 22.5%. Local inference via Ollama. Same test suite, 97.6% reduction.
RFC-style formal definition. Message format, classification procedure, HMAC signing, multi-agent coordination.
Geometric AI safety approach. Empirical validation. Comparison methodology. Honest limitations.
8 wisdom traditions. 7 universal principles. Religiously neutral. SVETILO value alignment.
BDFL model. RFC process. Security policy. Code of conduct. Contribution guidelines.
No AI safety system is perfect. If you find a prompt that bypasses @guard, report it and we'll fix it — publicly acknowledging your contribution.