principles

not slogans. testable rules. if a system cannot meet these, it should not claim reliability.

1.

name the compression

visibility

every useful system compresses. list the steps where it happens: retrieval, rank, summarize, generate, redact, review.

quick test: can a new engineer draw the compression path for a single request without guessing.

antipattern: hidden heuristics and silent truncation.

2.

score the loss

measurement

compute a per site tension score τ that combines information loss, task degradation, provenance gaps, and uncertainty. aggregate along the path.

quick test: does τ increase when sources conflict or when summarization is lossy.

antipattern: confidence without a cost model.

3.

provenance first

accountability

every claim carries receipts. attach sources and step metadata so reviewers can retrace without digging.

quick test: can a third party reproduce the claim with the provided sources.

antipattern: fluent output with no citations.

4.

gate with abstention

safety

block unsupported claims. if entailment fails or τ passes a threshold τ*, abstain or request clarification.

quick test: do abstentions cluster where τ spikes.

antipattern: answer anyway.

5.

map contradictions

coherence

treat conflicts as signals, not noise. detect and tag contradictions in inputs, tools, and policy so tension is visible.

quick test: can the system highlight the exact tokens or records that disagree.

antipattern: silent tie breaking.

6.

audit by default

replay

make replay cheap. logs should reconstruct the path, sources, scores, and gates for any output.

quick test: can you rebuild yesterday’s answer bit for bit with stored artifacts.

antipattern: ephemeral steps with no trace.

7.

privacy by design

governance

log tension, not secrets. support local scoring, redaction at source, and tiered provenance views based on role.

quick test: can you run CAI with sensitive text withheld while still computing τ.

antipattern: raw data in every log.

8.

optimize for outcomes

impact

judge CAI by business or safety outcomes: fewer unsupported claims at equal or better accuracy, calibrated abstention, faster review.

quick test: does unsupported claim rate fall at fixed quality when CAI is on.

antipattern: metric theater that ignores harm.

self audit checklist

use this before shipping or auditing a pipeline
item status owner notes
compression sites listed
τ per site computed
provenance attached
τ* abstention thresholds set
contradiction detector wired
replay log verified
privacy controls enforced
outcome metrics tracked
cai_audit:
  compression_sites_listed: false
  tension_score_tau_per_site: false
  provenance_attached: false
  abstention_threshold_tau_star: false
  contradiction_detector: false
  replay_log_verified: false
  privacy_controls_enforced: false
  outcome_metrics_tracked: false

foundations snapshot

compact definitions and equations used by the principles.

compression tension score (CTS)

per site: CTSi = wI·Ii + wT·Ti + wP·Pi + wU·Ui

  • I: information loss (e.g., KL or rate drop), T: task degradation, P: provenance gap, U: uncertainty amplification
  • normalize each term to [0,1]; choose weights with sum = 1
  • path score: CTS(𝓟) = Σ αi·CTSi with step weights α

contradiction score

for claims c₁, c₂ with NLI probabilities pE, pC, pN:

CS(c₁,c₂) = pC − pE ∈ [−1,1]

attach pointers to evidence spans for both claims.

provenance entropy

Hprov = −Σ pk log pk, Confprov = 1 − Hprov / log K

high confidence when a small set of strong, verifiable sources dominate.

auditability

Audit(𝓟) = (# of steps with replayable artifacts) / n

targets: ≥ 0.9 for public claims, ≥ 0.7 for internal.

abstention rule

emit an answer only if all are true:

  • V(E, ŷ) = 1 (verifier entailment)
  • CTS(𝓟) ≤ τcts
  • Audit(𝓟) ≥ τaudit

otherwise produce a short, specific abstain message.

last updated: