Verified-backup health endpoint and canary auto-rollback on SLO breach
Two reliability additions for self-hosted operators.
The new /health/deep endpoint returns HTTP 503 when no SHA-256-verified backup completed within a configurable window. Verification runs a checksum, attempts a sample restore, and confirms destination disk headroom before marking the backup as good. Operators can wire /health/deep into their existing uptime monitor as a leading indicator of backup drift. The original /health liveness probe is unchanged and remains the container health check.
Canary deployments now evaluate against a declarative set of SLOs at config/canary/slos.yaml. A breach triggers a reversible rollback action with a per-SLO cooldown. The default set covers backup pass rate, kernel verifier rejections on the egress guard, and PII pipeline latency. SLOs are off by default; populate the file to opt in.
No action required for SaaS tenants.