Bundled on-premise model container is now opt-in

15 May 2026 v1.10.1 improvement

deployment
performance

Self-hosted deployments without a GPU can now leave the bundled local-inference container out entirely, saving roughly 8 GB of resident memory. The container is gated behind a compose profile and is included by default for backward compatibility. The bootstrap script preserves the profile across upgrades, and a new install flag opts out at first setup.

The admin page for the secure routing lane gains a status pill that reports which rail is actually serving flagged content, with a one-click link to swap the priority. The connection-test modal now mirrors the real chat exchange, badge included, so administrators see exactly what their users will see.

No action required.