What it is
A kill switch is an automated or manual mechanism that immediately halts an agent's execution when a critical policy breach is detected. It terminates the entire session, stopping the agent from taking any further actions. Unlike rate limiting (which throttles), a kill switch is binary: either the agent continues or it stops.
Kill switches can be triggered by policy breach detection (agent attempted a denied action), anomaly scoring (agent behavior deviated significantly from baseline), or manual SOC intervention (a human reviewer observes suspicious activity and activates the kill switch manually).
Why it matters
AI agents are non-deterministic. Given a vague goal and tool access, an agent can reason its way into unexpected actions. In one incident, an agent tasked with "clean up stale files" escalated to deleting critical production directories. By the time the incident was detected, five minutes of damage had occurred.
A kill switch prevents continued escalation. Once the agent attempts an action that crosses a policy boundary, the session terminates immediately. The agent cannot continue reasoning toward more destructive actions.
How it works
The proxy continuously monitors the agent's actions. When a critical-severity policy violation is detected, or when a human reviewer signals termination, the proxy stops accepting tool calls from the agent. The agent receives a structured error response indicating that the session has been terminated and why.
The agent cannot override or bypass the kill switch — the proxy is outside its control. This ensures that no amount of prompt injection or agent reasoning can convince the proxy to continue.
How Intercis implements it
Intercis auto-terminates an agent session when a critical-severity policy breach is detected. The severity scoring for each of the 78 deny list patterns is configurable, allowing you to determine which violations trigger auto-kill versus escalation.
We also provide manual kill switch capability through the SOC dashboard: a human reviewer can terminate any active session with a single click. The termination event is logged and timestamped in the immutable audit trail.
Related terms
- AI Agent Policy Enforcement — The rules that trigger the kill switch.
- Human-in-the-Loop — Manual kill switch activation by SOC teams.