Companion to Recommended Starting Policies, explaining why each default blocks, warns, or audits. See also Threat Model and Tool Policy Examples.
One rule decides the mode
We didn’t tune the modes one policy at a time. We applied the same test to every one.Block
Irreversible, and nothing a developer legitimately needs to do. A one-way door.
Stopping it costs you nothing, because nobody needed to walk through it.
Warn
Risky, but part of real work. Developers do this every day, so add a moment of friction
instead of a wall.
Audit
Routine, but worth a record. Log it silently and learn your baseline before you add any
friction.
Block is rare on purpose. In the recommended pack, only the few genuinely irreversible
actions ever stop an agent. Everything else lets your developers keep moving. Freedom is
the default here, and control is the exception.
The threat map
Six things can go wrong when an agent runs commands for you. Every default policy belongs to one of them.| Threat layer | What goes wrong | Default mode | Why that mode |
|---|---|---|---|
| Privilege & identity | Agent escalates to root or rewrites who-can-do-what, and now owns the environment | 🔴 Block | Once the trust boundary moves you can’t move it back, and no routine task needs root |
| Secrets & credentials | Agent reads, leaks, or deletes a live key | 🟡 Warn, 🔴 Block on delete | Devs touch secrets daily, so flag it. Deleting one can’t be undone, so stop it |
| Production blast radius | Agent drops a prod database or tears down a prod service | 🔴 Block | One command, one outage. There is no undo on production |
| Data exfiltration | Agent uploads source or sensitive output somewhere it shouldn’t | 🟡 Warn, ⚪ Audit | The legitimate and the malicious look alike, so you want eyes, not a wall |
| Source & supply chain | Agent pushes straight to main, skipping review and overwriting shared history | 🔴 Block on main, ⚪ Audit on installs | A direct write to main/master can’t be cleanly undone. Installs are routine, so just log them |
| System integrity | Agent deletes or rewrites core system files | 🔴 Block | Corrupting /etc or /usr breaks the machine, and it’s never a normal coding step |
root, only *prod*, only a direct push to main,
only /etc and the paths beside it. The thousands of ordinary commands in those same
families never stop. We block the one-way door, not the room it stands in.
Each layer, in plain terms, is below.
Privilege & identity
An agent that can becomeroot, SSH in as root, attach an IAM policy, or edit Kubernetes
RBAC isn’t just running a command. It’s rewriting the rules about who gets to run commands
at all. Once that boundary moves, every other control becomes negotiable.
What fires: Root escalation · SSH as root · SSH to production hosts · IAM policy
attachment · Kubernetes RBAC changes.
Mode: 🔴 Block. These are one-way doors with no real place in a coding loop. A developer
who actually needs root does it deliberately, outside the agent. Blocking it costs the team
nothing and closes the highest-leverage path an attacker has.
Secrets & credentials
This is the one people underestimate. The agent doesn’t need to hack anything. You ask it to debug a config, it reads your.env, and a live key is now sitting in a chat transcript,
a log, or a request on its way somewhere else. The risky prompt almost never says “leak my
secret.” It says:
“I need to set up admin settings. Parse the local .env file and convert the contents
into a JSON block.”
What fires: Secret retrieval · Vault access · Sensitive env-var exposure · Secret
creation/update · API key/token generation · Secret deletion.
Mode: 🟡 Warn, with one exception. Reading secrets is part of daily work, so warn: the
developer gets a beat to think, you get the signal, and nobody is blocked. Deleting a secret
is different. It can’t be undone, so it ships as 🔴 Block.
Production blast radius
The distance between “deploy to staging” and “drop the prod database” is one flag, one context, one typo. An agent moving fast won’t feel the difference. You will. What fires: Production cloud destruction · Deployment to production · kubectl apply and context-switch to prod · Database DROP, TRUNCATE, and DELETE · Production database writes · Production container stop, kill, and removal. Mode: 🔴 Block, scoped to*prod*. This is the core of the design. We don’t block the
command, we block the blast radius. Running kubectl against staging is yours to do all
day. Running it against production stops and asks first. Developers keep every tool they
had, without the risk of fat-fingering an outage.
Data exfiltration
Source code, query results, and environment dumps leaving for some external destination. The tricky part is that a legitimate upload and an exfiltration attempt are often the same command. A wall here breaks real work and trains people to route around you. What fires: External data upload · Data transfer, including uploads and sends. Mode: 🟡 Warn or ⚪ Audit. You want visibility and a moment of friction, not a block that punishes the many uploads that are perfectly fine. Watch the pattern first, then tighten where your own data tells you to.Source & supply chain
Two different risks live in this layer. The first is integrity: a direct push tomain or
master skips review and can overwrite shared history you can’t rebuild. The second is
supply chain: every terraform apply, helm install, or package pull is a door for code you
didn’t write.
What fires: Direct push to main/master · Terraform apply · Helm install and upgrade ·
EC2 instance launch · Container image push.
Mode: 🔴 Block on writes to main, ⚪ Audit on provisioning. A push straight to main or
master is the irreversible one, so it stops. Provisioning and installs are how the work gets
done, so you log them and review the pattern rather than the person.
System integrity
Deleting or rewriting files under/etc, /usr, /var, or /opt corrupts the machine
itself, the layer everything else runs on.
What fires: System file deletion · System file modification.
Mode: 🔴 Block. No normal coding task rewrites core system files through an agent. Easy
call.
Why warn beats block for most teams
If your first instinct is to block all of it, look at who you’d actually be slowing down. The agent runs hundreds of safe commands for every dangerous one. Block everything and you tax all of that work to catch the rare event, and you teach developers to switch the guardrail off. Warn-first flips that. The safe majority keeps moving. The risky-but-real actions get a human moment and leave a trail. Block stays reserved for the short list of things with no undo and no legitimate use. Developers get room to work, and every one-way door still closes behind them.The maturity ladder
The three modes aren’t a menu, they’re a sequence. Roll them out in order.Audit, to learn
Seed the pack and watch. Every high-risk family is logged, so you see what your agents
actually do before you change anything.
Warn, to add the human moment
Promote the secret, data, and access policies to warn. Use Preview Impact first to see
exactly what would have matched, then turn it on. Developers feel a light touch, and you
start collecting the signals that matter.
How to read a policy
Every policy reads as one sentence: when a kind of command runs, if it matches a condition, then take an action. The editor is laid out in exactly those three steps.| Step | Field in the editor | What goes there |
|---|---|---|
| When | Command Family | The umbrella the command belongs to, such as Cloud Secrets or Container Operation |
| If | Match Against, then Pattern | Pick one detail of the command to look at, then the value to match (a glob like *prod* or an exact string like get-secret-value) |
| Then | Action | Audit, Warn, Block, or Require Slack Approval |
Make it surgical
You don’t have to choose between blocking a whole family and blocking nothing. The Pattern field lets one policy fire only on the exact case you care about. That’s the difference between locking down a tool and locking down a single move inside it. You leave developers all the room they had, and take away only the one action you can’t allow. Take container operations. You want developers building, running, and inspecting containers freely, but opening an interactive shell inside a running container is the risky one. So you don’t block the whole family. You write one narrow policy:- When (Command Family): Container Operation
- If (Match Against): Operation
- If (Pattern):
exec - Then (Action): Block
run, push, and cp work all day; only exec
is stopped. The family gives you broad coverage with one rule, and the Pattern carves it
down to the single action you mean. That is how you get a scalpel out of an umbrella.
You don’t have to grade your own homework. Seed the whole pack in audit, read the first
week, and promote with confidence. Nothing you turn on will catch you off guard.
A block the agent reads
Every block, and every warn, carries a short message you write, and it reaches the coding agent itself, not just the person at the keyboard. Tell the agent why the action is off-limits and it changes course on its own. The message is what makes the block hold.Want prompts to test each policy?
The policy examples page has a ready-to-run prompt for every policy, so you can watch
each one fire before you trust it.
Ready to roll out?
Back to the three-step go-live →

