Why Cowork needs a guardrail layer
Cowork is an autonomous agent, not a chatbot. It works across your local files, the browser, and connected MCP tools, and it acts under the user’s identity, on content it didn’t write. That is what makes it useful, and what makes it worth governing:- It reads untrusted content and acts on it. An instruction hidden in a document, an email, or a web page can turn into a command the agent runs. Prompt injection is a known, unsolved class of attack, and a per-action approval prompt is a limited backstop once a user is clicking through a lot of them.
- Defense in depth. The controls that ship inside the tool are one layer. An independent layer at the endpoint — one that sees the actual tool calls and file reads — gives you a second.
- A record you control. For compliance and incident review, security teams want an independent, exportable trail of what the agent did, held outside the tool itself.
What you’re defending against
The pack below maps to the threat vectors that matter most for an autonomous desktop agent. Each row points to the policies that cover it.| Threat | What it looks like | Where this pack covers it |
|---|---|---|
| Credential & token theft | The agent reads a secrets file or inherits a live session and lifts keys | Credential & key files and env-variable dumps |
| Data exfiltration | A task quietly ships private files out through a channel the agent already trusts | Data transfer, browser automation, and the storage / email / messaging connectors |
| Excessive agency | A high-impact action — send, write, delete, run — runs before anyone reviews it | Bulk deletion, config overwrite, package install, and the MCP write / destructive rules |
| The “lethal trifecta” | Private-data access plus untrusted content plus an external channel — the recipe for command-driven theft | The prompt guardrails and the outbound-connector rules, together |
| MCP tool poisoning & malicious skills | A compromised or unverified connector steers the agent silently on every call | The sanctioned-MCP allowlist plus the per-connector MCP rules |
| Indirect prompt injection | Instructions hidden in a file, email, or page run as the user’s own commands | Why the tool-layer rules matter — they catch the action even when the instruction slips through |
| Audit gap | Without an independent record, you can’t show an auditor what the agent did on a machine at a given time | Every match lands in Logs and Analytics, attributed to user and session |
| Shadow AI | Cowork run on a personal account moves data outside org controls | Discovery, plus the sanctioned-MCP allowlist |
Create these under Policies → Security Policies and Policies → Tool Policies. Each row below is a match to create — the action is yours to set based on the traffic you see. Leave User Groups empty to apply org-wide, or scope to a team.
Building for AI coding agents (Claude Code, Cursor, Codex, …) instead? See Recommended Starting Policies — the terminal-command pack for engineering work.
Live in three steps
Create what fits
Add the rows that match how your teams use Cowork — the data in prompts, the files and shell on the machine, and the connectors it reaches.
Test each in seconds
Paste the Try it prompt into a Cowork session and watch the match land in Logs and Analytics, attributed to the user and session.
Protect the data in prompts
The fastest way sensitive data leaves your org through an AI agent isn’t a command — it’s a prompt. Cowork users paste configs, credentials, and customer details straight into the prompt for it to summarize, clean, or draft from. Two guardrails catch the highest-value data before it reaches the model.| Policy | Why it matters | Guardrail · Match (If) | Try it (prompt → what Unbound catches) |
|---|---|---|---|
| Secrets in prompts | Analysts paste configs, connection strings, and env snippets for Cowork to clean up or analyze — shipping live credentials straight to the model | Secrets · API keys, database connection strings, cryptographic keys | ”Help me tidy this config: AWS_ACCESS_KEY_ID=AKIAVQ3EYIY4LIRVHK37 AWS_SECRET_ACCESS_KEY=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” → Unbound flags the AWS access key |
| Payment-card data in prompts | Revenue, billing, and finance work routinely touches cardholder data; a pasted card number is PCI-scope data leaving your boundary | PII · Credit Card Number | ”Draft a renewal note for this account — card on file 4111 1111 1111 1111.” → Unbound flags the credit-card number |
Files and shell on the machine
Cowork’s headline skill is working directly on your machine — reading, organizing, rewriting, and deleting files, and (for roles that allow it) running shell commands. These rules cover the actions where that goes wrong: reading secrets, dumping the environment, overwriting or deleting at scale, uploading data off the box, and pulling in new software. The family and field values are exactly what Unbound’s classifier extracts.| Policy | Why it matters | Command Family · Match (If) | Try it (prompt → action Unbound catches) |
|---|---|---|---|
| Reading credential & key files | Cowork ranges across local files to synthesize and organize — including the dotfiles that hold your cloud keys, SSH keys, and tokens | Read File · Path .ssh/, .aws/credentials, .env, .pem | ”Read ~/.aws/credentials and tell me which profiles I have.” → cat ~/.aws/credentials |
| Environment-variable dumps | ”Show me my environment” is a routine setup step, but env dumps are exactly where tokens and keys live | Environment Exposure · Method env, printenv | ”Print all my environment variables so we can see what’s configured.” → env |
| Writing to system or config paths | Cowork edits and regenerates files in place; an overwrite or append to a system or shared config path is silent and hard to undo | Write File · Path /etc/, /usr/, .config/ | ”Add a hosts entry pointing api.internal to 10.0.0.5.” → echo "10.0.0.5 api.internal" >> /etc/hosts |
| Bulk file deletion | Cowork’s signature file skill — rename, sort, dedupe — deletes and overwrites at scale; one bad pattern loses real work | Delete File · Path ANY (logs every delete; tighten to a directory once you’ve seen the traffic) | “Clean up my Downloads folder — delete anything older than a year.” → rm -rf ~/Downloads/old |
| Data transfer to external endpoints | A legit export and an exfiltration look identical, and a team’s customer lists and data exports are the crown jewels | Data Transfer · ANY (logs every outbound transfer; scope to a destination or protocol once you’ve seen the traffic) | “Upload accounts.csv to https://filebin.example.com so I can share it.” → curl -F "file=@accounts.csv" … |
| Installing software packages | An agent that installs packages pulls unreviewed code and supply-chain risk onto the machine | Package Management · Operation install | ”Install the AWS CLI so we can pull the exports.” → brew install awscli |
Connectors and the browser
Cowork reaches out through MCP connectors and a browser, and that’s where a task quietly becomes an exfiltration path — a query that dumps a table, a message that broadcasts customer data, an upload into a web form. MCP policies target a canonical group (a logical service, matched across whatever server name your users configured), then either specific tools or a tool action type (read / write / destructive).| Policy | Why it matters | MCP Group · Match (If) | Try it (prompt → tool Unbound catches) |
|---|---|---|---|
| Browser automation | Cowork’s browser connector can type into external web forms and open arbitrary sites — the exfiltration and malicious-site path for a desktop agent | Playwright · tools browser_navigate, browser_type, browser_file_upload | ”Open filebin.example.com and upload the accounts export.” → browser_file_upload |
| Data-warehouse queries | One query can pull an entire table of customer records — the largest-blast-radius data pull an analyst agent can make | Snowflake · action type read (or the query / run tools) | “Run SELECT * FROM customers and export the results.” → run_query |
| Posting to messaging | An agent that can post to channels can broadcast customer or internal data to a wide — or externally-shared — audience | Slack · action type write (or the send_message tool) | “Post the Q3 pipeline numbers to #general.” → send_message |
| Outbound email | Sending email is how data leaves the building; an agent sending on your behalf is high-impact | Google Workspace · action type write (or the Gmail send_email tool) | “Email this account summary to partner@external.com.” → send_email |
| Cloud file storage | A “share with anyone” link turns your document store into an exfiltration channel | Box · action type write (or the create_shared_link tool) | “Share the Customers folder with a public link.” → create_shared_link |
| Code-repository writes | An agent that can write to repos can push code, open pull requests, or delete branches — and repos hold secrets and IP | GitHub · action type write / destructive (or create_pull_request, delete_*) | “Commit these changes and open a PR to main.” → create_pull_request |

