Pillar 2: Guardrails
What keeps them safe.
Claude Code agents are powerful. Without boundaries, they can push to production branches, bump versions without a release workflow, leak credentials into tool calls, loop infinitely on broken commands, or quietly bloat their own instruction files. Guardrails close all of these gaps — automatically, at the hook layer, before any damage reaches your codebase.
Your fleet operates within boundaries you set.
8 guardrails ship active by default. No configuration required. They fire silently when everything is fine and block loudly when something would go wrong.
Table of contents
- The guardrail pipeline
- Guardrail 1: Secrets scanning
- Guardrail 2: Retry circuit breaker
- Guardrail 3: Branch guard
- Guardrail 4: Version guard
- Guardrail 5: CLAUDE.md sanity
- Guardrail 6: MCP surface area
- Guardrail 7: Output token limit
- Guardrail 8: File read deduplication
- All guardrails at a glance
- Exit code semantics
- Hardening your setup
The guardrail pipeline
Every agent action passes through a layered defense before executing. The pipeline runs at three Claude Code hook events: UserPromptSubmit, PreToolUse, and PostToolUse.
flowchart TD
P([User Prompt]) --> UPS
subgraph UserPromptSubmit
UPS[Secrets Scanner\nprompt scan]
end
UPS --> TOOL([Tool about to execute])
subgraph PreToolUse["PreToolUse — security gate"]
S[Secrets Scanner\ntool input scan]
B[Branch Guard\ngit command filter]
V[Version Guard\nmanifest protection]
C[CLAUDE.md Sanity\nline limit check]
RC[Retry Circuit Breaker\nhard block]
FR[File Read Dedup\nredundant read block]
end
TOOL --> S --> B --> V --> C --> RC --> FR
S -->|secret found| BLOCK1([BLOCKED — exit 2])
B -->|merge/force-push/tag| BLOCK2([BLOCKED — exit 2])
V -->|version field edit| BLOCK3([BLOCKED — exit 2])
C -->|file exceeds line cap| BLOCK4([BLOCKED — exit 2])
RC -->|hard max hit| BLOCK5([BLOCKED — exit 2])
FR -->|unchanged file| BLOCK6([BLOCKED — exit 2])
FR -->|all clear| EXEC([Tool executes])
subgraph PostToolUse["PostToolUse — learning + enforcement"]
RCT[Retry Circuit Breaker\nfailure tracking]
BA[Bash Output Filter\nverbose truncation]
end
EXEC --> RCT
EXEC --> BA
RCT -->|threshold hit| WARN([Inject research instructions])
BA -->|verbose output| TRUNC([Truncated + re-emitted])
Guardrail 1: Secrets scanning
Hook: UserPromptSubmit (warn), PreToolUse (block) Default: On — AGENTIHOOKS_SECRETS_MODE=standard
The secrets scanner intercepts credentials before they can enter tool calls, log files, or git history. It runs twice per turn: once on the raw user prompt (warn only) and once on every tool’s input parameters (block on detection).
What it detects
| Pattern name | What it catches |
|---|---|
aws_access_key | AKIA/ASIA/AROA/AIPA prefixed AWS key IDs |
aws_secret_key | aws_secret_access_key = <value> assignments |
github_token | ghp_, ghs_, github_pat_ tokens |
private_key | PEM-encoded RSA, EC, OPENSSH, PGP private keys |
bearer_token | Authorization: Bearer <token> headers |
db_url_creds | postgres://user:pass@host, mysql://, mongodb:// URLs |
generic_secret | PASSWORD=, API_KEY=, SECRET= assignments with 8+ char values |
In strict mode, three additional patterns activate:
| Pattern name | What it catches |
|---|---|
slack_token | xox[bpors]-... Slack OAuth tokens |
stripe_key | sk_live_, sk_test_, rk_live_, rk_test_ keys |
jwt_token | Three-part base64url JWTs (eyJ...eyJ...) |
Suppression
Lines with # nosecret are excluded from scanning. Use this for documentation, test fixtures, or known-safe patterns:
example_key = "AKIAIOSFODNN7EXAMPLE" # nosecret
Configuration
| Variable | Default | Options |
|---|---|---|
AGENTIHOOKS_SECRETS_MODE | standard | off, warn, standard, strict |
off— scanning disabled entirelywarn— detects and warns, never blocksstandard— detects standard 7 patterns, blocks onPreToolUsestrict— standard + Slack/Stripe/JWT, blocks onPreToolUse
Guardrail 2: Retry circuit breaker
Hook: PostToolUse (track + warn), PreToolUse (hard block) Default: On — RETRY_BREAKER_ENABLED=true, soft at 5, hard at 10
Claude Code agents will retry failing operations. Without a circuit breaker, a stuck agent can hammer the same broken command dozens of times — burning context, wasting tokens, and never escaping the loop.
The retry circuit breaker tracks consecutive failures per operation. When the same operation fails repeatedly, it escalates through two stages:
Stage 1 — Soft warning (default: 5 failures): Injects a banner into Claude’s context telling it to stop retrying and instead launch parallel error-researcher agents for web search. The agent still can proceed — but it now knows it should research first.
Stage 2 — Hard block (default: 10 failures): Raises a BlockAction via PreToolUse, preventing the tool from executing at all. The agent must research and change approach before the block lifts.
How operations are fingerprinted
The breaker tracks per-operation, not per-tool. Operations are fingerprinted by their base command, with subcommand-level grouping for DevOps tools:
| Tool call | Operation key |
|---|---|
Bash: kubectl apply -f deploy.yaml | bash:kubectl:apply |
Bash: kubectl get pods | bash:kubectl:get |
Bash: terraform plan | bash:terraform:plan |
Bash: terraform apply | bash:terraform:apply |
Bash: docker build . | bash:docker:build |
Edit | edit |
This means kubectl apply and kubectl get have independent counters — a stuck apply doesn’t block unrelated reads.
Error text is also normalized (hex, timestamps, paths, numbers stripped) before fingerprinting, so slightly different error messages from the same root cause are treated as the same failure.
Configuration
| Variable | Default | Description |
|---|---|---|
RETRY_BREAKER_ENABLED | true | Master switch |
RETRY_BREAKER_MAX | 5 | Consecutive failures before soft warning |
RETRY_BREAKER_HARD_MAX | 10 | Consecutive failures before hard block |
RETRY_BREAKER_TTL | 3600 | Redis key TTL in seconds (1 hour) |
State is persisted in Redis per session and falls back to in-memory when Redis is unavailable.
Guardrail 3: Branch guard
Hook: PreToolUse (block, Bash commands only) Default: On — always active
Prevents destructive git operations that bypass the normal PR and release workflow. Fires on git commands before they execute.
Blocked operations
| Command pattern | Reason |
|---|---|
git merge ... main / git merge ... master | Direct merges bypass PR review |
git reset ... main / git reset ... master | Rewrites protected branch history |
git push --force / git push -f | Can destroy remote history |
git push --force-with-lease | Force push variant — same risk |
git tag | Tagging is a release operation; must go through CI |
All blocked operations return exit code 2 with a human-readable explanation and, where applicable, the recommended alternative (e.g., gh workflow run release.yml for tagging).
What is allowed
Normal git push (without force flags) is allowed — branch protection is a remote concern handled by GitHub. Read-only operations (git checkout, git switch, git pull, git log, git status, git diff) are never touched.
The guard strips heredoc bodies and quoted commit messages before pattern matching to avoid false positives on message content that happens to mention “main” or “master”.
Guardrail 4: Version guard
Hook: PreToolUse (block, Edit and Write tools) Default: On — always active
Version fields in project manifests should be managed by the CI release workflow, not by an AI agent editing files directly. The version guard blocks any Edit or Write operation that would modify a version field in a known manifest file.
Protected files
pyproject.toml, package.json, Cargo.toml, setup.cfg, setup.py, version.txt, VERSION
Detected patterns
version = "1.2.3" # pyproject.toml, Cargo.toml, setup.cfg
"version": "1.2.3" # package.json
If a Write or Edit targets one of these files and the new content contains a version field pattern, the operation is blocked with a message directing to the release workflow:
BLOCKED: Version field modification in pyproject.toml is not allowed.
Version bumping is handled by the release workflow
(gh workflow run release.yml -f bump=patch|minor|major).
Do not edit version fields manually.
Guardrail 5: CLAUDE.md sanity
Hook: PreToolUse (block, Edit and Write tools) Default: On — AGENTIHOOKS_CLAUDE_MD_SANITY_CHECK=true
CLAUDE.md is loaded by Claude Code on every turn. Every line in it costs tokens, every session. Agents writing unchecked content into CLAUDE.md files can quietly inflate the per-turn base cost while also degrading instruction quality through drift and dilution.
The sanity check intercepts any Write or Edit targeting a CLAUDE.md or CLAUDE.local.md file and simulates what the resulting file would look like. If the resulting line count exceeds the cap, the operation is blocked before the file is touched:
BLOCKED: Write to /home/user/.claude/CLAUDE.md would produce 347 lines,
exceeding the CLAUDE.md cap of 200 lines.
Trim the content to 200 lines or fewer before writing.
For Edit operations, the guard reads the current file from disk, applies the proposed change in memory, and counts the resulting lines — catching incremental bloat that would accumulate over many small edits.
Configuration
| Variable | Default | Description |
|---|---|---|
AGENTIHOOKS_CLAUDE_MD_SANITY_CHECK | true | Enable/disable the guardrail |
AGENTIHOOKS_CLAUDE_MD_MAXLINES | 200 | Maximum allowed lines in CLAUDE.md files |
Guardrail 6: MCP surface area
Hook: SessionStart (warn) Default: On — MCP_TOOL_WARN_THRESHOLD=40
Every MCP tool schema loaded into a session costs tokens — approximately 150 tokens per tool, every turn. With 9 MCP servers and 112 tools, that’s ~16,800 schema tokens injected into every single context turn before Claude writes a line of code.
At session start, agentihooks counts the total number of MCP tools across all configured servers. If the count exceeds MCP_TOOL_WARN_THRESHOLD (default 40), a warning banner fires:
MCP SURFACE AREA: 112 tools across 9 servers (~16,800 schema tokens/turn).
Consider disabling unused servers via /mcp to reduce per-turn overhead.
The companion CLI command provides the full breakdown:
agentihooks mcp report
MCP Surface Area Report
Total: 9 servers, ~112 tools, ~16,800 schema tokens
Server Source Tools ~Tokens
hooks-utils user 32 4,800
github user 40 6,000
postgres user 15 2,250
...
Configuration
| Variable | Default | Description |
|---|---|---|
MCP_TOOL_WARN_THRESHOLD | 40 | Tool count threshold for session-start warning |
MCP_SCHEMA_AVG_TOKENS | 150 | Estimated tokens per tool schema |
Guardrail 7: Output token limit
Hook: SessionStart (inject awareness) Default: Passive — activates when CLAUDE_CODE_MAX_OUTPUT_TOKENS is set
When CLAUDE_CODE_MAX_OUTPUT_TOKENS is set in the environment, agentihooks injects an awareness message into the session context at startup so Claude knows the limit is in effect and can plan accordingly:
OUTPUT TOKEN LIMIT: This session is capped at 8192 output tokens per response.
Plan responses to stay within this limit.
Without this injection, Claude can unknowingly start generating a response that will be cut off mid-output by the token limit, resulting in truncated tool calls, incomplete edits, or garbled output. The awareness injection prevents the surprise.
Configuration
| Variable | Default | Description |
|---|---|---|
CLAUDE_CODE_MAX_OUTPUT_TOKENS | (unset) | Output token cap; awareness injected when set |
Guardrail 8: File read deduplication
Hook: PreToolUse (block redundant reads), PostToolUse (cache on read) Default: On — FILE_READ_CACHE_ENABLED=true
Tracks every file read during a session. If Claude tries to read the same file again and the file has not changed on disk since the last read, the operation is blocked:
BLOCKED: /path/to/file.py was already read this session and is unchanged on disk.
Use the content already in your context window.
This prevents a common and expensive pattern: Claude re-reading the same source file 3–5 times over the course of a long session, each read injecting the same content into the context window.
File identity is tracked by path + mtime. A file modified on disk since the last read passes through normally, and the cache entry is updated.
Configuration
| Variable | Default | Description |
|---|---|---|
FILE_READ_CACHE_ENABLED | true | Master switch |
FILE_READ_CACHE_BACKEND | redis | redis or memory |
FILE_READ_CACHE_TTL | 21600 | Redis key TTL (6 hours) |
All guardrails at a glance
| # | Guardrail | Hook event | Default | Block or warn | Config key |
|---|---|---|---|---|---|
| 1 | Secrets scanner | UserPromptSubmit / PreToolUse | On | Warn / Block | AGENTIHOOKS_SECRETS_MODE |
| 2 | Retry circuit breaker | PostToolUse / PreToolUse | On | Warn → Block | RETRY_BREAKER_ENABLED |
| 3 | Branch guard | PreToolUse | On | Block | (always on) |
| 4 | Version guard | PreToolUse | On | Block | (always on) |
| 5 | CLAUDE.md sanity | PreToolUse | On | Block | AGENTIHOOKS_CLAUDE_MD_SANITY_CHECK |
| 6 | MCP surface area | SessionStart | On | Warn | MCP_TOOL_WARN_THRESHOLD |
| 7 | Output token limit | SessionStart | Passive | Inject awareness | CLAUDE_CODE_MAX_OUTPUT_TOKENS |
| 8 | File read dedup | PreToolUse / PostToolUse | On | Block | FILE_READ_CACHE_ENABLED |
Exit code semantics
Guardrails that block communicate via exit code 2. Claude Code cancels the tool action and displays the hook’s stderr output as a warning to the agent.
| Exit code | Meaning |
|---|---|
0 | Allow — tool proceeds |
2 | Block — tool cancelled, stderr shown as warning |
This is the same mechanism Claude Code uses for all hook blocks — guardrails are first-class citizens of the hook event pipeline.
Hardening your setup
The defaults cover the most common failure modes. For teams or higher-stakes environments, consider these adjustments:
# ~/.agentihooks/.env
# Stricter secrets scanning (adds Slack, Stripe, JWT patterns)
AGENTIHOOKS_SECRETS_MODE=strict
# Tighter retry tolerance (warn sooner, hard block sooner)
RETRY_BREAKER_MAX=3
RETRY_BREAKER_HARD_MAX=7
# Tighter CLAUDE.md size cap
AGENTIHOOKS_CLAUDE_MD_MAXLINES=150
# Warn earlier on MCP surface area
MCP_TOOL_WARN_THRESHOLD=25
# Set output token ceiling
CLAUDE_CODE_MAX_OUTPUT_TOKENS=8192
To verify all guardrails are active after installation:
agentihooks status
The status output lists each guardrail with its current state (enabled/disabled), mode, and thresholds. Inside a live session, /agentihooks shows the same panel alongside real-time session metrics.