Context Preprocessor

Table of contents

  1. Overview
    1. The problem
    2. The insight
    3. Scope
  2. Compression Levels
  3. The Compression Pipeline
    1. Level 1: Markdown Formatting Removal
    2. Level 2: Filler Words and Abbreviations
    3. Level 3: Internal Vowel Removal
  4. Safety Rules — Protected Tokens
    1. Protected categories
  5. Algorithm: Protection Mask
  6. The Abbreviation Dictionary
    1. Extending the dictionary
  7. Configuration
    1. Scope
  8. Full Compression Example
    1. Level 0 (off) — 811 chars
    2. Level 1 (light) — ~620 chars
    3. Level 2 (standard) — ~480 chars
    4. Level 3 (aggressive) — ~410 chars
  9. Limitations
  10. Future Scope

Overview

The Context Preprocessor compresses rule files and CLAUDE.md content before mid-session re-injection by the context refresh system.

The problem

Context refresh injects rules every N turns to combat attention decay. The injection budget (CONTEXT_REFRESH_MAX_CHARS) defaults to 8,000 characters (~2,000 tokens). A typical operator profile with 10-13 rule files totals 12,000-16,000 characters — nearly 2x over budget. Rules that exceed the cap are silently dropped.

The insight

LLMs predict over subword tokens, not characters. The BPE tokenizer splits “authentication” into tokens like ["auth", "ent", "ication"] (3 tokens). If you write “auth” instead, it is 1 token and the model activates the same semantic representation — the surrounding tokens (“credentials”, “secrets”, “env vars”) provide enough signal for the attention heads to reconstruct full meaning.

This property means we can compress injected content by 30-55% while preserving LLM comprehension, as long as we protect tokens that carry critical operational semantics (negation, action verbs, identifiers).

Scope

By default (CONTEXT_COMPRESSION_SCOPE=refresh), compression applies to context refresh injections only. Set CONTEXT_COMPRESSION_SCOPE=all to extend compression to all inject_context() / inject_banner() calls (session start banners, secrets warnings, tool memory, circuit breaker messages, context threshold warnings) and bash output filter additionalContext results.

Rules are sorted by frontmatter priority: N (lower = higher priority, default 5) before the size cap is applied, so high-priority rules are always compressed and injected first.


Compression Levels

Level Name Transforms Token Savings Per 100-Turn Session
0 off None (passthrough) 0% 0 tokens
1 light Strip markdown formatting ~5-10% ~200-500 tokens
2 standard Level 1 + remove filler words + apply abbreviation dictionary ~10-20% ~2,000-4,000 tokens
3 aggressive Level 2 + internal vowel removal on long common words ~20-35% ~4,000-8,000 tokens

Default is standard. Set CONTEXT_COMPRESSION_SCOPE=all in ~/.agentihooks/.env to extend compression beyond refresh to all hook injections and tool output.

Session savings scale with: number of rules, CLAUDE.md size, session length, and how many injections fire. With scope=all, savings compound across every inject_context, inject_banner, and additionalContext call in the hook system.


The Compression Pipeline

Each level is additive — level N applies everything from levels below it. The safety protection mask runs first, before any transform.

Level 1: Markdown Formatting Removal

Strips structural markdown that carries zero semantic weight for the LLM:

Transform Before After
Headers ## Delegation Map [Delegation Map]
Tables \| Key \| Value \| (multi-row) Key: Value (flat per row)
Mermaid blocks mermaid ... [diagram removed]
Bold/italic **important** important
Horizontal rules --- (removed)

Level 2: Filler Words and Abbreviations

Filler word removal — removes low-information function words:

Before After
The system is configured to use Redis system configured use Redis
All of the deployment operations All deploy operations
This is a hard rule that applies hard rule applies

Target words: a, an, the, is, are, was, were, be, been, being, in, on, at, to, of, for, that (conjunction), which, with (when not part of a command).

Abbreviation substitution — replaces common DevOps terms using a dictionary:

Full term Abbreviation
authentication auth
kubernetes k8s
configuration cfg
environment env
production prod
deployment deploy
infrastructure infra
repository repo
namespace ns
application app
database db

Full dictionary in hooks/context/data/abbreviations.json (~50 entries).

Level 3: Internal Vowel Removal

Removes vowels that are flanked by consonants on both sides, in words of 7+ characters:

Before After
instruction instrction
protection prtctn
collaborative collbrtve
mandatory mndtry

Leading vowels are preserved (they anchor word shape for the tokenizer). Short words and exclusion-set words (like error, issue, order) are never disemvoweled.


Safety Rules — Protected Tokens

The preprocessor NEVER modifies tokens in protected categories. This is enforced by a span-based protection mask that is computed before any transform runs.

Protected categories

1. Code blocks — fenced ( `) and inline ( ` ``):

`kubectl delete pod` → preserved exactly

This is the most important protection. Commands, paths, env var names, and identifiers are almost always inside code spans in well-authored rule files.

2. Negation words — matched as whole words:

never, don't, not, no, without, cannot, can't, won't, shouldn't, must not, do not

Compressing a negation risks flipping the meaning of a rule.

3. Assertion words — operational imperatives:

always, must, required, mandatory, only, exactly, strictly

4. Action verbs — high-stakes operations:

push, delete, commit, deploy, block, destroy, drop, truncate, kill, terminate, rollback, revert, reset, force, override, disable, remove, purge, wipe

5. ALL_CAPS identifiers — env var names:

CONTEXT_REFRESH_MAX_CHARS, KUBECTL_NAMESPACE, AWS_REGION, etc. Pattern: [A-Z][A-Z0-9_]{2,}

6. Numbers and thresholds:

8000, 20, 3600, 80%, 512MiB — any numeric literal including byte sizes and percentages.

7. File paths and CLI commands:

~/.agentihooks/.env, /home/user/.claude/rules/, kubectl delete, helm upgrade, git push --force


Algorithm: Protection Mask

The protection mask is a list of (start, end) character-offset spans computed from the raw text. Each transform function uses _apply_masked() which:

  1. Finds all regex matches for the transform
  2. Checks each match span against the protection mask
  3. Skips any match that overlaps a protected span
  4. Applies non-overlapping matches only

The mask is rebuilt after each transform because text modifications shift character offsets. This is O(n*m) per transform (n=matches, m=protected spans) but acceptable given rule files are a few KB.

text = "Never run `kubectl delete` in production"

Protection mask:
  [0, 5)    = "Never"           (negation)
  [10, 26)  = "`kubectl delete`" (code span)
  [30, 40)  = "production"      (after abbrev: becomes "prod", but in L1 it's not yet abbreviated)

Level 2 filler removal:
  "run" → not protected, but it's a verb not in filler list → kept
  "in"  → filler word, not in protected span → removed

Result: "Never run `kubectl delete` prod"

The Abbreviation Dictionary

Location: hooks/context/data/abbreviations.json

Structure:

{
  "_version": 1,
  "entries": {
    "authentication": "auth",
    "kubernetes": "k8s",
    "configuration": "cfg"
  }
}

Entries are applied longest-match first to avoid partial collisions (e.g., “authentication” before “auth”).

Extending the dictionary

Set CONTEXT_REFRESH_ABBREV_FILE to the path of a custom JSON file with the same structure. Your entries are shallow-merged on top of the built-in dictionary (your entries win on collision).

# In ~/.agentihooks/.env
CONTEXT_REFRESH_ABBREV_FILE=/home/user/.agentihooks/custom-abbrevs.json

Configuration

Variable Default Description
CONTEXT_REFRESH_COMPRESSION standard Compression level: off, light, standard, aggressive
CONTEXT_COMPRESSION_SCOPE refresh Where compression applies: refresh (context refresh only) or all (all injections + tool output)
CONTEXT_REFRESH_ABBREV_FILE (empty) Path to user-supplied abbreviation dictionary (JSON). Merged on top of built-in.

Scope

By default, compression only applies to context refresh injections. Set CONTEXT_COMPRESSION_SCOPE=all to compress everything that flows through the hook system:

  • All inject_context() / inject_banner() calls (session start banners, secrets warnings, tool memory, circuit breaker messages, context threshold warnings)
  • Bash output filter additionalContext results (kubectl describe, docker logs, git diffs)
  • Context refresh rules and CLAUDE.md (already compressed per-rule for priority truncation)
# In ~/.agentihooks/.env
CONTEXT_REFRESH_COMPRESSION=standard
CONTEXT_COMPRESSION_SCOPE=all

Not compressed (by design): user prompts, stderr block messages, log file output.


Full Compression Example

Source: a typical clearance rule file (811 chars original).

Level 0 (off) — 811 chars

# Operator Clearance — Full by Default

## Default: Full Clearance

You have full clearance at all times unless the operator explicitly restricts it.

Full clearance means:
- Push to any branch including main
- Force push when needed
- Destructive operations (rm, kubectl delete, docker rm) — just do it
- Production operations — just do it
- All git operations — just do it

## The One Absolute: Secrets

No clearance level — not even full — permits handling credentials, API keys,
tokens, or passwords in plaintext. Reference via env vars only.

## Restricting and Restoring

- "restrict clearance" / "careful mode" → ask before destructive/production ops
- "full clearance" / "back to normal" → default behavior restored
- Restriction is per-task, reverts automatically after task completion

Level 1 (light) — ~620 chars

[Operator Clearance — Full by Default]

[Default: Full Clearance]

You have full clearance at all times unless the operator explicitly restricts it.

Full clearance means:
- Push to any branch including main
- Force push when needed
- Destructive operations (rm, kubectl delete, docker rm) — just do it
- Production operations — just do it
- All git operations — just do it

[The One Absolute: Secrets]

No clearance level — not even full — permits handling credentials, API keys,
tokens, or passwords in plaintext. Reference via env vars only.

[Restricting and Restoring]

- "restrict clearance" / "careful mode" → ask before destructive/production ops
- "full clearance" / "back to normal" → default behavior restored
- Restriction per-task, reverts automatically after task completion

Level 2 (standard) — ~480 chars

[Operator Clearance — Full by Default]

[Default: Full Clearance]

You have full clearance unless operator explicitly restricts it.

Full clearance means:
- Push any branch including main
- Force push when needed
- Destructive ops (rm, kubectl delete, docker rm) — just do it
- prod ops — just do it
- All git ops — just do it

[One Absolute: Secrets]

No clearance level permits handling credentials, API keys, tokens, or passwords plaintext. Reference via env vars only.

[Restricting and Restoring]

- "restrict clearance" / "careful mode" → ask before destructive/prod ops
- "full clearance" / "back to normal" → default behavior restored
- Restriction per-task, reverts automatically after task completion

Level 3 (aggressive) — ~410 chars

[Opertr Clearance — Full by Default]

[Default: Full Clearance]

You have full clearance unless opertr explctly rstrcts it.

Full clearance means:
- Push any branch inclding main
- Force push when needed
- Destructive ops (rm, kubectl delete, docker rm) — just do it
- prod ops — just do it
- All git ops — just do it

[One Absolte: Secrets]

No clearance level permits handlng credntls, API keys, tokens, or passwords plaintext. Reference via env vars only.

[Rstrcting and Restrng]

- "restrict clearance" / "careful mode" → ask before destructive/prod ops
- "full clearance" / "back to normal" → default behavr restored
- Restriction per-task, reverts autmtcly after task completn

Limitations

  • All-command rules: If a rule file is entirely code blocks and identifiers, the protection mask covers the whole document and no compression occurs. This is correct behavior.
  • No semantic validation: The preprocessor cannot detect if compression changes the operational meaning of a rule in edge cases. It relies on the protection categories to prevent this.
  • Dictionary maintenance: The abbreviation dictionary is manually curated. New DevOps terms need to be added as they emerge.
  • Level 3 readability: Aggressive vowel removal produces text that is harder for humans to read in logs. It remains fully comprehensible to the LLM.

Future Scope

The Context Preprocessor is designed to grow into a standalone service:

  • User message preprocessing: compress verbose user inputs before they consume context budget
  • Tool output compression: apply abbreviation and formatting reduction to large tool outputs (complementing bash_output_filter.py)
  • Adaptive compression: dynamically increase compression level as context usage approaches the window limit (integrating with context audit data)
  • Custom compression profiles: per-project or per-domain compression dictionaries
  • Compression analytics: track compression ratios and token savings across sessions