Type something to search...
How I Made Claude Code Safer (And You Can Too)

How I Made Claude Code Safer (And You Can Too)

I’ve been running Claude Code on real projects for months. It’s great at writing code — but it doesn’t always understand the consequences of what it writes.

Claude Code validates which tools can run. It doesn’t validate what they write. That gap cost me a crashed project and a malformed config file. So I built a plugin that fixes it — and, unexpectedly, teaches Claude to stop making the same mistakes.

Here’s what I learned about applying defense-in-depth to AI tooling.

The Gap Nobody Talks About

Claude Code has permission rules. You can control which tools run. You can block shell commands with tools like Safety Net. But once an Edit or Write is approved, nothing validates what gets written.

That’s the gap. Not access control — content validation.

Think about it in security terms: permission rules are your firewall. They control who gets in. But once traffic is allowed, you still need inspection. You need something that looks at what is being written and decides whether it should land.

I found this gap the hard way.

How I Found It

I was working with Claude on a project when it modified one of my primary config files mid-session. The project crashed. Claude had attempted to add data to a JSON config file and broken the formatting in the process. When I reviewed the damage, I found it had also stuffed in a bunch of data that didn’t belong there.

That wasn’t the only time. Claude would rewrite CLAUDE.md and silently drop entire sections. It would “fix” a config file by deleting top-level YAML keys. It would “clean up” a script and strip the shebang that CI depends on.

Claude didn’t mean to break anything. It was trying to be helpful. But it had no guardrails on what it could write — only on whether it could write. I knew I could solve this, and that’s where Document Guard started.

The Security Principle: Defense in Depth for AI Tooling

If you’ve spent any time in cybersecurity, you know defense in depth: don’t rely on one control. Layer them.

The same principle applies to AI coding assistants. Here’s the stack I run:

LayerWhat It DoesTool
Access ControlControls which tools Claude can useClaude Code Permission Rules
Command ProtectionBlocks dangerous shell commandsSafety Net
Content ValidationInspects file edits before they landDocument Guard
Audit TrailLogs every action for reviewDocument Guard audit log + custom hooks

No single layer is sufficient. Permission rules don’t inspect content. Safety Net doesn’t watch file edits. Document Guard doesn’t control tool access. Together, they cover the surface.

What I Built

Document Guard is a Claude Code plugin that intercepts every Edit and Write operation and validates it against configurable rules. It runs as a PreToolUse hookbefore the edit hits disk.

The Four-Tier Model

Not everything deserves the same response. A credential leak and a missing shebang are different severity levels. Document Guard uses four tiers:

TierResponseExample
CriticalBlock the edit. Require explicit user approval to override.Writing an AWS key into source code
HighBlock the edit. Require explicit override.Removing sections from CLAUDE.md
MediumWarn Claude (inject context). Allow the edit.Stripping a shebang from a shell script
LowLog it. No friction.Informational audit trail

This isn’t binary “allow or deny.” It’s graduated response — the same principle behind alert fatigue management in a SOC. If everything is critical, nothing is.

Seven Checks Out of the Box

The plugin ships with seven structural checks that cover the most common failure modes:

  1. Total write block — Some files (.env, .credentials/) should never be touched by an AI. Period.

Document Guard blocking a .env file write — Claude explains why and tells the user how to add the keys manually

  1. Credential scanning — 13 regex patterns catch AWS keys, GitHub tokens, Stripe keys, JWTs, private key blocks, database connection strings, and more. Built-in placeholder detection prevents false positives on your_api_key_here.

Document Guard catching credentials being written into source code 3. Key deletion protection — When Claude does a full-file rewrite of a YAML or config file, it compares the old and new versions and flags any removed top-level keys. 4. Section preservation — Same idea for markdown: detects when ## Heading sections disappear during a rewrite. 5. Heading structure — Catches removal of any heading level (# through ######). 6. Frontmatter preservation — Locks specific YAML frontmatter fields so skill identity, command routing, and metadata survive edits. 7. Shebang preservation — Catches when #!/usr/bin/env node gets stripped from a script, which silently breaks execution in CI.

Plus one opt-in semantic check that uses a local Ollama model to verify written content matches the file’s declared purpose. This one always warns, never blocks, and fails open if Ollama isn’t running.

The Override Mechanism

Blocking is only useful if there’s a clean escape hatch. When Document Guard blocks an edit, it tells Claude exactly how to proceed:

  1. Ask the user for explicit approval
  2. Write a single-use override file with the approved path and an expiration timestamp
  3. Retry the edit
  4. The override is consumed and logged

No permanent bypasses. No toggle that stays on. Every override expires (default: 120 seconds) and every override is audited.

This is the same principle behind break-glass procedures in access management. You don’t remove the control — you create a documented, time-limited, auditable exception.

Document Guard blocking a CLAUDE.md rewrite and offering the override menu

The Part I Didn’t Expect: It Teaches Claude

Here’s what surprised me. When Document Guard blocks an edit, it doesn’t just return a “denied” signal — it injects context back to Claude explaining why the edit was blocked. What rule matched, what check failed, what the violation was.

Claude reads that feedback. And it adjusts its behavior for the rest of the session.

Block a credential leak once, and Claude stops writing credentials into tracked files. Block a section removal from CLAUDE.md, and Claude starts preserving structure in its rewrites. The guardrails are teaching the AI in real-time.

That wasn’t the original plan. I built Document Guard as a safety net — catch the bad edit, prevent the damage. But the feedback loop turned it into something more: a way to shape Claude’s behavior within your project’s specific rules. The more rules you define, the more Claude learns what matters to you.

How It Works (Under the Hood)

Claude wants to edit a file
    |
Document Guard intercepts (PreToolUse hook)
    |
Extract file path, resolve to relative path
    |
Match against rules (glob patterns, most specific wins)
    |
Run applicable checks (credential scan, structural, semantic)
    |
No violations? --> Allow
    |
Critical/High violations? --> Block + log + provide override instructions
Medium violations? --> Warn (inject context) + allow
Low violations? --> Log only

The entire hook runs synchronously before the edit. If Document Guard crashes, it fails open — your workflow isn’t blocked by a broken guard.

Configuring It For Your Project

Document Guard ships with 11 universal rules that work out of the box. But the real power is customization.

Two-Tier Config

  1. Project override (.claude/hooks/document-guard.config.js) — highest priority
  2. Plugin default — bundled with the plugin, used as fallback

If you create a project config, it takes full precedence. No merging, no inheritance complexity.

Example: Protecting Database Migrations

{
  name: 'Database migrations',
  pattern: 'migrations/**',
  tier: 'high',
  checks: ['no_write_allowed'],
  message: 'Migration files are immutable once created.',
}

Example: Locking API Schema Structure

{
  name: 'API schema',
  pattern: 'openapi.yaml',
  tier: 'high',
  checks: ['key_deletion_protection', 'section_preservation'],
}

Example: Semantic Validation for Docs

{
  name: 'API documentation',
  pattern: 'docs/api/**',
  tier: 'high',
  checks: ['section_preservation', 'semantic_relevance'],
  purpose: 'REST API endpoint documentation',
}

The config is plain JavaScript (not JSON), so you get comments, variables, and logic if you need them.

What This Means for Security Teams

If your team is adopting AI coding assistants — and you probably are, or will be soon — you need to think about this layer.

The risk isn’t malicious AI. The risk is a capable assistant optimizing for the task in front of it without awareness of the consequences. It rewrites a config file and drops a key. It generates example code with a real API key from context. It “cleans up” a script and removes the shebang.

These are the same kinds of mistakes junior developers make. The difference is AI assistants make them faster and at scale.

What you can do today:

  1. Install Document Guard — zero config, immediate protection for the most common failure modes
  2. Add project-specific rules — protect your migration files, your API schemas, your deployment configs
  3. Run the full stack — Permission Rules + Safety Net + Document Guard for defense in depth
  4. Review the audit log — understand what your AI assistant is actually doing with file access

Try It

Two commands:

claude plugin marketplace add davidmoneil/aifred-document-guard
claude plugin install document-guard@aifred-document-guard

That’s it. No config needed. 11 protection rules active immediately. ~700 lines of vanilla Node.js, no dependencies.

It’s open source (MIT) and works with any Claude Code project.

GitHub: github.com/davidmoneil/aifred-document-guard

Document Guard is part of the AIfred ecosystem — a configuration framework for Claude Code that includes hooks, skills, patterns, and automation.

I’d love to hear what you think — and what’s the worst thing an AI assistant has accidentally done to your codebase? I’m genuinely curious. Reach out on Twitter/X or open an issue on GitHub.


Related Posts

4 Essentials for Executive & Business Buyin on your Incident Response Plan

4 Essentials for Executive & Business Buyin on your Incident Response Plan

The impact and subsequent fallout from a business-impacting cyber security attack are stressful at the best of times. Experience time and again shows that organizations without the benefit of an Inci

read more
The CyberSecurity & Evolving Threats

The CyberSecurity & Evolving Threats

Cybersecurity is a critical concern in today's world, as more and more of our daily lives are conducted online. The threat landscape is constantly evolving, and it can be challenging to keep up with t

read more
Top 5 things for a Successful Cyber Response 'IR' Plan

Top 5 things for a Successful Cyber Response 'IR' Plan

Incident Response Planning & Strategy How important is an Incident Response Plan? Some studies show that just having a plan, can reduce the cost of a breach [example one](https://insights.integrity36

read more
Pre-Selection Beats Post-Selection: How I Made Claude Code 10-30x Faster

Pre-Selection Beats Post-Selection: How I Made Claude Code 10-30x Faster

Every code navigation costs time. When you multiply 300ms delays across hundreds of searches per day, you're losing hours p

read more
I Ran 849 Tests on AI Context Files. Here's What Actually Works.

I Ran 849 Tests on AI Context Files. Here's What Actually Works.

After 849 controlled tests, $20 in API costs, and a week of experiments, I can tell you exactly how to organize your Claude Code reference files. The short version: Put everything in one flat fol

read more
Claude Code Has Two New CVEs — Here's What They Exploit and How to Harden Your Setup

Claude Code Has Two New CVEs — Here's What They Exploit and How to Harden Your Setup

Your engineers cloned repositories today. Probably dozens. If any of those repos contained a malicious .claude/settings.json, they may have executed arbitrary shell code without a single confirmatio

read more
I Scanned 152 Files of My Own AI-Generated Code for Invisible Unicode Malware

I Scanned 152 Files of My Own AI-Generated Code for Invisible Unicode Malware

Two weeks ago, a supply chain attack called Glassworm compromised 150+ GitHub repositories and 72+ browser extensions by hiding malicious payloads in characters that are literally invisible in every

read more
Four Generations of Broken Promises: Why AI SOC Agents Might Actually Be Different

Four Generations of Broken Promises: Why AI SOC Agents Might Actually Be Different

Series: The SIEM & AI Reckoning — Article 1 of 10Over twenty years and hundreds of vendor pitches, one line never changes: "This is going to change everything." 2005, SIEM. 2012, Next-Gen

read more
The Math Problem AI Just Changed for Security Testing

The Math Problem AI Just Changed for Security Testing

Published: 2026-03-22 | RSA 2026 Pre-Conference SeriesHere's the problem every security team lives with but rarely says out loud. Your environment changes every time a developer merges code,

read more
The SIEM Cost Trap — Why Your Data Lake + AI Agents Will Win

The SIEM Cost Trap — Why Your Data Lake + AI Agents Will Win

If you've ever sat across from your CFO, your VP of Engineering, or your board and tried to explain why your SIEM costs what it costs — you already know how this conversation goes. The short version o

read more
Your Data Lake Is Only as Useful as Its Ability to Answer a Question

Your Data Lake Is Only as Useful as Its Ability to Answer a Question

You moved your security data out of the SIEM and into a data lake. Costs dropped. For the first time in years, you had budget to spare. Then an investigation hit — and your team spent two weeks findi

read more