Claude Code Optimization

Root Cause 1

🗄️

Cache Misses

If your prompt cache breaks mid-session, you pay full price every turn instead of 0.1×.

⚠ What breaks the cache

✕ Adding or removing a tool in the middle of a session

✕ Switching models in the middle of a session

✓ What to do

✓ Decide your tools and model BEFORE you start the session

✓ Never change them mid-session

✓ A healthy cache hit rate is around 90%+

Root Cause 2

📚

Context Bloat

The longer your session runs, the more Claude has to remember — and that costs tokens. Opus 4.6 defaults to a 1M context window which is overkill.

Settings to add

CLAUDE_CODE_DISABLE_1M_CONTEXT = 1
CLAUDE_AUTOCOMPACT_PCT_OVERRIDE = 80

5 habits to keep context clean

/compact at 50% full or after every task — don't wait

/clear between unrelated tasks — fresh session, fresh cache

/rewind if a turn went wrong — cheaper than working around bad context

Subagents use for bulk work — keeps your main session lean

@filename tag files directly so Claude doesn't search for them

Subagent model guide

Haiku

boring mechanical work

Sonnet

research, code exploration

Opus

planning, complex decisions only

Root Cause 3

⚙️

Wrong Model or Effort Level

Most people use Opus-level reasoning on tasks that only need Haiku. Default reasoning uses ~2× more tokens than medium.

Effort levels to set per prompt

/effort low quick fixes, simple tasks

/effort medium most everyday problems

/effort high complex reasoning

/effort xhigh default for agentic tasks

/effort max almost never worth it

Model routing strategy

→ Start a Sonnet session when work is simple — cheaper

→ Start an Opus session when planning is involved — delegate the rest to Sonnet/Haiku

→ If you keep hitting limits, route through OpenRouter — same Claude Code interface, ~1/12th the cost

Root Cause 4

📄

Wrong Input Format

Some file types cost way more tokens than necessary by default.

Don't use

Screenshots for web pages

Use instead

agent-browser

⚡ ~90% fewer tokens

Don't use

Claude's built-in PDF reader

Use instead

pdftotext

⚡ Avoids loading PDFs as images

Don't use

Re-reading whole repo each task

Use instead

code-review-graph

⚡ 6.8× fewer tokens on reviews, up to 49× on daily tasks

Root Cause 5

👁️

Not Watching Your Usage

You can't fix what you can't see. Three tools to track it.

phuryn/claude-usage Historical view, breakdown by day/week/session (Pro/Max/Team)

Gronsten/claude-usage-monitor Real-time, shows your current 5-hour window live

platform.claude.com/usage/cache Anthropic's cache dashboard — API users only

✦ The One-Line Summary

Lock your tools and model before you start. Compact early and often. Use cheaper models for simple tasks. Use the right input format. Track your usage. That's it.

From $750/month to 12% of quota in 3 days

Cache Misses

Context Bloat

Wrong Model or Effort Level

Wrong Input Format

Not Watching Your Usage