Back to Resources
Claude Code Optimization

From $750/month to 12% of quota in 3 days

Same workflow. Same Claude. Just the right settings, cache habits, and a few small swaps that most people never make.

Root Cause 1

🗄️

Cache Misses

If your prompt cache breaks mid-session, you pay full price every turn instead of 0.1×.

⚠ What breaks the cache

Adding or removing a tool in the middle of a session
Switching models in the middle of a session

✓ What to do

Decide your tools and model BEFORE you start the session
Never change them mid-session
A healthy cache hit rate is around 90%+

Root Cause 2

📚

Context Bloat

The longer your session runs, the more Claude has to remember — and that costs tokens. Opus 4.6 defaults to a 1M context window which is overkill.

Settings to add

CLAUDE_CODE_DISABLE_1M_CONTEXT = 1
CLAUDE_AUTOCOMPACT_PCT_OVERRIDE = 80

5 habits to keep context clean

/compact at 50% full or after every task — don't wait
/clear between unrelated tasks — fresh session, fresh cache
/rewind if a turn went wrong — cheaper than working around bad context
Subagents use for bulk work — keeps your main session lean
@filename tag files directly so Claude doesn't search for them

Subagent model guide

Haiku

boring mechanical work

Sonnet

research, code exploration

Opus

planning, complex decisions only

Root Cause 3

⚙️

Wrong Model or Effort Level

Most people use Opus-level reasoning on tasks that only need Haiku. Default reasoning uses ~2× more tokens than medium.

Effort levels to set per prompt

/effort low quick fixes, simple tasks
/effort medium most everyday problems
/effort high complex reasoning
/effort xhigh default for agentic tasks
/effort max almost never worth it

Model routing strategy

Start a Sonnet session when work is simple — cheaper
Start an Opus session when planning is involved — delegate the rest to Sonnet/Haiku
If you keep hitting limits, route through OpenRouter — same Claude Code interface, ~1/12th the cost

Root Cause 4

📄

Wrong Input Format

Some file types cost way more tokens than necessary by default.

Don't use

Screenshots for web pages

Use instead

agent-browser

⚡ ~90% fewer tokens

Don't use

Claude's built-in PDF reader

Use instead

pdftotext

⚡ Avoids loading PDFs as images

Don't use

Re-reading whole repo each task

Use instead

code-review-graph

⚡ 6.8× fewer tokens on reviews, up to 49× on daily tasks

Root Cause 5

👁️

Not Watching Your Usage

You can't fix what you can't see. Three tools to track it.

phuryn/claude-usage Historical view, breakdown by day/week/session (Pro/Max/Team)
Gronsten/claude-usage-monitor Real-time, shows your current 5-hour window live
platform.claude.com/usage/cache Anthropic's cache dashboard — API users only

The One-Line Summary

Lock your tools and model before you start. Compact early and often. Use cheaper models for simple tasks. Use the right input format. Track your usage. That's it.

✓ Copied!