claude-code-proxy-patterns
$
npx mdskill add terrylica/cc-skills/claude-code-proxy-patterns<!-- # SSoT-OK: version references are documentation of binary analysis findings, not package versions -->
SKILL.md
.github/skills/claude-code-proxy-patternsView on GitHub ↗
---
name: claude-code-proxy-patterns
description: Claude Code OAuth proxy patterns and anti-patterns for multi-provider model routing.
allowed-tools: Read, Write, Edit, Bash, Grep, Glob
---
<!-- # SSoT-OK: version references are documentation of binary analysis findings, not package versions -->
# Claude Code Proxy Patterns
Multi-provider proxy that routes Claude Code model tiers to different backends. Haiku to MiniMax (cost/speed), Sonnet/Opus to Anthropic (native OAuth passthrough). Includes Go binary proxy with launchd auto-restart and failover wrapper for resilience.
**Scope**: Local reverse proxy for Claude Code with OAuth subscription (Max plan). Routes based on model name in request body.
**Reference implementations**:
- Go proxy binary: `/usr/local/bin/claude-proxy` (port 8082)
---
> **Self-Evolving Skill**: This skill improves through use. If instructions are wrong, parameters drifted, or a workaround was needed — fix this file immediately, don't defer. Only update for real, reproducible issues.
## When to Use This Skill
- Building or debugging a Claude Code multi-provider proxy
- Setting up `ANTHROPIC_BASE_URL` with OAuth subscription mode
- Integrating Anthropic-compatible providers (MiniMax, etc.)
- Diagnosing "OAuth not supported" or auth failures through a proxy
- Understanding how Claude Code stores and transmits OAuth tokens
**Do NOT use for**: Claude API key-only setups (no proxy needed), MCP server development, Claude Code hooks (operate at tool level, not API level), or corporate HTTPS proxy traversal.
---
## Architecture
```
Claude Code (OAuth/Max subscription)
|
| ANTHROPIC_BASE_URL=http://127.0.0.1:8082 (Go proxy)
| (unset ANTHROPIC_API_KEY to avoid auth conflict)
v
+----------------------------------+
| Go proxy (:8082) |
| launchd managed, auto-restart |
+----------------------------------+
|
| model =
| claude-haiku-
| 4-5-20251001
v
+-----------+
| MiniMax |
| highspeed |
+-----------+
```
**Port Configuration**:
- `:8082` - Go proxy (entry point, launchd-managed, auto-restart)
The Go proxy uses `cenkalti/backoff/v4` for built-in retry logic.
The proxy reads the `model` field from each `/v1/messages` request body. If it matches the configured Haiku model ID, the request goes to MiniMax. Everything else falls through to real Anthropic with OAuth passthrough.
---
## Working Patterns
### WP-01: Keychain OAuth Token Reading
Read OAuth tokens from macOS Keychain where Claude Code stores them.
**Service**: `"Claude Code-credentials"` (note the space before the hyphen)
**Account**: Current username via `getpass.getuser()`
```python
import subprocess, json, getpass
result = subprocess.run(
["security", "find-generic-password",
"-s", "Claude Code-credentials",
"-a", getpass.getuser(), "-w"],
capture_output=True, text=True, timeout=5, check=False,
)
if result.returncode == 0:
data = json.loads(result.stdout.strip())
oauth = data.get("claudeAiOauth")
```
See [references/oauth-internals.md](./references/oauth-internals.md) for the full deep dive.
### WP-02: Token JSON Structure
The Keychain stores a JSON envelope with the `claudeAiOauth` key.
```json
{
"claudeAiOauth": {
"accessToken": "eyJhbG...",
"refreshToken": "rt_...",
"expiresAt": 1740268800000,
"subscriptionType": "claude_pro_2025"
}
}
```
**Note**: `expiresAt` is in **milliseconds** (Unix epoch _ 1000). Compare with `time.time() _ 1000` or divide by 1000 for seconds.
### WP-03: OAuth Beta Header
The `anthropic-beta: oauth-2025-04-20` header is **required** for OAuth token authentication. Without it, Anthropic rejects the Bearer token.
**Critical**: APPEND to existing beta headers, do not replace them.
```python
# proxy.py:304-308
existing_beta = original_headers.get("anthropic-beta", "")
beta_parts = [b.strip() for b in existing_beta.split(",") if b.strip()] if existing_beta else []
if "oauth-2025-04-20" not in beta_parts:
beta_parts.append("oauth-2025-04-20")
target_headers["anthropic-beta"] = ",".join(beta_parts)
```
### WP-04: ANTHROPIC_API_KEY=proxy-managed
Setting `ANTHROPIC_BASE_URL` alone is insufficient in OAuth mode. Claude Code must also see `ANTHROPIC_API_KEY` set to switch from OAuth-only mode to API-key mode, which then honors `ANTHROPIC_BASE_URL`.
```bash
# In .zshenv (managed by proxy-toggle)
export ANTHROPIC_BASE_URL="http://127.0.0.1:8082"
export ANTHROPIC_API_KEY="proxy-managed"
```
The value `"proxy-managed"` is a dummy sentinel. The proxy intercepts it (line 324) and never forwards it to providers.
### WP-05: OAuth Token Cache with TTL
Avoid repeated Keychain subprocess calls by caching the token for 5 minutes.
```python
# proxy.py:117-118
_oauth_cache: dict = {"token": None, "expires_at": 0.0, "fetched_at": 0.0}
_OAUTH_CACHE_TTL = 300 # Re-read from Keychain every 5 minutes
```
Cache invalidation triggers:
- TTL expired (5 minutes since last fetch)
- Token's `expiresAt` has passed
- Proxy restart
### WP-06: Auth Priority Chain
The proxy tries multiple auth sources in order for Anthropic-bound requests.
```
1. REAL_ANTHROPIC_API_KEY env var -> x-api-key header (explicit config)
2. Keychain OAuth token -> Authorization: Bearer + anthropic-beta
3. ~/.claude/.credentials.json -> Authorization: Bearer (plaintext fallback)
4. Forward client Authorization -> Pass through whatever Claude Code sent
5. No auth -> Will 401 (expected)
```
See `proxy.py:293-314` for the implementation.
### WP-07: count_tokens Endpoint Auth
The `/v1/messages/count_tokens` endpoint needs the same auth as `/v1/messages`. Claude Code calls this for preflight token counting. Missing auth here causes silent failures. Returns 501 for non-Anthropic providers (MiniMax doesn't support it).
### WP-08: Anthropic-Compatible Provider URLs
Third-party providers that support the Anthropic `/v1/messages` API format.
| Provider | Base URL | Notes |
| ----------------- | ---------------------------------- | ------------------------------------------------- |
| MiniMax highspeed | `https://api.minimax.io/anthropic` | Returns `base_resp` field, extra `thinking` block |
See [references/provider-compatibility.md](./references/provider-compatibility.md) for the full matrix.
### WP-09: Concurrency Semaphore
Per-provider rate limiting prevents overwhelming third-party APIs. No semaphore for Anthropic (they handle their own rate limiting).
```python
# proxy.py:207-209
MAX_CONCURRENT_REQUESTS = int(os.getenv("MAX_CONCURRENT_REQUESTS", "5"))
haiku_semaphore = asyncio.Semaphore(MAX_CONCURRENT_REQUESTS)
opus_semaphore = asyncio.Semaphore(MAX_CONCURRENT_REQUESTS)
sonnet_semaphore = asyncio.Semaphore(MAX_CONCURRENT_REQUESTS)
```
### WP-10: proxy-toggle Enable/Disable
The `proxy-toggle` script manages `.zshenv` entries and a flag file atomically.
```bash
~/.claude/bin/proxy-toggle enable # Adds env vars, creates flag file, checks health
~/.claude/bin/proxy-toggle disable # Removes env vars, removes flag file
~/.claude/bin/proxy-toggle status # Shows routing flag, proxy process, .zshenv state
```
**Important**: Claude Code must be restarted after toggling because `ANTHROPIC_BASE_URL` is read at startup.
### WP-11: Health Endpoint
The `/health` endpoint returns provider configuration state for monitoring.
```bash
curl -s http://127.0.0.1:8082/health | jq .
```
### WP-12: Go Proxy with Retry
Go proxy with built-in retry using `cenkalti/backoff/v4` (exponential backoff: 500ms -> 1s -> 2s, max 5s elapsed).
```go
import "github.com/cenkalti/backoff/v4"
backoffConfig := backoff.NewExponentialBackOff(
backoff.WithInitialInterval(500 * time.Millisecond),
backoff.WithMultiplier(2),
backoff.WithMaxInterval(2 * time.Second),
backoff.WithMaxElapsedTime(5 * time.Second),
)
err := backoff.Retry(operation, backoffConfig)
```
**Location**: `/usr/local/bin/claude-proxy` | **Environment**: `ANTHROPIC_BASE_URL=http://127.0.0.1:8082` in `.zshenv`
### WP-13: Launchd Service Configuration
The Go proxy runs as a macOS launchd daemon for auto-restart on crash and boot persistence.
**Plist**: `/Library/LaunchDaemons/com.terryli.claude-proxy.plist`
Full plist configuration, commands, verification checklist, and debugging: [references/launchd-configuration.md](./references/launchd-configuration.md)
### WP-14: OAuth Token Auto-Refresh
Background goroutine refreshes OAuth tokens every 30 minutes, 5 minutes before expiry. Falls back to Keychain if API refresh fails.
Full implementation and refresh logic: [references/oauth-auto-refresh.md](./references/oauth-auto-refresh.md)
---
## Anti-Patterns Summary
Full details with code examples: [references/anti-patterns.md](./references/anti-patterns.md)
| ID | Severity | Gotcha | Fix |
| ------ | -------- | ------------------------------------------------------ | ---------------------------------------------------------------------- |
| CCP-01 | HIGH | ANTHROPIC_BASE_URL alone without ANTHROPIC_API_KEY | Set `ANTHROPIC_API_KEY=proxy-managed` |
| CCP-02 | HIGH | Missing `anthropic-beta: oauth-2025-04-20` header | Append to existing beta headers |
| CCP-03 | MEDIUM | Using `/api/oauth/claude_cli/create_api_key` endpoint | Requires `org:create_api_key` scope (users only have `user:inference`) |
| CCP-04 | HIGH | Lowercase keychain service `"claude-code-credentials"` | Actual name has space: `"Claude Code-credentials"` |
| CCP-05 | MEDIUM | Reading `~/.claude/.credentials.json` as primary | Keychain is SSoT; credential file is stale fallback |
| CCP-06 | HIGH | Hardcoding OAuth tokens | Tokens expire; read dynamically with cache |
| CCP-07 | HIGH | Using `gh auth token` in proxy/hooks | Causes process storms (recursive spawning) |
| CCP-08 | HIGH | ANTHROPIC_API_KEY set in env while having OAuth token | Auth conflict warning in Claude Code; unset it |
| CCP-09 | MEDIUM | cache_control param sent to MiniMax | MiniMax doesn't support it; remove from allowedParams |
| CCP-10 | MEDIUM | Setting ANTHROPIC_API_KEY to real key while proxy runs | Proxy forwards it to all providers, leaking key |
| CCP-11 | MEDIUM | Not handling `/v1/messages/count_tokens` | Causes auth failures on preflight requests |
| CCP-12 | LOW | Running proxy on 0.0.0.0 | Bind to 127.0.0.1 for security |
---
## TodoWrite Task Templates
Setup, provider addition, diagnostics, and disable templates: [references/task-templates.md](./references/task-templates.md)
---
## Reference Implementation
The working production deployment (Go proxy is primary):
| File | Purpose |
| ------------------------------------------------------- | -------------------------------- |
| `/usr/local/bin/claude-proxy` | Go proxy binary (~960 lines) |
| `~/.claude/tools/claude-code-proxy-go/main.go` | Go proxy source |
| `~/.claude/tools/claude-code-proxy-go/oauth_refresh.go` | OAuth auto-refresh (80 lines) |
| `~/.claude/tools/claude-code-proxy-go/.env` | Provider config (chmod 600) |
| `/Library/LaunchDaemons/com.terryli.claude-proxy.plist` | launchd config |
| `~/.zshenv` | Environment (ANTHROPIC_BASE_URL) |
---
## Post-Change Checklist
After modifying this skill:
1. [ ] Anti-patterns table matches [references/anti-patterns.md](./references/anti-patterns.md)
2. [ ] Working patterns verified against proxy.py source
3. [ ] No hardcoded OAuth tokens in examples
4. [ ] Beta header version current (`oauth-2025-04-20`)
5. [ ] All internal links use relative paths (`./references/...`)
6. [ ] Link validator passes
7. [ ] Skill validator passes
8. [ ] Append changes to [references/evolution-log.md](./references/evolution-log.md)
---
## Troubleshooting
| Issue | Cause | Solution |
| -------------------------------------- | ---------------------------------------------- | ------------------------------------------------ |
| Claude Code ignores ANTHROPIC_BASE_URL | Missing ANTHROPIC_API_KEY (CCP-01) | Set `ANTHROPIC_API_KEY=proxy-managed` in .zshenv |
| 401 Unauthorized from Anthropic | Missing anthropic-beta header (CCP-02) | Ensure proxy appends `oauth-2025-04-20` |
| Keychain read returns empty | Wrong service name (CCP-04) | Use `"Claude Code-credentials"` (with space) |
| Proxy forwards real API key | ANTHROPIC_API_KEY set to real key (CCP-10) | Use `proxy-managed` sentinel value |
| count_tokens auth failure | Missing endpoint handler (CCP-11) | Proxy must handle `/v1/messages/count_tokens` |
| Proxy accessible from network | Bound to 0.0.0.0 (CCP-12) | Bind to 127.0.0.1 only |
| Process storms on enable | gh auth token in hooks (CCP-07) | Never call gh CLI from hooks/credential helpers |
| MiniMax returns wrong model name | MiniMax quirk | Cosmetic only; Claude Code handles it |
| Token expired after 5 min | Cache TTL (WP-05) | Normal behavior; proxy re-reads from Keychain |
| Auth conflict warning in Claude Code | ANTHROPIC_API_KEY set (CCP-08) | Unset ANTHROPIC_API_KEY in .zshenv |
| cache_control.ephemeral.scope error | MiniMax doesn't support cache_control (CCP-09) | Remove cache_control from allowedParams |
## Post-Execution Reflection
After this skill completes, check before closing:
1. **Did the command succeed?** — If not, fix the instruction or error table that caused the failure.
2. **Did parameters or output change?** — If the underlying tool's interface drifted, update Usage examples and Parameters table to match.
3. **Was a workaround needed?** — If you had to improvise (different flags, extra steps), update this SKILL.md so the next invocation doesn't need the same workaround.
Only update if the issue is real and reproducible — not speculative.
More from terrylica/cc-skills
- academic-pdf-to-gfmConvert academic PDF papers to GitHub-renderable GFM markdown with math equations. TRIGGERS - PDF, GitHub markdown, math
- adaptive-wfo-epochAdaptive epoch selection for Walk-Forward Optimization. TRIGGERS - WFO epoch, epoch selection, WFE optimization, overfitting epochs.
- adr-code-traceabilityAdd ADR references to code for traceability. TRIGGERS - ADR traceability, code reference, document decision in code.
- adr-graph-easy-architectASCII architecture diagrams for ADRs via graph-easy. TRIGGERS - ADR diagram, architecture diagram, ASCII diagram.
- agent-reach>
- agentic-process-monitorMonitor background processes from Claude Code using sentinel files, heartbeat liveness, and subagent polling. Best practices and.
- alpha-forge-preshipAlpha Forge quality gates for PR review - RNG determinism, URL validation, parameter validation, manifest sync.
- article-extractorExtract MQL5 articles and documentation. TRIGGERS - MQL5 articles, MetaTrader docs, mql5.com resources.
- ascii-diagram-validatorValidate ASCII diagram alignment in markdown. TRIGGERS - diagram alignment, ASCII art, box-drawing diagrams.
- asciinema-analyzerSemantic analysis of asciinema recordings. TRIGGERS - analyze cast, keyword extraction, find patterns in recordings.