More Amplifier Stories
Amplifier Foundation

Per-Agent
Model Routing

The right model for every agent, automatically.

Shipped February 24, 2026

The Problem

Every agent was burning
frontier tokens

  • One model fits all: Every delegated agent inherited the parent's model — usually Opus or the most expensive option available
  • Wasteful spend: Git commit formatters, file readers, and shell commands all running on frontier-class models
  • Rate-limit collisions: Parallel instances competing for the same model's quota, causing cascading throttling
  • Unnecessary latency: Simple tasks waiting for frontier-class response times when a fast budget model would suffice

The Solution

Four tiers. Automatic routing.

Tier Agents Model Examples
Budget git-ops, file-ops, smoke-test, post-task-cleanup claude-haiku-*, gpt-4o-mini
Strong explorer, ecosystem-expert, foundation-expert, session-analyst, web-research, integration-specialist, test-coverage, security-guardian claude-sonnet-*, gpt-5.[0-9]
Strong-Code bug-hunter, modular-builder claude-sonnet-*, gpt-5.[0-9]
Frontier zen-architect claude-opus-*, gpt-5-opus-*
Downgrade
git-ops → Haiku
Your Session Model
Any tier
Upgrade
zen-architect → Opus

Works both directions — budget agents get downgraded and planning agents get upgraded

Under the Hood

How it works

Agent-Level Preferences
Each agent's .md frontmatter declares provider_preferences — an ordered list of provider + model glob pairs that define the agent's tier.
Glob Resolution
Patterns like claude-haiku-* and gpt-5.[0-9] auto-resolve to the newest matching model at runtime. Character classes catch only base models, not variants.
4-Provider Fallback Chain
Every tier defines a full fallback chain:
Anthropic → OpenAI → Google → GitHub Copilot.
If one provider is down, the next takes over seamlessly.
Override & Inheritance
Callers can always pass explicit provider_preferences at delegation time to override defaults. Derived bundles like amplifier-dev inherit automatically.

The Impact

Faster, cheaper, more parallel

5–50×
Cost reduction on
budget delegations
3–10×
Faster responses
from budget models
Split
Rate limits spread
across model families
Parallel scaling
unlocked

Haiku rate limits are separate from Opus and Sonnet. Multiple parallel instances no longer bottleneck on a single model tier's quota — the system naturally spreads load across providers and model families.

Verified in Production

Session 3ba3c845

Agent Expected Tier Model Used
git-ops Budget claude-haiku-4-5-20251001
amplifier-smoke-test Budget claude-haiku-4-5-20251001
explorer Strong claude-sonnet-4-6
ecosystem-expert Strong claude-sonnet-4-6
bug-hunter Strong-Code claude-sonnet-4-6
zen-architect Frontier claude-opus-4-6
self Inherit claude-opus-4-6
7/7 correct 0 errors All globs resolved

Roadmap

What's next

  • ⚙️ User-level preference customization — Reorder providers, redefine tiers, set org-wide defaults
  • 📦 External bundle agents — recipes, design-intelligence, python-dev, superpowers, and more
  • 📊 Observability — Enriched delegation events with resolved model info, cost-per-agent tracking
  • 🚀 App-level routing — Custom apps can set routing todayAvailable now

For the Curious

Technical details

🔧 amplifier-foundation

  • Bundle loader extraction for reuse across bundles
  • tool-delegate agent now reads default preferences
  • 16 agent .md files updated with provider_preferences
  • Derived bundles inherit via shared loader

🖥️ amplifier-app-cli

  • Fixed glob resolution in session_spawner
  • Sync → async fix for fnmatch against provider.list_models()
  • Models sorted descending — newest version wins

🎯 Glob Pattern Design

  • claude-haiku-* — matches any Haiku variant
  • gpt-5.[0-9] — character class catches only base flagship models, skipping variants like gpt-5.0-mini
  • gemini-2.5-flash* — covers flash family
  • Patterns auto-resolve at delegation time

🔗 Fallback Chain

  • 4 providers per tier: Anthropic → OpenAI → Google → GitHub Copilot
  • Provider unavailability handled transparently
  • Caller override always takes precedence
  • No config changes needed for existing users
1 / 8
More Amplifier Stories