Amplifier Foundation

Per-Agent
Model Routing

The right model for every agent, automatically.

Shipped February 24, 2026

The Problem

Every agent was burning
frontier tokens

One model fits all: Every delegated agent inherited the parent's model — usually Opus or the most expensive option available
Wasteful spend: Git commit formatters, file readers, and shell commands all running on frontier-class models
Rate-limit collisions: Parallel instances competing for the same model's quota, causing cascading throttling
Unnecessary latency: Simple tasks waiting for frontier-class response times when a fast budget model would suffice

The Solution

Four tiers. Automatic routing.

Tier	Agents	Model Examples
Budget	git-ops, file-ops, smoke-test, post-task-cleanup	claude-haiku-*, gpt-4o-mini
Strong	explorer, ecosystem-expert, foundation-expert, session-analyst, web-research, integration-specialist, test-coverage, security-guardian	claude-sonnet-*, gpt-5.[0-9]
Strong-Code	bug-hunter, modular-builder	claude-sonnet-*, gpt-5.[0-9]
Frontier	zen-architect	claude-opus-, gpt-5-opus-

Downgrade

git-ops → Haiku

⟵

Your Session Model

Any tier

⟶

Upgrade

zen-architect → Opus

Works both directions — budget agents get downgraded and planning agents get upgraded

Under the Hood

How it works

Agent-Level Preferences

Each agent's .md frontmatter declares provider_preferences — an ordered list of provider + model glob pairs that define the agent's tier.

Glob Resolution

Patterns like claude-haiku-* and gpt-5.[0-9] auto-resolve to the newest matching model at runtime. Character classes catch only base models, not variants.

4-Provider Fallback Chain

Every tier defines a full fallback chain:
Anthropic → OpenAI → Google → GitHub Copilot.
If one provider is down, the next takes over seamlessly.

Override & Inheritance

Callers can always pass explicit provider_preferences at delegation time to override defaults. Derived bundles like amplifier-dev inherit automatically.

The Impact

Faster, cheaper, more parallel

5–50×

Cost reduction on
budget delegations

3–10×

Faster responses
from budget models

Split

Rate limits spread
across model families

∞

Parallel scaling
unlocked

Haiku rate limits are separate from Opus and Sonnet. Multiple parallel instances no longer bottleneck on a single model tier's quota — the system naturally spreads load across providers and model families.

Verified in Production

Session 3ba3c845

Agent	Expected Tier	Model Used
git-ops	Budget	claude-haiku-4-5-20251001
amplifier-smoke-test	Budget	claude-haiku-4-5-20251001
explorer	Strong	claude-sonnet-4-6
ecosystem-expert	Strong	claude-sonnet-4-6
bug-hunter	Strong-Code	claude-sonnet-4-6
zen-architect	Frontier	claude-opus-4-6
self	Inherit	claude-opus-4-6

✓

7/7 correct 0 errors All globs resolved

Roadmap

What's next

⚙️ User-level preference customization — Reorder providers, redefine tiers, set org-wide defaults
📦 External bundle agents — recipes, design-intelligence, python-dev, superpowers, and more
📊 Observability — Enriched delegation events with resolved model info, cost-per-agent tracking
🚀 App-level routing — Custom apps can set routing todayAvailable now

For the Curious

Technical details

🔧 amplifier-foundation

Bundle loader extraction for reuse across bundles
tool-delegate agent now reads default preferences
16 agent .md files updated with provider_preferences
Derived bundles inherit via shared loader

🖥️ amplifier-app-cli

Fixed glob resolution in session_spawner
Sync → async fix for fnmatch against provider.list_models()
Models sorted descending — newest version wins

🎯 Glob Pattern Design

claude-haiku-* — matches any Haiku variant
gpt-5.[0-9] — character class catches only base flagship models, skipping variants like gpt-5.0-mini
gemini-2.5-flash* — covers flash family
Patterns auto-resolve at delegation time

🔗 Fallback Chain

4 providers per tier: Anthropic → OpenAI → Google → GitHub Copilot
Provider unavailability handled transparently
Caller override always takes precedence
No config changes needed for existing users

Per-AgentModel Routing