● Active — Shipped 2026-03-13
Recipe Validation
Pipeline
Catch broken recipes before they reach production.
Three PRs. One session. Zero regressions.
PR #49 · amplifier-bundle-recipes
PR #50 · amplifier-bundle-recipes
PR #126 · amplifier-foundation
The Problem
Broken recipes
fail silently.
🔕
No pre-merge quality gate
Structural errors, undefined variables, bad agent namespaces — all undetectable without running a recipe end-to-end. Issues only surface in production.
🐛
Validator blind spot
The recipe engine's validator tracked output and collect as variable producers but missed output_exit_code — a gap since the original bash step implementation in Dec 2025.
📋
No consistent standards
No enforcement of naming conventions, timeout requirements, prompt quality standards, or versioning — leaving best practices as tribal knowledge.
Result: Recipes that look correct could silently produce false-positive validation errors or miss real bugs — with no systematic way to catch issues before merge.
Today's Delivery
Three PRs.
One complete system.
Validator Bug Fix
amplifier-bundle-recipes — Fixed the output_exit_code blind spot in validator.py. 3-line fix + 2 regression tests. Full TDD red-green cycle. 397 existing tests continue passing.
validate-recipes.yaml — The Pipeline
amplifier-bundle-recipes — 7-phase recipe validation pipeline. 1,671 lines of YAML + embedded Python. 117 new tests across 6 test files and 3 test fixtures. Parallel validation battery (Phases 2–4 run concurrently).
Foundation Integration
amplifier-foundation — Integrated into validate-bundle-repo.yaml v3.3.0 as new Phase 2.7. Opt-in via validate_recipes: "true" or validate_all: "true".
PR #49 — amplifier-bundle-recipes
Closing the
output_exit_code gap
Root Cause
The validator tracked output and collect as variable-producing keys in bash steps — but never added output_exit_code, which was introduced alongside bash step branching in Dec 2025.
Any recipe using output_exit_code for exit-code branching would receive a false-positive "variable not defined" error on validation.
Fix
3-line addition to validator.py to include output_exit_code alongside the existing variable producers.
2 regression tests with full TDD red-green cycle — test written first (red), fix applied (green), confirmed no regressions in the 397-test suite.
variable_producers = [
"output",
"collect",
]
variable_producers = [
"output",
"collect",
"output_exit_code",
]
Impact
- Recipes using exit-code branching validate correctly
- 397 existing tests: all passing ✓
- 2 new regression tests added
PR #50 — validate-recipes.yaml
7-phase validation
pipeline
Phase 0
Environment Check
Detect tooling availability, graceful degradation — works with or without optional deps
→
Phase 1
Recipe Discovery
Recursive scan, filter to actual recipe files, build inventory for subsequent phases
↓ FAN OUT — PARALLEL EXECUTION ↓
Phase 2
Structural Validation
Engine's validate_recipe() + 15 gap checks the engine misses
Phase 3
Best Practices
Naming conventions, timeouts, error handling, prompt quality, versioning
Phase 4
Semantic Validation
Agent namespace resolution, sub-recipe paths, variable flow, condition syntax, loop convergence, cross-recipe consistency
↓ FAN IN — RESULTS AGGREGATED ↓
Phase 5
Quality Classification
Deterministic: critical / needs_work / polish / good
→
Phase 6
Conditional Report
Bash quick-pass for clean repos · LLM synthesis gate for repos with issues
1,671 lines YAML + Python
117 tests · 6 test files · 3 fixtures
Phases 2–4 run in parallel
Architecture
validate-recipes.yaml — Pipeline Flow
Parallel validation battery (Phases 2–4) fans out from discovery and fans in to quality classification. LLM synthesis is gated — only invoked when issues are detected.
The Parallel Validation Battery
Three validators.
Running concurrently.
🏗️
Phase 2 — Structural
- Engine's built-in
validate_recipe()
- +15 additional gap checks the engine misses
- Required field completeness
- Step type validity
- Variable producer/consumer consistency
- Loop structure integrity
✅
Phase 3 — Best Practices
- Naming conventions enforcement
- Timeout requirements on long ops
- Error handling coverage
- Prompt quality standards
- Versioning conventions
- Documentation completeness
🧠
Phase 4 — Semantic
- Agent namespace resolution
(local + configurable external patterns)
- Sub-recipe path resolution
- Variable flow analysis
- Condition syntax parsing
- Loop convergence checks
- Cross-recipe consistency
Design principle: Recipe validation is fundamentally more automatable than agent validation — mostly deterministic bash steps. LLM is invoked only for report synthesis when issues are detected, keeping the happy path fast and cheap.
Design Decisions
Built with domain
expertise baked in.
🤝
Recipe-Author Agent as Domain Expert
The recipe-author agent was consulted as authoritative source and provided a 50+ item validation checklist — the foundation for everything Phases 2–4 check. Domain expertise encoded into deterministic rules, not ad-hoc prompting.
⚡
LLM as Synthesis Gate, Not Primary Validator
Phase 6 forks: bash quick-pass for clean repos (fast, zero LLM cost), LLM synthesis only when Phases 2–4 surface issues. Clean repos never pay the LLM tax.
🔧
Graceful Degradation by Design
Phase 0 detects available tooling before any work begins. Missing optional dependencies trigger clean fallbacks — not failures. Runs in minimal environments without modification.
📊
Deterministic Classification
Phase 5 quality tiers — critical / needs_work / polish / good — are computed algorithmically from structured findings, not LLM opinion. Reproducible, auditable, CI-friendly.
PR #126 — amplifier-foundation
Integrated into
validate-bundle-repo v3.3.0
New Phase 2.7
validate-bundle-repo.yaml now calls validate-recipes.yaml as a sub-recipe at Phase 2.7 — slotted between agent validation and quality classification.
Results feed into the downstream quality classification and synthesize-report phases automatically.
Opt-In Flags
validate_recipes: "true"
validate_all: "true"
Both flags default to "false" — existing validate-bundle-repo runs are unaffected until teams opt in.
Updated Components
- Quality classification — incorporates recipe validation findings alongside agent validation results
- Synthesize-report — unified output covering both agents and recipes in one report
- Phase ordering — recipe validation slotted at 2.7, before final synthesis
Zero Breaking Changes
v3.3.0 is fully backward-compatible. Teams using validate-bundle-repo today see no change to behavior or output until they explicitly set validate_recipes or validate_all to "true".
The Full Picture
Bundle Validation Ecosystem — All Three Recipes
validate-bundle-repo.yaml orchestrates both validate-agents.yaml (existing) and the new validate-recipes.yaml (Phase 2.7) as sub-recipes, producing a unified validation report.
Developer Experience
One command
to validate everything.
Validate recipes in a repo directly
amplifier tool invoke recipes operation=execute \
recipe_path=recipes:recipes/validate-recipes.yaml \
context='{"repo_path": "/path/to/repo"}'
Full bundle validation — agents + recipes together
amplifier tool invoke recipes operation=execute \
recipe_path=foundation:recipes/validate-bundle-repo.yaml \
context='{"repo_path": "/path/to/repo", "validate_all": "true"}'
Output: Clean repo
Bash quick-pass report. No LLM invoked. Fast, free, deterministic.
Output: Issues found
LLM synthesis gate activates. Structured findings with severity, file paths, recommended fixes.
Quality tiers
critical · needs_work · polish · good
Impact
Shipped in one session.
3
PRs Merged
#49 · #50 · #126
~4,200
Lines of New Code
YAML + Python
117
New Tests
validate-recipes · 6 files · 3 fixtures
397
Existing Tests
Recipe engine · all passing
0
Regressions
Clean merge across both repos
Full brainstorm → design → plan → implement → verify → merge lifecycle completed in a single Amplifier session.
Sources & Methodology
How this deck was built
Methodology Notes
- ~4,200 lines is the author's estimate; exact count not independently verified
- Test counts (117 new, 397 existing) are as reported by the test runner at merge time
- "One session" reflects the author's account — no independent clock measurement
- "3-line fix" refers to the core logic addition, not including surrounding whitespace or test lines
- Dec 2025 gap date is author-reported, not independently verified from git blame
Data as of: 2026-03-13 ·
Context provided by: PR author / session participant ·
External research: None (all data from direct session context) ·
Feature status: Active — PRs merged to main branches
Try It
Validate your recipes
today.
One command. 7 phases. Zero guesswork.
recipes:recipes/validate-recipes.yaml
validate_all: "true"
amplifier-bundle-recipes · amplifier-foundation · Shipped 2026-03-13