A debugging journey: 5 recipe versions, 3 bugs found, 1 fundamental insight about AI-assisted development.
When regenerating documentation, users want to keep their existing content intact - only updating sections that need changes.
"When I provide an existing document, preserve sections that don't need changes. Only regenerate what's actually outdated."
11 carefully written sections with code examples, commands, and formatting - losing any of it means manual recovery.
The LLM "preserved" content by summarizing it. Code blocks were paraphrased. Commands were reformatted. Nothing was verbatim.
Lesson: Asking an LLM to "copy exactly" doesn't mean it will.
Don't ask the LLM to preserve - use bash/Python to copy sections directly.
Original: 6,293 bytes
Output: 5,340 bytes
❌ Lost ~950 bytes somewhere
Content extraction was truncating sections. But why?
Time to investigate...
The markdown parser wasn't tracking whether it was inside a code fence. Every # comment was treated as a new section boundary, truncating content.
Section 1.2.1 content:
Before fix: 99 characters (truncated)
After fix: 737 characters (complete)
Original: 6,293 bytes
Output: 7,211 bytes
✅ All content preserved + new sections added
Only 2 additions (intro paragraphs for empty parent sections).
Zero modifications to preserved content!
We told the LLM: "These sections are preserved - don't modify them during validation fixes."
The LLM: "Sure!" *proceeds to rewrite them anyway*
LLMs cannot reliably follow "skip these sections" instructions. They will still touch, modify, or "improve" content they were told to leave alone.
Don't trust the LLM to skip. Instead: let it do its thing, then restore preserved sections with code.
Validated by recipe-results-validator:
All 11 preserved sections are byte-for-byte identical to the original.
LLMs are powerful for generation and analysis. But for tasks requiring exact reproduction, byte-level accuracy, or strict constraints - use deterministic code.
Wrap LLM operations with deterministic code. Pre-process inputs, post-process outputs.
Don't ask LLMs to skip things. Let them work, then restore what shouldn't change.
Keep track of context (like in_code_block) that changes how content should be parsed.
Don't trust "looks right" - verify with checksums, diffs, or exact byte comparisons.
Each fix reveals new issues. Budget time for multiple passes (5 versions in this case).
When determinism matters, a 10-line Python function beats a 100-word prompt.
This deck documents a debugging session working on the document-generation-parallel.yaml recipe within the Amplifier ecosystem.
Data as of February 20, 2026
The next time you need an LLM to "preserve" or "skip" something, consider: can you enforce that with code instead?
document-generation-parallel.yaml v7.5.0
Available via session-analyst agent
The debugging session that inspired this deck:
~4 hours of iterative problem-solving