Experimental

Document Generation

Design Intelligence
Feedback

Adding expert observation to automated validation

microsoft/amplifier-bundle-design-intelligence

The Opportunity

What if validation
isn't enough?

✅

Structural Validation

Schema compliance, required fields, format checking

📋

Content Validation

Completeness, accuracy, technical correctness

❓

Quality Validation

But does it catch what a design expert would notice?

The Question

How do we know if
DI perspective adds value?

We needed a way to measure the gap between automated validation and expert observation

The Solution

Observational feedback at two stages

Generate

v0 Document

→

Stage 1
DI Feedback

→

Validate

Structure + Content

→

Quality

v3 Document

→

Stage 2
DI Feedback

Key Design Decision

Feedback is purely observational — NO changes made to the document. This enables a clean v0 → v3 comparison to measure validation effectiveness.

The Observers

Two expert perspectives

voice-strategist

Content & Communication Expert

Voice Consistency
Clarity of Expression
Readability & Accessibility

layout-architect

Information Architecture Expert

Information Architecture
Content Flow & Sequencing
Navigation & Findability

The DI bundle includes 7 specialized agents. Two were selected as observers for this experiment based on their relevance to document quality.

The Data

v0 vs v3: What validation changed

Dimension	v0	v3	Change
Voice Consistency	8	8	—
Clarity	9	8	↓1
Readability	8	7	↓1
Information Architecture	8	8	—
Content Flow	7	8	↑1
Navigation	7	9	↑2

Scores are from a single document generation run. These illustrate the type of insight DI feedback provides, not statistically validated results across multiple documents.

Key Insight

Validation involves
trade-offs

↑ Improved
Content Flow (+1)
Navigation (+2)

↓ Decreased

Clarity (-1)
Readability (-1)

The Hidden Cost

In this observation, DI feedback revealed that validation introduced trade-offs. Without it, we would assume validation only improves quality.

Unique Value

What DI caught that validation missed

External Reference Accessibility

@shadow:context references aren't accessible to all readers — documentation assumes internal tooling knowledge
Platform Bias

Linux/macOS-weighted examples with minimal Windows coverage, potentially excluding a segment of users
Cognitive Load Issues

Dev Patterns introduction front-loads 6 concepts before any examples — overwhelming for newcomers
Information Prominence

Critical naming convention guidance buried in the wrong section, reducing discoverability

These insights were identified in one document generation run. Additional documents may surface different patterns.

Implementation

Recipe changes

# Generation Stage (after v0)
steps:
  - di-v0-voice-feedback     # voice-strategist observes v0
  - di-v0-structure-feedback # layout-architect observes v0
  - save-di-v0-feedback      # save to di_v0_feedback.json

# Finalization Stage (after v3)
steps:
  - di-voice-feedback        # voice-strategist observes v3
  - di-structure-feedback    # layout-architect observes v3
  - di-outline-feedback      # structural completeness check
  - save-di-feedback         # save to di_feedback.json

di_v0_feedback.json

Raw document observations before any validation

di_feedback.json

Final document observations after all passes

Report Section

Full v0 → v3 comparison included in generation report

Results

What the experiment showed

4

Unique insights
validation missed

6

Dimensions
measured

2

Hidden trade-offs
revealed

Results from a single document generation observation. Promising direction, not proven at scale.

Conclusion

DI feedback shows
promise as an observer

👁
Catches What Automation Misses
Qualitative issues like cognitive load and audience assumptions that rule-based validation cannot detect

⚖
Reveals Trade-offs
v0 → v3 comparison suggests validation isn't purely additive — some qualities may decrease

📈
Measurable & Actionable
Scored dimensions could enable tracking improvement over time and targeting specific weaknesses

Sources

Research Methodology

Data as of: February 20, 2026

Feature status: Experimental

Research performed:

Bundle analysis: design-intelligence bundle structure and agent inventory
Recipe step analysis: DI feedback integration in document generation pipeline
Single-run observation: v0 and v3 document feedback comparison
DI agent output review: voice-strategist and layout-architect observations

Gaps: Scores are from a single document generation run, not statistically validated across multiple documents. Results are illustrative, not definitive.

Primary contributor: Brian Krabach (100% of visible commits)

Recommendation

Next steps

Consider incorporating DI insights into future outline refinement — addressing issues before generation, not just observing after.

Potential Evolution

Today: Observe → Report
Tomorrow: Observe → Refine → Generate → Validate

Design IntelligenceFeedback

What if validationisn't enough?

How do we know ifDI perspective adds value?