More Amplifier Stories
amplifier-bundle-ui-vision

Teaching AI
to See

When an AI agent built a feature but couldn't see
what it built, it created its own eyes.

February 12-13, 2026

Act I

The Vision

The Canvas team set out to build an Artifact Viewer - a full-featured file browser for AI-generated code. The spec was meticulous.

1,021
Lines of Spec
25
Screenshots Studied
17
HTML Mockups
21
Spec Sections
CANVAS-ARTIFACT-VIEWER-SPEC.md
25 Kepler desktop screenshots analyzed. 17 interactive HTML mockups prototyped. 21 sections covering every detail from file tree styling to VS Code-inspired badges. A 10-Act acceptance test scripting the exact verification sequence.
Act II

The Autonomous Build

A 528-line implementation prompt was handed to an Amplifier session. What followed was 4 hours of fully autonomous development.

ComponentLines
ArtifactViewer.tsx617
useArtifactStore.ts462
CodeView.tsx272
PreviewView.tsx255
FileList.tsx227
+ 5 more files273
2,106
Total Lines Written
10
React Components
~4h
Autonomous Dev Time
The Output

A Complete Feature,
Built Without Human Touch

ArtifactViewer.tsx
617-line main popup shell with full-screen mode, keyboard shortcuts, and responsive layout
useArtifactStore.ts
462-line Zustand state management with file registry, version history, and selection logic
CodeView.tsx
PrismJS syntax highlighting with line numbers and 15+ language support
PreviewView.tsx
Multi-strategy HTML/image preview with iframe sandboxing and asset inlining
FileList.tsx
VS Code-style tree navigator with colored file-type badges and expand/collapse
SSE Detection
Real-time artifact detection from streaming tool_call events via Server-Sent Events
Act III

The Crash

The AI built the feature flawlessly.
Then it tried to look at what it built.

The Failure

35+ Nested Recursion Loop

The session launched a browser-operator to visually verify the UI. The operator couldn't process what it saw. It spawned another. Then another. Then another.

// What the session tried to do: delegate browser-operator "Check if the UI looks right" -> delegate browser-operator "Navigate and verify" -> delegate browser-operator "Take screenshot" -> delegate browser-operator "Analyze screenshot" -> delegate browser-operator "..." // 35+ levels deep // Session crashed // Context window exhausted

One simple question killed the session: "Does this look right?"

The Evidence

SESSION-HANDOFF.md

The session died before updating its handoff document. Every feature: PENDING. Every acceptance test: NOT TESTED. Every checkpoint: NOT REACHED.

Feature Progress
Zustand Artifact RegistryPENDING
SSE Artifact DetectionPENDING
ArtifactViewer ShellPENDING
File List UIPENDING
Code TabPENDING
Preview TabPENDING
Version HistoryPENDING
...11 more featuresPENDING
10-Act Acceptance Test
Act 1: Project SetupNOT TESTED
Act 2: File CreationNOT TESTED
Act 3: Preview TestingNOT TESTED
Act 4: IterationNOT TESTED
Act 5: Adding ImagesNOT TESTED
Act 6: Cross-ProjectNOT TESTED
Act 7: SnapshotNOT TESTED
Act 8-9: Error & ResumeNOT TESTED
The Insight

AI agents need
eyes, not just code.

Building UI without visual verification
is building blind.

Act IV

Pain Drives Innovation

ui-vision

A purpose-built Amplifier bundle born from a crash.
Built the next morning. Merged by lunch.

The Solution

Three Agents, Three Modes

Each agent answers a different question about your UI.

👁
Visual Auditor
"Does this look good?"
Quality
🔄
Regression Checker
"Did my changes break anything?"
Safety
Accessibility Scanner
"Can everyone use this?"
Inclusion

Plus 3 matching recipes for repeatable, automated testing workflows.

How It Works

Playwright MCP + Vision Mode

The core innovation: screenshots returned as base64 images the model can literally see and analyze.

Navigate
browser_navigate
Screenshot
base64 image
Snapshot
a11y tree
Analyze
structured report
# The magic flags command: npx args: - "@playwright/mcp@latest" - "--caps=vision" # Enable vision mode - "--headless" # No browser window - "--image-responses" - "allow" # Return base64 images
Amplifier Composability

Drop-In Vision for Any Bundle

The entire capability is a composable behavior YAML. Add visual testing to any existing Amplifier bundle in two lines.

# As a full bundle includes: - bundle: git+...amplifier-bundle-ui-vision@main # As a behavior (compose into existing) includes: - bundle: git+...ui-vision@main #subdirectory=behaviors/ui-vision.yaml
What You Get
● Playwright MCP with vision mode
● 3 specialist agents auto-registered
● 3 recipes ready to execute
● Context docs for delegation routing
Bundle Size
1 behavior YAML (31 lines)
3 agent definitions
3 recipe definitions
2 context documents
Act V

First Sight

February 13, 11:00 AM - The AI opens its eyes.

// The very first visual verification
browser_navigate("http://localhost:5173")
browser_take_screenshot()
// -> Returns base64 PNG
// -> Model SEES the rendered pixels
// -> Saved: vision-test-screenshot.png

// "I can see the dashboard layout with a sidebar
// on the left, a header at the top, and the main
// content area showing the artifact viewer..."
30
Screenshots Captured
11:25
PR #31 Merged
The Transformation

Before & After

✕ Before ui-vision

AI builds UI code blindly, hoping it works
Visual verification requires a human in the loop
Generic browser-operator with no visual context
35+ recursion crash trying to self-verify
SESSION-HANDOFF: every item PENDING
No accessibility awareness during development

✓ After ui-vision

AI sees base64 screenshots of rendered UI
Autonomous visual verification with structured reports
Purpose-built specialist agents for each concern
Clean delegation, focused analysis, no recursion
Visual evidence attached to every verification
WCAG 2.1 AA scanning built into the workflow
The Pattern

2-Step Recipe Architecture

Every recipe follows the same pattern: specialist analysis, then executive summary.

# visual-audit.yaml name: visual-audit steps: - id: audit agent: ui-vision:visual-auditor instruction: | Navigate to {{ url }} Screenshot each page in {{ pages }} Analyze: layout, typography, color, spacing Rate issues: Critical / Major / Minor / Nitpick - id: summary instruction: | From {{ steps.audit.result }}: 1. Overall quality score (1-10) 2. Top 3 issues to fix 3. Top 3 things done well 4. Next steps by effort vs impact
Velocity

From Crash to Capability

Feb 12, Morning
The Vision
1,021-line UX spec completed with 17 mockups
Feb 12, Afternoon (~4 hours)
The Build
2,106 lines of React autonomously written across 10 components
Feb 12, Evening
The Crash
35+ recursion loop. Session dead. All features marked PENDING.
Feb 13, Morning
The Build (ui-vision)
3 agents, 3 recipes, 1 behavior YAML. Purpose-built from pain.
Feb 13, 11:00 AM
First Sight
30 screenshots captured. AI verifies its own UI. vision-test-screenshot.png saved.
Feb 13, 11:25 AM
PR #31 Merged
The AI can see. The loop is closed.
Lessons

What This Teaches Us

AI builds its own tools
When an AI agent hits a wall, the response isn't a workaround - it's a new capability. The crash wasn't a failure. It was a requirements document.
Composability is leverage
A 31-line behavior YAML gives any Amplifier bundle the power of visual testing. Agents, recipes, and MCP tools compose like LEGO bricks.
Pain drives purpose
The most useful tools emerge from real failures. A 35-recursion crash at 10 PM led to an elegant solution by 11 AM the next day.
Specialist > generalist
A generic browser-operator crashed. Three focused specialists - auditor, regression checker, accessibility scanner - each do one thing brilliantly.
Try It

The AI Can See Now.

Add visual testing to your Amplifier bundle today.

# In your session: "Check how localhost:5173 looks" "Did my CSS changes break anything?" "Run an accessibility scan on the signup form"

amplifier-bundle-ui-vision  ·  Built Feb 13, 2026  ·  PR #31

1 / 18
More Amplifier Stories