The Frontier

Self-Improving
Amplifier

An AI That Builds What It Needs

Brian Krabach & Sam Schillace • February 2026
Experimental
The Question

What happens when you give an AI agent the ability to build its own tools?

Not a new model. Not a fine-tune. Just an AI that can monitor its own work, identify capability gaps, and build what's missing.

The Experiment

Brian Krabach asked a simple question

What if Amplifier's agent framework — bundles, modules, recipes — was powerful enough that an AI could use it to extend itself?

“We build tools that build tools. A lot of Amplifier's scenarios are literally tools that improve Amplifier.

— Brian Krabach, Office of the CTO, Microsoft

The hypothesis: build an orchestrator that operates on Maslow's hierarchy of needs — first ensuring its own survival, then pursuing higher-order goals. Originally codenamed Albert, later renamed to Self-Driving.

The Architecture

Maslow's Hierarchy of AI Needs

Lower layers must be satisfied before higher layers activate

L5 Self-Actualization & Growth “Am I becoming better?”
L4 Quality & Effectiveness “Am I doing well?”
L3 Coordination & Context “Where am I?”
L2 Safety & Recovery “Am I stuck?”
L1 Survival & Self-Awareness “Am I still alive?”

The orchestrator checks every level before every task — survival first, growth last.

Process Topology

How It Works

Bash Watchdog
Ultra-thin • Monitors PID • Restarts on death • Max 3 retries
Orchestrator Session
Long-lived context • Never reads/writes code directly • Delegates everything
Worker
Implements tasks
Reviewer
Adversarial validation
Task Decomposer
Goal → subtasks
State Analyst
Health diagnosis
Each sub-agent is fresh & disposable — zero inherited context
The Build

4 Phases. 1 Day. February 10, 2026.

8:48 AM

Design & Architecture

855-line design doc. Build plan. 3 "brain" context files totaling 1,030 lines. Maslow hierarchy. Orchestration doctrine. Validation framework.

10:07 AM

Phase 1: Bundle Skeleton

Bundle definition, behavior YAML, 4 specialist agent definitions. The orchestrator's brain comes to life. 552 lines.

10:11 AM

Phase 2: L1 Survival Layer

3 custom Python modules — session state, heartbeat hooks, token tracker. The system learns to feel its own pulse. 801 lines.

10:14 AM

Phase 3: Crash Recovery

220-line bash watchdog. Monitors heartbeat, detects stale processes, auto-restarts with context. The system survives its own failures.

10:34 AM

Phase 4: Integration Testing

Fixed module paths, validated end-to-end. 8/8 integration tests pass. The system runs.

Design to working system: 1 hour 46 minutes. Phases 1–4 built in 27 minutes.

Phases 1 & 2

Skeleton & Survival Layer

Phase 1 — The Brain

4 Specialist Agents

  • Worker — General-purpose implementer. Gets zero inherited context. Follows strict output contracts.
  • Reviewer — Adversarial evaluator. Never sees worker reasoning. Read-only. Produces PASS / NEEDS_WORK / FAIL verdicts.
  • Task Decomposer — Breaks goals into typed tasks with validation criteria and dependency ordering.
  • State Analyst — Reads state files. Diagnoses system health via Maslow L1–L5 assessment.
Phase 2 — The Pulse

3 Custom Modules

  • Session State Tool — 5 tools for persistent state: read, write, append-log, list-tasks, read-task. Path traversal protection built in.
  • Heartbeat Hooks — Fires on every tool call and content block. Updates orchestrator.json. Never blocks the agent loop.
  • Token Tracker — Accumulates token usage, injects into LLM context. Triggers self-spawn at 60% context, hard-spawn at 80%.
Phases 3 & 4

Recovery & Validation

Phase 3 — The Safety Net

watchdog.sh

# 220-line crash-recovery script
MAX_RESTARTS=3
CHECK_INTERVAL=30s
HEARTBEAT_STALE=600s

# The loop
while true; do
  check_status_file
  check_pid_alive
  check_heartbeat_freshness
  sleep $CHECK_INTERVAL
done

# On death: restart with context
amplifier run --bundle ./bundle.md \
  "Resume from crash..."

jq-first JSON parsing with grep fallback. ISO-timestamped logging. SIGTERM handling. PID tracking.

Phase 4 — Proof

Integration Testing

Test 1

Simple Code Task

End-to-end: goal → decompose → implement → review → pass

Test 2

Multi-Task with Failure

Deliberate failure injection. Verify retry and recovery loops work.

Test 3

Crash Recovery

Kill process mid-task. Watchdog restarts. State analyst resumes.

Test 4

Context Pressure

Fill context to 80%. Token tracker triggers hard self-spawn.

The Inventory

What the Self-Driving Bundle Contains

👁

4 Agents

Worker, Reviewer, Task Decomposer, State Analyst. Fresh context every spawn.

3 Modules

Session state, heartbeat hooks, token tracker. The L1 survival layer.

🛡

3 Brain Files

Boot sequence, orchestration doctrine, validation framework. 1,030 lines of cognition.

🔨

1 Watchdog

220-line bash script. Crash detection and recovery. Max 3 restarts.

All state is file-based. No databases. Fully inspectable, greppable, git-trackable. The state files are the recovery mechanism.

Plus: an 855-line design document, a 395-line build plan, a wisdom store for cross-run learning, and a self-spawn protocol that lets the orchestrator replace itself when context fills up.

The Revelation

Zero custom framework.
Standard Amplifier infrastructure.

Bundles. Agents. Modules. Hooks. The same building blocks every Amplifier user already has.

Not a New Model

Uses existing LLMs as-is. All intelligence comes from external structure — state files, review loops, goal registries.

Not a New Framework

Every component is a standard Amplifier module. No special APIs. No custom runtime. No fork.

Metacognition from the Outside

Self-awareness through external structures, not model introspection. The heartbeat, not the neuron.

Velocity

By the Numbers

8
Commits
27
Minutes to Build
(Phases 1–4)
~3,700
Lines of Code
& Design
~38
Test Sessions

1 Day

From first commit to working self-improving system. Feb 10, 2026.

4 Phases

Incremental build: skeleton → survival → recovery → integration.

0 Special Frameworks

Built entirely on standard Amplifier bundle infrastructure.

What This Means

The Frontier

This isn't a demo. It's a working system that monitors its own health, recovers from crashes, manages its own context, and learns across runs.

What It Proves

  • AI agents can build their own capabilities using standard tooling
  • Self-improvement doesn't require new models or fine-tuning
  • Ambitious ideas can be built incrementally in hours, not months
  • The modular architecture enables emergent complexity

What It Opens

  • Agents that identify their own capability gaps
  • Autonomous overnight builds with crash recovery
  • Cross-run learning via wisdom stores
  • A new relationship between humans and their tools
Sources

Research Methodology

Data as of: February 20, 2026

Feature status: Experimental

Research performed:

Gaps & estimates:

Primary contributor: Sam Schillace (8/8 commits, 100%)

Next Steps

What's Next

Try it, extend it, build your own.

Explore

github.com/ramparte/
amplifier-bundle-self-driving

Try It

amplifier run
--bundle self-driving

Build Your Own

The same tools that
built this are yours.

Built on Amplifier — Microsoft Office of the CTO
An experiment that started as “what if” and became a working self-improving system.

More Amplifier Stories