You have met every stage of the loop one at a time. Now watch the whole thing run on its own — pass after pass, with no human pressing go. Your job changes from running the passes to watching the passes run and reading the evidence. AFK: away from keyboard, fully observable.
So far you have learned the loop as five moves you make: LEARN the real state, ANALYZE the gap and pick one thing, EXECUTE that one bounded unit, VERIFY it at the real boundary, and DECIDE what happens next. Reading them like that, it sounds like a checklist you walk through by hand, one pass at a time.
It is not. Once the scope contract is written — the measurable "done" from lesson 2 — the loop runs AFK: away from keyboard. An agent does a pass. Then another. Then another. It keeps going until "done" is met or it hits something only a human can answer. Nobody clicks anything between passes.
That changes your role completely. You are not the one running each pass anymore. You are the one watching the run — reading a log that the loop writes as it goes, glancing at a status line, checking a checklist fill in. This is observability: you see what is happening, but you are not in the path. The system drives; you observe.
The reframe, in one line: the operator does not run the passes — the operator watches the passes run and reads the proof. The only time you are pulled back in is a genuine fork that only a human can resolve. Routine review is never your job.
Think of it like… a dishwasher. You load it, you choose the cycle, you press start — and then you walk away. You do not stand there rotating the arm or squeezing the soap each minute. You glance at the little light now and then. It runs wash → rinse → dry on its own and stops when the dishes are clean. The only time it pulls you back is a real problem it cannot solve itself — the drain is blocked, or it is out of salt. That blinking light is your observability; the cycle is the loop; "dishes clean" is your done-when. Where the analogy breaks: a dishwasher can fake "done" by just stopping, but the loop is forbidden to — every pass must prove done at a real boundary, never claim it.
AFK is not "fire and forget and hope." It is a tightly-bounded autonomous loop with three structural guarantees. One: every pass ends at a real Proof Gate — the change is exercised against the actual boundary (the test runner, the HTTP endpoint, the file on disk, the rendered page), never a claim or a mock. Two: the loop appends every pass to an observable artifact (LOOP-LOG.md) and keeps a live status, so the run is legible without interrupting it. Three: the loop blocks the human only on a genuine user-only fork — a decision that no amount of evidence-gathering can settle (a product call, an irreversible action, a missing credential). Everything else it resolves itself and records.
In operations terms, the human sits on the observability plane, not the control plane. You consume signals — LOOP-LOG.md, the done-when checklist, the per-pass status, the end-of-run review.md. You do not emit control actions into the loop between passes. The separation is deliberate: a human in the per-pass control path is the bottleneck that makes long-horizon autonomy impossible. Pull the human out of the path, leave them a clear window in, and the loop can run for hours.
This lesson is the hinge. Module 2 taught the five moves; this shows them running unattended. Module 3 (the Forge) wraps a full front-end around an AFK run — an Executor builds each ticket and an independent Validator proves it against GOAL.md (the Validator is never the builder), and a final AFK QA pass emits review.md as an observability report. The thread that ties it all together is exactly this lesson's claim: everything runs AFK; the human's only role is observability.
There is a tempting wrong model, and the right one. The wrong one feels productive because you are busy. The right one feels strange at first because you are not.
The tell: if stepping away from your keyboard stops progress, you are driving. If stepping away changes nothing except that you will read a longer log when you come back, you are observing — and that is the loop in motion.
Here is the whole thing. The five moves form a ring that the agent walks again and again. You are drawn outside that ring on purpose. You do not reach into it. You read what it writes — LOOP-LOG.md and the live status — through a one-way window.
LOOP-LOG.md and status through a read-only window, never acting inside.The placement is the whole point of the diagram. Put the human inside the ring — making the DECIDE call each pass, or eyeballing the VERIFY — and the loop can only advance at human speed and human availability. Put the human outside, with a read-only window onto the artifacts, and the loop advances at machine speed while staying fully legible. The arrow from LOOP-LOG.md to the human points one way: information out, no control in.
The single most important arrow is the one from DECIDE back to the next pass. If VERIFY proves done, DECIDE stops the run. If VERIFY shows the gap narrowed but is not closed, DECIDE feeds an improved plan back into LEARN and the ring turns again. If VERIFY surfaces a user-only fork, DECIDE pauses and raises a handoff. Three outcomes, one of which (and only one) involves you — and even then only as a decision-maker at a fork, not as the driver of routine passes.
Let us slow the loop right down and walk a single pass, stage by stage, with the running example from this course: RHG, the app we have been improving lesson after lesson. Pick a scenario, then press Next to light up each stage in turn. Watch where VERIFY sends the pass at the DECIDE branch — because a pass can end three different ways, and only one of them ever involves you.
Start here
A pass begins — and you are not in it
Press Next to walk a pass that converges. Switch the scenario above to see a pass that loops, or one that has to stop and ask you.
A pass is a function. LEARN, ANALYZE, EXECUTE, then VERIFY returns a verdict, and DECIDE is a three-way switch on that verdict. Note that nothing in this loop calls back to a human except the one explicit handoff branch — and that branch fires only on fork, never on a routine pass.
async function pass(goal, log) { const state = await learn(goal); // see the real state const unit = analyze(state, goal); // classify gap, pick ONE await execute(unit); // one bounded change const proof = await verify(unit, goal); // at the REAL boundary log.append(proof); // observability, every pass switch (proof.verdict) { case 'done': return { stop: true }; // converged case 'progress': return { stop: false }; // iterate → next pass case 'fork': return handoff(proof); // the ONLY human pull-in } } while (!(await pass(goal, log)).stop) { /* AFK: no human between passes */ }
A naive loop has two outcomes: done, or try again. The third — fork — is what keeps autonomy safe. Without it, a loop that hits a genuinely human question either guesses (and produces confident garbage) or spins forever. With it, the loop does the honest thing: it stops, packages the decision, and surfaces it. The discipline is that fork is rare and specific — a product decision, an irreversible action, a missing secret — never "this is hard" or "please confirm this routine step."
The branch you just stepped through deserves a picture of its own, because it is the heart of "you observe, you do not drive." VERIFY produces a verdict; DECIDE acts on it. Two of the three paths keep the loop autonomous. Only the third — and only sometimes — reaches you.
Read the colours: green (converged) and blue (iterate) never touch you. Rust (handoff) is the only one that does — and it fires on a genuine fork, not on routine review. If you find yourself in the loop on a green or blue path, something is mis-configured: you are driving when you should be observing.
Now drive the dashboard the way you actually would in an AFK run — except "drive" is the wrong word, because the only buttons here are the loop's own moves, and your real job is to read the readout. Press a move to advance the pass; watch the highlighted state travel, the allowed next-moves update, and the log append. The point to feel: at every state the loop knows what comes next on its own — the buttons are showing you the machine's options, not asking you to choose.
Walk it to a finish. A converged pass dead-ends at CONVERGED with nothing left to press. A blocked pass dead-ends at BLOCKED — and the only thing that moves it is a human decision (the resolve move, marked in blue). That blue move is the one and only place a person belongs.
The clay node is the current state. Faint nodes are not reachable from here. resolve is the only human move.
Current state · what you observe
IDLE
The run is configured and waiting. Press start to begin pass 1. After that, the loop advances itself.
Moves the loop can make next
LOOP-LOG.md — appended as the run moves
The dashboard is a finite-state machine — the same shape as the loop itself. There is one current state, a fixed set of moves, and a table mapping (state, move) → nextState. The buttons read the table to decide what to enable; you cannot trigger a move that is not legal from the current state, which is exactly why the loop cannot "skip" the Proof Gate. CONVERGED is terminal (the run is done). BLOCKED is a special state: the only move out of it is resolve, and resolve is the human's.
const run = { IDLE: { start: 'RUN' }, RUN: { verify: 'VER' }, // always pass through proof VER: { done: 'DONE', again: 'ITER', // converge or loop… fork: 'BLK' }, // …or raise a fork ITER: { start: 'RUN' }, // next pass, autonomously BLK: { resolve: 'IDLE' }, // the ONLY human move DONE: {} // terminal — converged };
When a move is not in the table for the current state, the button is disabled, not silently inert. That visible constraint is the teaching: the loop's options are finite and known at every step, so its behaviour is predictable and legible from the outside. A human watching the dashboard can always answer "what can happen next?" without reading any code — which is the essence of observability.
A still diagram shows the stages; it cannot show the motion. Press Play and the loop runs itself — a pass token travels LEARN → ANALYZE → EXECUTE → VERIFY, the Proof Gate decides, and the run either converges or sends the token back round for another pass. Play auto-advances with no input from you (that is the AFK part); Step walks one beat; Reset reshuffles whether this run will converge or loop again. Notice you are pressing nothing while it runs — you are just watching.
Ready. Press Play to run a pass unattended. This run will converge or loop — Reset reshuffles which.
The motion is a five-phase finite state machine in vanilla JS — learn, analyze, execute, verify, decide. Each beat tweens the token with requestAnimationFrame and an ease-in-out curve; Play auto-fires the next beat on a timer (the AFK behaviour), Pause clears it. There is no <video>, no GIF, no library — so it scrubs, steps, and resets deterministically. The always-orbiting green dot is a single declarative <animateMotion> with repeatCount="indefinite", signalling "the loop is alive" with zero JS.
Honouring prefers-reduced-motion: reduce is non-negotiable for a lesson that leans on animation. When it is set, the JS tweens collapse to instant jumps (the token still moves between stages, it just does not glide), and the whole sequence is narrated in the aria-live="polite" caption so a screen-reader user — or anyone who turned motion off — gets every beat in words. The declarative orbit is intentionally subtle and low-contrast so it does not distract; a stricter build could pause it under the same media query.
LOOP-LOG.md — advance time, watch it fillThis is the single most important thing you actually do as the observer: you read the log. So here is the log itself, live. Press Advance one pass to let the loop take another step — and watch two things at once: the done-when checklist on the left ticks closer to complete, and LOOP-LOG.md on the right gains a new entry. You are not executing the pass. You are watching the loop execute it and reading what it wrote.
Keep advancing. The run ends one of two ways: every done-when box turns green and the verdict reads converged, or the loop hits a fork and the verdict turns blue — decision-ready, waiting for you. That is the whole rhythm of an AFK run, compressed.
done-when · the scope contract for RHG
LOOP-LOG.md · appended by the loop, read by you
LOOP-LOG.md entry containsEach pass appends a small, append-only record. The shape is deliberate: a timestamp, which unit was attempted, the verdict from the Proof Gate, and the evidence for that verdict — the actual command run and its actual result, never a paraphrase. That last part is what makes the log trustworthy: the human reads proof, not the loop's opinion of itself.
## pass 3 · 2026-06-15T14:22:09Z unit: add /health endpoint + smoke test verify: curl -s -o /dev/null -w '%{http_code}' :3000/health result: 200 # real boundary, not a claim verdict: progress # done-when: 3/5 met → iterate next: a11y sweep on the pricing view
The log is never rewritten, only appended. That gives the observer a complete, ordered history of the run — every pass, every verdict, every piece of evidence — so a human arriving late can reconstruct exactly what happened without having watched it live. It is the loop's flight recorder. When the run finally converges (or forks), the same artifact is the basis of the end-of-run review.md the QA pass emits.
Notice the last done-when item — "Pricing copy approved" — is tagged human-only. The loop can drive the other four to green entirely on its own. It physically cannot tick the fifth, because it is a product decision. That single item is what eventually turns the verdict blue and raises the handoff. Everything above it is the loop's job; that one line is yours.
Sometimes you do not want to read the whole log — you want the vital signs at a glance, the way an on-call engineer scans a dashboard. That is the status readout. Four headline numbers tell you how the run is going; a table below shows each stage's health on the latest pass. Hit Refresh to pull a new reading, or turn on Live to watch it tick the way it would during an AFK run. Again: you are reading, not running.
Run health — RHG · loop in motion
AFK run · goal: GOAL.md · rolling, since pass 1
| Stage | Status | Last latency | Last note |
|---|
A single array of stage objects drives both the table and the rollup pill. Each tick perturbs the metrics within realistic bounds, recomputes each stage's status, then re-derives the banner: any proof-failed stage → red, any human-waiting stage → blue, else green/converging. The KPI deltas are coloured by meaning — "human pulls" trending down is good (green), even though both arrows can point the same way. The whole point of a status readout is that the eye reads the colour first.
For an AFK run, "healthy" is a specific thing: passes are completing, the proof fail rate is low or falling, done-when is climbing, and human pulls are at zero. A run that is iterating happily with nobody touching it is the ideal state — it means the loop is doing exactly its job. The one number you watch most is "human pulls": as long as it stays at zero, you can stay away from the keyboard.
When the loop is wrapped in the Forge front-end (next module), the work is a board of tickets — small bounded units, each with blocking relationships, arranged like a kanban. In an AFK run, you do not drag the cards. The loop does: it picks an unblocked ticket, builds it, proves it, and slides it to Done — then the next. Press Run one loop pass and watch a card advance on its own. The counts at the top stay honest the whole time. You can still grab a card to see it move, but in a real run you would just be watching.
One card is special: RHG-19 carries a human-only fork. The loop will move everything else to Done and leave that one parked in progress with a blocked tag — because it is the one decision the loop is not allowed to make. That parked card is your cue.
Every ticket is one object holding a col field that can only be triage, progress, or done. The columns on screen are not the source of truth — the array of ticket objects is. "Run one loop pass" finds the highest-priority unblocked ticket, advances it one column, and re-paints. A ticket with an unmet human flag can move into progress but cannot reach Done — exactly mirroring the loop: the Executor builds it, the Validator (a different agent, never the builder) tries to prove it, and the Proof Gate refuses to pass a human-only decision.
function runOnePass() { const t = tickets .filter(x => x.col !== 'done' && !x.blocked) .sort(byPriority)[0]; if (!t) return raiseHandoff(); // only blocked work left → you t.col = next(t.col); // triage→progress→done if (t.human && t.col === 'done') t.col = 'progress'; // can't auto-finish a fork render(); }
In the Forge, tickets carry blocking edges — ticket B cannot start until ticket A is done and proven. The loop respects them automatically when it picks the next unblocked unit, which is why a long run does not need a human sequencing the work: the dependency graph plus the priority order is enough. The human's only sequencing role is the very first one (writing the goal) and the very last one (a genuine fork).
Here is the other way you consume an AFK run: not live, but afterward. You came back to your keyboard; the run finished hours ago. You read the record top to bottom — a timeline of every pass, why each verdict came out the way it did, the totals, and a checklist of any follow-ups. This is the same flight-recorder log from earlier, rendered as a report you can scan in two minutes. Olive dots are normal passes, clay is a pass that found a gap, blue is the one fork that pulled a human, green is convergence.
LEARN baseline · 1/5 done-when met
The loop read the real state of RHG: build green, but no /health endpoint, two failing tests, a11y unchecked. Picked the failing tests as the first unit.
Fixed failing tests · proof: vitest run → 0 failing
Executed one bounded fix, verified at the real boundary (ran the suite), recorded the pass. 2/5 met. Looped on its own.
progressAdded /health — first attempt failed proof
VERIFY hit :3000/health and got 404, not 200. The Proof Gate refused to pass it. The loop did not claim success — it logged the failure and improved the plan.
/health returns 200 · 3/5 met
Route registered correctly; curl returned 200 at the real boundary. Recorded and looped. Still nobody at the keyboard.
Hit a fork — pricing copy needs a human
4/5 met. The last item was a product decision the loop is not allowed to make. It paused, packaged the decision, and raised a handoff. This is the only moment a human was pulled in.
handoff · decision-readyDecision applied · 5/5 met · converged
Once the human answered the one fork, the loop applied it, re-verified all five done-when items at their real boundaries, and stopped. Run complete, fully proven.
convergedWhat 39 unattended minutes bought — and the number that matters most at the end.
6
passes run1
proof failure (caught)1
human pull-in5/5
done-when metAfter reading the run, the only things left for you are light follow-ups — not re-doing the work. Check them off; the bar tracks you. Note that none of these is "run the passes again": the loop already proved every one.
review.md are different artifactsLOOP-LOG.md is written during the run, pass by pass, by the loop itself — it is the raw flight recorder. review.md is written after the run by a separate AFK QA pass whose only job is observability: it reads the finished state, re-checks the done-when independently, and writes a human-readable verdict. Crucially, the QA is itself AFK — the human does not run it. The human reads its output. Even the review of the run is automated; your role stays pure observation.
Pass 3 failing its Proof Gate, then being caught and retried, is the system working correctly. A run with zero proof failures across many passes is more suspicious, not less — it can mean the gate is too weak to catch anything. The presence of a caught-and-recovered failure in the log is evidence the boundary is real. The observer should be reassured by it, not alarmed.
Everything so far has said "you observe, you do not drive." Here is the single, bounded exception. When the loop hits a decision that no evidence can settle — a product call, an irreversible action, a missing secret — it does not guess and it does not spin. It stops at a fork, packages the decision, and hands it to you decision-ready. You make the one call. The loop takes it from there and goes back to running itself.
The whole model in one sentence: the loop runs every pass AFK and proves each at a real boundary; you live on the observability side reading LOOP-LOG.md / status / review.md; the only thing that ever crosses to your side is a genuine, decision-ready fork — and after you answer it, the loop goes right back to running itself.
The whole idea has a concrete home: a goal file you write once, an append-only log the loop writes as it runs, and a status you read. You can open all three with one command. The thing to notice is that none of them is a control surface for you — they are the loop's workspace and your read-only window onto it.
~/.claude/skills/loop-engineering/forge-flow.md — the AFK + observability contract# the human's ONLY role is observability: # read LOOP-LOG.md — every pass, with real proof # read status — converging / iterating / blocked # read review.md — the AFK QA's end-of-run report # the human NEVER executes a pass, and NEVER runs the QA. # the human blocks ONLY on a genuine user-only fork (handoff).
And the run itself, stripped to its spine — a loop that drives passes and writes the log, with the human nowhere in the per-pass path:
while (!doneWhenMet(goal)) { // the contract from lesson 2 const proof = await pass(goal); // learn→analyze→execute→verify log.append(proof); // observability, every pass if (proof.verdict === 'fork') // the ONE human pull-in await handoff(proof); // decision-ready, then resume } await qaReview(goal); // AFK QA → review.md (you READ it)
# read the AFK + observability flow that this lesson teaches cat ~/.claude/skills/loop-engineering/forge-flow.md # during a run, tail the flight recorder without touching the loop tail -f LOOP-LOG.md # after a run, read the AFK QA's observability report cat review.md
From the project's own working rules: "TUDO roda AFK; o humano só tem observabilidade (lê LOOP-LOG.md/review.md/status, não executa nada — nem o QA)." In English: everything runs AFK; the human only has observability — reads the log, the review, and the status, and executes nothing, not even the QA. The Validator is never the one who built the unit, and the human blocks only on a genuine user-only fork via handoff. This lesson is that rule, made visible.
Let us put it all together with one concrete story, end to end, the way it actually happens.
5:58 PM. You write the goal for RHG: ship a /health endpoint, get the test suite green, clear critical a11y issues, and land the new pricing copy. You make the done-when measurable — exact commands and exact pass conditions — approve it, and start the run. Then you close the laptop and go make dinner. You are now AFK.
6:02–6:35 PM. The loop runs. Pass 1 reads the baseline. Pass 2 fixes the failing tests and proves it with the suite. Pass 3 adds /health but the Proof Gate catches a 404 and refuses to pass it — the loop logs the failure honestly and improves the plan. Pass 4 gets a real 200 from curl. Each pass appends to LOOP-LOG.md. You are eating; nobody is at the keyboard. The status quietly reads iterating.
6:36 PM. Pass 5 reaches the last item — the pricing copy. That is a product decision, not something proof can settle. The loop does the honest thing: it stops at a fork, writes a decision-ready handoff, and the status flips to blocked · decision-ready. Your phone shows one notification. This is the only moment the evening needed you.
6:37 PM. You glance at it: two copy options, with context. You pick one. That is your entire contribution — one decision, ten seconds. You put the phone down.
6:38–6:41 PM. The loop applies your choice, re-verifies all five done-when items at their real boundaries, hits 5/5, and stops. Pass 6: converged. An AFK QA pass then writes review.md. Nobody ran it.
8:15 PM. You wander back, open LOOP-LOG.md and review.md, and read the whole evening in two minutes: six passes, one caught proof failure, one fork you answered, everything proven. You did not run a single pass. You observed a run and made one decision. That is the loop in motion.
Count your actions: write the goal (once), answer one fork (ten seconds), read the log (after the fact). Three touches across two hours and seventeen minutes of work. Everything else — six passes, a caught failure, full re-verification, the QA report — ran without you.
Three questions, from memory — no peeking at the sections above. Pick one answer in each; you will find out immediately whether it is right and why. Recalling beats re-reading, so give each a real think first.
1 · During an AFK run, what is the human's role?
2 · A pass verifies, sees the gap shrank but is not closed. What happens next?
3 · Which event is the ONLY legitimate reason to pull a human into a run?
LOOP-LOG.md from one of your own runs. Next up: the Forge front-end, where this exact AFK loop gets a full 7-step pipeline wrapped around it.