Step 14 · In Practice · In Practice · Loop Engineering ENPT
Module 5 · In Practice · Lesson 14

In practice: one ask, end to end

One rough idea, driven all the way to shipped — Forge to set it up, the loop to build it, the toolbelt to prove it — and the whole time the human only watched. This is the previous thirteen lessons doing one real job, then handing you the keys.

Plain-language first; open any panel for the precise version.
1

The big idea: a rough idea, shipped, while you watched


Everything in this course has been one piece at a time: what a loop is, how to scope it, the five steps, the gates, the Forge front-end, the toolbelt, the course engine. This lesson runs all of it at once, on one real ask, from a sentence to a shipped change — and you do not touch the keyboard once it starts.

Here is the ask we will follow the whole way down: "RHG needs a health endpoint and a tiny status page, behind login, shipped safely." That is vague on purpose — it is how real asks arrive. By the end it becomes a measurable contract, a board of tickets, code that an agent wrote, a proof that the code actually works, and a write-up of what shipped. None of that is mocked: every "done" on this page is a real check that really passed.

The shape of the run is two halves. Forge is the front-end that turns the fog into a plan — seven steps: grill, research, prototype, PRD, issues, implement, review. The loop is the engine inside the "implement" step that builds each ticket and refuses to call it done until a real check passes. Wrapped around both is one rule that makes it safe to walk away: everything runs AFK (away from keyboard), and your only job is observability — you read the log, you do not drive a pass.

Think of it like… commissioning a kitchen renovation while you are at work. You don't lay tile. You hand over a brief, the crew turns it into a plan with a sign-off at each stage, they build it, an independent inspector checks each stage against the brief, and you get photos on your phone all day. You only step in for a decision only you can make — "which countertop?" — never to hold a hammer.

The loop (the engine)

Five steps, run over and over until the contract is met: LEARN (observe the real state) → ANALYZE (classify the gap, pick exactly ONE bounded unit) → EXECUTE (build that one unit) → VERIFY at the real boundary (the Proof Gate — run the actual check, never a claim and never a mock) → DECIDE (advance, retry, or escalate). The Proof Gate is the non-negotiable: a unit is "done" only when a command run against the real artifact returns the expected result.

Forge (the 7-step front-end)

For a raw or vague ask you run Forge first: 1 grill (self-debate that converges the scope), 2 research (optional — the Bright Data CLI pulls real facts into research.md), 3 prototype (optional — real evidence the approach works), 4 PRD (the product spec), 5 issues (tickets with BLOCKING relationships — a kanban) plus GOAL.md (the durable ultragoal contract), 6 implement (the AFK loop: an Executor builds each ticket, an independent Validator proves it against GOAL.md — the Validator is never the builder), 7 review (AFK QA that emits review.md as an observability report).

AFK + observability

Every step above runs unattended. The human's only role is observability: read LOOP-LOG.md, review.md, the status readout. You never execute anything — not even the QA. You are blocked only on a genuine user-only fork, surfaced as a decision-ready handoff. Cross-agent delegation is via headless cli -p; web evidence is always the Bright Data CLI (never WebSearch/WebFetch, never the Bright Data MCP); GUI verification via Computer Use is non-blocking and accessibility-only.

Control-plane patterns this run follows: steipete/agent-scripts. The durable-goal discipline behind GOAL.md: jxnl/dots (ultragoal).

2

The whole run in one picture


Read it left to right. The seven Forge steps run in order; step 6 (implement) is where the loop spins, one pass per ticket; step 7 (review) emits the report. Under the whole line runs the observability rail — the human reads it, never steps onto it, until the one decision fork pulls them up.

1 grill 2 research brightdata 3 prototype 4 PRD 5 issues + GOAL.md 6 implement the loop spins 7 review review.md OBSERVABILITY · the human reads, never executes LOOP-LOG.md · status · review.md reads handoff · the one user-only fork
Seven steps across the top; the loop spins inside step 6; the observability rail is where the human lives. The only upward arrow is the handoff.

We will now walk every numbered beat of this picture as a live, clickable thing — a flowchart you step through, a plan you open phase by phase, a board that moves itself, the tiers that stayed locked, a replay of one pass, the recovery when a proof failed, the improve-the-prompt branch, and the shipped PR. Nine interactive widgets, one ask.

3

Walk the run end to end


Same run, now as a decision you can step through. Pick a path and press Next: the happy path runs all seven steps to a shipped result; the other paths show the two forks that can pull you off the autopilot — a proof that failed (the loop recovers itself) and the one genuine user-only decision (the loop stops and hands you a choice).

Trace run:
pass fail → retry user-only A rough ask arrives grill · research · prototype PRD written issues (kanban) + GOAL.md EXECUTE one unit Executor builds Proof Gate 200? ✓ review.md · shipped handoff decision-ready
Read top → bottom. The Proof Gate is the diamond: a pass ships, a fail loops back to EXECUTE, and a genuine user-only choice branches to the handoff.
Step 1 of 7

Start here

A rough ask lands

Press Next to drive the happy path from the ask to a shipped result. Switch the run above to see the two forks that pull you off the autopilot.

The gate is a command, not an opinion

In the loop, EXECUTE never gets to declare victory. VERIFY runs the check named in GOAL.md against the real artifact — here curl -s -o /dev/null -w "%{http_code}" localhost:8080/health — and only a literal 200 advances the pass. A failing gate does not stop the run; it feeds the failure back into ANALYZE and the loop tries again (that is the "fail → retry" arrow). The run halts for a human only at the handoff fork, which is reserved for a choice no agent should make alone.

4

Anatomy of the ask: from a sentence to a contract


The first thing Forge does is grill the idea — it argues with itself until the fog turns into something measurable. "Shipped safely" is not testable; curl localhost:8080/health → 200 is. Below is each Forge step for our ask, and the concrete proof it leaves behind. Notice that every step hands the next one a real artifact, so nothing is taken on faith.

  1. 1 · grillConverge the scopeSelf-debate resolves "behind login" and "safely" into concrete behavior.→ a sharp, two-sentence scope
  2. 2 · researchPull real factsThe Bright Data CLI fetches the current health-check convention into research.md.→ research.md (cited)
  3. 3 · prototypeEvidence it worksA throwaway route returns 200 locally — proof the approach is sound.→ a running spike
  4. 4 · PRDThe specProblem, scope, non-goals, and the done-when, written down once.→ PRD.md
  5. 5 · issuesThe board + contractTickets with BLOCKING links become a kanban; GOAL.md records the durable done-when.→ issues + GOAL.md
  6. 6 · implementThe AFK loopExecutor builds each ticket; an independent Validator proves it at the gate.→ proven commits
  7. 7 · reviewObservability QAAFK QA reads the whole run and emits a report you read, not run.→ review.md

The grill's output (the sharpened scope)

"Behind login" → the endpoint is public (load balancers must reach it unauthenticated) but the status page requires a session. "Safely" → ship behind a flag, default off, with a one-flip rollback. The done-when stops being a feeling and becomes a command.

The contract it wrote — GOAL.md

This is the ultragoal artifact: agent-, CLI-, and model-agnostic. Universal activation is simply running a durable GOAL.md under the loop (a vendor's native goal feature is one optional example, never required).

GOAL.md — the durable contract every Validator checks against
<goal>   Add a /health endpoint and a session-gated status page to RHG. </goal>
<context> repo: rhg-api · service runs on :8080 · ship behind flag status_page_v1 </context>
<constraints>
  - /health is unauthenticated (the load balancer probes it)
  - /status requires a valid session cookie
  - flag defaults off; rollback = flip the flag, no deploy
</constraints>
<verification>
  - curl -s -o /dev/null -w "%{http_code}" localhost:8080/health200
  - curl -s -o /dev/null -w "%{http_code}" localhost:8080/status302 (no session)
</verification>
<done-when> both verification commands return their codes on the real service </done-when>

ultragoal / durable-goal discipline: jxnl/dots.

5

The plan: the PRD as phases with exit bars


The PRD is not a wall of prose — Forge shapes it as a sequence of phases that go in order, and a phase only advances when it clears its exit bar, the proof that it is safe to continue. The strip is the map of our run; click a phase to open its goal, its tasks, the exit criteria, and the risks the loop is watching for.

Think of it like… renovating room by room. You don't empty the whole house onto the lawn — you finish one room, check nothing is broken, then start the next, and you keep one working tap until the very end so you can always wash your hands.

Ask RHG health endpoint + status page Window one AFK evening Driver Forge → the loop (Orchestrator delegates via cli -p)
Progress 2 of 4 phases complete

Click a phase — or focus the bar and use — to open its card.

Phase 3 · In progress

Implement under the loop

the AFK loop

Goal: Build each ticket and prove it at the real boundary. The Orchestrator delegates a unit via cli -p; an Executor builds it; an independent Validator runs the gate. The Validator is never the builder.

Tasks
  • Executor implements the /health route behind the flag
  • Validator runs curl … /health and asserts 200
  • Computer Use verifies the status page renders — non-blocking, AX-only
  • Each green unit moves itself to Proven on the board
Exit criteria
  • Both GOAL.md commands return their codes on the real service
  • No unit marked done on a claim — only on a passing gate
  • Every retry and failure is in LOOP-LOG.md
Risks & mitigations
HighA proof fails mid-runThe route 500s under the flag. Mitigation: the gate catches it; the loop feeds the failure back to ANALYZE and retries (see §9).
MedBuilder grades its own workSelf-validation hides bugs. Mitigation: the Validator is a separate agent that never wrote the code.
LowGUI check blocks the runA modal steals focus. Mitigation: Computer Use is accessibility-only and never raises the app or moves the cursor.

An exit criterion is a gate, not a feeling

"Implement went well" is not a gate; "curl … /health returns 200 on the real service" is. A phase cannot advance until every box is a passed check — which is exactly the loop's Proof Gate applied at the phase scale. The plan and the loop are the same discipline at two zoom levels.

The milestone bar is a live state machine

Each segment carries done, active, or todo; selecting one swaps the visible role="tabpanel". In a real run these states are driven by the ticket board, so the bar reflects reality rather than the plan as written.

1 · Frame done 2 · Decompose done 3 · Implement in progress 4 · Review/ship planned exit ✓ exit ✓ exit ✓ flag stays OFF — rollback by a flip … enforced on at ship
Each phase advances only through its exit gate. The flag is off through phases 1–3 so rollback is a single flip; ship is where it goes live.
6

The issue board moves itself


Forge's "issues" step turns the PRD into a board of tickets with four columns — Blocked (waiting on a dependency), Ready (no open blockers), Building (an Executor has it), and Proven (a Validator passed its gate). During the AFK run the loop advances cards itself; here you can drive them. A ticket always lives in exactly one column, and the counts stay honest the whole time.

Notice the locked card: a ticket with an open BLOCKING dependency cannot move until its blocker is Proven — the same rule the loop obeys, so it never builds something whose foundation isn't there yet.

Think of it like… sticky notes on a wall. A note never sits in two places; you peel it off "ready" and stick it under "building". Nothing is lost, and the wall always tells you how much is left — except the notes that are taped down until the one above them is finished.

Open tickets: 0 · Proven: 0
Blocked0
Ready0
Building0
Proven0

One array is the source of truth

Every ticket is one object with a col field — blocked, ready, building, or proven — and an optional blockedBy. The columns on screen are not the truth; the array is. Both the arrow button and a drag-and-drop land in the same moveTo(), so the two interactions can never disagree, and a card physically cannot appear in two columns.

BLOCKING is enforced in one place

A move out of blocked is refused while the blocker isn't proven; when a blocker reaches proven, its dependents auto-promote to ready. That is the kanban rule from Forge's "issues" step, and it is why the loop never picks up a unit whose foundation is missing. No framework — one array, one render, native drag events.

moveTo — the only place a ticket's column changes (blocking enforced)
function moveTo(id, col) {
  const t = tickets.find(x => x.id === id);
  if (!t || t.col === col) return;
  if (t.blockedBy && !isProven(t.blockedBy)) return;  // BLOCKING: refuse the move
  t.col = col;                 // single source of truth
  if (col === 'proven') promoteDependents(id);  // unblock what waited on it
  render(id);                  // repaint everything from state
}
7

Authorization tiers in action: what stayed user-gated


"Runs AFK" does not mean "does anything it likes." Each capability the run can use sits in a tier: most are auto (the loop may use them unattended — read files, run the build, run the gate, drive a non-blocking GUI check), and a few are gated (they require a human, surfaced as a handoff). The panel below is the authorization map for our run. Flip a capability on and watch it either go live or raise a red warning that it can't run without its gate.

The teaching point: a gated capability turned "on" by the loop alone does not actually fire — it shows as blocked, exactly like a feature flag switched on while its dependency is off. That is what keeps an autonomous run safe.

Think of it like… the light switches in a building. The reading lamps are on a circuit you can flip freely. But the main breaker for the server room needs a key — flip its switch without the key and a tag lights up: "needs sign-off first."

read & buildauto

Read the repo, run the build, run the test suite. The loop's bread and butter — never needs a human.

tier: auto — no gate
run the Proof Gateauto

Run the curl … /health check against the real service. The Validator does this every pass, unattended.

tier: auto — no gate
verify GUI (Computer Use)auto

Read the status page via the accessibility tree to confirm it renders. Non-blocking, AX-only — never moves the cursor.

tier: auto — non-blocking
flip the prod flag to 100%gated

Turn status_page_v1 on for all users. This is a launch decision — the loop must hand it to a human.

requires: human sign-off (handoff)
force-push to maingated

Rewrite shared history on the default branch. Destructive and irreversible — always a human's call.

requires: human sign-off (handoff)

What the loop may do right now

One map decides what is "actually live"

Each capability has a tier. The loop may switch on anything, but the effective set — what actually fires — excludes any gated capability that has not been authorized by a human. A gated switch flipped by the loop alone renders the inline warning and is reported as blocked, never delivered. The "Get sign-off" fix here stands in for the real handoff: it records the human authorization that clears the block.

const tier = { read_run:'auto', proof_gate:'auto', gui_verify:'auto',
               enable_flag:'gated', merge_main:'gated' };

function effective(on, authorized) {
  // on by the loop AND (auto OR a human authorized it) = actually fires
  return Object.keys(on).filter(id =>
    on[id] && (tier[id] === 'auto' || authorized[id]));
}

This is the safety rail under "everything runs AFK": autonomy over the auto tier, a hard stop on the gated tier. The human never executes the auto work and is pulled in only for the gated forks.

8

Replay one AFK pass, state by state


Zoom all the way in: a single pass of the loop on the /health ticket. Press the events and watch the loop move through its states. The whole point is that you can't skip around — you can't VERIFY before you EXECUTE, and once a pass DECIDEs to advance, that unit is done. Buttons grey out the moment a move isn't allowed from where you are.

Think of it like… a board game where only certain squares connect. You roll, you move — but the board won't let you jump to a square there's no path to. The greyed-out buttons are the squares you simply can't reach from where you stand.

analyze pick one verify advance retry Learn LEARN Analyze ANALYZE Execute EXECUTE Verify VERIFY · gate Done DONE · final

The clay-filled node is the current step. Faint nodes are unreachable from here.

Current step

LEARN

Observe the real state of rhg-api: no /health route exists yet, flag is off.

Allowed moves

Pass log (this is what LOOP-LOG.md records)

    The pass is a finite-state machine

    One current state, a fixed set of events, and a table mapping (state, event) → nextState. Events not in the table for the current state are disabled, so an illegal move (verifying before executing) is impossible by construction — the same reason the loop never claims done without running the gate. VERIFY is the only state with two exits: advance to DONE on a pass, or retry back to ANALYZE on a fail.

    const loop = {
      LEARN:   { analyze: 'ANALYZE' },
      ANALYZE: { execute: 'EXECUTE' },
      EXECUTE: { verify:  'VERIFY' },
      VERIFY:  { advance: 'DONE', retry: 'ANALYZE' },  // gate decides which
      DONE:    {}                                       // terminal
    };
    9

    When a proof failed: the recovery


    Halfway through the run, a gate went red. The Validator ran curl … /health and got a 500, not the 200 the contract demands. This is the moment the whole design is built for: nothing shipped, no one was woken up, and the loop recovered itself. Below is the report the run produced — a timeline of what happened, the root cause dug out with five whys, the blast radius, and the fixes the loop applied — and you only read it.

    Think of it like… a smoke alarm that goes off while the kitchen is still fine. The alarm is the win: it caught the problem before the fire, the sprinkler handled it, and the report tells you to move the toaster — not to rebuild the house.

    GATE-FAIL

    The pass where /health returned 500

    Caught atPass 4 · VERIFY
    RecoveredPass 6 · VERIFY → 200
    Shipped broken?No — gate blocked it
    Human woken?No — loop self-recovered
    Reported inLOOP-LOG.md
    9·a

    Timeline of the failed pass


    Read it top to bottom. Olive dots are routine, clay is a warning, red is the failed gate, green is recovery. The failure surfaced exactly where it should — at VERIFY, before anything shipped.

    1. Pass 4

      Executor builds the /health route

      An agent adds the handler behind status_page_v1 and reports it complete.

      execute
    2. Pass 4

      VERIFY: gate returns 500, not 200

      The Validator runs curl … /health. The handler throws on a nil config read. The claim "complete" is overruled by the boundary.

      gate fail
    3. Pass 5

      DECIDE: retry, not ship

      The failure feeds back into ANALYZE. The loop does not advance and does not escalate — a failed gate is in-scope for the loop to fix.

      retry
    4. Pass 5

      Root cause found in the log

      The handler read the flag config before it was loaded. ANALYZE narrows the fix to one bounded change.

      diagnosed
    5. Pass 6

      Fix executed and re-verified

      Executor guards the config read; the Validator re-runs the gate. curl … /health → 200 on the real service.

      recovered
    6. Pass 6

      Ticket moves itself to Proven

      Only now — on a real 200, not a claim — does the card advance. The run continues, untouched by a human.

      proven
    Pass 4 build VERIFY 500 — gate fails Pass 5 retry · diagnose Pass 6 fix → 200 Pass 6 proven
    The failure surfaced at VERIFY and was contained there. Two passes later the same gate returned a real 200.
    9·b

    Root cause — five whys


    Keep asking "but why did that happen?" until you reach something you can actually fix. "The gate returned 500" is the symptom. The fifth answer is the one worth fixing — and it points at a missing test, not a person.

    1. Why did the gate fail?

      The /health handler returned a 500 instead of 200.

    2. Why did the handler 500?

      It threw reading a nil flag config.

    3. Why was the config nil?

      The handler read the flag before the config loader had run on cold start.

    4. Why wasn't that caught earlier?

      The ticket's done-when checked a warm process; nothing exercised the cold-start path.

    5. Root cause · the gate had a blind spot

      The verification command hit an already-initialized server, so the order-of-init bug was invisible to it. The fix is to add a cold-start case to the gate — the proof, not the code, was incomplete.

    The gate did exactly its job

    A claim of "complete" met a boundary that disagreed, and the boundary won. Nothing shipped on a lie because the loop never advances on a claim — only on a passing gate. The cost of the bug was two extra passes of compute, paid by the machine, with the human asleep.

    Blameless, and it hardens the gate

    The fix isn't "the agent wrote a bug"; it's "the contract's verification missed a path." Strengthening the gate (add the cold-start check) makes the system better, so the same class of failure can't slip past next time.

    9·c

    Blast radius & the fixes


    The damage in numbers, and the good news at the end. Then the action items — check them off as they ship; the bar tracks progress.

    2

    Extra loop passes

    0

    Broken builds shipped

    0

    Humans paged

    100%

    Caught at the gate
    0 of 3 done
    • P1
    • P1
    • P2
    10

    The improve-the-prompt branch: tune the contract, watch it rebuild


    The loop can improve two different things. Usually it improves the artifact — the code. But when the artifact keeps missing in the same way, the smarter move is to improve the prompt that drives the run: the instruction handed to the next agent. This tuner is that branch made tangible. Turn the knobs on the left — how strict the gate is, which boundary to verify, who the executor is, whether to demand a cited source — and the assembled instruction on the right rebuilds itself, word for word, so you see exactly how each lever rewrites the request before it is sent.

    Think of it like… a coffee machine with dials for strength, size, and milk. You don't re-plumb the machine each time — you turn a dial and the next cup changes. Here the "cup" is the instruction the loop sends, and every dial rewrites it instantly.

    Controls

    How hard the Proof Gate is to satisfy.

    live HTTP endpoint

    Slide from a unit test up to the real running service.

    Which agent the Orchestrator delegates this unit to via cli -p.

    Assembled instructionlive
    
            

    One pure function maps knobs → instruction

    Each control updates a shared state and calls render(); the preview is whatever buildPrompt(state) returns — nothing writes to it directly. Because the builder is pure (same state → same string), the instruction is reproducible and the changed line simply flashes. This is the literal mechanism of the loop's "improve" step when it targets the prompt instead of the code: change the contract, regenerate the instruction, run again, keep it only if the gate result improves.

    Improve the artifact OR improve the prompt

    The loop converges by improving whichever surface is the bottleneck. A flaky build → improve the code. A run that keeps verifying the wrong thing (the §9 blind spot) → improve the prompt and the gate. Both branches end the same way: re-run, re-prove, decide. The human still only observes.

    11

    The shipped result: a PR write-up


    The run is done; here is what came out. A good result tells a story, not a diff dump. Before you read a line of code you should know the motivation (why we touched this), get a quick file tour (what moved and why), see the focus (the one subtle part worth a second look — the cold-start fix from §9), and trust the rollout (how it goes live, behind its flag, with a one-flip rollback). The four buttons are stops on that walk.

    Think of it like… a tour guide, not a map dump. A map shows every street at once; a guide walks you through, points at the one statue that matters, and tells you where the exit is.

    rhg/rhg-api · pull request #318
    Add /health endpoint and a session-gated status page
    Proven by the loop +204 −12 5 files flag: status_page_v1

    Why RHG needed this at all.

    The load balancer had no reliable way to tell whether an RHG instance was actually serving — it probed the home page, which could 200 while the app was wedged. We needed a cheap, unauthenticated /health the balancer can trust, plus a small session-gated /status page for operators to eyeball recent checks.

    The pain

    No trustworthy liveness signal; the balancer kept traffic on a wedged instance.

    The goal

    A public /health returning 200, and a /status page that 302s without a session.

    Why now

    The scope was sharp, the done-when was a command, and the whole thing fit one AFK evening.

    Stop 1 of 4 · Motivation

    The reviewer's real questions, front-loaded

    A diff answers "what changed?" but a reviewer asks "why?", "where do I look?", and "will this break prod?". The four-stage shape answers exactly those, so the right scrutiny lands on the load-bearing hunk (the cold-start guard) instead of spreading thin across renamed variables. The rollout earns trust because it's flag-gated, canaried on named metrics, with a no-deploy rollback — the reviewer approves a plan, not a leap.

    12

    How it was built: the suite map


    Everything you just watched runs on one harness and five distributed skills. The loop-engineering harness is the spine — it runs the loop and orchestrates the AFK crew. Around it sit five skills, each owning one job: ultragoal writes the durable GOAL.md; visual-teach builds the course (this page); brightdata-cli is the one path to real web evidence; computer-use-cli drives native macOS apps non-blocking. The same five are installed across a dozen agents, so any agent can pick up the work.

    loop-engineering the harness · runs the loop orchestrates the AFK crew Forge front-end 7 steps · grill→review ultragoal durable GOAL.md visual-teach builds this course brightdata-cli real web evidence computer-use-cli macOS · AX-only
    One harness (clay, center), five skills around it. The Forge front-end and the toolbelt all plug into the same loop.

    The harness

    loop-engineering runs LEARN → ANALYZE → EXECUTE → VERIFY → DECIDE, drives the Forge 7 steps, and orchestrates the AFK crew (an Orchestrator delegating units via cli -p; an Executor that builds; a Validator that proves and is never the builder). It always ends a non-trivial job by producing a visual-teach course like this one.

    The five skills

    • ultragoal — the durable-goal discipline behind GOAL.md; agent/CLI/model-agnostic. Universal activation is a durable goal run under the loop.
    • visual-teach — the course engine; emits the self-contained EN + PT-BR lessons.
    • brightdata-cli — the one uniform path to web data (SERP, scrape, browser, 40+ datasets). Always this CLI; never WebSearch/WebFetch; never the Bright Data MCP.
    • computer-use-cli — native macOS automation through the accessibility API only; non-blocking, never moves the cursor or raises the app.
    • Forge — the 7-step front-end that turns a raw ask into the run above.

    Control-plane lineage: steipete/agent-scripts · ultragoal lineage: jxnl/dots.

    13

    The handoff: pick it up


    That is the whole suite doing one job. Now it is yours. You do not need to remember the seven steps or the five states — you need to remember one move: invoke /loop-engineering. Give it your rough ask. For anything vague it runs Forge first; for anything concrete it goes straight to the loop. It runs AFK, it proves its own work at the real boundary, and it ends — exactly like this — by handing you a visual-teach course so the next person can pick it up too.

    The only thing you keep doing is the thing you did on this whole page: observe. Read LOOP-LOG.md. Read review.md. Answer the one handoff fork when it comes. That is the job.

    You · a rough ask /loop-engineering the run · AFK Forge → the loop → proof Orchestrator delegates via cli -p shipped result + this visual-teach course review.md · LOOP-LOG.md you observe the whole way — you never drive a pass the course hands the next ask back to you
    One move to start (/loop-engineering), AFK in the middle, a shipped result and a course out — and the loop is ready to run again.
    Your teacher is one message away. This was the whole suite on one ask. Want to run it on your ask? Tell the agent the rough idea and say "drive it with /loop-engineering." Not sure if your ask is vague enough to need Forge, or concrete enough to go straight to the loop? Ask — that is exactly the kind of question to bring here.
    14

    In the code: where the run lives on disk


    The run is not magic — it is a few plain files an agent reads and writes, plus the one command you type. Here are the real artifacts the run leaves behind, and exactly how to open them.

    the artifacts of one run (under the repo it operates on)
    # the durable contract every Validator checks against
    GOAL.md            # goal · context · constraints · verification · done-when
    # the Forge outputs
    research.md        # facts pulled by the Bright Data CLI (cited)
    PRD.md             # problem · scope · non-goals · done-when
    issues/            # tickets with BLOCKING edges — the kanban
    # the run records (what you read; you never execute)
    LOOP-LOG.md        # every pass: LEARN→ANALYZE→EXECUTE→VERIFY→DECIDE
    review.md          # the AFK QA observability report

    Start the run

    One move, from any agent that has the skills installed:

    # in the repo you want changed:
    /loop-engineering  "RHG needs a /health endpoint and a status page, behind login, shipped safely"

    Watch it (observe, never drive)

    # tail the run log as passes land
    tail -f LOOP-LOG.md
    # read the QA report when phase 4 emits it
    cat review.md
    # the contract the whole run is held to
    cat GOAL.md

    Re-run the proof yourself (optional)

    The done-when is a command, so you can confirm it by hand — the same check the Validator ran:

    curl -s -o /dev/null -w "%{http_code}\n" localhost:8080/health   # → 200
    15

    Quick check


    Recall beats re-reading. Try each question from memory before you click — the answer reveals on click, with the why. No tells in the formatting; every option is the same length.

    Q1During the AFK run, what is the human's only job?

    B. Everything runs AFK; the human only has observability — reads LOOP-LOG.md / review.md / status, never executes, not even the QA. The single exception is a genuine user-only fork, surfaced as a handoff.

    Q2What makes a unit "done" in the loop?

    C. The Proof Gate is a command run against the real artifact — never a claim, never a mock. In our run that was curl … /health → 200 on the actual service. A claim of "complete" lost to the boundary in §9.

    Q3Who proves that an Executor's work meets the goal?

    A. The Validator is never the builder. Separating who builds from who proves is what stops an agent rubber-stamping its own work — the §7 "Med" risk made concrete.

    Q4When does Forge run before the loop?

    D. Forge is the front-end for a raw or vague ask — its grill converges the scope into a measurable done-when. A concrete ask can go straight to the loop with a GOAL.md.

    Q5Why did the failed gate in §9 not become an outage?

    B. Because the loop never advances on a claim, the 500 was caught at VERIFY before anything shipped. The failure fed back into ANALYZE and the loop fixed it itself — two extra passes, zero humans woken.

    Q6Which is the always-the-CLI path for web evidence?

    C. Web evidence is always the Bright Data CLI — the one uniform path every agent has through the shell. Never WebSearch/WebFetch, and never the Bright Data MCP.
    Score: 0 / 6

    That is the course. You now have the model, the front-end, the toolbelt, and the one move that runs them all. Open any technical panel you skipped, then take it to your own work — your teacher is one message away.