Step 5 · The Loop · The Loop · Loop Engineering EN PT

Module 2 · The Loop · Step 3 of the cycle

EXECUTE: one bounded unit, done right

LEARN looked. ANALYZE picked. Now you build — but exactly one thing, the single most valuable bounded unit, and nothing more. EXECUTE is not "type code until it seems done." It is a contract: read fully, find the root cause, do the simplest thing that meets the scope, add the check that proves it, run that proof, and log it. The discipline is in the bounding.

Read the plain version, or open the technical layer on any section.

The big idea: build one thing, all the way

Every turn of the loop has five steps — LEARN → ANALYZE → EXECUTE one bounded unit → VERIFY at the real boundary → DECIDE. This lesson is the third step, the one where work actually happens. Everything before it was preparation; everything after it is judgment. EXECUTE is where you make a change.

The trap is obvious once named: when you finally get to build, it is tempting to fix five things while you are in there. You came to repair a leaking tap and you end up re-plumbing the bathroom. That feels productive. It is the most common way a loop turn goes wrong — because now nothing is small enough to prove, and if anything breaks you cannot tell which of the five changes did it.

So EXECUTE has one rule above all others: do the single most valuable bounded unit, and stop. "Bounded" means the change has an edge you can point at — these lines, this file, this one behavior. "Most valuable" means ANALYZE already ranked it; you do not re-litigate that here. You read the whole thing first, you find the real cause (not the symptom), you make the simplest change that satisfies the scope, you add a check that proves it, you run the proof, and you write down what you did. That sequence is the Unit Contract, and the rest of this lesson teaches it from six angles.

Think of it like… a surgeon with one item on the list. They do not open you up for an appendix and decide to also "tidy up" a knee while they are at it. They scope the cut to exactly what was agreed, they confirm the count of instruments before they close, and they write the operative note. Bounded, proven, logged. The skill is not cutting more — it is cutting only what was scoped, and proving you left the rest intact.

The whole point of bounding is attribution. If a turn changes one thing and verification fails, you know exactly what to undo. If it changes five things, a failure tells you almost nothing, and the loop loses its single best property: that every step is reversible and explainable.

Why "one bounded unit" is load-bearing for an autonomous loop

In an autonomous, AFK run the loop is executed without a human approving each diff. The only thing that keeps that safe is that every EXECUTE produces a change small enough for the VERIFY step to prove or reject at a real boundary. A diff that touches one behavior maps cleanly onto one Proof Gate; a diff that touches five behaviors needs five proofs and a much larger blast radius if it must be reverted. Bounding is what makes the next step — VERIFY — tractable.

EXECUTE never claims; it sets up the proof

EXECUTE does not get to say "done." It produces the change and the check that will judge the change, then runs that check. The verdict belongs to VERIFY (the next lesson), and in a crew it belongs to an independent Validator — never the builder. So the EXECUTE step's job is to leave behind something that is cheaply and honestly verifiable: a failing test that now passes, a command whose exit code flips, a boundary observation that changes. "I added the proving check and ran it" is the deliverable; "it works" is not a thing EXECUTE is allowed to assert.

Scope-creep is a scope violation, not a bonus

If, mid-build, you discover a second worthwhile change, the correct move is to log it as a new unit and return it to ANALYZE for ranking — not to fold it into the current diff. The scope was set in lesson 2; ANALYZE chose one unit from it in lesson 4. EXECUTE honors that choice. Quietly widening scope mid-turn breaks the contract that lets the loop run unattended.

The Unit Contract — six moves, in order

EXECUTE is not freeform. It is a fixed sequence of six moves, and skipping one is how turns go bad. Read it as a checklist you run every single time, no matter how small the change feels.

1 read fully 2 root cause + bounded plan 3 simplest thing in scope 4 add the proving check 5 run the proof 6 log it

Read left → right: the six contract moves. A failed proof loops back to re-plan the same bounded unit — it never widens the scope.

1 · Read fully

Read the entire relevant surface before editing a character — the function and its callers, the test that covers it, the issue/scope, the trusted sources from LEARN. Most bad fixes are bad because the builder edited the first plausible line without reading the second one that explained why it was written that way.

2 · Root cause + bounded plan

Name the actual cause, not the symptom. A 500 error is a symptom; "the handler dereferences user before the null check" is a cause. Then write the smallest plan that addresses that cause and draw its edge: which lines, which file, which one behavior changes. The edge is the bound.

3 · The simplest thing that meets scope

Among the plans that satisfy the scope, take the one with the fewest moving parts. Not the cleverest, not the most general, not the one that "also sets us up for" a future feature. Simplicity here is what keeps the diff inside its edge and keeps the proof cheap.

4 · Add the check that proves it

Write the check before you believe the change. A regression test that fails on the old code and passes on the new; an assertion; a command whose exit status flips. The check is the contract's spine — it is what turns "I think it works" into something VERIFY can confirm at a real boundary.

5 · Run the proof

Actually run it, at the real boundary — not in your head, not as a mock, not as a claim. If it passes, hand off to VERIFY. If it fails, you loop back to move 2 and re-plan the same unit; you do not absorb a new fix to make the failure go away.

6 · Log it

One line in LOOP-LOG.md: what unit, what root cause, what proof, pass/fail. This is what makes the run observable to a human who never touches the build — the durable record that the contract was honored.

In one picture: where EXECUTE sits in the turn

EXECUTE is the middle of the cycle — fed by ANALYZE's chosen unit, handing a proven change to VERIFY. It is the only step that writes to the artifact, which is exactly why it must be the most bounded.

EXECUTE is the centre of the turn: it consumes one chosen unit and produces one proven change. It is the only step that mutates the artifact.

The unit, as a checklist you can step through

Here is the Unit Contract for one real change, laid out as a stepped plan. The strip is the contract; each card zooms into one move with its concrete tasks, the exit bar that lets it advance, and the risks of skipping it. Click a step — or focus the strip and use the arrow keys — to open its card.

The running example for this lesson: a login endpoint that throws a 500 when the email is unknown. ANALYZE already ranked it the most valuable unit. We are now executing it.

Unit Fix 500 on unknown-email login Scope edge src/routes/auth.ts · one handler Chosen by ANALYZE (rank 1)

Progress 2 of 5 moves complete

Click a step — or focus the strip and use ← → — to open its card.

Move 2 · Done

Root cause + bounded plan

name the cause

Goal: name the actual cause and draw the edge of the fix. Symptom: 500 on unknown email. Cause: the handler calls user.hash before checking that user exists.

Tasks

Write the cause in one line: null deref on missing user
Draw the edge: one handler, one guard
Decide the right status: 401, generic message
Confirm nothing else in scope needs to change

Exit bar

Cause is stated, not just the symptom
The change touches one file, one behavior
The plan is the smallest that fixes the cause

Skip-it risk

High Scope blooms here "While I'm in auth I'll also add rate limiting." Mitigation: log that as a new unit; keep this edge at one guard.

Move 3 · In progress

The simplest thing that meets scope

now

Goal: make the smallest change that fixes the named cause — a single guard that returns a generic 401 when the user is missing, before any property is read. No refactor, no new abstraction.

Tasks

Add one guard: if (!user) return 401
Use a generic message (no account enumeration)
Leave the surrounding code untouched
Resist every "while I'm here" temptation

Exit bar

The diff is a handful of lines, one file
No unrelated lines moved or reformatted
The change reads as obviously correct

Skip-it risk

Med Over-engineering the fix A "reusable auth-guard helper" for a one-line check. Mitigation: the simplest thing that meets scope wins; generalize later if a second case appears.

Move 4 · Planned

Add the check that proves it

before believing it

Goal: write a regression test that fails on the old code and passes on the new. POST an unknown email; assert the status is 401, not 500. The check is the spine of the whole contract.

Tasks

Add a test: unknown email → expect 401
Confirm it fails against the pre-fix code
Keep the existing happy-path test green
Name the test so the intent is obvious

Exit bar

A check exists that distinguishes fixed from broken
It failed on old code (proof it tests the right thing)
It runs at a real boundary, not a mock of one

Skip-it risk

High A change with no proof "Looks fixed" is not a verdict. Mitigation: no unit is done without a check that would catch the bug coming back.

Moves 5–6 · Planned

Run the proof, then log it

hand off to VERIFY

Goal: run the check at the real boundary and write one line of record. EXECUTE does not declare victory — it produces a proof and a log entry, then hands the verdict to VERIFY.

Tasks

Run the suite; watch the new test go green
If it fails, loop to Move 2 — same unit, no widening
Append one line to LOOP-LOG.md
Hand the proven change to VERIFY / the Validator

Exit bar

The proof was actually run (exit code seen)
The log entry names unit, cause, proof, result
EXECUTE makes no "it works" claim of its own

Skip-it risk

High Claiming instead of proving Saying "done" without running the check. Mitigation: the Proof Gate is run at the real boundary, never simulated from memory.

The exit bar is a measurable gate, not a feeling

Each move advances only when its exit bar clears — and the bars are written as things you can check, not vibes. "The diff is a handful of lines, one file" is checkable; "the code feels clean" is not. This is the same gate discipline VERIFY uses, applied inside a single EXECUTE so the unit stays honest under time pressure.

A failed proof re-plans the same unit

The strip looks linear, but Move 5 has a back-edge: a failing proof returns you to Move 2 with the same scope. The temptation when a test won't pass is to "just also change" something adjacent. That widens the edge and breaks attribution. The contract says re-plan within the bound, or split off a new unit — never silently grow this one.

The strip is a tiny state machine

Each segment carries done / active / todo; selecting one swaps the visible role="tabpanel". In a live run these states come from the tracker so the strip reflects reality, not the plan as written.

Explore the variants — then pick the simplest that meets scope

Move 3 says "the simplest thing that meets scope." But there is usually more than one way to fix the same cause. Before you commit, it is worth holding the candidates side by side and feeling their trade-offs against the bound. Pick a fix below; the diagram and the trade-off note update together.

All three fix the 500. They differ in how much they touch, how much risk they add, and how well they fit the one-unit scope. The contract chooses the one that fixes the cause with the fewest moving parts — watch the meters.

Fix A: one guard line inside the existing handler returns 401 when the user is missing; everything else is left exactly as it was.

One guard clause

Add a single line that returns a generic 401 when the user is missing, before any property is read. Fixes the named cause and nothing else.

+Smallest possible diff — a few lines, one file.

+Trivially provable: one test flips 500 → 401.

+Stays exactly inside the unit's edge.

−Doesn't pre-build a helper for a future case (by design).

Scope touchedtiny

Added risklow

Fit to scopeexact

Reusable auth helper

Extract a shared requireUser() utility and route this handler — plus two others — through it. Tidy in the abstract, but wider than the unit.

+One place to change the guard rule later.

−Touches 3 files — the edge is no longer one handler.

−The other 2 call-sites now need their own proofs.

−A failure can't be attributed to one change.

Scope touchedmedium

Added riskmedium

Fit to scopeloose

Simplicity is a property of the contract, not taste

Fix B and Fix C might be "better engineering" in a vacuum. But the Unit Contract's move 3 is "simplest thing that meets scope" — and scope is one handler, one behavior. Fix A is the only candidate whose edge equals the unit's edge, so it is the only one a single Proof Gate can fully cover and a single revert can cleanly undo. The helper and the rewrite are real ideas — they just belong in their own units, logged and ranked by ANALYZE.

Exploring is not the same as widening

Holding three candidates side by side is good practice — it is how you confirm the simplest one actually fixes the cause. The discipline is that exploration ends in a choice, and the choice respects the bound. You compare in order to narrow, never to justify doing all three.

The constraints the unit must honor

"Simplest thing that meets scope" has a quiet second half: it must also stay inside the project's constraints. Scope says what to change; constraints say how any change must behave — the security, style, and safety rules that hold across the whole codebase. A unit that fixes the bug but violates a constraint is not done.

Below are the constraints this lesson's project carries, shown the way a design system shows its tokens: a named set you can scan, a table that says exactly where each applies, and do / don't pairs for the one we are about to touch.

The project's constraints (named, like tokens)

SECURITY

no-enumerationsame reply for wrong-password and no-such-user

CONTRACT

status-codes401 unauthorized · 429 throttled · never 500 for auth

STYLE

no-consolestructured logger only, never console.log

SCOPE

one-behaviora unit changes one behavior in one file

SAFETY

proof-requiredevery change ships with a check that proves it

INPUT

validate-firstnever read a field before it is checked

Where each constraint applies to THIS unit

Constraint	Rule	What it forces in this fix
no-enumeration	Generic auth replies	The 401 message must not reveal whether the email exists.
status-codes	Auth never returns `500`	The whole point: a missing user is a `401`, not a crash.
validate-first	Check before you read	Guard `user` before touching `user.hash`.
one-behavior	One behavior, one file	Only the unknown-user path changes; everything else is frozen.
proof-required	Ship a check	A test that fails on 500 and passes on 401 must accompany the diff.

Do & don’t — honoring `no-enumeration`

Do — honor the constraint

A generic reply for both failure modes. An attacker can't tell "no such account" from "wrong password", so they can't enumerate valid emails — and the status is the contract's 401, never a 500.
if (!user || !ok) return res.status(401).json({ error: 'bad_credentials' });

Don’t — fix the crash, break the rule

Returning a distinct 404 no_such_user stops the 500 — but it now leaks which emails are registered, violating no-enumeration. The bug is gone; the unit is still not done.
if (!user) return res.status(404).json({ error: 'no_such_user' });

Scope bounds the change; constraints bound the behavior

Scope is the edge of this unit — which lines you may touch. Constraints are global invariants every unit must respect no matter what it touches. They are independent fences: you can satisfy scope (one tiny diff) and still fail a constraint (a 404 that enumerates accounts), or honor every constraint while blowing scope (a constraint-clean full rewrite). The contract requires both: inside the edge and inside the rules.

Where the constraints live

In a real loop these come from the project's durable record — the GOAL.md constraints block, a CONTEXT/ADR doc, the linter config. EXECUTE reads them as part of move 1 ("read fully") so the simplest fix is chosen from the set that already satisfies them, not retrofitted after a reviewer catches a violation.

Bounded vs scope-creep, side by side

The single most important habit in EXECUTE is keeping the edge still. Here is the same starting point taken two ways — one stays inside the bound, the other quietly grows until nothing is provable.

Left: the edge holds, so one proof and one revert apply. Right: five "while I'm here" changes dissolve the edge — attribution and reversibility are gone.

A useful test mid-build: "could a reviewer undo my change in one step?" If yes, the unit is bounded. If undoing it means picking apart five intertwined edits, scope already crept — split it.

Patch vs clean bounded fix — two flavors of "small"

Not every small change is a good bounded unit. There is a difference between a quick patch that hides the symptom and a clean bounded fix that resolves the cause — both can be tiny. The matrix lays the same unit out three ways so you can see which "small" actually satisfies the contract.

Read it as a grid: each row is a property the contract cares about, each column is an approach. Then the cards say when each is the right call.

The same 500 bug, three ways — judged by the Unit Contract
property \ approach	symptom patch	clean bounded fix	big rewrite
addresses cause?	no — hides the 500	yes — guards the deref	yes (and much more)
diff size	tiny	small	large
stays in scope edge?	yes	yes	no
provable by one check?	only the symptom	yes — 500 → 401	no — too broad
contract verdict	fails (no root cause)	passes	fails (scope-creep)

symptom patch

- crash on user.hash + try { ... } catch { return 500 } // swallows the error, cause intact

Hides the symptom by wrapping the crash. The 500 may stop showing, but the null deref is still there waiting.

Contract verdict: fails — no root cause.

clean bounded fix

// guard before the deref + if (!user) return res.status(401) // generic message · one file

Resolves the cause with the smallest change in scope, honoring every constraint, with a check that proves it.

Contract verdict: passes — this is the unit.

big rewrite

- whole handler + new flow, sessions, logging… // correct, but unbounded

Fixes everything and then some — but the edge is gone and one proof can't cover it. The good parts belong in their own units.

Contract verdict: fails — scope-creep.

A patch can be tiny and still wrong

The symptom patch is the smallest diff of the three — and it fails the contract hardest, because it does not address the named cause (move 2). "Small" is not the goal; "the simplest thing that fixes the cause within scope" is. A try/catch that turns a 500 into a different 500 is motion without progress: the next unknown-email request still hits the same broken path.

The clean fix is the one with a real proof

Only the middle column has a check that distinguishes fixed from broken at a real boundary (POST unknown email → expect 401). The patch can only "prove" that an error was swallowed; the rewrite is too broad for any single check to cover. Provability-by-one-check is a sharp test for whether a change is genuinely one bounded unit.

The move people skip: add the proving check

Of the six moves, the one most often dropped is move 4 — adding the check that proves the change. It feels like overhead when the fix "obviously works." But a fix with no proof is just a claim, and the loop runs on proofs, not claims. The trick that makes the check trustworthy: it must fail on the old code first.

A trustworthy check fails on the broken code and passes on the fixed code. If it passes on both, it isn't testing the bug — rewrite it.

src/routes/auth.ts — the proving check (a regression test)

// fails on the old handler (500), passes after the guard (401)
test('unknown email returns 401, not 500', async () => {
  const res = await request(app)
    .post('/auth/login')
    .send({ email: 'nobody@example.com', password: 'x' });
  expect(res.status).toBe(401);          // not 500
  expect(res.body.error).toBe('bad_credentials'); // generic — no enumeration
});

Run it at the real boundary

The check hits the actual route through the app, not a stubbed function — that is what "real boundary" means. Run only this test while iterating:

# run just the new regression test
npm test -- -t "unknown email returns 401"

# expected: red on the pre-fix commit, green after the guard

This is EXECUTE's deliverable, not VERIFY's verdict

Adding and running this check is part of EXECUTE. The independent judgment — "yes, this genuinely meets the scope's done-when" — is the next step, VERIFY, and in a crew it is done by a Validator who did not write the fix. EXECUTE's job is to hand over a change that is cheap and honest to verify: here, a single command whose exit code tells the truth.

Self-review before handing it off

The last thing EXECUTE does before VERIFY takes over is read its own diff with a reviewer's eye. Below is the actual change for our unit — green lines added, red removed — with the risk badges a careful builder would attach and reviewer notes pinned to specific lines. Click any line with a clay dot to read its note.

This is a self-review: catching the obvious problems before an independent Validator (or a human reading the log) ever sees them. One of the notes is blocking — see if you can find it before you read them all.

OPEN unit · fix-login-500 · one bounded change → main

Guard missing user before reading user.hash

Turns a 500 on unknown-email login into a generic 401 on POST /auth/login. 1 file changed · +3 −2

Crashes500 path removed Latency~ flat (one check) Enumerationgeneric 401, no leak New behaviorneeds the 401 test

src/routes/auth.ts+3−2

@@ -11,7 +11,8 @@ router.post('/auth/login', async (req, res) => {

1111 const { email, password } = req.body;

1212 const user = await db.users.findByEmail(email);

13+ if (!user) return res.status(401).json({ error: 'bad_credentials' });

13− const ok = await verify(password, user.hash);

14+ const ok = await verify(password, user.hash);

1415 if (!ok) return res.status(401).json({ error: 'bad_credentials' });

1516 return res.json({ token: sign(user.id) });

0 / 4 notes open Self-review: ready for VERIFY

Each clay-dotted line carries a note. One is blocking and must be resolved before the unit is handed off — open the notes to find it.

Self-review catches the cheap mistakes early

Reading your own diff with risk badges and line notes is part of move 3–4 hygiene: did the change stay in scope, honor the constraints, and come with a proof? It is the cheapest place to catch an enumeration leak or a stray reformatted line. But it is not the verdict.

The builder never signs off on their own work

In the loop, the judgment that the unit truly meets done-when is VERIFY's — and in a crew it belongs to an independent Validator who did not write the code. Self-review makes the handoff clean; it does not replace the independent gate. That separation is exactly what keeps an AFK run honest: the one who built it is never the one who certifies it.

Worked example: one bounded fix, plan to proven

Putting the whole contract together on our running unit, move by move — exactly what an Executor produces in one EXECUTE step.

Moves 1–3 · build

Read → cause → simplest fix

Read fully: the handler reads user.hash on the line after findByEmail, with no null check. Cause: null deref when the email is unknown. Simplest fix in scope: one guard returning a generic 401 before the deref. No helper, no rewrite.

the bounded diff

  const user = await db.users.findByEmail(email);
+ if (!user) return res.status(401)
+     .json({ error: 'bad_credentials' });
  const ok = await verify(password, user.hash);

Moves 4–6 · prove + log

Check → run → record

Add the check: POST unknown email, expect 401 (fails on old code, passes on new). Run the proof: the suite goes green at the real boundary. Log it — one line, then hand to VERIFY.

LOOP-LOG.md — one entry

## turn 7 — EXECUTE
unit:   fix-login-500
cause:  null deref on missing user
change: +1 guard, src/routes/auth.ts
proof:  npm test -t "unknown email" → PASS
scope:  1 file · 1 behavior · in-bounds
next:   → VERIFY (Validator, not builder)

Notice what the entry does not say: it never claims "the bug is fixed." It records the change and the proof that was run, and explicitly hands the verdict to VERIFY. That restraint is the contract working as designed.

Quick check — would the contract pass it?

Five questions on EXECUTE. Pick an answer to see if it's right and why — retrieval beats re-reading. No tell in the formatting; read each option on its merits.

Q1What does "one bounded unit" mean in EXECUTE?

B. Bounded means the change has a visible edge — one behavior, one file's worth of intent — so a single proof can cover it and a single revert can undo it. Size of your day and "what's in the file" are not the bound.

Q2Mid-build you spot a second worthwhile fix. What does the contract say?

C. A new fix is a new unit. Log it and let ANALYZE rank it; do not widen the current edge or abandon the chosen unit. Quietly absorbing it breaks attribution and the contract that lets the loop run unattended.

Q3Why must the proving check fail on the old code first?

A. A check that passes on both old and new code proves nothing — it isn't exercising the bug. Seeing it fail on the broken code, then pass on the fix, is what makes it a real proof rather than decoration.

Q4The fix stops the 500 by returning a distinct 404 "no_such_user". Verdict?

B. Scope and constraints are two fences. The diff fits scope, but a distinct 404 tells an attacker which emails exist — violating no-enumeration. Inside the edge and inside the rules, or it isn't done.

Q5EXECUTE ran the proof and it passed. What does EXECUTE get to claim?

C. EXECUTE produces a change and a run proof, then logs it. The judgment that it truly meets done-when belongs to VERIFY — and in a crew, to an independent Validator who did not build it. EXECUTE never signs off on its own work.

Score: 0 / 5

Your agent is your teacher. Want to run a real EXECUTE pass on your own repo — pick one unit, bound it, add the proving check — or unsure whether a change of yours is genuinely "one unit"? Ask. Next, once the change is built and the proof is written, comes the step that judges it: VERIFY and the gates: prove it, not claim it.

The big idea: build one thing, all the way

Why "one bounded unit" is load-bearing for an autonomous loop

EXECUTE never claims; it sets up the proof

Scope-creep is a scope violation, not a bonus

The Unit Contract — six moves, in order

1 · Read fully

2 · Root cause + bounded plan

3 · The simplest thing that meets scope

4 · Add the check that proves it

5 · Run the proof

6 · Log it

In one picture: where EXECUTE sits in the turn

The unit, as a checklist you can step through

Read fully

Tasks

Exit bar

Skip-it risk

Root cause + bounded plan

Tasks

Exit bar

Skip-it risk

The simplest thing that meets scope

Tasks

Exit bar

Skip-it risk

Add the check that proves it

Tasks

Exit bar

Skip-it risk

Run the proof, then log it

Tasks

Exit bar

Skip-it risk

The exit bar is a measurable gate, not a feeling

A failed proof re-plans the same unit

The strip is a tiny state machine

Explore the variants — then pick the simplest that meets scope

One guard clause

Reusable auth helper

Rewrite the handler

Simplicity is a property of the contract, not taste

Exploring is not the same as widening

The constraints the unit must honor

The project's constraints (named, like tokens)

Where each constraint applies to THIS unit

Do & don’t — honoring no-enumeration

Scope bounds the change; constraints bound the behavior

Where the constraints live

Bounded vs scope-creep, side by side

Patch vs clean bounded fix — two flavors of "small"

A patch can be tiny and still wrong

The clean fix is the one with a real proof

The move people skip: add the proving check

Run it at the real boundary

This is EXECUTE's deliverable, not VERIFY's verdict

Self-review before handing it off

Self-review catches the cheap mistakes early

The builder never signs off on their own work

Worked example: one bounded fix, plan to proven

Read → cause → simplest fix

Check → run → record

Quick check — would the contract pass it?

Do & don’t — honoring `no-enumeration`