Step 4 · The Loop · The Loop · Loop Engineering EN PT

Module 2 · The Loop · Step 2 of the cycle

ANALYZE: classify the gap, rate, pick ONE unit

LEARN handed you a grounded picture of reality. ANALYZE turns that pile of facts into a single decision: sort every gap into the right bucket, rate the handful that are actually yours to do now, and pick the one most valuable bounded unit to execute next. Not three. One.

Read the plain version, or open the technical layer on any section.

The big idea: one ranked move, not a to-do list

Every turn of the loop has five steps — LEARN → ANALYZE → EXECUTE one bounded unit → VERIFY at the real boundary → DECIDE. This lesson is the second step. ANALYZE is the moment between seeing reality and changing it: you take the gaps LEARN surfaced and decide, on purpose, the single next thing to do.

It has three small moves, in order. First you classify each gap into one of four buckets — is it on-scope work, a do-now blocker, something that needs the user, or plainly out of scope? Second, for the gaps that survive (the on-scope and do-now ones), you rate each candidate unit on five quick axes: Fit, Risk, Proof, Blocker, Next. Third, you pick exactly one — the most valuable bounded unit — and hand it to EXECUTE.

The discipline that makes this hard is the word one. When you can see ten things wrong, every instinct says "I'll just fix them all while I'm here". The loop forbids that. A pass executes a single bounded unit so that VERIFY can prove that one change at the real boundary, and so the human watching can tell exactly what moved. Batch five fixes together and a red test tells you nothing about which fix broke it.

Continuing the running example from the last lesson: in the RHG service, LEARN already confirmed the gap — /search returns every row on an empty query, the handler is in api.py:42, and there's a skipped test test_empty_query_returns_400 waiting. ANALYZE's job now is not to find more gaps. It's to decide which single unit to ship first.

Think of it like… a triage nurse in a full emergency room. Ten people are waiting and all of them want help. The nurse doesn't treat anyone yet — she sorts. Who is critical and must be seen now, who can wait, who is in the wrong department entirely, who needs a doctor's call she can't make herself. Only after the sort does one patient go through the door. Where the analogy breaks: the nurse runs many rooms at once; an ANALYZE pass commits to a single patient, because the loop wants one provable change per turn.

Why "one bounded unit" is structural, not stylistic

A bounded unit is the smallest change that is independently verifiable at the real boundary. The bound is what lets VERIFY (lesson 6) make a clean claim: this test went from skipped to green, this endpoint now returns 400, nothing else moved. If a unit bundles two behaviours, a failure is ambiguous — you can't tell which half regressed without unpicking it. Picking one is therefore not productivity advice; it's what keeps the proof gate meaningful and the loop debuggable.

ANALYZE consumes LEARN's output and nothing else

ANALYZE is allowed to reason only over grounded facts — the picture LEARN established. It does not go look again (that was LEARN) and it does not edit (that's EXECUTE). If a rating needs a fact you don't have, that's a signal to drop back to LEARN, not to guess. Keeping the steps separate is what stops "analysis" from quietly becoming "I started fixing it".

The output is a decision plus its reasons

A finished ANALYZE pass produces one chosen unit and a short, legible rationale: which bucket each gap landed in, how the top candidates rated, and why this one won. That rationale is observability — it lands in LOOP-LOG.md so the human can read the decision without re-deriving it (lesson 7).

The four buckets

Classifying means dropping each gap into exactly one of four boxes. The boxes are not about how big or hard the work is — they're about whether it's yours to do, and whether it's now. Here they are.

Four boxes, one rule: every gap goes in exactly one. The colour is the action — only the first two columns feed the rating step.

on-scope → do in turn do-now → jump the queue needs-user → hand off out-of-scope → park it

The test for each bucket

On-scope: does the done-when contract (lesson 2) require it? If yes, and it isn't blocking anything urgent, it's normal queued work. Do-now: would the current unit be impossible or untrustworthy without it? A broken test runner, a missing dependency, a red baseline you'd otherwise mistake for your own breakage — these jump ahead because they poison VERIFY. Needs-user: is there a genuine fork only the human can settle — a product choice, an irreversible action, an ambiguity scope didn't pin down? Then it's a handoff, decision-ready (lesson 9), not a guess. Out-of-scope: is it real but outside this goal? Park it where it won't be lost, and move on.

Out-of-scope is recorded, never silently dropped

"Out of scope" doesn't mean "ignore". A parked item is written down (a backlog note, a follow-up ticket) so the observability trail stays honest and nothing real evaporates. The loop is allowed to not do something; it is not allowed to pretend it didn't see it.

Needs-user is the only bucket that can block the human

Everything else runs AFK. The whole suite's contract is that the human only observes — they read the log, they don't execute. The single exception is a needs-user fork: the loop pauses and hands a decision-ready question over (never a half-built guess). That handoff is the one place a human is in the path, and it's deliberately rare.

Triage the gaps into buckets

Now do the classifying yourself. Below is a triage board whose four columns are the four buckets. The cards are the exact gaps LEARN confirmed on the RHG task, plus a couple of tempting distractions. Drag each card — or press its arrow — into the bucket where it belongs. There's no "in progress" here; classifying is a sort, and every gap lands in precisely one bucket.

The counts at the top stay honest: how many gaps are actionable now (on-scope + do-now), and how many you've set aside (needs-user + out-of-scope). Only the actionable ones move on to rating.

Think of it like… sorting the morning mail into four trays before you open anything. Bills to pay, something urgent that can't wait, a letter you must ask your partner about, junk for the recycling. You don't act on any of it while sorting — you just make sure each piece is in the right tray. Acting comes after, one piece at a time.

Actionable now: 0 · Set aside: 0

On-scope0

part of done-when → do in turn

Do-now0

blocks the unit → jump queue

Needs-user0

human call → hand off

Out-of-scope0

real but not this goal → park

The board is a one-of-four classifier

Each gap is one object with a col field constrained to the four buckets. The columns on screen are not the source of truth — the array is — so a card physically cannot sit in two buckets and the counts can never drift. This mirrors the discipline exactly: classification is a function from a gap to a single category, not a vibe. The "actionable now" count sums the first two columns, because those are the only gaps that earn a rating in the next section.

Where this picks up from lesson 3: LEARN's board ended with observations confirmed; this board takes those same items and assigns each a bucket. Confirm first (LEARN), classify second (ANALYZE) — the two boards are the two steps, in order.

In one picture: classify → rate → pick

The whole step is a funnel. Many gaps go in; they get sorted into buckets; only the actionable ones get rated; and exactly one comes out the bottom as the chosen unit. Everything else is either parked, handed off, or queued for a later pass.

Read left → right. The funnel narrows on purpose: many in, one out. The dashed branches are real outcomes too — just not "do it now".

The one rule

ANALYZE always ends with a single chosen unit. If a pass ends with "and also these four", it didn't finish analyzing — it just made a to-do list, and to-do lists don't survive contact with the proof gate.

The rate rubric: five axes

Once a gap is in an actionable bucket, you rate it on five quick axes. They're chosen so a single glance tells you whether a unit is a good next move — not just whether it's worth doing eventually, but whether it's worth doing now, and whether you'll be able to prove it when you're done.

Five quick reads → one judgement. High Fit and Proof pull a unit up; high Risk and a live Blocker push it down.

The five axes

Fit — how squarely the unit serves the done-when. A unit that directly satisfies a contract line scores high; a tangential nice-to-have scores low even if it's tempting. Risk — the chance it breaks something else or balloons past its bound (high risk is bad, so it counts against). Proof — how cleanly VERIFY can confirm it at the real boundary: a unit with a waiting test scores high; one whose only proof is "looks right" scores low. Blocker — is there anything that stops it starting right now (a missing fact, a needs-user fork, an unrun test suite)? A live blocker can veto an otherwise great unit. Next — how much finishing it unblocks afterwards; a small unit that clears the path for three more punches above its weight.

Why Risk is the axis people misread

Beginners rate units by excitement ("this would be cool"). The rubric forces the unglamorous questions instead: can I prove it, will it break the build, is it even unblocked. A high-Fit unit with a live Blocker is not the next move — the blocker is. That re-ordering is most of the value of rating at all.

The rubric is a heuristic, not a formula

You're not adding the axes into a precise number; you're using them to make the trade-off legible. A unit with stellar Fit but no Proof should make you nervous, and the rubric is what surfaces that before you've sunk a pass into it. The scorecard in the next section makes the weighing tangible.

Rate the candidate units

Triage left three actionable candidates on the RHG task. Here they are as cards, each rated on the rubric. Think of them like variants of one decision — same component (the next move), different flavours — laid side by side so you can compare on purpose. Use the buttons to sort by an axis; the card that wins on that axis lights up. The one that wins overall carries a green badge.

Sort by:

best next move

on-scope

Add the empty-query guard

Return 400 when q is missing or empty; un-skip the waiting test.

Fit

Risk

Proof

BlockerNone — fact and test both ready.

NextCloses the core done-when line directly.

verdict · pick now

out-of-scope-ish

Rewrite the search index layer

Make index.lookup reject short queries at the data layer too.

Fit

Risk

Proof

BlockerTouches many callers; no waiting test.

NextUnblocks little the guard doesn't already.

verdict · park it

needs-user

Pick the status code: 400 vs 422

A product call on which error code the API should return.

Fit

Risk

Proof

BlockerHuman decision — can't be guessed.

NextSettling it firms the guard's exact spec.

verdict · hand off

Two independent axes, like variant × size

Each card carries its raw ratings in data attributes (data-fit, data-risk, data-proof). Sorting re-orders the cards by the chosen axis and lights the leader — exactly the way a component gallery lets you compare one variant across a size axis. The "overall" sort uses a simple legible score: Fit and Proof add, Risk subtracts, and a live Blocker caps the result. Card A wins not because it's the most ambitious but because it's the most shippable next: high Fit, high Proof, zero Blocker.

The losing cards aren't wrong — they're elsewhere

Card B is the classic scope-creep trap: a real improvement that balloons risk and proves nothing the guard doesn't. Card C is genuinely valuable but lives in needs-user — it's a handoff, not this pass's unit. Rating doesn't just find the winner; it tells you why each loser is parked or handed off, which is the rationale that lands in the log.

Score one unit yourself

The cards showed fixed ratings. Now you turn the dials. Drag the three sliders to rate a unit on Fit, Risk, and Proof; the gauge and verdict update live. Try the presets to feel how a real candidate scores — and watch what a live Blocker does to even a high-Fit unit.

Think of it like… a sound desk. Push Fit and Proof up and the level rises into the green; push Risk up and it drops back toward red. A blocker is the mute switch — flip it and it doesn't matter how good the mix is, nothing comes out until you clear it.

Try a candidate:

Fitserves done-when

90/100

Riskbreak / balloon

20/100

Proofclean to verify

90/100

Blockerstops it starting

clear

defermaybepick now

defer

A legible weighting, not a black box

The score is deliberately simple so you can predict it: value = Fit + Proof − Risk, clamped to 0–100. Above ~120 raw it reads "pick now" (green), in the middle "maybe", and low "defer". The Blocker slider is special — flip it on and the verdict is forced to defer regardless of the rest, because a live blocker means the unit literally cannot start. That hard veto encodes the rubric's sharpest rule: a blocker beats a high score every time.

function score(fit, risk, proof, blocked) {
  if (blocked) return { v: 0, verdict: 'defer' };  // hard veto
  const v = clamp(fit + proof - risk, 0, 200);
  return { v, verdict: v >= 120 ? 'go' : v >= 70 ? 'maybe' : 'defer' };
}

Move the dials to the guard preset and you'll land deep in the green; move to the index-rewrite preset and high Risk plus low Proof drag it down to defer — the same verdicts the cards reached, now under your own hands.

Competing units, side by side

Sometimes the decision isn't "which gap" but "how big a bite". For the empty-query fix there are at least three reasonable units, each a different size of the same change. None is wrong; each makes a different bargain between how fast it ships, how much it proves, and how much risk it carries. Lay them side by side and choose the bound on purpose. Hover or focus a card to bring it forward.

Just the guard

Add one early return for empty q, un-skip the one waiting test. The smallest provable unit.

# api.py — the whole change
if not (q := request.args.get("q", "").strip()):
    return jsonify(error="empty query"), 400

Pros

+Tiny: one guard, one test un-skipped.
+Proves cleanly — the spec already exists.

Cons

–Leaves other endpoints unguarded.

Pick this when You want the highest-value, lowest-risk move that VERIFY can prove this pass. The default.

Guard + shared helper

Extract a require_query() helper and call it from /search now, ready for siblings later.

def require_query(args):
    q = args.get("q", "").strip()
    if not q: abort(400, "empty query")
    return q

Pros

+One reusable guard for future endpoints.
+Still verifiable by the same test.

Cons

–Bigger surface; abstraction before a 2nd caller.
–Mixes "fix" with "refactor" in one unit.

Pick this when Two or more endpoints already need the same guard today — the reuse is real, not speculative.

Guard every endpoint

Sweep all six query endpoints and add validation to each in one pass.

# touches 6 handlers + 6 new tests
for ep in (search, suggest, facet,
           related, recent, popular):
    add_guard(ep)   # one big unit

Pros

+Fixes the whole class of bug at once.

Cons

–Unbounded: a red test won't say which one.
–Five of six have no spec yet — low Proof.

Pick this when Almost never as one unit. Split it: ship the guard, then queue each sibling as its own bounded pass.

For this pass I most need to…

Bound for provability, not for ambition

All three units would improve RHG. The tiebreaker is the proof gate: unit A maps one-to-one onto an existing test, so VERIFY can make an unambiguous claim. Unit C bundles six behaviours, so one failure is a mystery and the human watching can't tell what moved. The loop's preference for the smallest provable bite isn't timidity — it's what keeps every pass debuggable and every claim honest. Unit C isn't rejected; it's re-shaped into six queued units, one per sibling endpoint, each its own future pass.

Abstraction has a "rule of two"

Unit B extracts a helper before a second caller exists, which is speculative generality — risk with no present payoff. If LEARN had shown two endpoints already needing the guard today, B's reuse would be real and its rating would climb. The rubric only credits reuse you can point at, not reuse you imagine.

The Fit / Risk matrix

Two of the five axes do most of the picking, so it helps to plot just them. Put Fit on one side and Risk on the other and you get four quadrants. The one you want is the top-left: high fit, low risk — high value, unlikely to bite. That's where the empty-query guard lands.

Plot Fit against Risk and the decision almost makes itself: the top-left circle is your next move.

What to do in each quadrant

Top-left (high fit, low risk) — pick it now; this is the sweet spot. Top-right (high fit, high risk) — the work matters but the bite is too big or too dangerous; split it into smaller provable units and pick the safest slice. Bottom-left (low fit, low risk) — harmless filler; it's safe but it doesn't move the done-when, so it waits. Bottom-right (low fit, high risk) — avoid; lots of danger for little value. The matrix is a fast first pass; Proof, Blocker and Next then break ties among the survivors.

Why the index rewrite sits bottom-right

It scored low Fit (the guard already satisfies the contract line) and high Risk (it touches every caller with no waiting test). That's the avoid quadrant — not because it's a bad idea forever, but because as this pass's unit it's the worst trade on the board.

One unit, three approaches

You've picked the unit: add the empty-query guard. But even one unit can be built more than one way, and ANALYZE is also where you choose the approach. Switch between three ways to implement the guard; the diagram and the rating note update together so you can feel each trade-off before EXECUTE touches a line.

Read left → right: the request reaches the handler, which checks the query itself before the index.

Inline in the handler

The guard is three lines at the top of /search. The change lives exactly where its one effect is.

+Smallest diff; maps 1:1 onto the waiting test.

+Zero blast radius — touches one handler.

−If a sibling needs it later, you copy it.

Fithigh

Risklow

Proofclean

A decorator

A @require_query wrapper holds the check; you opt routes in one at a time. Reuse without a global blast radius.

+One guard, opt-in per route — clean reuse.

+Still provable by the same single test.

−An abstraction before a 2nd caller needs it.

Fitgood

Riskmedium

Proofclean

Global middleware

Validation runs on every request before any route. Powerful — and exactly why it's risky: it touches endpoints that have no q at all.

+One place guards the whole app forever.

−Hits routes like /upload that shouldn't be touched.

−Big blast radius; the one test can't cover it.

Fitovershoots

Riskhigh

Proofweak

Approach is part of ANALYZE, not EXECUTE

Choosing how to build the unit is still analysis — it changes Risk and Proof, so it belongs before any code is written. All three diagrams ship in one inline SVG; the tablist swaps which <g data-diagram> is shown and rewrites the caption and aria-label, with arrow-key roving focus per the WAI-ARIA tabs pattern. For RHG, approach A wins the same way unit A did: highest Fit and Proof, lowest Risk. The decorator only becomes the right call once a real second caller exists; the middleware overshoots the scope and drags in routes that have no query at all.

One unit at a time vs batch everything

This is the habit ANALYZE exists to enforce, so look at it head-on. On the left, the loop's way: pick one unit, prove it, then go again. On the right, the tempting way: grab everything at once. They feel similar at the start and diverge completely at the proof gate.

Same start, opposite ends. One unit gives VERIFY a clean yes/no; a batch gives it a mystery.

Why one wins

A bounded unit is the largest change VERIFY can still make an unambiguous claim about. Cross that line and a red result stops telling you anything — and the human watching the log can no longer see what moved. "Pick one" is the price of a meaningful proof gate.

The chosen unit, as a plan

The decision is made: ship the empty-query guard. ANALYZE's last act is to shape that one unit into a tiny plan EXECUTE can follow — what it does, the steps, the risks, and the exit bar that proves it's done. Click along the strip to read each phase. Notice the whole plan is for one unit; this is a bounded change, not a project.

Think of it like… a recipe card for a single dish, not the whole dinner. Ingredients, three steps, the one way to know it's cooked (the test goes green). You don't plan the entire menu — you plan the one plate going out next.

Unit Empty-query guard on /search Bucket on-scope Verdict pick now (Fit 92 · Risk 18 · Proof 95)

Click a step — or focus the bar and use ← → — to open it.

Step 1 · do first

Reproduce the gap

~2 min

Goal: See the bug with your own eyes before fixing it, so EXECUTE starts from a confirmed failure, not a belief.

Tasks

Hit GET /search?q= and confirm it returns all rows
Run pytest -q to confirm the baseline is green (11 passed)
Locate the skipped spec test_empty_query_returns_400

Done when

The empty-query bug is reproduced on the running service
The waiting test is found and read

Risks & mitigations

Low Can't run the service locally Env not set up. Mitigation: the skipped test already encodes the bug — read it instead of curling.

Step 2 · next

Add the guard

~5 min

Goal: One early return in api.py that rejects a missing or empty q with a 400 — and nothing else.

Tasks

Read q with a default and .strip() — catch both None and ""
Return 400 with a short error body when it's empty
Leave the rest of the handler untouched (no scope creep)

Done when

The guard handles missing and empty-string queries
The diff touches only the one handler

Risks & mitigations

Med Missing the empty-string case A naive if q is None lets ?q= through. Mitigation: the grounded fact from LEARN — strip and test for falsy.

Low Quietly widening the change Tempting to also tidy nearby code. Mitigation: that's a new unit — note it, don't do it here.

Step 3 · next

Un-skip the test

~2 min

Goal: Turn the waiting spec into a live check, so the change has a real boundary to be judged against.

Tasks

Remove the @skip from test_empty_query_returns_400
Add a case for ?q= (empty string) alongside the missing-param case

Done when

The test runs (no longer skipped) and asserts a 400
Both empty and missing q are covered

Risks & mitigations

Low Test asserts the wrong code 400 vs 422 was a needs-user fork. Mitigation: use the human's answer from the handoff; don't guess in the test.

Step 4 · prove

Clear the exit bar

~1 min

Goal: Hand a unit to VERIFY that proves itself at the real boundary — the measurable gate that says "done", not "looks done".

Tasks

Run the suite: the once-skipped test now passes
Confirm the other 11 tests still pass — no regression

Exit bar (the proof gate)

test_empty_query_returns_400 goes from skipped → green
Full suite green; diff limited to one handler + one test
/search?q= returns 400, not all rows

Risks & mitigations

High "It looks right" instead of proof Claiming success without running the gate. Mitigation: VERIFY runs the real test — never a claim, never a mock (lesson 6).

The plan is shaped around the proof gate

Notice the last phase isn't "ship it" — it's "clear the exit bar". The exit bar is a measurable gate: a specific test going from skipped to green, the rest of the suite staying green, the endpoint returning 400. "It looks right" is not a gate. Shaping the unit this way at ANALYZE time means EXECUTE knows precisely what success is, and VERIFY (lesson 6) has an unambiguous thing to prove. The milestone bar is a small tablist driven by per-step state; in a real run those states would reflect the live tracker, not the plan as written.

One unit, four small steps — still one unit

Reproduce → add guard → un-skip test → prove is not four units; it's the internal shape of one bounded change. Each step is a read or a minimal edit that builds to a single provable outcome. That's the difference between a plan for a unit and a project plan.

Where ANALYZE sits in the loop

ANALYZE is the hinge between looking and doing. It only runs after LEARN has grounded the facts, and it must finish before EXECUTE touches anything — because EXECUTE needs exactly one chosen unit to act on. Here's the whole cycle with ANALYZE lit, and the one thing it must hand forward.

ANALYZE turns LEARN's facts into exactly one unit for EXECUTE. Its single output — the chosen unit — is the whole point of the step.

in: grounded facts classify → rate → pick out: ONE bounded unit no editing here

In the code: the decision, captured

An ANALYZE pass doesn't change a single file — it produces a decision and writes it down. Here's what that looks like as the note that lands in the observability log: each gap's bucket, the ratings of the live candidates, and the one unit chosen, with its reason. The human can read this and know the next move without re-deriving it.

LOOP-LOG.md — an ANALYZE pass, fix/empty-query

# ANALYZE — classify, rate, pick

classify:
  empty-q guard ................. on-scope     # done-when line
  tests don't run? (they do) .... n/a          # baseline green
  400 vs 422 status code ........ needs-user   # product call → handoff
  rewrite search index .......... out-of-scope # parked: backlog#214

rate (Fit / Risk / Proof):
  A · add guard ................. 92 / 18 / 95   # blocker: none
  B · index rewrite ............ 40 / 78 / 35   # blocker: many callers
  C · status-code call ......... 55 / 22 / 50   # blocker: human-only

pick: A — add the empty-query guard (inline)
  # highest Fit + Proof, zero blocker, maps to the waiting test.
  # C handed off as decision-ready; B parked. no files changed.

Access it yourself

The ANALYZE rationale lives in the run's observability file — typically LOOP-LOG.md at the repo root. Read it with sed -n '1,40p' LOOP-LOG.md or jump to the latest pass with grep -n "ANALYZE" LOOP-LOG.md | tail -1. It is append-only: each pass adds its classify / rate / pick block so the decision history is auditable.

Crucially there are no edit commands here — an ANALYZE pass is pure decision. If you ever see a file change attributed to ANALYZE, the step bled into EXECUTE; that's the boundary the loop keeps clean (lesson 5). And any external fact a rating leaned on is grounded the one allowed way — the Bright Data CLI (lesson 11), never memory, never WebSearch/WebFetch.

Worked example: from ten gaps to one unit

Tie it together on the RHG task, start to finish — the ANALYZE part only. LEARN handed over a grounded picture with several loose threads. Watch a careful ANALYZE pass turn that pile into a single, defensible next move.

classifySort every gap into a bucket

The empty-query guard is squarely on-scope — it's the done-when line itself. "Which status code, 400 or 422?" is a product call, so it's needs-user. "Rewrite the search index" is real but unrelated to this goal, so it's out-of-scope — parked as a backlog note, not dropped. The baseline being green means there's no do-now blocker. Four gaps, four boxes.

rateScore the actionable candidates

Only the on-scope guard truly survives as a unit this pass. Rated: Fit 92 (it is the contract line), Risk 18 (one handler, no callers affected), Proof 95 (a waiting test makes verification trivial), Blocker none, Next high (closes the core gap). The index rewrite rates the opposite way and the status-code call is a handoff — neither is this pass's unit.

pickChoose exactly one

The guard wins on every axis that matters for a next move: high value, low risk, trivially provable, unblocked. Pick it — and resist the pull to also fix the five sibling endpoints "while I'm here". Those become their own queued units. One unit goes to EXECUTE.

hand offRoute the rest, don't lose it

The status-code question goes to the human as a decision-ready handoff (400 vs 422, with the trade-off stated) so EXECUTE isn't blocked guessing. The index rewrite is parked in the backlog. Nothing real is dropped; everything is either chosen, handed off, or recorded.

What ANALYZE produced

One chosen unit (the inline empty-query guard), three routed gaps (one handed off, one parked, one already-fine baseline), and a one-paragraph rationale in the log. Zero files changed. EXECUTE now has a single, bounded, provable thing to do — and the human can see exactly why. That is an ANALYZE pass done right.

The pass, as it lands for the human to observe

The whole point is a legible decision the human reads without re-doing it (the observability the suite runs on — lesson 7). Decision only; every rating has a reason, every routed gap has a destination.

# ANALYZE result — fix/empty-query
chosen unit : add empty-query guard (inline, /search)
rationale   : Fit 92 · Risk 18 · Proof 95 · blocker none
handoff     : status code 400 vs 422 → decision-ready to user
parked      : index-layer rewrite → backlog#214
deferred    : 5 sibling endpoints → one unit each, later passes
files changed: 0   # ANALYZE never edits

Quick check: did it stick?

Recall beats re-reading. Answer each from memory before you peek — the option you pick grades instantly, with a note on why. No tells in the formatting; the answers are spread around on purpose.

Q1What are the three moves of an ANALYZE pass, in order?

Q2A gap is a real product decision only the human can make. Which bucket?

Q3Why does an ANALYZE pass pick only one bounded unit?

Q4A unit has high Fit but a live Blocker. What does the rubric say?

Q5What does a finished ANALYZE pass change in the repo?

Score: 0 / 5

Your agent is your teacher. Want to run a real ANALYZE pass on your own backlog, or unsure how to bound a unit so VERIFY can prove it? Ask. Next — now that you've chosen exactly one unit — is doing it without scope creep: EXECUTE: one bounded unit, done right.