Phase 3¶
The build log for the third ship. Posture transitions, Disposition, Receipt, hook adapter, full CLI, and E2E coverage. Mirrors RECEIPTS.md with day-of-build context.
Headline¶
Shipped: v0.3.0a0, 2026-05-09. Six commits on claude/setup-project-structure-3YeiT ahead of main after the operator's Phase 2 ff-merge to main at e5e37bf. CI green across all 9 matrix cells.
Scope: Pure posture state machine (transition, evaluate), the closed Disposition enum, the content-addressable Receipt, the I/O hook adapter (run_hook, main), the full spine-lite CLI surface (validate-manifest, classify, hook), and an end-to-end test suite that exercises the same path Claude Code invokes when wiring the PreToolUse hook.
What's stable: Everything in spine_lite.__all__ after this phase. Phase 3 adds Disposition, Receipt, transition, evaluate to the surface. The full public API:
from spine_lite import (
PRECEDENCE, Effect, most_restrictive,
Manifest, ToolDefinition, parse_manifest,
ToolCall, Decision, classify,
Posture, Disposition, transition, evaluate,
Receipt,
SpineLiteError, ManifestError, ClassificationError,
PostureError, HookError,
__version__,
)
spine_lite.hook.run_hook and spine_lite.hook.main remain accessible via the submodule import for programmatic users; the canonical operator entry point is the spine-lite hook console script.
What's not in scope: Real-host integration (smoke against an actual Claude Code session) — the E2E tests run via subprocess against the installed spine-lite console script, which is the closest faithful test the build sandbox supports. PyPI publish remains a project-level decision.
Commit timeline¶
| # | SHA prefix | Subject |
|---|---|---|
| 1 | 6cdcde5 |
chore: enable attr_list mkdocs extension |
| 2 | 29bfb63 |
feat: posture state machine with Disposition and evaluate |
| 3 | 7cd329b |
feat: Receipt dataclass with deterministic serialization |
| 4 | 0d92074 |
feat: PreToolUse hook adapter |
| 5 | d3e6cb6 |
feat: full CLI surface with integration and E2E tests |
| 6 | (this commit) | release: bump to v0.3.0a0 + phase 3 exit receipt |
Each commit independently passed the local verification gate before being staged.
Design choices recorded¶
- Disposition is closed at three members.
ALLOW,DENY,ESCALATE. Adding a fourth (e.g.LOG_ONLY) would require updating every consumer that exhaustively matches and the exit-code contract; same closed-taxonomy logic applies as forEffect. evaluateis layered explicitly. The order of checks matters: posture allow-list first (one tool, many postures),LOCKED/DRY_RUNposture-specific rules next,require_confirmationlast. Reordering the posture-specific checks against the allow-list would let a deny-listed tool slip through underLOCKED.- Exit codes use a wide range.
0ALLOW /1DENY /2ESCALATE /64HOOK_ERROR /65MANIFEST_ERROR.64+is sysexits.h territory for "internal failure" — keeps policy outcomes distinguishable from protocol errors at the host's exit-code layer. - Receipt fields are content-addressable.
to_canonical_jsonusessort_keys=True,ensure_ascii=False, and compact separators. The hash issha256(canonical_json.encode("utf-8")). No timestamps, no UUIDs, no per-run metadata — anything that varies between runs lives in the hook's external metadata, not the receipt itself. - Hook contract is intentionally minimal. Top-level JSON object with
tool(required, non-empty string) andarguments(optional, object). Other fields are ignored. This lets the same hook adapt to multiple host hook formats; spine-lite doesn't need to be re-released when a host's payload schema evolves. - Stdin ↔ stdout is the boundary, not a config file. The CLI's
--manifestflag is the configuration; payload comes via stdin; decision goes to stdout. No log files written by default. Whether to capture receipts to disk is the operator's choice (a--receipt-dirflag is reserved for a future minor release). - CLI uses
Annotated-style typer parameters. AvoidsB008cleanly and matches typer's modern recommended idiom. The one carve-out:Pathstays at the top ofcli.py(TC003 ignored for this file only) because typer introspects the runtime annotation to doexists=Truevalidation. - E2E tests run via
python -m spine_lite.clisubprocess. True fresh-venv install is out of scope for the build sandbox, but invoking the installed entry point captures everything except the console-script shim.
Verification on the green run¶
ruff check: cleanruff format --check: cleanmypy --strict src tests: clean across 19 source filespytest: 209 / 209 passing- Coverage: 100% on every runtime module
mkdocs build --strict: clean- Hypothesis: 9 properties × 1,000 examples each (six classifier + three receipt)
- E2E subprocess tests: 7 cases (5 posture × tool combinations + byte-stability + version)
Phase 3 exit gate¶
| # | Item | State |
|---|---|---|
| 1 | posture.py (transitions + evaluate) 100% coverage |
✓ (34 stmts, 14 branches, 0 miss) |
| 2 | receipt.py 100% coverage |
✓ (22 stmts, 0 miss) |
| 3 | hook.py 100% coverage |
✓ (54 stmts, 6 branches, 0 miss) |
| 4 | cli.py 100% coverage |
✓ (49 stmts, 0 miss) |
| 5 | Integration tests for every subcommand | ✓ |
| 6 | E2E smoke via installed entry point | ✓ (7 subprocess cases) |
| 7 | mypy --strict clean |
✓ |
| 8 | CI green on all 9 matrix cells | (pending push verification) |
| 9 | CHANGELOG entry for v0.3.0a0 |
✓ |
| 10 | All commits in Conventional Commits format | ✓ |
| 11 | Receipt appended to RECEIPTS.md |
✓ (this commit) |
Lessons for after Phase 3¶
- Closed enums fold neatly into ordered transition tables. Once
Posturewas closed at four members, encoding the transition rules asdict[Posture, frozenset[Posture]]produced a fully-typed structure with zero special-cased code paths. - Hypothesis on dataclasses with
st.builds(...)is cheap. TheReceiptstrategy at 1,000 examples per property is ~3 seconds. Tightening the strategy bounds (max sizes for arguments, text fields) keeps it that way. - Annotated-style typer is worth the upgrade.
B008ignore is an anti-pattern; the Annotated form is cleaner and the typer docs now lead with it.
See also¶
RECEIPTS.md— canonical phase-day receipts.CHANGELOG.md— what shipped in each version.- Phase 2 history — what shipped immediately before.
- Wire into Claude Code — operator runbook now backed by a working hook.