From 734a6522c16e376bb0ded5a6409a44b15c482159 Mon Sep 17 00:00:00 2001 From: sjat Date: Wed, 24 Jun 2026 15:00:07 +0200 Subject: [PATCH] docs(rack): Phase 3 network implementation plan --- notes/dev/plans/2026-06-24-rack-network.md | 593 +++++++++++++++++++++ 1 file changed, 593 insertions(+) create mode 100644 notes/dev/plans/2026-06-24-rack-network.md diff --git a/notes/dev/plans/2026-06-24-rack-network.md b/notes/dev/plans/2026-06-24-rack-network.md new file mode 100644 index 0000000..8f03cab --- /dev/null +++ b/notes/dev/plans/2026-06-24-rack-network.md @@ -0,0 +1,593 @@ +# Rack Network (Phase 3) Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Add network-cabling data (`links:` feeds + switch/patch-panel peer files) to the rack pipeline, validate it (rule 4), and render a mermaid network graph on the generated rack page — reusing every Phase 1/2 mechanism. + +**Architecture:** Extend the existing `scripts/gen_rack.py` with `load_hardware_index` (global hostname→frontmatter map for peer resolution), `validate_links` (rule 4), and `render_network` (a `flowchart LR` with local interface, peer port, and speed on each edge label); insert a `## Network` section into `render_page` between Power and Occupancy. Switch/patch-panel files are normal placed items that Phase 1 already draws and `gen_overview.py` already lists. Mermaid is already enabled. + +**Tech Stack:** Python 3 (stdlib + PyYAML only), pytest, MkDocs Material, Forgejo Actions CI. + +**Spec:** `notes/dev/specs/2026-06-24-rack-network-design.md`. + +## Global Constraints + +- Scripts use **stdlib + PyYAML only**; deterministic and offline (copy existing `gen_rack.py` style). No randomness/time in generated output. +- `re` and `yaml` are already imported in `scripts/gen_rack.py`; do not add new imports. +- `_node_id` (Phase 2) is reused for mermaid node ids — do not redefine it. +- Validation failures raise `SchemaError`; `generate` prints `ERROR: …` to stderr and returns `1`, **writing nothing** on failure (existing behaviour). +- Generated files keep the existing `_Auto-generated … do not edit by hand_` banner (already emitted by `render_page`). +- **Peer resolution is global** (against all `docs/hardware/*.md` hostnames), not per-rack — rule 4 says "resolves to a real file". +- `peer_port` range is checked **only when the peer declares an integer `ports`**. +- Edge label format: `{local} → p{peer_port} · {speed}G`, with the ` · {speed}G` suffix omitted when `speed_gbps` is absent. Use the unicode arrow `→` (not `->`) to avoid clashing with mermaid's `-->` syntax. +- A node whose kind is `switch` or `patch-panel` renders as `{name}
{kind}`; all other nodes render as the bare hostname. +- Network data added here is **provisional placeholder data** (like the mfNN positions and the Phase 2 power data), not real values. +- **No edits** to `mkdocs.yml`, `Makefile`, `.forgejo/workflows/docs.yml`, or `scripts/overview_config.yml` (`switch`/`patch-panel`/`ap` already in the enum; drift already covers `racks/`). +- `mkdocs build --strict` must pass; `make docs-check` must exit 0 after regeneration. + +--- + +### Task 1: `load_hardware_index` + `validate_links` — rule 4 (TDD) + +Add the global peer index and link validation, and wire `validate_links` into `generate`. Testable on validation alone. + +**Files:** +- Modify: `scripts/gen_rack.py` (add `load_hardware_index`, `validate_links`; build the index and call `validate_links` in `generate`) +- Modify: `tests/test_gen_rack.py` (append tests) + +**Interfaces:** +- Consumes: `SchemaError`, `parse_frontmatter`, the `item()`/`_write_item` test helpers, `generate`. +- Produces: + - `load_hardware_index(hardware_dir: Path) -> dict[str, dict]` — `{hostname: frontmatter}` for every `*.md` (excluding `index.md`). + - `validate_links(items: list[dict], hw_index: dict[str, dict]) -> None` — raises `SchemaError` on a malformed/dangling link. + +- [ ] **Step 1: Append failing tests to `tests/test_gen_rack.py`** + +```python +def test_load_hardware_index_maps_all_hostnames(tmp_path): + hw = tmp_path / "hardware" + hw.mkdir() + _write_item( + hw, "sw01", + "---\nhostname: sw01\nkind: switch\nstatus: in-use\nports: 24\n---\n", + ) + _write_item( + hw, "mf00", + "---\nhostname: mf00\nkind: server\nstatus: in-use\n" + "rack: rack01\nrack_u: 1\nu_height: 1\nrack_face: front\n---\n", + ) + idx = gen_rack.load_hardware_index(hw) + assert set(idx) == {"sw01", "mf00"} + assert idx["sw01"]["ports"] == 24 + + +def test_validate_links_accepts_valid_link(): + items = [item(hostname="mf01", rack_u=2, u_height=1, rack_face="front", + links=[{"local": "eth0", "peer": "sw01", + "peer_port": 1, "speed_gbps": 1}])] + hw_index = {"sw01": item(hostname="sw01", kind="switch", ports=24)} + gen_rack.validate_links(items, hw_index) + + +def test_validate_links_rejects_unknown_peer(): + items = [item(hostname="mf01", rack_u=2, u_height=1, rack_face="front", + links=[{"local": "eth0", "peer": "ghost", "peer_port": 1}])] + with pytest.raises(gen_rack.SchemaError): + gen_rack.validate_links(items, {}) + + +def test_validate_links_rejects_peer_port_over_count(): + items = [item(hostname="mf01", rack_u=2, u_height=1, rack_face="front", + links=[{"local": "eth0", "peer": "sw01", "peer_port": 25}])] + hw_index = {"sw01": item(hostname="sw01", kind="switch", ports=24)} + with pytest.raises(gen_rack.SchemaError): + gen_rack.validate_links(items, hw_index) + + +def test_validate_links_accepts_peer_without_ports(): + items = [item(hostname="mf01", rack_u=2, u_height=1, rack_face="front", + links=[{"local": "eth0", "peer": "rtr01", "peer_port": 99}])] + hw_index = {"rtr01": item(hostname="rtr01", kind="server")} + gen_rack.validate_links(items, hw_index) # no ports -> range check skipped + + +def test_validate_links_rejects_missing_local(): + items = [item(hostname="mf01", rack_u=2, u_height=1, rack_face="front", + links=[{"peer": "sw01", "peer_port": 1}])] + hw_index = {"sw01": item(hostname="sw01", kind="switch", ports=24)} + with pytest.raises(gen_rack.SchemaError): + gen_rack.validate_links(items, hw_index) + + +def test_validate_links_rejects_malformed_entry(): + items = [item(hostname="mf01", rack_u=2, u_height=1, rack_face="front", + links=["sw01"])] + with pytest.raises(gen_rack.SchemaError): + gen_rack.validate_links(items, {}) + + +def test_generate_returns_1_on_bad_link_peer(tmp_path): + hw = tmp_path / "hardware" + out = tmp_path / "out" + hw.mkdir() + _write_item( + hw, "mf00", + "---\nhostname: mf00\nkind: server\nstatus: in-use\n" + "rack: rack01\nrack_u: 1\nu_height: 1\nrack_face: front\n" + "links:\n - { local: eth0, peer: ghost, peer_port: 1 }\n---\n", + ) + rc = gen_rack.generate(hw, out) + assert rc == 1 + assert not (out / "rack01.md").exists() +``` + +- [ ] **Step 2: Run to verify failure** + +Run: `pytest tests/test_gen_rack.py -q` +Expected: FAIL — `AttributeError: module 'gen_rack' has no attribute 'load_hardware_index'`. + +- [ ] **Step 3: Add `load_hardware_index` and `validate_links` after `check_overlaps` in `scripts/gen_rack.py`** + +Add these two functions (place them just after `check_overlaps`, before `_pdu_index`): + +```python +def load_hardware_index(hardware_dir: Path) -> dict[str, dict]: + """Map hostname -> frontmatter for every hardware file (global peer lookup).""" + index: dict[str, dict] = {} + for path in sorted(hardware_dir.glob("*.md")): + if path.name == "index.md": + continue + fm = parse_frontmatter(path) + if fm is None: + continue + name = fm.get("hostname") + if isinstance(name, str) and name: + index[name] = fm + return index + + +def validate_links(items: list[dict], hw_index: dict[str, dict]) -> None: + """Validate `links` cable declarations (rule 4). + + Every links[].peer must resolve to a real hardware file (global lookup via + hw_index); peer_port must fall within the peer's declared `ports` when it + declares an integer count. + """ + for fm in items: + links = fm.get("links") + if links is None: + continue + name = fm.get("hostname", "?") + if not isinstance(links, list): + raise SchemaError(f"{name}: links must be a list") + for link in links: + if not isinstance(link, dict): + raise SchemaError(f"{name}: links entry must be a mapping") + local = link.get("local") + peer = link.get("peer") + peer_port = link.get("peer_port") + if not isinstance(local, str) or not local: + raise SchemaError(f"{name}: links entry needs a non-empty 'local'") + if not isinstance(peer, str) or not peer: + raise SchemaError(f"{name}: links entry needs a non-empty 'peer'") + if not isinstance(peer_port, int): + raise SchemaError( + f"{name}: links entry for {peer} needs an integer 'peer_port'" + ) + target = hw_index.get(peer) + if target is None: + raise SchemaError( + f"{name}: links peer={peer!r} is not a known hardware file" + ) + ports = target.get("ports") + if isinstance(ports, int) and (peer_port < 1 or peer_port > ports): + raise SchemaError( + f"{name}: peer_port {peer_port} out of range 1..{ports} on {peer}" + ) +``` + +- [ ] **Step 4: Wire `validate_links` into `generate` in `scripts/gen_rack.py`** + +`generate` currently begins: + +```python +def generate(hardware_dir: Path, output_dir: Path) -> int: + items = load_rack_items(hardware_dir) + + errors: list[str] = [] +``` + +Add the global index right after `items` is loaded: + +```python +def generate(hardware_dir: Path, output_dir: Path) -> int: + items = load_rack_items(hardware_dir) + hw_index = load_hardware_index(hardware_dir) + + errors: list[str] = [] +``` + +Then extend the per-rack validation loop. Replace: + +```python + if not errors: # only check overlaps once placements are individually valid + for rack, ritems in racks.items(): + try: + check_overlaps(ritems) + validate_power(ritems) + except SchemaError as e: + errors.append(f"{rack}: {e}") +``` + +with: + +```python + if not errors: # only check overlaps once placements are individually valid + for rack, ritems in racks.items(): + try: + check_overlaps(ritems) + validate_power(ritems) + validate_links(ritems, hw_index) + except SchemaError as e: + errors.append(f"{rack}: {e}") +``` + +- [ ] **Step 5: Run to verify pass** + +Run: `pytest tests/test_gen_rack.py -q` +Expected: PASS (all prior tests + 8 new). + +- [ ] **Step 6: Commit** + +```bash +git add scripts/gen_rack.py tests/test_gen_rack.py +git commit -m "feat(rack): validate network links against peer files and ports" +``` + +--- + +### Task 2: `render_network` + page section (TDD) + +**Files:** +- Modify: `scripts/gen_rack.py` (add `render_network`; edit `render_page`) +- Modify: `tests/test_gen_rack.py` (append tests) + +**Interfaces:** +- Consumes: `_node_id` (Phase 2), `render_page`, `generate`. +- Produces: `render_network(rack: str, items: list[dict]) -> str` — a fenced `mermaid` `flowchart LR` ending in a newline, or `""` when no item has a `links` feed. + +- [ ] **Step 1: Append failing tests to `tests/test_gen_rack.py`** + +```python +def test_render_network_has_nodes_and_edge_labels(): + items = [ + item(hostname="sw01", kind="switch", rack_u=10, u_height=1, + rack_face="front", ports=24), + item(hostname="mf00", rack_u=1, u_height=1, rack_face="front", + links=[{"local": "eth0", "peer": "sw01", + "peer_port": 1, "speed_gbps": 1}]), + ] + out = gen_rack.render_network("rack01", items) + assert "```mermaid" in out + assert "flowchart LR" in out + assert "sw01
switch" in out + assert "mf00" in out + assert "eth0" in out + assert "p1" in out + assert "1G" in out + + +def test_render_network_patch_panel_subtitle(): + items = [ + item(hostname="pp01", kind="patch-panel", rack_u=24, u_height=1, + rack_face="front", ports=24), + item(hostname="mf01", rack_u=2, u_height=1, rack_face="front", + links=[{"local": "eth0", "peer": "pp01", + "peer_port": 1, "speed_gbps": 1}]), + ] + out = gen_rack.render_network("rack01", items) + assert "pp01
patch-panel" in out + + +def test_render_network_empty_when_no_links(): + items = [item(hostname="mf00", rack_u=1, u_height=1, rack_face="front")] + assert gen_rack.render_network("rack01", items) == "" + + +def test_render_network_omits_speed_when_absent(): + items = [ + item(hostname="sw01", kind="switch", rack_u=10, u_height=1, + rack_face="front", ports=24), + item(hostname="mf00", rack_u=1, u_height=1, rack_face="front", + links=[{"local": "eth0", "peer": "sw01", "peer_port": 1}]), + ] + out = gen_rack.render_network("rack01", items) + assert "eth0" in out and "p1" in out + assert "·" not in out # no speed suffix rendered + + +def test_render_network_is_deterministic(): + a = item(hostname="sw01", kind="switch", rack_u=10, u_height=1, + rack_face="front", ports=24) + b = item(hostname="mf01", rack_u=2, u_height=1, rack_face="front", + links=[{"local": "eth0", "peer": "sw01", + "peer_port": 2, "speed_gbps": 1}]) + c = item(hostname="mf00", rack_u=1, u_height=1, rack_face="front", + links=[{"local": "eth0", "peer": "sw01", + "peer_port": 1, "speed_gbps": 1}]) + assert gen_rack.render_network("rack01", [a, b, c]) == \ + gen_rack.render_network("rack01", [c, b, a]) + + +def test_generate_includes_network_section(tmp_path): + hw = tmp_path / "hardware" + out = tmp_path / "out" + hw.mkdir() + _write_item( + hw, "sw01", + "---\nhostname: sw01\nkind: switch\nstatus: in-use\n" + "rack: rack01\nrack_u: 10\nu_height: 1\nrack_face: front\nports: 24\n---\n", + ) + _write_item( + hw, "mf00", + "---\nhostname: mf00\nkind: server\nstatus: in-use\n" + "rack: rack01\nrack_u: 1\nu_height: 1\nrack_face: front\n" + "links:\n - { local: eth0, peer: sw01, peer_port: 1, speed_gbps: 1 }\n---\n", + ) + rc = gen_rack.generate(hw, out) + assert rc == 0 + page = (out / "rack01.md").read_text() + assert "## Network" in page + assert "```mermaid" in page + assert "eth0" in page +``` + +- [ ] **Step 2: Run to verify failure** + +Run: `pytest tests/test_gen_rack.py -q` +Expected: FAIL — `AttributeError: module 'gen_rack' has no attribute 'render_network'`. + +- [ ] **Step 3: Add `render_network` after `render_power` in `scripts/gen_rack.py`** + +```python +def render_network(rack: str, items: list[dict]) -> str: + """Return a mermaid network-cabling flowchart, or '' if no links. + + Assumes `validate_links` has already passed: every link has a non-empty + `local`/`peer` and an integer `peer_port`, and `peer` resolves to a real + hardware file. `generate` validates before any render call. + """ + linked = [fm for fm in items if fm.get("links")] + if not linked: + return "" + + by_host = {fm.get("hostname"): fm for fm in items} + + edges: list[tuple[str, str, str, int, object]] = [] + nodes: set[str] = set() + for fm in linked: + source = fm.get("hostname", "?") + nodes.add(source) + for link in fm["links"]: + peer = link["peer"] + nodes.add(peer) + edges.append( + (source, link["local"], peer, link["peer_port"], + link.get("speed_gbps")) + ) + edges.sort(key=lambda e: (e[0], e[1], e[2], e[3])) + + def node_label(name: str) -> str: + fm = by_host.get(name) + kind = fm.get("kind") if fm else None + if kind in ("switch", "patch-panel"): + return f"{name}
{kind}" + return name + + lines: list[str] = ["```mermaid", "flowchart LR"] + for name in sorted(nodes): + lines.append(f' {_node_id(name)}["{node_label(name)}"]') + for source, local, peer, peer_port, speed in edges: + label = f"{local} → p{peer_port}" + if speed is not None: + label += f" · {speed}G" + lines.append(f" {_node_id(source)} -->|{label}| {_node_id(peer)}") + lines.append("```") + return "\n".join(lines) + "\n" +``` + +- [ ] **Step 4: Insert the `## Network` section in `render_page` in `scripts/gen_rack.py`** + +`render_page` currently has this block (the Power section followed directly by Occupancy): + +```python + power = render_power(rack, items) + if power: + lines.append("## Power") + lines.append("") + lines.append(power.rstrip()) + lines.append("") + lines.append("## Occupancy") +``` + +Insert the Network section between the Power block and the Occupancy line: + +```python + power = render_power(rack, items) + if power: + lines.append("## Power") + lines.append("") + lines.append(power.rstrip()) + lines.append("") + network = render_network(rack, items) + if network: + lines.append("## Network") + lines.append("") + lines.append(network.rstrip()) + lines.append("") + lines.append("## Occupancy") +``` + +- [ ] **Step 5: Run to verify pass** + +Run: `pytest tests/test_gen_rack.py -q` +Expected: PASS (all prior tests + 6 new). + +- [ ] **Step 6: Commit** + +```bash +git add scripts/gen_rack.py tests/test_gen_rack.py +git commit -m "feat(rack): render mermaid network graph into the rack page" +``` + +--- + +### Task 3: Populate provisional network data, regenerate + +**Files:** +- Create: `docs/hardware/sw01.md`, `docs/hardware/pp01.md` +- Modify: `docs/hardware/mf00.md`..`mf04.md` (add `links:`) +- Regenerate: `docs/hardware/index.md`, `docs/infrastructure/racks/rack01.md`, `docs/infrastructure/racks/rack01-elevation.svg` + +**Interfaces:** +- Consumes: `python3 scripts/gen_rack.py` / `make docs-index`, `mkdocs build --strict`, `make docs-check`. + +> **Operator note — provisional data.** The switch/patch-panel placements and the cable assignments below are placeholders proving the feature, matching the existing fictional mfNN positions and Phase 2 power data. Replace with real values when known; `validate_links` rejects dangling peers and over-count ports loudly. sw01/pp01 deliberately get no `power:` feeds in this phase. + +- [ ] **Step 1: Create the switch and patch-panel files** + +Create `docs/hardware/sw01.md`: + +```markdown +--- +hostname: sw01 +kind: switch +status: in-use +rack: rack01 +rack_u: 10 +u_height: 1 +rack_face: front +ports: 24 +--- + +## Notes + +- Provisional placeholder switch. Port assignments are not yet real. +``` + +Create `docs/hardware/pp01.md`: + +```markdown +--- +hostname: pp01 +kind: patch-panel +status: in-use +rack: rack01 +rack_u: 24 +u_height: 1 +rack_face: front +ports: 24 +links: + - { local: uplink, peer: sw01, peer_port: 24, speed_gbps: 1 } +--- + +## Notes + +- Provisional placeholder patch panel. Devices patch in here; rear uplink to sw01. +``` + +- [ ] **Step 2: Add `links:` to the five host files** + +These files already carry rack-placement and `power:` frontmatter. ADD a `links:` block to each (before the closing `---`); do not remove anything. + +In `docs/hardware/mf00.md` add: + +```yaml +links: + - { local: eth0, peer: sw01, peer_port: 1, speed_gbps: 1 } +``` + +In `docs/hardware/mf01.md` add: + +```yaml +links: + - { local: eth0, peer: pp01, peer_port: 1, speed_gbps: 1 } +``` + +In `docs/hardware/mf02.md` add: + +```yaml +links: + - { local: eth0, peer: pp01, peer_port: 2, speed_gbps: 1 } +``` + +In `docs/hardware/mf03.md` add: + +```yaml +links: + - { local: eth0, peer: pp01, peer_port: 3, speed_gbps: 1 } +``` + +In `docs/hardware/mf04.md` add: + +```yaml +links: + - { local: eth0, peer: pp01, peer_port: 4, speed_gbps: 1 } +``` + +- [ ] **Step 3: Regenerate all indices and rack artifacts** + +Run: `make docs-index` +Expected: `gen_overview.py` rewrites `docs/hardware/index.md` (now listing sw01 under "Switches" and pp01 under "Patch panels"); `gen_rack.py` prints `Wrote rack01.md + rack01-elevation.svg (9 item(s))`. + +- [ ] **Step 4: Confirm the generated page has a network graph and the new boxes** + +Run: `grep -c "→ p" docs/infrastructure/racks/rack01.md` +Expected: `6` (one network edge per link: mf00→sw01, mf01..mf04→pp01, pp01→sw01). + +Run: `grep -q "sw01" docs/infrastructure/racks/rack01-elevation.svg && grep -q "pp01" docs/infrastructure/racks/rack01-elevation.svg && echo OK` +Expected: `OK` (switch and patch-panel drawn as boxes in the elevation). + +- [ ] **Step 5: Run the full test suite** + +Run: `make test` +Expected: PASS (all tests). + +- [ ] **Step 6: Build the site strictly** + +Run: `mkdocs build --strict` (if `mkdocs` is not on PATH, use `python3 -m mkdocs build --strict`) +Expected: build succeeds with no warnings-as-errors. + +Verify: `grep -c "mermaid" site/infrastructure/racks/rack01/index.html` +Expected: `≥ 2` (a power block and a network block both render as mermaid diagrams). + +- [ ] **Step 7: Confirm the drift guard is satisfied** + +Run: `make docs-check` +Expected: exit 0 — committed artifacts match a fresh regeneration. + +- [ ] **Step 8: Commit** + +```bash +git add docs/hardware/ docs/infrastructure/racks/ +git commit -m "feat(rack): populate provisional network topology (sw01, pp01, links)" +``` + +--- + +## Self-Review + +**Spec coverage (`2026-06-24-rack-network-design.md`):** +- `links:` frontmatter on devices/peers — Task 3 (populate); validated Task 1. ✔ +- Switch + patch-panel peer files (`ports`, placed 1U front) — Task 3; appear via Phase 1 SVG + gen_overview, no new code. ✔ +- Validation rule 4 (peer resolves to a real file globally; peer_port within `ports` when declared; malformed/missing fields) — Task 1 (`validate_links` + `load_hardware_index`), wired into `generate`. ✔ +- Global peer resolution (not per-rack) — Task 1 (`load_hardware_index` over all files; `generate` passes `hw_index`). ✔ +- Mermaid network graph, full edge label (local → port · speed), kind subtitle for switch/patch-panel, omit-when-empty, deterministic — Task 2 (`render_network`), inserted in `render_page` between Power and Occupancy. ✔ +- Node-id sanitization reused (`_node_id`) — Task 2. ✔ +- Speed omitted when absent; unicode `→` — Task 2 (label build), tested. ✔ +- No mkdocs/Makefile/CI/overview_config changes — honored (Global Constraints); drift covered by existing `racks/` diff — Task 3 Steps 3/7. ✔ +- Provisional data (mf01–mf04 → pp01 1–4; pp01 uplink → sw01:24; mf00 → sw01:1) — Task 3 Steps 1–2. ✔ + +**Placeholder scan:** No "TBD"/"handle edge cases"/"similar to Task N". The only operator-judgement item is provisional network values, explicitly bounded and guarded by `validate_links`. + +**Type consistency:** `load_hardware_index` → `dict[str, dict]`; `validate_links(items, hw_index)`/`check_overlaps`/`validate_power` → `None` (raise `SchemaError`); `render_network`/`render_power`/`render_page`/`_node_id` → `str`; `generate` → `int` (0/1). `validate_links(ritems, hw_index)` is called per-rack alongside `check_overlaps`/`validate_power`, with `hw_index` built once at the top of `generate`. `render_network` consumes `_node_id` and feeds `render_page`. Names match across tasks and tests.