From 8a42f5482f439fc07c77de853f1f22c30237eb01 Mon Sep 17 00:00:00 2001 From: sjat Date: Tue, 9 Jun 2026 12:12:22 +0200 Subject: [PATCH] docs(spec): flat data path + isolated mgmt VLAN topology ether1 copper uplink (SFP+ deferred), flat 10.2.30.0/24 data VLAN 30, isolated mgmt VLAN 99 on ether8 with switch mgmt 192.168.88.1/24, no gateway/NTP/DNS. Includes the lockout-safe on-site cutover runbook. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../2026-06-09-crs310-flat-mgmtvlan-design.md | 97 +++++++++++++++++++ 1 file changed, 97 insertions(+) create mode 100644 docs/superpowers/specs/2026-06-09-crs310-flat-mgmtvlan-design.md diff --git a/docs/superpowers/specs/2026-06-09-crs310-flat-mgmtvlan-design.md b/docs/superpowers/specs/2026-06-09-crs310-flat-mgmtvlan-design.md new file mode 100644 index 0000000..baca8c8 --- /dev/null +++ b/docs/superpowers/specs/2026-06-09-crs310-flat-mgmtvlan-design.md @@ -0,0 +1,97 @@ +# CRS310 — flat data path + isolated management VLAN — Design + +**Date:** 2026-06-09 +**Status:** Approved (brainstorming complete) +**Author:** sjat + Claude +**Supersedes** the placeholder topology in `host_vars/crs310-maker.yml` (the +`10.0.99.x` / SFP+-trunk example). Builds on +`2026-06-07-mikrotik-crs310-ansible-design.md`. + +## Purpose + +Bring the makerspace CRS310 into service as a **flat L2 switch** on the existing +`10.2.30.0/24` network, with its **management plane isolated on a dedicated VLAN** +reached through one physical port. No SFP+ yet — the 10G uplink is deferred until the +connectors arrive; **`ether1` is the (copper) uplink** for now. + +## Context (as found on 2026-06-09) + +- Switch on factory **defconf**: one flat `bridge` with all ports, mgmt IP + `192.168.88.1/24` sitting directly on `bridge`, `vlan-filtering=no`. +- Upstream LAN is **flat**: DHCP/gateway at `10.2.30.1`, untagged. Verified by leasing + `10.2.30.227` to mamba *through* the switch's flat bridge. +- mamba is the management station (patched into the switch, reached from fisi over a + `kuku` jump + port-forward tunnel to `192.168.88.1`). + +## Topology + +VLAN-aware bridge (`bridge`), `vlan-filtering=yes` enabled **last**. All ports are +untagged access ports — **no trunks**. + +| Port | Mode | PVID | VLAN | Notes | +|---|---|---|---|---| +| `ether1` | access | 30 | DATA | copper uplink to `10.2.30.0/24` | +| `ether2`–`ether7` | access | 30 | DATA | device access ports | +| `sfp-sfpplus1/2` | access | 30 | DATA | unused until connectors arrive | +| `ether8` | access | 99 | MGMT | dedicated management port (mamba lives here) | + +- **DATA VLAN 30** — internal-only id; ingress/egress on `ether1` is untagged, so the + upstream router sees a plain flat network. The switch CPU (`bridge`) is **not** a + member of VLAN 30 → no switch L3 presence on the user network. +- **MGMT VLAN 99** — `vlan-mgmt` interface on the bridge, IP **`192.168.88.1/24`**, the + bridge/CPU is the only tagged member, `ether8` the only untagged member. + **No default gateway** — management is intentionally isolated. + +## Management & internet + +- Reachable only from `ether8` (plug the management laptop / mamba there, addressed + `192.168.88.2/24`). The switch does **no routing or DHCP**; `10.2.30.1` keeps both. +- The control plane has **no internet** by design → **NTP/DNS disabled** (they would + only error on an isolated segment; clock won't sync, updates are done manually when + the switch is temporarily patched to the data network). + +## Required changes to the IaC + +1. `host_vars/crs310-maker.yml`: replace the placeholder topology with the table above; + `switch_mgmt_address: 192.168.88.1/24`, `switch_mgmt_vlan_id: 99`, **no gateway**; + drop the `10.0.99.x` DNS/NTP/gateway placeholders. +2. Role `vlans.yml`: make the **default-route** task conditional on a gateway being set + (skip when isolated); **remove the legacy defconf IP** off the bare `bridge` so it + doesn't collide with the `vlan-mgmt` IP (`192.168.88.1` must live only on + `vlan-mgmt`). +3. Role `identity.yml`: gate NTP (and DNS) behind a flag / empty-server check so an + isolated mgmt plane doesn't configure unreachable servers. Add + `switch_ntp_enabled: false` for this host. + +The existing `vlans.yml` membership Jinja already produces the correct sets for an +all-access topology (DATA untagged = data ports, CPU tagged only on MGMT). + +## Cutover runbook (lockout-safe; operator on-site at `ether8`) + +1. **Restore mgmt path** (done): mamba `enp0s31f6` → `192.168.88.2/24` (profile + `crs310-bench`); fisi→mamba→switch tunnel up; Ansible reaches `192.168.88.1`. +2. **Move the cable: switch port 5 → port 8.** (Bridge is still flat, so mamba stays + reachable on either port.) Re-confirm reachability. +3. Apply config in order: bridge VLAN table → port PVIDs → create `vlan-mgmt` iface. + Verify the VLAN/PVID state with `vlan-filtering` still **off**. Then the **flip**, as + one ordered sequence (the address can't be on both interfaces at once): remove + `192.168.88.1` from `bridge`, add it to `vlan-mgmt`, set `vlan-filtering=yes`. mamba + (`ether8`, untagged VLAN 99, `.2`) ↔ switch (`.1`) is the canary; the SSH/tunnel may + blip during the flip but must come back. Pre-verifying PVID/membership before the + flip is what prevents a hard lockout. +4. Verify: `/interface/bridge/vlan/print` membership correct, mgmt still reachable, a + device on `ether1`-fed ports still gets `10.2.30.x`. + +## Risks + +- **Lockout** on enabling `vlan-filtering` if `ether8`/VLAN 99/mgmt-IP aren't aligned. + Mitigated by ordering (filtering last), the live canary connection, and the operator + being on-site to re-cable. WinBox-MAC recovery is unavailable (broken under Wine); + worst case is a no-defaults reset, which we avoid. +- **Removing the legacy bridge IP** is the delicate step — done while the new + `vlan-mgmt` IP is the same address, before filtering, with the connection watched. + +## Out of scope + +Real inter-VLAN segmentation, the SFP+ 10G uplink/trunk, and any upstream router VLAN +work — revisited when the connectors and a real VLAN plan are ready.