docs: capture topology + operational learnings in CLAUDE.md/README
Bring the everyday guides up to the live state (flat data VLAN 30 + isolated mgmt VLAN 99 on ether8, DHCP + web UI experiment) and record the gotchas that cost time: the bench tunnel (paramiko ignores ProxyJump), mamba NM-profile stickiness on cable flap, the RouterOS find-by-address quirk, and the commit-confirmed detached-flip pattern for lockout-prone changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
18de750507
commit
2796616d05
2 changed files with 42 additions and 14 deletions
42
CLAUDE.md
42
CLAUDE.md
|
|
@ -30,12 +30,35 @@ ansible-playbook play_switch.yml --tags vlans # one domain
|
|||
ansible-vault view group_vars/mikrotik.vault.yml # read a secret
|
||||
```
|
||||
|
||||
## Access (on-site / bench)
|
||||
|
||||
The switch is reachable only via the makerspace laptop `mamba`. Ansible's `network_cli`
|
||||
uses paramiko, which **ignores ProxyJump**, so port-forward instead of double-hopping:
|
||||
|
||||
```bash
|
||||
ssh -J kuku -p 7576 sjat@10.8.0.4 -L 2222:192.168.88.1:22 -N # tunnel to the switch
|
||||
ansible-playbook play_switch.yml -e ansible_host=127.0.0.1 -e ansible_port=2222 -e ansible_user=sjat
|
||||
ssh-keygen -R '[127.0.0.1]:2222' # if the tunnel host key changed
|
||||
```
|
||||
|
||||
- `mamba` is the mgmt station on **switch port 8** (MGMT VLAN); it must be on port 8 to
|
||||
reach `192.168.88.1`. From a data port it gets `10.2.30.x` and **cannot** reach mgmt.
|
||||
- NM profiles on `mamba` `enp0s31f6`: `crs310-bench` (static `.2`) and `Wired connection 1`
|
||||
(DHCP). Moving the cable flaps the link and NM re-selects a profile — pin the intended
|
||||
one sticky (`autoconnect yes` + higher priority) and the other off, or it reverts.
|
||||
|
||||
## Rules
|
||||
|
||||
- **Idempotency:** RouterOS tasks use `community.routeros.command` with `:if [find]`
|
||||
guards. Run every device-touching play **twice**; the second run must report no changes.
|
||||
- **Lockout safety:** keep an independent recovery channel (serial/WinBox-MAC) when
|
||||
touching mgmt/services/VLANs; enable `vlan-filtering` **last**.
|
||||
touching mgmt/services/VLANs; enable `vlan-filtering` **last**. For lockout-prone
|
||||
changes over the network (vlan-filtering, moving the mgmt IP), run them as a detached
|
||||
self-reverting job — `:execute { …; :delay 240s; :if ($mgmtok=false) do={ revert } }`,
|
||||
then `:global mgmtok true` once verified. (Auto-healed a hard lockout during the cutover.)
|
||||
- **RouterOS `find ... address=<prefix>` never matches** an ip/address or dhcp-network
|
||||
value (returns 0 even on an exact string) — match by `[find interface=X]` or
|
||||
`:foreach`+`/ip/address/get $a address`. Bit the mgmt-IP move (duplicated the IP).
|
||||
- **All real values go in `host_vars`;** the role holds only mechanism + placeholders.
|
||||
- **Secrets** go to the `makerfloss` vault, never plaintext. Encrypt with
|
||||
`ansible-vault encrypt --encrypt-vault-id makerfloss <file>`.
|
||||
|
|
@ -43,12 +66,13 @@ ansible-vault view group_vars/mikrotik.vault.yml # read a secret
|
|||
|
||||
## Status / next
|
||||
|
||||
Bootstrap is done (user `sjat` + key + identity `crs310-maker`, RouterOS 7.19.6 pinned;
|
||||
default `admin` now disabled). All per-domain task files are **implemented**:
|
||||
`identity`, `users`, `backup`, `firmware` (opt-in) and `play_bootstrap` / `play_backup`
|
||||
are idempotency-verified against the device. `vlans` is implemented and Jinja-validated
|
||||
but its **device run is deferred** — the `host_vars` topology is still a placeholder.
|
||||
Live on the device (2026-06-09): flat L2 switch on `10.2.30.0/24` — **DATA VLAN 30**
|
||||
(`ether1` copper uplink + `ether2-7` + SFP+), **isolated MGMT VLAN 99 on `ether8`**
|
||||
(mgmt `192.168.88.1/24`, no gateway/NTP/DNS), `vlan-filtering` on. The mgmt port also
|
||||
serves DHCP (`192.168.88.10-.254`) + the web UI as a makerspace experiment (flags
|
||||
`switch_web_enabled`, `switch_mgmt_dhcp_enabled`). Default `admin` disabled; login as
|
||||
`sjat` (key, or vaulted password). All task files + `play_bootstrap`/`play_backup` are
|
||||
idempotency-verified. Design + cutover runbook:
|
||||
`docs/superpowers/specs/2026-06-09-crs310-flat-mgmtvlan-design.md`.
|
||||
|
||||
Next, on-site with a recovery channel: drop the real VLAN/port map into `host_vars`,
|
||||
reconcile the legacy defconf IP (`192.168.88.1/24` lives directly on `bridge`), then run
|
||||
`--tags vlans` and confirm mgmt reachability before/after `vlan-filtering=yes`.
|
||||
Next: SFP+ 10G uplink and real VLAN segmentation once connectors + a VLAN plan are ready.
|
||||
|
|
|
|||
14
README.md
14
README.md
|
|
@ -13,12 +13,16 @@ rebuilt from this repo instead of by hand in WinBox.
|
|||
| Repo scaffolding, role skeleton, vault | ✅ done |
|
||||
| On-site device prep + **bootstrap** (named user + SSH key + identity) | ✅ done (2026-06-08) |
|
||||
| `identity` / `users` / `backup` / `firmware` + `play_bootstrap` / `play_backup` | ✅ implemented; idempotency-verified against the device (firmware is opt-in, lint/syntax only) |
|
||||
| `vlans` (VLAN-aware bridge, ports, mgmt iface) | ✅ implemented + Jinja-validated; **device run deferred** — needs the real VLAN/port plan and an on-site recovery channel before `vlan-filtering` is enabled |
|
||||
| `vlans` (VLAN-aware bridge, ports, mgmt iface) | ✅ **applied & live** — flat data VLAN + isolated mgmt VLAN, `vlan-filtering` on |
|
||||
|
||||
The switch is reachable today by key auth as user `sjat`. All task files now carry their
|
||||
real RouterOS logic. The `vlans` topology in `host_vars` is still a **placeholder**:
|
||||
replace it with the real makerspace VLAN ids + per-port map before running `--tags vlans`
|
||||
on the live device, and do so on-site with a serial/WinBox-MAC recovery channel open.
|
||||
**Live topology (2026-06-09):** a flat L2 switch on the makerspace `10.2.30.0/24` —
|
||||
**DATA VLAN 30** (`ether1` copper uplink + `ether2-7` + SFP+) bridged through, and an
|
||||
**isolated MGMT VLAN 99 on `ether8`** (switch admin at `192.168.88.1`, no gateway/NTP/DNS).
|
||||
The mgmt port also serves DHCP + the web UI as an experiment (plug into `ether8`, get a
|
||||
lease, admin at `http://192.168.88.1`; login still required, default `admin` disabled).
|
||||
SFP+ 10G uplink and real VLAN segmentation are future work. See
|
||||
`docs/superpowers/specs/2026-06-09-crs310-flat-mgmtvlan-design.md` for the design + the
|
||||
lockout-safe cutover runbook.
|
||||
|
||||
## Layout
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue