MakerFLOSS_Mikrotik/README.md

96 lines
4.4 KiB
Markdown
Raw Permalink Normal View History

# MakerFLOSS_Mikrotik
Infrastructure-as-Code for the makerspace's **MikroTik CRS310-8G+2S+IN** switch
(8× 2.5GbE + 2× SFP+ 10G, RouterOS 7). Configuration is managed declaratively with
Ansible over SSH using the `community.routeros` collection — identity, management
access, users/keys, VLAN switching, backups, and firmware — so the switch can be
rebuilt from this repo instead of by hand in WinBox.
## Status
| Area | State |
|---|---|
| Repo scaffolding, role skeleton, vault | ✅ done |
| On-site device prep + **bootstrap** (named user + SSH key + identity) | ✅ done (2026-06-08) |
| `identity` / `users` / `backup` / `firmware` + `play_bootstrap` / `play_backup` | ✅ implemented; idempotency-verified against the device (firmware is opt-in, lint/syntax only) |
| `vlans` (VLAN-aware bridge, ports, mgmt iface) | ✅ **applied & live** — flat data VLAN + isolated mgmt VLAN, `vlan-filtering` on |
**Live topology (2026-06-09):** a flat L2 switch on the makerspace `10.2.30.0/24`
**DATA VLAN 30** (`ether1` copper uplink + `ether2-7` + SFP+) bridged through, and an
**isolated MGMT VLAN 99 on `ether8`** (switch admin at `192.168.88.1`, no gateway/NTP/DNS).
The mgmt port also serves DHCP + the web UI as an experiment (plug into `ether8`, get a
lease, admin at `http://192.168.88.1`; login still required, default `admin` disabled).
SFP+ 10G uplink and real VLAN segmentation are future work. See
`docs/superpowers/specs/2026-06-09-crs310-flat-mgmtvlan-design.md` for the design + the
lockout-safe cutover runbook.
## Layout
```
inventories/prod/hosts.yml # group `mikrotik` -> the switch host
group_vars/mikrotik.yml # connection vars (network_cli + community.routeros) + enable-flags
group_vars/mikrotik.vault.yml # encrypted admin/user password (makerfloss vault id)
host_vars/crs310-maker.yml # device facts + real addressing + VLAN/port map
roles/makerfloss.mikrotik_switch/ # the role: defaults + per-domain task files
play_switch.yml # day-2 run (key auth), applies all enabled domains
docs/makerspace-switch-fieldguide.md # on-site, printable prep checklist
docs/superpowers/specs|plans/ # design spec + implementation plan
```
## Setup (control node)
```bash
direnv allow # or: python3 -m venv .venv && . .venv/bin/activate
pip install -r requirements.txt
ansible-galaxy collection install -r requirements.yml
```
**Vault:** secrets use a dedicated vault identity `makerfloss`, keyed by
`~/.ansible/vault-keys/makerfloss.txt` (referenced in `ansible.cfg`, kept outside the
repo). View a secret with `ansible-vault view group_vars/mikrotik.vault.yml`.
## Connectivity
The role connects with `ansible.netcommon.network_cli` + `ansible_network_os:
community.routeros.routeros`, authenticating with the operator SSH key
(`~/.ssh/id_ed25519`). Day-2 needs no password.
> **Bench note:** while the switch sits on an isolated bench reachable only through a
> jump host, Ansible's paramiko transport won't traverse `ProxyJump`. Run Ansible from a
> host on the switch's network, or forward the port:
> `ssh -J <jump> <user>@<jump-lan> -L 2222:192.168.88.1:22 -N` then set
> `ansible_host=127.0.0.1 ansible_port=2222`. In production (switch directly reachable)
> this is a non-issue.
## Usage
```bash
# Validate
yamllint . && ansible-lint && ansible-playbook play_switch.yml --syntax-check
# First contact on a fresh/reset device (password auth, one time)
ansible-playbook play_bootstrap.yml -e ansible_user=admin --ask-pass
# Day-2 configuration (key auth, idempotent)
ansible-playbook play_switch.yml
ansible-playbook play_switch.yml --tags identity,users # safe domains
ansible-playbook play_switch.yml --tags vlans # on-site only — see lockout note
ansible-playbook play_switch.yml --limit crs310-maker
# Backup config into the repo
ansible-playbook play_backup.yml
```
## ⚠️ Lockout safety
When changing management, services, or VLAN/bridge settings, keep an independent
recovery channel open (serial console, or WinBox MAC-telnet) and enable
`vlan-filtering` **last**, after the management path is proven. RouterOS config tasks
use `:if [find]` guards for idempotency; **run every device-touching play twice** and
confirm the second run reports no changes.
## Preparing a switch on-site
See **`docs/makerspace-switch-fieldguide.md`** — a printable checklist for what to do
physically at the makerspace before Ansible takes over.