173 lines
8.8 KiB
Markdown
173 lines
8.8 KiB
Markdown
|
|
# MakerFLOSS_Mikrotik — CRS310 Ansible Management — Design
|
|||
|
|
|
|||
|
|
**Date:** 2026-06-07
|
|||
|
|
**Status:** Approved (brainstorming complete; pending implementation plan)
|
|||
|
|
**Author:** sjat + Claude
|
|||
|
|
|
|||
|
|
## Purpose
|
|||
|
|
|
|||
|
|
Manage the makerspace's MikroTik **CRS310-8G+2S+IN** 10-port switch
|
|||
|
|
(8× 2.5GbE + 2× SFP+ 10G, RouterOS) as Infrastructure-as-Code with Ansible.
|
|||
|
|
Goal: deterministic, idempotent, version-controlled switch configuration —
|
|||
|
|
identity, management access, users/keys, VLAN switching, backups, and firmware —
|
|||
|
|
so the switch can be rebuilt from the repo with no manual WinBox clicking.
|
|||
|
|
|
|||
|
|
## Scope
|
|||
|
|
|
|||
|
|
**In scope (this iteration):** a single CRS310 switch, configured over SSH.
|
|||
|
|
|
|||
|
|
Configuration domains, each gated by an enable-flag:
|
|||
|
|
1. **Identity + management + services** — hostname/identity, management IP/VLAN,
|
|||
|
|
NTP/DNS, enable SSH, disable unused services (telnet, ftp, www, api; winbox decision in Open Items).
|
|||
|
|
2. **Users + SSH keys** — named admin user, import operator SSH public key,
|
|||
|
|
harden/disable the default `admin`.
|
|||
|
|
3. **VLANs + bridge + ports** — bridge with hardware-offload VLAN filtering,
|
|||
|
|
access/trunk port assignments, SFP+ as upstream trunk. Ships with a
|
|||
|
|
**placeholder** example topology; real VLAN IDs/port map filled into `host_vars` later.
|
|||
|
|
4. **Backups + firmware** — scheduled `/export` + `/system backup`, fetched into the
|
|||
|
|
repo; RouterOS/RouterBOOT upgrade flow to a pinned target version.
|
|||
|
|
|
|||
|
|
**Out of scope (for now):** additional MikroTik devices, APs, routers; the REST API
|
|||
|
|
transport; CI/molecule testing; monitoring integration. Structure should not *prevent*
|
|||
|
|
these later, but we build only the single-switch path.
|
|||
|
|
|
|||
|
|
## Decisions (from brainstorming)
|
|||
|
|
|
|||
|
|
| Topic | Decision |
|
|||
|
|
|---|---|
|
|||
|
|
| Project / repo name | `MakerFLOSS_Mikrotik` (underscore; hyphen acceptable) |
|
|||
|
|
| Repo host | New repo on `forgejo.makerfloss.eu`, remote `origin`, default branch `main` |
|
|||
|
|
| Location | Sibling directory `~/Projects/MakerFLOSS_Mikrotik` |
|
|||
|
|
| Transport | **SSH** via `network_cli` (`community.routeros`), **key auth** for day-2 |
|
|||
|
|
| Role namespace | `makerfloss.*` → role `makerfloss.mikrotik_switch` |
|
|||
|
|
| Vault | **Separate** identity `makerfloss` at `~/.ansible/vault-keys/makerfloss.txt` — NOT the home `prod` key |
|
|||
|
|
| Config location | All real values in `host_vars/<switch>.yml`; connection vars in `group_vars/mikrotik.yml`; mechanism + placeholders in role `defaults/` |
|
|||
|
|
| Base | Fresh repo in AnsibleBaobabV4 conventions; cherry-pick narrowin/ansible-mikrotik command sequences for backup/upgrade |
|
|||
|
|
| Clean slate | Factory-reset switch to **no default configuration**; Ansible owns the entire config |
|
|||
|
|
| Default admin | Create named admin user + import key; **disable** the default `admin` after key login is proven |
|
|||
|
|
|
|||
|
|
## What to bring over from AnsibleBaobabV4
|
|||
|
|
|
|||
|
|
Copy + trim (independent repos; do not symlink):
|
|||
|
|
|
|||
|
|
- `.envrc` + `.venv` direnv bootstrap — verbatim.
|
|||
|
|
- `ansible.cfg` — adapted: `host_key_checking=False`, `vault_identity_list = makerfloss@~/.ansible/vault-keys/makerfloss.txt`, network-CLI-friendly defaults.
|
|||
|
|
- `.ansible-lint` + yamllint config — verbatim.
|
|||
|
|
- `requirements.txt` — trimmed to `ansible`, `ansible-lint`, `yamllint` (drop molecule/docker/snipe/kuma).
|
|||
|
|
- `requirements.yml` — `community.routeros` (pulls in `ansible.netcommon`).
|
|||
|
|
- Inventory cascade pattern: `inventories/prod/hosts.yml` with one host in group `mikrotik`.
|
|||
|
|
- **Operator SSH public key** `~/.ssh/id_ed25519.pub` → imported onto the switch admin user.
|
|||
|
|
- Forgejo push key `~/.ssh/id_ed25519_forgejo` already exists (used for `git push`).
|
|||
|
|
|
|||
|
|
## Architecture
|
|||
|
|
|
|||
|
|
### Repo layout
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
MakerFLOSS_Mikrotik/
|
|||
|
|
├── .envrc / .ansible-lint / .yamllint / ansible.cfg
|
|||
|
|
├── requirements.txt / requirements.yml
|
|||
|
|
├── inventories/
|
|||
|
|
│ └── prod/hosts.yml # group: mikrotik -> one switch host
|
|||
|
|
├── group_vars/
|
|||
|
|
│ └── mikrotik.yml # connection/platform vars (network_cli, network_os, user, key)
|
|||
|
|
├── host_vars/
|
|||
|
|
│ └── <switch>.yml # identity, mgmt IP/VLAN, VLAN+port map, firmware_target
|
|||
|
|
├── roles/
|
|||
|
|
│ └── makerfloss.mikrotik_switch/
|
|||
|
|
│ ├── defaults/main.yml # enable-flags, safe defaults, PLACEHOLDER vlan/port map
|
|||
|
|
│ ├── tasks/main.yml # imports domain task files, each gated by a flag
|
|||
|
|
│ ├── tasks/identity.yml # identity, mgmt IP, NTP/DNS, SSH on, unused services off
|
|||
|
|
│ ├── tasks/users.yml # named admin, import ssh pubkey, disable default admin
|
|||
|
|
│ ├── tasks/vlans.yml # bridge + hw VLAN filtering, access/trunk ports, SFP+ uplink
|
|||
|
|
│ ├── tasks/backup.yml # /export + /system backup save, fetch into repo
|
|||
|
|
│ └── tasks/firmware.yml # RouterOS + RouterBOOT upgrade to firmware_target
|
|||
|
|
├── playbooks (or top-level):
|
|||
|
|
│ ├── play_bootstrap.yml # FIRST CONTACT: password auth -> create user, import key
|
|||
|
|
│ ├── play_switch.yml # day-2: key-only, applies all enabled domains
|
|||
|
|
│ └── play_backup.yml # on-demand/scheduled backup fetch
|
|||
|
|
├── backups/<switch>/ # fetched config exports + .backup files
|
|||
|
|
└── docs/superpowers/specs/ # this design doc
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Connection model
|
|||
|
|
|
|||
|
|
`group_vars/mikrotik.yml`:
|
|||
|
|
- `ansible_connection: ansible.netcommon.network_cli`
|
|||
|
|
- `ansible_network_os: community.routeros.routeros`
|
|||
|
|
- `ansible_user: <admin user>`
|
|||
|
|
- `ansible_ssh_private_key_file: ~/.ssh/id_ed25519` (day-2, key auth)
|
|||
|
|
|
|||
|
|
`play_bootstrap.yml` overrides with password auth (`--ask-pass`) for first contact only.
|
|||
|
|
|
|||
|
|
### Idempotency strategy (key design challenge)
|
|||
|
|
|
|||
|
|
Over `network_cli`/SSH the primary module is `community.routeros.command` (RouterOS has
|
|||
|
|
no rich declarative module set like `ios_*`). Idempotency is therefore the main risk and
|
|||
|
|
must be deliberate:
|
|||
|
|
- Prefer naturally-idempotent commands: `/.../ set` on known, named items.
|
|||
|
|
- For `add`-style items, guard with RouterOS scripting: `:if ([find <selector>] = "") do={ add ... }`.
|
|||
|
|
- Use `changed_when` based on command output where guards are impractical.
|
|||
|
|
- Keep each domain's command set small and readable; one logical change per task.
|
|||
|
|
- Cross-check against `community.routeros.facts` / `/export` output where useful.
|
|||
|
|
|
|||
|
|
This is explicitly called out so the implementation plan budgets for testing idempotency
|
|||
|
|
(run twice, assert no changes on second run).
|
|||
|
|
|
|||
|
|
## Operational flows
|
|||
|
|
|
|||
|
|
### On-switch preparation (manual, before Ansible)
|
|||
|
|
|
|||
|
|
1. Confirm boot OS is **RouterOS** (not SwOS) — VLAN filtering + `community.routeros` require it.
|
|||
|
|
2. Upgrade RouterOS **and** RouterBOOT firmware to a known-good stable; record as `firmware_target`.
|
|||
|
|
3. **Factory-reset to no default configuration** so Ansible owns the whole config.
|
|||
|
|
4. First-contact connectivity: laptop on a port, reach the device, confirm SSH reachable.
|
|||
|
|
5. Decide addressing (into `host_vars`): mgmt IP/mask, mgmt VLAN, gateway, and which
|
|||
|
|
port/SFP+ is the upstream **trunk/uplink** to OPNsense.
|
|||
|
|
6. Record identity facts: serial, MAC, model, RouterOS version.
|
|||
|
|
7. Physical: SFP+ module/DAC for the 10G uplink, PSU, mounting.
|
|||
|
|
|
|||
|
|
### Bootstrap (run once)
|
|||
|
|
|
|||
|
|
`play_bootstrap.yml`, SSH **password** auth (default/initial creds):
|
|||
|
|
- create named admin user; set its password from vault;
|
|||
|
|
- import `~/.ssh/id_ed25519.pub`, bind to the user;
|
|||
|
|
- enable SSH service;
|
|||
|
|
- verify key login works, then disable the default `admin`.
|
|||
|
|
|
|||
|
|
### Day-2 (normal)
|
|||
|
|
|
|||
|
|
`play_switch.yml`, **key-only**, applies all enabled domains idempotently.
|
|||
|
|
`play_backup.yml` exports config + binary backup into `backups/<switch>/`.
|
|||
|
|
|
|||
|
|
## Secrets
|
|||
|
|
|
|||
|
|
Vault identity `makerfloss` (`~/.ansible/vault-keys/makerfloss.txt`), referenced in
|
|||
|
|
`ansible.cfg`. Initial contents: the switch admin password. SSH key auth means day-2
|
|||
|
|
runs need no secret at runtime. (Vault-less start is possible but we create the identity
|
|||
|
|
up front.)
|
|||
|
|
|
|||
|
|
## Success criteria
|
|||
|
|
|
|||
|
|
- `play_bootstrap.yml` takes a factory-reset switch to key-based SSH access.
|
|||
|
|
- `play_switch.yml` applies identity + services + users + a placeholder VLAN/port
|
|||
|
|
topology, and is **idempotent** (second run reports no changes).
|
|||
|
|
- `play_backup.yml` writes a usable `/export` and `.backup` into the repo.
|
|||
|
|
- All real switch values live in `host_vars`; the role contains no makerspace specifics.
|
|||
|
|
- `ansible-lint` and `yamllint` pass.
|
|||
|
|
|
|||
|
|
## Open items to confirm during planning
|
|||
|
|
|
|||
|
|
- Exact RouterOS `firmware_target` version to pin.
|
|||
|
|
- Whether `winbox` service stays enabled (convenience) or is disabled (hardening).
|
|||
|
|
- Named admin username (e.g. `sjat` vs a service account like `ansible`).
|
|||
|
|
- Backup scheduling: Ansible-run on demand vs a RouterOS scheduler + fetch.
|
|||
|
|
|
|||
|
|
## Reference
|
|||
|
|
|
|||
|
|
- `narrowin/ansible-mikrotik` (GitHub) — playbook-centric; mine its backup/upgrade
|
|||
|
|
command sequences. Not used as a dependency.
|
|||
|
|
- `community.routeros` Ansible collection.
|
|||
|
|
- AnsibleBaobabV4 — conventions source (direnv, ansible.cfg, lint, inventory cascade,
|
|||
|
|
enable-flag role idiom).
|