docs: CRS310 Ansible implementation plan

Bite-sized, idempotency-verified plan: scaffolding -> vault/inventory ->
role skeleton -> bootstrap (key import) -> domain tasks (identity, users,
vlans, backup, firmware) -> docs/publish. Phase 0 gates device-dependent
work on physical switch prep + forgejo repo creation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
sjat 2026-06-07 08:12:15 +02:00
parent f1d7b3059c
commit 7731f98f15

View file

@ -0,0 +1,924 @@
# MakerFLOSS_Mikrotik — CRS310 Ansible Management — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Build a fresh Ansible repo that configures the makerspace MikroTik CRS310-8G+2S+IN switch over SSH (identity, services, users/keys, VLANs, backups, firmware), idempotently and version-controlled.
**Architecture:** One role `makerfloss.mikrotik_switch` with per-domain task files gated by enable-flags, driven by `community.routeros` over `network_cli`. Real values live in `host_vars`; connection vars in `group_vars/mikrotik.yml`; mechanism + placeholders in role defaults. A one-time `play_bootstrap.yml` (password auth) imports the operator SSH key; day-2 `play_switch.yml` runs key-only.
**Tech Stack:** Ansible 10.x / ansible-core 2.17, `community.routeros` (+ `ansible.netcommon`), RouterOS 7.x, SSH key auth, ansible-vault (`makerfloss` identity), ansible-lint + yamllint.
**Working directory:** `~/Projects/MakerFLOSS_Mikrotik` (git repo already initialised, `main`, with the design spec committed).
**Reference spec:** `docs/superpowers/specs/2026-06-07-mikrotik-crs310-ansible-design.md`
---
## How verification works in this plan
There is no unit-test framework for RouterOS config. "Tests" in this plan are:
- **Static:** `yamllint .`, `ansible-lint`, `ansible-playbook --syntax-check`.
- **Connectivity:** `ansible -m community.routeros.command -a "commands='/system/resource/print'" <switch>`.
- **Idempotency:** run the play twice; the **second run must report `changed=0`** for that domain.
Device-dependent tasks (Phase 4 onward) require the switch to be prepared and reachable
(Phase 0). Until then, only static checks pass — that is expected and fine.
> ⚠️ **Lockout safety:** When applying VLAN/bridge or service changes, keep an independent
> recovery channel open (WinBox MAC-telnet, or serial console) so a mistake in management
> reachability doesn't strand the switch. Enable `vlan-filtering` **last**, after the
> management path is proven.
---
## Phase 0: Prerequisites (manual, out-of-band — do before Phase 4)
These are not code tasks; they gate the device-dependent phases.
- [ ] **0.1 Create the empty repo** `MakerFLOSS_Mikrotik` on `forgejo.makerfloss.eu` (no README/license, so the first push isn't rejected).
- [ ] **0.2 Confirm boot OS is RouterOS** (not SwOS) on the CRS310. Switch in RouterBOOT if needed.
- [ ] **0.3 Upgrade + pin firmware:** update RouterOS and RouterBOOT to a known-good stable (e.g. latest 7.x stable). Record the exact version string for `firmware_target` (used in Task 9).
- [ ] **0.4 Factory-reset to NO default configuration** (`/system/reset-configuration no-defaults=yes` or Netinstall). Ansible will own the whole config.
- [ ] **0.5 First contact:** connect a laptop, reach the switch (default has no IP after no-defaults reset — assign a temporary IP via WinBox MAC session, e.g. `/ip/address/add address=192.168.88.1/24 interface=ether1`, and `/ip/service/enable ssh`). Confirm `ssh admin@<ip>` works.
- [ ] **0.6 Record identity facts** into a scratch note for Task 3 host_vars: serial, base MAC, model, RouterOS version, the temporary mgmt IP/port.
- [ ] **0.7 Physical:** fit the SFP+ module/DAC for the 10G uplink; confirm PSU/mounting.
---
## Phase 1: Repo scaffolding (no device required)
### Task 1: direnv, ansible.cfg, lint configs, requirements
**Files:**
- Create: `.envrc`, `ansible.cfg`, `.ansible-lint`, `.yamllint`, `requirements.txt`, `requirements.yml`, `.gitignore`
- [ ] **Step 1: Create `.gitignore`**
```gitignore
.venv/
*.retry
__pycache__/
*.pyc
.DS_Store
```
- [ ] **Step 2: Create `.envrc`** (verbatim from AnsibleBaobabV4)
```bash
# Create .venv automatically if it doesn't exist
if [ ! -d .venv ]; then
python3 -m venv .venv
.venv/bin/python -m pip install -U pip setuptools wheel
fi
# Activate the environment manually (avoids Python 3.13 deprecation warning)
export VIRTUAL_ENV=$PWD/.venv
PATH_add .venv/bin
```
- [ ] **Step 3: Create `ansible.cfg`**
```ini
[defaults]
inventory = inventories/prod/hosts.yml
roles_path = roles:~/.ansible/roles
collections_path = ~/.ansible/collections
host_key_checking = False
retry_files_enabled = False
interpreter_python = auto_silent
nocows = 1
timeout = 30
stdout_callback = yaml
bin_ansible_callbacks = True
vault_identity_list = makerfloss@~/.ansible/vault-keys/makerfloss.txt
[persistent_connection]
command_timeout = 60
connect_timeout = 60
[ssh_connection]
pipelining = True
```
- [ ] **Step 4: Create `requirements.txt`**
```text
# Core Ansible
ansible==10.3.0
# Linting & validation
ansible-lint==24.7.0
yamllint==1.35.1
# Network connection plugins / SCP for SSH key transfer to RouterOS
paramiko>=3.4.0
scp>=0.15.0
```
- [ ] **Step 5: Create `requirements.yml`**
```yaml
---
collections:
- name: community.routeros
version: ">=3.0.0,<4.0.0"
- name: ansible.netcommon
version: ">=6.0.0,<8.0.0"
```
- [ ] **Step 6: Create `.ansible-lint`**
```yaml
---
profile: production
skip_list:
- line-length
- no-changed-when
exclude_paths:
- .venv/
- backups/
```
- [ ] **Step 7: Create `.yamllint`**
```yaml
---
extends: default
rules:
line-length: disable
comments:
min-spaces-from-content: 1
truthy:
allowed-values: ["true", "false", "yes", "no"]
ignore: |
.venv/
backups/
```
- [ ] **Step 8: Bootstrap the venv and install**
Run:
```bash
cd ~/Projects/MakerFLOSS_Mikrotik
direnv allow 2>/dev/null || python3 -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
ansible-galaxy collection install -r requirements.yml
```
Expected: ansible, ansible-lint, yamllint installed; `community.routeros` + `ansible.netcommon` installed.
- [ ] **Step 9: Verify yamllint passes on the config files**
Run: `yamllint .`
Expected: no errors.
- [ ] **Step 10: Commit**
```bash
git add .gitignore .envrc ansible.cfg requirements.txt requirements.yml .ansible-lint .yamllint
git commit -m "chore: repo scaffolding (direnv, ansible.cfg, lint, requirements)"
```
---
### Task 2: Vault identity + inventory + connection group_vars
**Files:**
- Create: `~/.ansible/vault-keys/makerfloss.txt` (outside repo), `inventories/prod/hosts.yml`, `group_vars/mikrotik.yml`, `group_vars/all.yml`
- [ ] **Step 1: Create the vault key (outside the repo)**
Run:
```bash
mkdir -p ~/.ansible/vault-keys
( umask 077; openssl rand -base64 48 > ~/.ansible/vault-keys/makerfloss.txt )
chmod 600 ~/.ansible/vault-keys/makerfloss.txt
```
Expected: a 600-perm key file exists. (This is the `makerfloss` identity from `ansible.cfg`.)
- [ ] **Step 2: Create `inventories/prod/hosts.yml`**
> Replace `crs310-maker` and the `ansible_host` IP with the real values from Phase 0.6.
```yaml
---
all:
children:
mikrotik:
hosts:
crs310-maker:
ansible_host: 192.168.88.1 # temp mgmt IP until Task 4 sets the real one
```
- [ ] **Step 3: Create `group_vars/mikrotik.yml`** (connection/platform vars)
```yaml
---
ansible_connection: ansible.netcommon.network_cli
ansible_network_os: community.routeros.routeros
ansible_user: admin
ansible_ssh_private_key_file: "~/.ssh/id_ed25519"
# Domain enable-flags (day-2 play). Override per-host if needed.
switch_identity_enabled: true
switch_users_enabled: true
switch_vlans_enabled: true
switch_backup_enabled: true
switch_firmware_enabled: false # opt-in; upgrades are disruptive
```
- [ ] **Step 4: Create `group_vars/all.yml`** (placeholder for shared, non-secret defaults)
```yaml
---
# Shared non-secret defaults across all hosts go here.
# Secrets live in the vault (see host_vars / a vaulted file), not in this file.
org_name: "MakerFLOSS"
```
- [ ] **Step 5: Verify inventory parses**
Run: `ansible-inventory --graph`
Expected: shows `@mikrotik``crs310-maker`.
- [ ] **Step 6: Commit**
```bash
git add inventories group_vars
git commit -m "feat: inventory, connection group_vars, makerfloss vault identity"
```
---
## Phase 2: Role skeleton
### Task 3: Role skeleton + host_vars + meta
**Files:**
- Create: `roles/makerfloss.mikrotik_switch/defaults/main.yml`
- Create: `roles/makerfloss.mikrotik_switch/meta/main.yml`
- Create: `roles/makerfloss.mikrotik_switch/tasks/main.yml`
- Create: `host_vars/crs310-maker.yml`
- Create: `play_switch.yml`
- [ ] **Step 1: Create `roles/makerfloss.mikrotik_switch/meta/main.yml`**
```yaml
---
galaxy_info:
role_name: mikrotik_switch
namespace: makerfloss
author: sjat
description: Configure a MikroTik RouterOS switch (CRS310) over SSH.
license: MIT
min_ansible_version: "2.17"
platforms: []
dependencies: []
```
- [ ] **Step 2: Create `roles/makerfloss.mikrotik_switch/defaults/main.yml`** (mechanism + PLACEHOLDER topology)
```yaml
---
# ----- Identity / management -----
switch_identity_name: "{{ inventory_hostname }}"
switch_mgmt_vlan_id: 99
switch_mgmt_address: "192.168.88.1/24" # PLACEHOLDER — override in host_vars
switch_mgmt_gateway: "192.168.88.254" # PLACEHOLDER — override in host_vars
switch_dns_servers: "192.168.88.254"
switch_ntp_servers: "192.168.88.254"
# Services to disable for hardening (winbox kept on by default for recovery)
switch_disabled_services:
- telnet
- ftp
- www
- www-ssl
- api
- api-ssl
switch_ssh_port: 22
# ----- Users -----
switch_admin_user: "sjat"
switch_admin_group: "full"
switch_admin_ssh_pubkey_file: "~/.ssh/id_ed25519.pub"
switch_disable_default_admin: true
# ----- VLAN / bridge / ports (PLACEHOLDER example) -----
# Real topology is defined in host_vars/<switch>.yml.
switch_bridge_name: "bridge"
switch_vlans:
- { id: 99, name: "mgmt" }
- { id: 10, name: "members" }
switch_bridge_ports:
# ether1..ether8 = 2.5GbE access ports; sfp-sfpplus1/2 = 10G uplinks
- { interface: "ether1", pvid: 10, mode: access }
- { interface: "sfp-sfpplus1", pvid: 1, mode: trunk, tagged_vlans: [99, 10] }
# ----- Firmware -----
switch_firmware_target: "" # set in host_vars when opting into upgrades
```
- [ ] **Step 3: Create `roles/makerfloss.mikrotik_switch/tasks/main.yml`** (domain dispatch)
```yaml
---
- name: Identity, management and services
ansible.builtin.import_tasks: identity.yml
when: switch_identity_enabled | bool
tags: [identity]
- name: Users and SSH keys
ansible.builtin.import_tasks: users.yml
when: switch_users_enabled | bool
tags: [users]
- name: VLANs, bridge and ports
ansible.builtin.import_tasks: vlans.yml
when: switch_vlans_enabled | bool
tags: [vlans]
- name: Backup configuration
ansible.builtin.import_tasks: backup.yml
when: switch_backup_enabled | bool
tags: [backup]
- name: Firmware upgrade
ansible.builtin.import_tasks: firmware.yml
when: switch_firmware_enabled | bool
tags: [firmware]
```
- [ ] **Step 4: Create stub `identity.yml`, `users.yml`, `vlans.yml`, `backup.yml`, `firmware.yml`**
Each stub (replaced in later tasks) is just:
```yaml
---
- name: Placeholder
ansible.builtin.debug:
msg: "not yet implemented"
```
Create all five files in `roles/makerfloss.mikrotik_switch/tasks/` with that content.
- [ ] **Step 5: Create `host_vars/crs310-maker.yml`** (REAL values from Phase 0.6)
```yaml
---
# Identity facts recorded during Phase 0.6 (edit to match the device)
switch_identity_name: "crs310-maker"
switch_mgmt_vlan_id: 99
switch_mgmt_address: "10.0.99.2/24" # EDIT: real mgmt IP
switch_mgmt_gateway: "10.0.99.1" # EDIT: real gateway
switch_dns_servers: "10.0.99.1"
switch_ntp_servers: "10.0.99.1"
switch_admin_user: "sjat"
# Real VLAN/port topology (EDIT to the makerspace plan when known)
switch_vlans:
- { id: 99, name: "mgmt" }
- { id: 10, name: "members" }
switch_bridge_ports:
- { interface: "ether1", pvid: 10, mode: access }
- { interface: "ether2", pvid: 10, mode: access }
- { interface: "sfp-sfpplus1", pvid: 1, mode: trunk, tagged_vlans: [99, 10] }
# Firmware (opt-in)
# switch_firmware_enabled: true
# switch_firmware_target: "7.x.y" # EDIT to the version pinned in Phase 0.3
```
- [ ] **Step 6: Create `play_switch.yml`**
```yaml
---
- name: Configure MikroTik switches (day-2, key auth)
hosts: mikrotik
gather_facts: false
roles:
- makerfloss.mikrotik_switch
```
- [ ] **Step 7: Verify syntax + lint**
Run:
```bash
ansible-playbook play_switch.yml --syntax-check
yamllint .
ansible-lint
```
Expected: syntax OK; yamllint clean; ansible-lint clean (fix any findings).
- [ ] **Step 8: Commit**
```bash
git add roles host_vars play_switch.yml
git commit -m "feat: role skeleton, host_vars, day-2 play (stubbed domains)"
```
---
## Phase 3: Bootstrap play (device-dependent — needs Phase 0 done)
### Task 4: First-contact bootstrap — create user, import SSH key
**Files:**
- Create: `play_bootstrap.yml`
- Create: `group_vars/mikrotik.vault.yml` (vaulted admin password)
- [ ] **Step 1: Create the vaulted admin password file**
Run:
```bash
ansible-vault create group_vars/mikrotik.vault.yml
```
Put in it:
```yaml
---
vault_switch_admin_password: "CHOOSE-A-STRONG-PASSWORD"
```
Expected: file is encrypted (`head -1` shows `$ANSIBLE_VAULT`).
- [ ] **Step 2: Create `play_bootstrap.yml`**
> Run with password auth the first time: `ansible-playbook play_bootstrap.yml -e ansible_user=admin --ask-pass`.
> `net_put` copies the public key file to the device over SCP, then RouterOS imports it.
```yaml
---
- name: Bootstrap MikroTik switch (first contact, password auth)
hosts: mikrotik
gather_facts: false
vars:
pubkey_local: "{{ switch_admin_ssh_pubkey_file | default('~/.ssh/id_ed25519.pub') }}"
pubkey_remote: "id_ansible.pub"
tasks:
- name: Create named admin user (idempotent)
community.routeros.command:
commands:
- >-
:if ([:len [/user find name="{{ switch_admin_user }}"]] = 0) do={
/user add name="{{ switch_admin_user }}" group="{{ switch_admin_group }}"
password="{{ vault_switch_admin_password }}" }
register: user_create
changed_when: true
- name: Copy operator public key to the switch
ansible.netcommon.net_put:
src: "{{ pubkey_local }}"
dest: "{{ pubkey_remote }}"
- name: Import the SSH public key for the admin user
community.routeros.command:
commands:
- /user/ssh-keys/import public-key-file="{{ pubkey_remote }}" user="{{ switch_admin_user }}"
register: key_import
changed_when: true
- name: Ensure SSH service is enabled
community.routeros.command:
commands:
- /ip/service/set ssh disabled=no port={{ switch_ssh_port | default(22) }}
changed_when: true
```
- [ ] **Step 3: Syntax check**
Run: `ansible-playbook play_bootstrap.yml --syntax-check`
Expected: OK.
- [ ] **Step 4: Run bootstrap against the switch (password auth)**
Run:
```bash
ansible-playbook play_bootstrap.yml -e ansible_user=admin --ask-pass
```
Expected: user created, key file copied, key imported, SSH enabled. (Keep a WinBox MAC session open per the lockout note.)
- [ ] **Step 5: Prove key login works**
Run:
```bash
ansible -m community.routeros.command -a "commands='/user/print'" crs310-maker
```
Expected: succeeds using `~/.ssh/id_ed25519` (no password prompt), and lists your named user.
- [ ] **Step 6: Commit**
```bash
git add play_bootstrap.yml group_vars/mikrotik.vault.yml
git commit -m "feat: first-contact bootstrap play (named admin + SSH key import)"
```
---
## Phase 4: Domain tasks (device-dependent — idempotency-verified)
> For every task below: run `ansible-playbook play_switch.yml --tags <domain> --limit crs310-maker`
> **twice**; the second run must report `changed=0` (or all `changed_when: false`).
> RouterOS `:if ([:len [... find ...]] = 0)` guards make `add` idempotent.
### Task 5: identity.yml — identity, mgmt IP, DNS/NTP, service hardening
**Files:**
- Modify: `roles/makerfloss.mikrotik_switch/tasks/identity.yml`
- [ ] **Step 1: Replace `identity.yml` with the real implementation**
```yaml
---
- name: Set system identity
community.routeros.command:
commands:
- /system/identity/set name="{{ switch_identity_name }}"
changed_when: false
- name: Configure DNS
community.routeros.command:
commands:
- /ip/dns/set servers="{{ switch_dns_servers }}" allow-remote-requests=no
changed_when: false
- name: Configure NTP client
community.routeros.command:
commands:
- /system/ntp/client/set enabled=yes servers="{{ switch_ntp_servers }}"
changed_when: false
- name: Disable unused services
community.routeros.command:
commands: >-
{{ switch_disabled_services
| map('regex_replace', '^(.*)$', '/ip/service/set \1 disabled=yes')
| list }}
changed_when: false
- name: Set SSH service port
community.routeros.command:
commands:
- /ip/service/set ssh disabled=no port={{ switch_ssh_port }}
changed_when: false
```
- [ ] **Step 2: Run twice, assert idempotent**
Run (twice):
```bash
ansible-playbook play_switch.yml --tags identity --limit crs310-maker
```
Expected: completes cleanly both runs; `/system/identity/print` shows the new name; disabled services show `X` (disabled).
- [ ] **Step 3: Commit**
```bash
git add roles/makerfloss.mikrotik_switch/tasks/identity.yml
git commit -m "feat(identity): identity, DNS, NTP, service hardening"
```
---
### Task 6: users.yml — ensure admin user, key, disable default admin
**Files:**
- Modify: `roles/makerfloss.mikrotik_switch/tasks/users.yml`
- [ ] **Step 1: Replace `users.yml`**
```yaml
---
- name: Ensure named admin user exists
community.routeros.command:
commands:
- >-
:if ([:len [/user find name="{{ switch_admin_user }}"]] = 0) do={
/user add name="{{ switch_admin_user }}" group="{{ switch_admin_group }}" }
changed_when: false
- name: Disable the default admin user
community.routeros.command:
commands:
- >-
:if ([:len [/user find name="admin"]] > 0) do={
/user/set admin disabled=yes }
when: switch_disable_default_admin | bool
changed_when: false
```
- [ ] **Step 2: Run twice, assert idempotent**
Run (twice):
```bash
ansible-playbook play_switch.yml --tags users --limit crs310-maker
```
Expected: clean both runs. **Before this lands, confirm key login as the named user works** (Task 4 Step 5), or disabling `admin` could lock you out.
- [ ] **Step 3: Commit**
```bash
git add roles/makerfloss.mikrotik_switch/tasks/users.yml
git commit -m "feat(users): ensure named admin, disable default admin"
```
---
### Task 7: vlans.yml — VLAN-aware bridge, ports, mgmt interface
**Files:**
- Modify: `roles/makerfloss.mikrotik_switch/tasks/vlans.yml`
> Ordering matters to avoid lockout: create bridge (filtering OFF) → add ports → define
> VLANs → add mgmt VLAN interface + IP → enable `vlan-filtering` LAST.
- [ ] **Step 1: Replace `vlans.yml`**
```yaml
---
- name: Create VLAN-aware bridge (filtering off initially)
community.routeros.command:
commands:
- >-
:if ([:len [/interface/bridge/find name="{{ switch_bridge_name }}"]] = 0) do={
/interface/bridge/add name="{{ switch_bridge_name }}" vlan-filtering=no }
changed_when: false
- name: Add bridge ports with PVIDs
community.routeros.command:
commands:
- >-
:if ([:len [/interface/bridge/port/find interface="{{ item.interface }}"]] = 0) do={
/interface/bridge/port/add bridge="{{ switch_bridge_name }}"
interface="{{ item.interface }}" pvid={{ item.pvid }} }
else={ /interface/bridge/port/set
[find interface="{{ item.interface }}"] pvid={{ item.pvid }} }
loop: "{{ switch_bridge_ports }}"
loop_control:
label: "{{ item.interface }}"
changed_when: false
- name: Define bridge VLANs (tagged/untagged membership)
community.routeros.command:
commands:
- >-
:if ([:len [/interface/bridge/vlan/find vlan-ids={{ item.id }}]] = 0) do={
/interface/bridge/vlan/add bridge="{{ switch_bridge_name }}" vlan-ids={{ item.id }}
tagged="{{ ([switch_bridge_name] + (switch_bridge_ports
| selectattr('mode','equalto','trunk')
| selectattr('tagged_vlans','defined')
| selectattr('tagged_vlans','contains', item.id)
| map(attribute='interface') | list)) | join(',') }}"
untagged="{{ switch_bridge_ports
| selectattr('mode','equalto','access')
| selectattr('pvid','equalto', item.id)
| map(attribute='interface') | list | join(',') }}" }
loop: "{{ switch_vlans }}"
loop_control:
label: "vlan {{ item.id }}"
changed_when: false
- name: Create management VLAN interface
community.routeros.command:
commands:
- >-
:if ([:len [/interface/vlan/find name="vlan-mgmt"]] = 0) do={
/interface/vlan/add name="vlan-mgmt" interface="{{ switch_bridge_name }}"
vlan-id={{ switch_mgmt_vlan_id }} }
changed_when: false
- name: Assign management IP address
community.routeros.command:
commands:
- >-
:if ([:len [/ip/address/find interface="vlan-mgmt"]] = 0) do={
/ip/address/add address="{{ switch_mgmt_address }}" interface="vlan-mgmt" }
changed_when: false
- name: Set default gateway route
community.routeros.command:
commands:
- >-
:if ([:len [/ip/route/find dst-address="0.0.0.0/0"]] = 0) do={
/ip/route/add dst-address=0.0.0.0/0 gateway="{{ switch_mgmt_gateway }}" }
changed_when: false
- name: Enable VLAN filtering (LAST — verify mgmt reachability first)
community.routeros.command:
commands:
- /interface/bridge/set "{{ switch_bridge_name }}" vlan-filtering=yes
changed_when: false
```
- [ ] **Step 2: Run twice, assert idempotent — WITH a recovery channel open**
Run (twice), keeping WinBox MAC session open:
```bash
ansible-playbook play_switch.yml --tags vlans --limit crs310-maker
```
Expected: clean both runs. Verify `/interface/bridge/vlan/print` shows correct tagged/untagged sets and you can still reach the mgmt IP after `vlan-filtering=yes`.
- [ ] **Step 3: Commit**
```bash
git add roles/makerfloss.mikrotik_switch/tasks/vlans.yml
git commit -m "feat(vlans): VLAN-aware bridge, ports, mgmt interface"
```
---
### Task 8: backup.yml — export + binary backup, fetch into repo
**Files:**
- Modify: `roles/makerfloss.mikrotik_switch/tasks/backup.yml`
- Create: `play_backup.yml`
- Create: `backups/.gitkeep`
- [ ] **Step 1: Replace `backup.yml`**
```yaml
---
- name: Generate a config export on the device
community.routeros.command:
commands:
- /export file=export
changed_when: false
- name: Generate a binary system backup on the device
community.routeros.command:
commands:
- /system/backup/save name=backup dont-encrypt=yes
changed_when: false
- name: Fetch the export file into the repo
ansible.netcommon.net_get:
src: "export.rsc"
dest: "{{ playbook_dir }}/backups/{{ inventory_hostname }}/export.rsc"
- name: Fetch the binary backup into the repo
ansible.netcommon.net_get:
src: "backup.backup"
dest: "{{ playbook_dir }}/backups/{{ inventory_hostname }}/backup.backup"
```
- [ ] **Step 2: Create `play_backup.yml`**
```yaml
---
- name: Back up MikroTik switch configuration
hosts: mikrotik
gather_facts: false
tasks:
- name: Ensure local backup directory exists
ansible.builtin.file:
path: "{{ playbook_dir }}/backups/{{ inventory_hostname }}"
state: directory
mode: "0755"
delegate_to: localhost
- name: Run backup tasks
ansible.builtin.include_role:
name: makerfloss.mikrotik_switch
tasks_from: backup.yml
```
- [ ] **Step 3: Create `backups/.gitkeep`** (empty file) so the dir exists.
- [ ] **Step 4: Run the backup play**
Run:
```bash
ansible-playbook play_backup.yml --limit crs310-maker
```
Expected: `backups/crs310-maker/export.rsc` and `backup.backup` appear locally and are non-empty.
- [ ] **Step 5: Commit (export only — binary backup may contain secrets)**
```bash
echo 'backups/**/*.backup' >> .gitignore
git add roles/makerfloss.mikrotik_switch/tasks/backup.yml play_backup.yml backups/crs310-maker/export.rsc .gitignore
git commit -m "feat(backup): export + binary backup, fetch into repo"
```
---
### Task 9: firmware.yml — RouterOS/RouterBOOT upgrade to pinned target
**Files:**
- Modify: `roles/makerfloss.mikrotik_switch/tasks/firmware.yml`
> Opt-in only (`switch_firmware_enabled: true` + `switch_firmware_target` set in host_vars).
> Upgrades reboot the switch — run deliberately, with a recovery channel.
- [ ] **Step 1: Replace `firmware.yml`**
```yaml
---
- name: Assert a firmware target is set
ansible.builtin.assert:
that:
- switch_firmware_target | length > 0
fail_msg: "switch_firmware_target must be set in host_vars to run firmware upgrades."
- name: Read current RouterOS version
community.routeros.facts:
register: ros_facts
- name: Upgrade RouterOS to the target channel and reboot
community.routeros.command:
commands:
- /system/package/update/set channel=stable
- /system/package/update/install
when: ansible_net_version is version(switch_firmware_target, '<')
changed_when: true
- name: Pause for device reboot
ansible.builtin.wait_for_connection:
delay: 30
timeout: 300
when: ansible_net_version is version(switch_firmware_target, '<')
- name: Upgrade RouterBOOT firmware to match RouterOS
community.routeros.command:
commands:
- /system/routerboard/upgrade
changed_when: true
- name: Reboot to apply RouterBOOT upgrade
community.routeros.command:
commands:
- /system/reboot
changed_when: true
ignore_errors: true # connection drops on reboot; expected
```
- [ ] **Step 2: Syntax + lint only (do NOT auto-run upgrades in CI)**
Run:
```bash
ansible-playbook play_switch.yml --syntax-check
ansible-lint
```
Expected: clean.
- [ ] **Step 3: (Manual, optional) run the upgrade deliberately**
Run:
```bash
ansible-playbook play_switch.yml --tags firmware --limit crs310-maker \
-e switch_firmware_enabled=true
```
Expected: upgrades only if current `< switch_firmware_target`; switch reboots and comes back.
- [ ] **Step 4: Commit**
```bash
git add roles/makerfloss.mikrotik_switch/tasks/firmware.yml
git commit -m "feat(firmware): opt-in RouterOS + RouterBOOT upgrade to pinned target"
```
---
## Phase 5: Docs and publish
### Task 10: README, role README, CLAUDE.md, push to Forgejo
**Files:**
- Create: `README.md`, `roles/makerfloss.mikrotik_switch/README.md`, `CLAUDE.md`
- [ ] **Step 1: Create `README.md`** covering: purpose, prerequisites (Phase 0 checklist), setup (`direnv allow`, `pip install`, `ansible-galaxy install`), bootstrap (`play_bootstrap.yml --ask-pass`), day-2 (`play_switch.yml`), backup (`play_backup.yml`), and the lockout-safety note.
- [ ] **Step 2: Create `roles/makerfloss.mikrotik_switch/README.md`** documenting every variable in `defaults/main.yml`, the enable-flags, and the `switch_bridge_ports`/`switch_vlans` data shapes with an example.
- [ ] **Step 3: Create `CLAUDE.md`** — short project guide: tech stack, structure, essential commands (lint, syntax-check, bootstrap, day-2, backup), the idempotency rule, and the lockout-safety rule.
- [ ] **Step 4: Final static verification**
Run:
```bash
yamllint . && ansible-lint && ansible-playbook play_switch.yml --syntax-check
```
Expected: all clean.
- [ ] **Step 5: Add remote and push**
Run:
```bash
git remote add origin git@forgejo.makerfloss.eu:<owner>/MakerFLOSS_Mikrotik.git
git add README.md roles/makerfloss.mikrotik_switch/README.md CLAUDE.md
git commit -m "docs: README, role README, CLAUDE.md"
git push -u origin main
```
Expected: repo populated on `forgejo.makerfloss.eu`.
---
## Self-review checklist (run before execution)
- [ ] **Spec coverage:** identity/services (Task 5), users/keys (Tasks 4,6), VLANs/bridge/ports (Task 7), backups (Task 8), firmware (Task 9), bring-over conventions (Tasks 12), separate vault (Task 2), placeholder topology overridable in host_vars (Tasks 3,7). ✔
- [ ] **Open items** from the spec are surfaced in the plan: firmware target (Phase 0.3 / Task 9), winbox on/off (`switch_disabled_services` default keeps winbox), admin username (`switch_admin_user`), backup scheduling (on-demand `play_backup.yml`; RouterOS scheduler left as a future enhancement).
- [ ] **Idempotency** is explicitly tested (run-twice) on every device-touching task.
- [ ] **Lockout safety** called out at the top and on Tasks 6 and 7.
## Notes / risks to validate during execution
- **RouterOS version drift:** exact CLI syntax (NTP `servers=` property, `ssh-keys/import` path) is RouterOS-7 specific; verify against the pinned version from Phase 0.3 and adjust.
- **`net_put`/`net_get` over `network_cli`:** depends on SCP being available on the RouterOS SSH service; if it fails, fall back to importing the key by pasting its contents via `/user/ssh-keys/...` or enabling SCP.
- **`changed_when: false`** is used widely because the `command` module can't detect RouterOS state changes; idempotency comes from the `:if [find]` guards. Revisit if you want accurate change reporting (parse command output).