MakerFLOSS_Mikrotik/docs/superpowers/plans/2026-06-07-mikrotik-crs310-ansible.md
sjat 0721ecc34c docs(plan): carry-over notes from skeleton code review
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 08:38:23 +02:00

30 KiB
Raw Blame History

MakerFLOSS_Mikrotik — CRS310 Ansible Management — Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Build a fresh Ansible repo that configures the makerspace MikroTik CRS310-8G+2S+IN switch over SSH (identity, services, users/keys, VLANs, backups, firmware), idempotently and version-controlled.

Architecture: One role makerfloss.mikrotik_switch with per-domain task files gated by enable-flags, driven by community.routeros over network_cli. Real values live in host_vars; connection vars in group_vars/mikrotik.yml; mechanism + placeholders in role defaults. A one-time play_bootstrap.yml (password auth) imports the operator SSH key; day-2 play_switch.yml runs key-only.

Tech Stack: Ansible 10.x / ansible-core 2.17, community.routeros (+ ansible.netcommon), RouterOS 7.x, SSH key auth, ansible-vault (makerfloss identity), ansible-lint + yamllint.

Working directory: ~/Projects/MakerFLOSS_Mikrotik (git repo already initialised, main, with the design spec committed).

Reference spec: docs/superpowers/specs/2026-06-07-mikrotik-crs310-ansible-design.md


How verification works in this plan

There is no unit-test framework for RouterOS config. "Tests" in this plan are:

  • Static: yamllint ., ansible-lint, ansible-playbook --syntax-check.
  • Connectivity: ansible -m community.routeros.command -a "commands='/system/resource/print'" <switch>.
  • Idempotency: run the play twice; the second run must report changed=0 for that domain.

Device-dependent tasks (Phase 4 onward) require the switch to be prepared and reachable (Phase 0). Until then, only static checks pass — that is expected and fine.

⚠️ Lockout safety: When applying VLAN/bridge or service changes, keep an independent recovery channel open (WinBox MAC-telnet, or serial console) so a mistake in management reachability doesn't strand the switch. Enable vlan-filtering last, after the management path is proven.


Phase 0: Prerequisites (manual, out-of-band — do before Phase 4)

These are not code tasks; they gate the device-dependent phases.

  • 0.1 Create the empty repo MakerFLOSS_Mikrotik on forgejo.makerfloss.eu (no README/license, so the first push isn't rejected).
  • 0.2 Confirm boot OS is RouterOS (not SwOS) on the CRS310. Switch in RouterBOOT if needed.
  • 0.3 Upgrade + pin firmware: update RouterOS and RouterBOOT to a known-good stable (e.g. latest 7.x stable). Record the exact version string for firmware_target (used in Task 9).
  • 0.4 Factory-reset to NO default configuration (/system/reset-configuration no-defaults=yes or Netinstall). Ansible will own the whole config.
  • 0.5 First contact: connect a laptop, reach the switch (default has no IP after no-defaults reset — assign a temporary IP via WinBox MAC session, e.g. /ip/address/add address=192.168.88.1/24 interface=ether1, and /ip/service/enable ssh). Confirm ssh admin@<ip> works.
  • 0.6 Record identity facts into a scratch note for Task 3 host_vars: serial, base MAC, model, RouterOS version, the temporary mgmt IP/port.
  • 0.7 Physical: fit the SFP+ module/DAC for the 10G uplink; confirm PSU/mounting.

Phase 1: Repo scaffolding (no device required)

Task 1: direnv, ansible.cfg, lint configs, requirements

Files:

  • Create: .envrc, ansible.cfg, .ansible-lint, .yamllint, requirements.txt, requirements.yml, .gitignore

  • Step 1: Create .gitignore

.venv/
*.retry
__pycache__/
*.pyc
.DS_Store
  • Step 2: Create .envrc (verbatim from AnsibleBaobabV4)
# Create .venv automatically if it doesn't exist
if [ ! -d .venv ]; then
  python3 -m venv .venv
  .venv/bin/python -m pip install -U pip setuptools wheel
fi

# Activate the environment manually (avoids Python 3.13 deprecation warning)
export VIRTUAL_ENV=$PWD/.venv
PATH_add .venv/bin
  • Step 3: Create ansible.cfg
[defaults]
inventory = inventories/prod/hosts.yml
roles_path = roles:~/.ansible/roles
collections_path = ~/.ansible/collections
host_key_checking = False
retry_files_enabled = False
interpreter_python = auto_silent
nocows = 1
timeout = 30
stdout_callback = yaml
bin_ansible_callbacks = True
vault_identity_list = makerfloss@~/.ansible/vault-keys/makerfloss.txt

[persistent_connection]
command_timeout = 60
connect_timeout = 60

[ssh_connection]
pipelining = True
  • Step 4: Create requirements.txt
# Core Ansible
ansible==10.3.0

# Linting & validation
ansible-lint==24.7.0
yamllint==1.35.1

# Network connection plugins / SCP for SSH key transfer to RouterOS
paramiko>=3.4.0
scp>=0.15.0
  • Step 5: Create requirements.yml
---
collections:
  - name: community.routeros
    version: ">=3.0.0,<4.0.0"
  - name: ansible.netcommon
    version: ">=6.0.0,<8.0.0"
  • Step 6: Create .ansible-lint
---
profile: production
skip_list:
  - line-length
  - no-changed-when
exclude_paths:
  - .venv/
  - backups/
  • Step 7: Create .yamllint
---
extends: default
rules:
  line-length: disable
  comments:
    min-spaces-from-content: 1
  truthy:
    allowed-values: ["true", "false", "yes", "no"]
ignore: |
  .venv/
  backups/
  • Step 8: Bootstrap the venv and install

Run:

cd ~/Projects/MakerFLOSS_Mikrotik
direnv allow 2>/dev/null || python3 -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
ansible-galaxy collection install -r requirements.yml

Expected: ansible, ansible-lint, yamllint installed; community.routeros + ansible.netcommon installed.

  • Step 9: Verify yamllint passes on the config files

Run: yamllint . Expected: no errors.

  • Step 10: Commit
git add .gitignore .envrc ansible.cfg requirements.txt requirements.yml .ansible-lint .yamllint
git commit -m "chore: repo scaffolding (direnv, ansible.cfg, lint, requirements)"

Task 2: Vault identity + inventory + connection group_vars

Files:

  • Create: ~/.ansible/vault-keys/makerfloss.txt (outside repo), inventories/prod/hosts.yml, group_vars/mikrotik.yml, group_vars/all.yml

  • Step 1: Create the vault key (outside the repo)

Run:

mkdir -p ~/.ansible/vault-keys
( umask 077; openssl rand -base64 48 > ~/.ansible/vault-keys/makerfloss.txt )
chmod 600 ~/.ansible/vault-keys/makerfloss.txt

Expected: a 600-perm key file exists. (This is the makerfloss identity from ansible.cfg.)

  • Step 2: Create inventories/prod/hosts.yml

Replace crs310-maker and the ansible_host IP with the real values from Phase 0.6.

---
all:
  children:
    mikrotik:
      hosts:
        crs310-maker:
          ansible_host: 192.168.88.1   # temp mgmt IP until Task 4 sets the real one
  • Step 3: Create group_vars/mikrotik.yml (connection/platform vars)
---
ansible_connection: ansible.netcommon.network_cli
ansible_network_os: community.routeros.routeros
ansible_user: admin
ansible_ssh_private_key_file: "~/.ssh/id_ed25519"

# Domain enable-flags (day-2 play). Override per-host if needed.
switch_identity_enabled: true
switch_users_enabled: true
switch_vlans_enabled: true
switch_backup_enabled: true
switch_firmware_enabled: false   # opt-in; upgrades are disruptive
  • Step 4: Create group_vars/all.yml (placeholder for shared, non-secret defaults)
---
# Shared non-secret defaults across all hosts go here.
# Secrets live in the vault (see host_vars / a vaulted file), not in this file.
org_name: "MakerFLOSS"
  • Step 5: Verify inventory parses

Run: ansible-inventory --graph Expected: shows @mikrotikcrs310-maker.

  • Step 6: Commit
git add inventories group_vars
git commit -m "feat: inventory, connection group_vars, makerfloss vault identity"

Phase 2: Role skeleton

Task 3: Role skeleton + host_vars + meta

Files:

  • Create: roles/makerfloss.mikrotik_switch/defaults/main.yml

  • Create: roles/makerfloss.mikrotik_switch/meta/main.yml

  • Create: roles/makerfloss.mikrotik_switch/tasks/main.yml

  • Create: host_vars/crs310-maker.yml

  • Create: play_switch.yml

  • Step 1: Create roles/makerfloss.mikrotik_switch/meta/main.yml

---
galaxy_info:
  role_name: mikrotik_switch
  namespace: makerfloss
  author: sjat
  description: Configure a MikroTik RouterOS switch (CRS310) over SSH.
  license: MIT
  min_ansible_version: "2.17"
  platforms: []
dependencies: []
  • Step 2: Create roles/makerfloss.mikrotik_switch/defaults/main.yml (mechanism + PLACEHOLDER topology)
---
# ----- Identity / management -----
switch_identity_name: "{{ inventory_hostname }}"
switch_mgmt_vlan_id: 99
switch_mgmt_address: "192.168.88.1/24"   # PLACEHOLDER — override in host_vars
switch_mgmt_gateway: "192.168.88.254"    # PLACEHOLDER — override in host_vars
switch_dns_servers: "192.168.88.254"
switch_ntp_servers: "192.168.88.254"

# Services to disable for hardening (winbox kept on by default for recovery)
switch_disabled_services:
  - telnet
  - ftp
  - www
  - www-ssl
  - api
  - api-ssl
switch_ssh_port: 22

# ----- Users -----
switch_admin_user: "sjat"
switch_admin_group: "full"
switch_admin_ssh_pubkey_file: "~/.ssh/id_ed25519.pub"
switch_disable_default_admin: true

# ----- VLAN / bridge / ports (PLACEHOLDER example) -----
# Real topology is defined in host_vars/<switch>.yml.
switch_bridge_name: "bridge"
switch_vlans:
  - { id: 99, name: "mgmt" }
  - { id: 10, name: "members" }
switch_bridge_ports:
  # ether1..ether8 = 2.5GbE access ports; sfp-sfpplus1/2 = 10G uplinks
  - { interface: "ether1", pvid: 10, mode: access }
  - { interface: "sfp-sfpplus1", pvid: 1, mode: trunk, tagged_vlans: [99, 10] }

# ----- Firmware -----
switch_firmware_target: ""   # set in host_vars when opting into upgrades
  • Step 3: Create roles/makerfloss.mikrotik_switch/tasks/main.yml (domain dispatch)
---
- name: Identity, management and services
  ansible.builtin.import_tasks: identity.yml
  when: switch_identity_enabled | bool
  tags: [identity]

- name: Users and SSH keys
  ansible.builtin.import_tasks: users.yml
  when: switch_users_enabled | bool
  tags: [users]

- name: VLANs, bridge and ports
  ansible.builtin.import_tasks: vlans.yml
  when: switch_vlans_enabled | bool
  tags: [vlans]

- name: Backup configuration
  ansible.builtin.import_tasks: backup.yml
  when: switch_backup_enabled | bool
  tags: [backup]

- name: Firmware upgrade
  ansible.builtin.import_tasks: firmware.yml
  when: switch_firmware_enabled | bool
  tags: [firmware]
  • Step 4: Create stub identity.yml, users.yml, vlans.yml, backup.yml, firmware.yml

Each stub (replaced in later tasks) is just:

---
- name: Placeholder
  ansible.builtin.debug:
    msg: "not yet implemented"

Create all five files in roles/makerfloss.mikrotik_switch/tasks/ with that content.

  • Step 5: Create host_vars/crs310-maker.yml (REAL values from Phase 0.6)
---
# Identity facts recorded during Phase 0.6 (edit to match the device)
switch_identity_name: "crs310-maker"
switch_mgmt_vlan_id: 99
switch_mgmt_address: "10.0.99.2/24"      # EDIT: real mgmt IP
switch_mgmt_gateway: "10.0.99.1"         # EDIT: real gateway
switch_dns_servers: "10.0.99.1"
switch_ntp_servers: "10.0.99.1"

switch_admin_user: "sjat"

# Real VLAN/port topology (EDIT to the makerspace plan when known)
switch_vlans:
  - { id: 99, name: "mgmt" }
  - { id: 10, name: "members" }
switch_bridge_ports:
  - { interface: "ether1", pvid: 10, mode: access }
  - { interface: "ether2", pvid: 10, mode: access }
  - { interface: "sfp-sfpplus1", pvid: 1, mode: trunk, tagged_vlans: [99, 10] }

# Firmware (opt-in)
# switch_firmware_enabled: true
# switch_firmware_target: "7.x.y"   # EDIT to the version pinned in Phase 0.3
  • Step 6: Create play_switch.yml
---
- name: Configure MikroTik switches (day-2, key auth)
  hosts: mikrotik
  gather_facts: false
  roles:
    - makerfloss.mikrotik_switch
  • Step 7: Verify syntax + lint

Run:

ansible-playbook play_switch.yml --syntax-check
yamllint .
ansible-lint

Expected: syntax OK; yamllint clean; ansible-lint clean (fix any findings).

  • Step 8: Commit
git add roles host_vars play_switch.yml
git commit -m "feat: role skeleton, host_vars, day-2 play (stubbed domains)"

Phase 3: Bootstrap play (device-dependent — needs Phase 0 done)

Task 4: First-contact bootstrap — create user, import SSH key

Files:

  • Create: play_bootstrap.yml

  • Create: group_vars/mikrotik.vault.yml (vaulted admin password)

  • Step 1: Create the vaulted admin password file

Run:

ansible-vault create group_vars/mikrotik.vault.yml

Put in it:

---
vault_switch_admin_password: "CHOOSE-A-STRONG-PASSWORD"

Expected: file is encrypted (head -1 shows $ANSIBLE_VAULT).

  • Step 2: Create play_bootstrap.yml

Run with password auth the first time: ansible-playbook play_bootstrap.yml -e ansible_user=admin --ask-pass. net_put copies the public key file to the device over SCP, then RouterOS imports it.

---
- name: Bootstrap MikroTik switch (first contact, password auth)
  hosts: mikrotik
  gather_facts: false
  vars:
    pubkey_local: "{{ switch_admin_ssh_pubkey_file | default('~/.ssh/id_ed25519.pub') }}"
    pubkey_remote: "id_ansible.pub"
  tasks:
    - name: Create named admin user (idempotent)
      community.routeros.command:
        commands:
          - >-
            :if ([:len [/user find name="{{ switch_admin_user }}"]] = 0) do={
            /user add name="{{ switch_admin_user }}" group="{{ switch_admin_group }}"
            password="{{ vault_switch_admin_password }}" }
      register: user_create
      changed_when: true

    - name: Copy operator public key to the switch
      ansible.netcommon.net_put:
        src: "{{ pubkey_local }}"
        dest: "{{ pubkey_remote }}"

    - name: Import the SSH public key for the admin user
      community.routeros.command:
        commands:
          - /user/ssh-keys/import public-key-file="{{ pubkey_remote }}" user="{{ switch_admin_user }}"
      register: key_import
      changed_when: true

    - name: Ensure SSH service is enabled
      community.routeros.command:
        commands:
          - /ip/service/set ssh disabled=no port={{ switch_ssh_port | default(22) }}
      changed_when: true
  • Step 3: Syntax check

Run: ansible-playbook play_bootstrap.yml --syntax-check Expected: OK.

  • Step 4: Run bootstrap against the switch (password auth)

Run:

ansible-playbook play_bootstrap.yml -e ansible_user=admin --ask-pass

Expected: user created, key file copied, key imported, SSH enabled. (Keep a WinBox MAC session open per the lockout note.)

  • Step 5: Prove key login works

Run:

ansible -m community.routeros.command -a "commands='/user/print'" crs310-maker

Expected: succeeds using ~/.ssh/id_ed25519 (no password prompt), and lists your named user.

  • Step 6: Commit
git add play_bootstrap.yml group_vars/mikrotik.vault.yml
git commit -m "feat: first-contact bootstrap play (named admin + SSH key import)"

Phase 4: Domain tasks (device-dependent — idempotency-verified)

For every task below: run ansible-playbook play_switch.yml --tags <domain> --limit crs310-maker twice; the second run must report changed=0 (or all changed_when: false). RouterOS :if ([:len [... find ...]] = 0) guards make add idempotent.

Task 5: identity.yml — identity, mgmt IP, DNS/NTP, service hardening

Files:

  • Modify: roles/makerfloss.mikrotik_switch/tasks/identity.yml

  • Step 1: Replace identity.yml with the real implementation

---
- name: Set system identity
  community.routeros.command:
    commands:
      - /system/identity/set name="{{ switch_identity_name }}"
  changed_when: false

- name: Configure DNS
  community.routeros.command:
    commands:
      - /ip/dns/set servers="{{ switch_dns_servers }}" allow-remote-requests=no
  changed_when: false

- name: Configure NTP client
  community.routeros.command:
    commands:
      - /system/ntp/client/set enabled=yes servers="{{ switch_ntp_servers }}"
  changed_when: false

- name: Disable unused services
  community.routeros.command:
    commands: >-
      {{ switch_disabled_services
         | map('regex_replace', '^(.*)$', '/ip/service/set \1 disabled=yes')
         | list }}
  changed_when: false

- name: Set SSH service port
  community.routeros.command:
    commands:
      - /ip/service/set ssh disabled=no port={{ switch_ssh_port }}
  changed_when: false
  • Step 2: Run twice, assert idempotent

Run (twice):

ansible-playbook play_switch.yml --tags identity --limit crs310-maker

Expected: completes cleanly both runs; /system/identity/print shows the new name; disabled services show X (disabled).

  • Step 3: Commit
git add roles/makerfloss.mikrotik_switch/tasks/identity.yml
git commit -m "feat(identity): identity, DNS, NTP, service hardening"

Task 6: users.yml — ensure admin user, key, disable default admin

Files:

  • Modify: roles/makerfloss.mikrotik_switch/tasks/users.yml

  • Step 1: Replace users.yml

---
- name: Ensure named admin user exists
  community.routeros.command:
    commands:
      - >-
        :if ([:len [/user find name="{{ switch_admin_user }}"]] = 0) do={
        /user add name="{{ switch_admin_user }}" group="{{ switch_admin_group }}" }
  changed_when: false

- name: Disable the default admin user
  community.routeros.command:
    commands:
      - >-
        :if ([:len [/user find name="admin"]] > 0) do={
        /user/set admin disabled=yes }
  when: switch_disable_default_admin | bool
  changed_when: false
  • Step 2: Run twice, assert idempotent

Run (twice):

ansible-playbook play_switch.yml --tags users --limit crs310-maker

Expected: clean both runs. Before this lands, confirm key login as the named user works (Task 4 Step 5), or disabling admin could lock you out.

  • Step 3: Commit
git add roles/makerfloss.mikrotik_switch/tasks/users.yml
git commit -m "feat(users): ensure named admin, disable default admin"

Task 7: vlans.yml — VLAN-aware bridge, ports, mgmt interface

Files:

  • Modify: roles/makerfloss.mikrotik_switch/tasks/vlans.yml

Ordering matters to avoid lockout: create bridge (filtering OFF) → add ports → define VLANs → add mgmt VLAN interface + IP → enable vlan-filtering LAST.

  • Step 1: Replace vlans.yml
---
- name: Create VLAN-aware bridge (filtering off initially)
  community.routeros.command:
    commands:
      - >-
        :if ([:len [/interface/bridge/find name="{{ switch_bridge_name }}"]] = 0) do={
        /interface/bridge/add name="{{ switch_bridge_name }}" vlan-filtering=no }
  changed_when: false

- name: Add bridge ports with PVIDs
  community.routeros.command:
    commands:
      - >-
        :if ([:len [/interface/bridge/port/find interface="{{ item.interface }}"]] = 0) do={
        /interface/bridge/port/add bridge="{{ switch_bridge_name }}"
        interface="{{ item.interface }}" pvid={{ item.pvid }} }
        else={ /interface/bridge/port/set
        [find interface="{{ item.interface }}"] pvid={{ item.pvid }} }
  loop: "{{ switch_bridge_ports }}"
  loop_control:
    label: "{{ item.interface }}"
  changed_when: false

- name: Define bridge VLANs (tagged/untagged membership)
  community.routeros.command:
    commands:
      - >-
        :if ([:len [/interface/bridge/vlan/find vlan-ids={{ item.id }}]] = 0) do={
        /interface/bridge/vlan/add bridge="{{ switch_bridge_name }}" vlan-ids={{ item.id }}
        tagged="{{ ([switch_bridge_name] + (switch_bridge_ports
        | selectattr('mode','equalto','trunk')
        | selectattr('tagged_vlans','defined')
        | selectattr('tagged_vlans','contains', item.id)
        | map(attribute='interface') | list)) | join(',') }}"
        untagged="{{ switch_bridge_ports
        | selectattr('mode','equalto','access')
        | selectattr('pvid','equalto', item.id)
        | map(attribute='interface') | list | join(',') }}" }
  loop: "{{ switch_vlans }}"
  loop_control:
    label: "vlan {{ item.id }}"
  changed_when: false

- name: Create management VLAN interface
  community.routeros.command:
    commands:
      - >-
        :if ([:len [/interface/vlan/find name="vlan-mgmt"]] = 0) do={
        /interface/vlan/add name="vlan-mgmt" interface="{{ switch_bridge_name }}"
        vlan-id={{ switch_mgmt_vlan_id }} }
  changed_when: false

- name: Assign management IP address
  community.routeros.command:
    commands:
      - >-
        :if ([:len [/ip/address/find interface="vlan-mgmt"]] = 0) do={
        /ip/address/add address="{{ switch_mgmt_address }}" interface="vlan-mgmt" }
  changed_when: false

- name: Set default gateway route
  community.routeros.command:
    commands:
      - >-
        :if ([:len [/ip/route/find dst-address="0.0.0.0/0"]] = 0) do={
        /ip/route/add dst-address=0.0.0.0/0 gateway="{{ switch_mgmt_gateway }}" }
  changed_when: false

- name: Enable VLAN filtering (LAST — verify mgmt reachability first)
  community.routeros.command:
    commands:
      - /interface/bridge/set "{{ switch_bridge_name }}" vlan-filtering=yes
  changed_when: false
  • Step 2: Run twice, assert idempotent — WITH a recovery channel open

Run (twice), keeping WinBox MAC session open:

ansible-playbook play_switch.yml --tags vlans --limit crs310-maker

Expected: clean both runs. Verify /interface/bridge/vlan/print shows correct tagged/untagged sets and you can still reach the mgmt IP after vlan-filtering=yes.

  • Step 3: Commit
git add roles/makerfloss.mikrotik_switch/tasks/vlans.yml
git commit -m "feat(vlans): VLAN-aware bridge, ports, mgmt interface"

Task 8: backup.yml — export + binary backup, fetch into repo

Files:

  • Modify: roles/makerfloss.mikrotik_switch/tasks/backup.yml

  • Create: play_backup.yml

  • Create: backups/.gitkeep

  • Step 1: Replace backup.yml

---
- name: Generate a config export on the device
  community.routeros.command:
    commands:
      - /export file=export
  changed_when: false

- name: Generate a binary system backup on the device
  community.routeros.command:
    commands:
      - /system/backup/save name=backup dont-encrypt=yes
  changed_when: false

- name: Fetch the export file into the repo
  ansible.netcommon.net_get:
    src: "export.rsc"
    dest: "{{ playbook_dir }}/backups/{{ inventory_hostname }}/export.rsc"

- name: Fetch the binary backup into the repo
  ansible.netcommon.net_get:
    src: "backup.backup"
    dest: "{{ playbook_dir }}/backups/{{ inventory_hostname }}/backup.backup"
  • Step 2: Create play_backup.yml
---
- name: Back up MikroTik switch configuration
  hosts: mikrotik
  gather_facts: false
  tasks:
    - name: Ensure local backup directory exists
      ansible.builtin.file:
        path: "{{ playbook_dir }}/backups/{{ inventory_hostname }}"
        state: directory
        mode: "0755"
      delegate_to: localhost

    - name: Run backup tasks
      ansible.builtin.include_role:
        name: makerfloss.mikrotik_switch
        tasks_from: backup.yml
  • Step 3: Create backups/.gitkeep (empty file) so the dir exists.

  • Step 4: Run the backup play

Run:

ansible-playbook play_backup.yml --limit crs310-maker

Expected: backups/crs310-maker/export.rsc and backup.backup appear locally and are non-empty.

  • Step 5: Commit (export only — binary backup may contain secrets)
echo 'backups/**/*.backup' >> .gitignore
git add roles/makerfloss.mikrotik_switch/tasks/backup.yml play_backup.yml backups/crs310-maker/export.rsc .gitignore
git commit -m "feat(backup): export + binary backup, fetch into repo"

Task 9: firmware.yml — RouterOS/RouterBOOT upgrade to pinned target

Files:

  • Modify: roles/makerfloss.mikrotik_switch/tasks/firmware.yml

Opt-in only (switch_firmware_enabled: true + switch_firmware_target set in host_vars). Upgrades reboot the switch — run deliberately, with a recovery channel.

  • Step 1: Replace firmware.yml
---
- name: Assert a firmware target is set
  ansible.builtin.assert:
    that:
      - switch_firmware_target | length > 0
    fail_msg: "switch_firmware_target must be set in host_vars to run firmware upgrades."

- name: Read current RouterOS version
  community.routeros.facts:
  register: ros_facts

- name: Upgrade RouterOS to the target channel and reboot
  community.routeros.command:
    commands:
      - /system/package/update/set channel=stable
      - /system/package/update/install
  when: ansible_net_version is version(switch_firmware_target, '<')
  changed_when: true

- name: Pause for device reboot
  ansible.builtin.wait_for_connection:
    delay: 30
    timeout: 300
  when: ansible_net_version is version(switch_firmware_target, '<')

- name: Upgrade RouterBOOT firmware to match RouterOS
  community.routeros.command:
    commands:
      - /system/routerboard/upgrade
  changed_when: true

- name: Reboot to apply RouterBOOT upgrade
  community.routeros.command:
    commands:
      - /system/reboot
  changed_when: true
  ignore_errors: true   # connection drops on reboot; expected
  • Step 2: Syntax + lint only (do NOT auto-run upgrades in CI)

Run:

ansible-playbook play_switch.yml --syntax-check
ansible-lint

Expected: clean.

  • Step 3: (Manual, optional) run the upgrade deliberately

Run:

ansible-playbook play_switch.yml --tags firmware --limit crs310-maker \
  -e switch_firmware_enabled=true

Expected: upgrades only if current < switch_firmware_target; switch reboots and comes back.

  • Step 4: Commit
git add roles/makerfloss.mikrotik_switch/tasks/firmware.yml
git commit -m "feat(firmware): opt-in RouterOS + RouterBOOT upgrade to pinned target"

Phase 5: Docs and publish

Task 10: README, role README, CLAUDE.md, push to Forgejo

Files:

  • Create: README.md, roles/makerfloss.mikrotik_switch/README.md, CLAUDE.md

  • Step 1: Create README.md covering: purpose, prerequisites (Phase 0 checklist), setup (direnv allow, pip install, ansible-galaxy install), bootstrap (play_bootstrap.yml --ask-pass), day-2 (play_switch.yml), backup (play_backup.yml), and the lockout-safety note.

  • Step 2: Create roles/makerfloss.mikrotik_switch/README.md documenting every variable in defaults/main.yml, the enable-flags, and the switch_bridge_ports/switch_vlans data shapes with an example.

  • Step 3: Create CLAUDE.md — short project guide: tech stack, structure, essential commands (lint, syntax-check, bootstrap, day-2, backup), the idempotency rule, and the lockout-safety rule.

  • Step 4: Final static verification

Run:

yamllint . && ansible-lint && ansible-playbook play_switch.yml --syntax-check

Expected: all clean.

  • Step 5: Add remote and push

Run:

git remote add origin git@forgejo.makerfloss.eu:<owner>/MakerFLOSS_Mikrotik.git
git add README.md roles/makerfloss.mikrotik_switch/README.md CLAUDE.md
git commit -m "docs: README, role README, CLAUDE.md"
git push -u origin main

Expected: repo populated on forgejo.makerfloss.eu.


Self-review checklist (run before execution)

  • Spec coverage: identity/services (Task 5), users/keys (Tasks 4,6), VLANs/bridge/ports (Task 7), backups (Task 8), firmware (Task 9), bring-over conventions (Tasks 12), separate vault (Task 2), placeholder topology overridable in host_vars (Tasks 3,7). ✔
  • Open items from the spec are surfaced in the plan: firmware target (Phase 0.3 / Task 9), winbox on/off (switch_disabled_services default keeps winbox), admin username (switch_admin_user), backup scheduling (on-demand play_backup.yml; RouterOS scheduler left as a future enhancement).
  • Idempotency is explicitly tested (run-twice) on every device-touching task.
  • Lockout safety called out at the top and on Tasks 6 and 7.

Notes / risks to validate during execution

  • RouterOS version drift: exact CLI syntax (NTP servers= property, ssh-keys/import path) is RouterOS-7 specific; verify against the pinned version from Phase 0.3 and adjust.
  • net_put/net_get over network_cli: depends on SCP being available on the RouterOS SSH service; if it fails, fall back to importing the key by pasting its contents via /user/ssh-keys/... or enabling SCP.
  • changed_when: false is used widely because the command module can't detect RouterOS state changes; idempotency comes from the :if [find] guards. Revisit if you want accurate change reporting (parse command output).

Carry-over notes from the skeleton code review (Tasks 13, done 2026-06-07)

The no-device tasks (13) are implemented, reviewed, and committed on branch feat/initial-scaffolding. The code-quality review of the role skeleton raised these points to handle WHEN the device task files (Tasks 59) are written:

  • switch_ssh_port (default 22): the identity task will set the SSH port. If the device was manually moved to a non-standard port before Ansible manages it, the first run resets it to 22 and the connection drops. Confirm the live port matches before the identity task runs, or override switch_ssh_port in host_vars.
  • switch_bridge_name / switch_admin_group: these default to the CRS310 factory values (bridge / full) and are NOT overridden in host_vars. Correct for this one device; if the bridge/group name ever differs, the VLAN and users tasks silently target the wrong object. Add explicit host_vars overrides if a second device is ever onboarded.
  • Trunk pvid: 1 (sfp-sfpplus1): untagged frames on the uplink land in VLAN 1. In a hardened VLAN design VLAN 1 is usually unused — when writing vlans.yml, decide deliberately whether the trunk should accept untagged traffic at all, and comment intent.
  • host_vars # EDIT: placeholders: switch_mgmt_address/gateway/dns/ntp in host_vars/crs310-maker.yml hold plausible 10.0.99.x placeholders. Replace with the real values from the field guide (Step 7) and remove the # EDIT comments so it's unambiguous they were updated.