GCodeOverlay/docs/superpowers/specs/2026-06-08-gcode-overlay-design.md
sjat 08767cf821 Add G-code overlay design spec
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 21:55:06 +02:00

7.2 KiB
Raw Blame History

G-Code Overlay — Design

Date: 2026-06-08 Status: Approved (pending spec review)

Purpose

At the makerspace, a fixed camera mounted on the wall films the CNC router table top-down, and the feed is restreamed to monitors around the space. This project is a web app that overlays a G-code toolpath on top of that camera stream, so an operator can:

  • Reality-check a job — see where the machine will actually cut before/while it runs.
  • Plan fixturing — see where the tool won't go, to decide where screws/clamps can safely be placed.

A user opens a local G-code file in the browser, the toolpath is drawn over the live video, and they align it to the real material.

Scope

In scope (v1):

  • Load a local G-code file (no upload; parsed in-browser).
  • Draw cutting paths (G1/G2/G3) and rapid/travel moves (G0, styled distinctly) over the video.
  • One-time camera→machine calibration (perspective transform), shared via a served config file.
  • Per-job alignment of the toolpath to the material (drag+rotate, click-to-set-origin, numeric offset).

Out of scope (v1), designed to add later:

  • Plunge/screw-placement highlighting.
  • Depth cues (color/weight by Z or cut order).
  • Broadcasting the overlay to all makerspace monitors (requires shared session state / backend).
  • Lens-distortion correction.
  • Multi-camera support.

Audience: local viewer only. The overlay appears on the screen of whoever loaded the G-code; the makerspace-wide monitors keep showing the existing raw stream. No shared session state.

Architecture

A pure client-side static single-page app — TypeScript + Vite + Canvas 2D, no backend. The page stacks three layers:

  1. Video layer — the camera stream, shown in an <img> (MJPEG) or <video> (HLS/WebRTC) element. Exact format confirmed early in implementation (see Open Questions). The overlay never reads the stream's pixels, so there is no CORS dependency and no proxy is needed.
  2. Overlay canvas — a transparent <canvas> sized to the video element, on which the toolpaths are drawn.
  3. UI — file open, calibration panel, alignment controls.

Transforms

Two transforms do all the geometric work:

  • Calibration homography H (fixed, one-time): maps machine-mm → image-px. A 3×3 projective transform. Calibrated once by jogging the spindle to known machine coordinates and clicking the tip in the video — this ties the camera image directly to the machine's own coordinate system (the same system the G-code lives in), collapsing "camera→table" and "table→machine" into one transform with no origin guesswork.
  • Per-job alignment A (per session): maps G-code work-coords → machine-mm (rotation + translation), accounting for where the operator zeroed the material this job.

Render pipeline:

G-code text
  → parse → polylines in work-mm (arcs flattened to segments)
  → apply A (per-job align)        → machine-mm
  → apply H (calibration homography) → image-px
  → draw on overlay canvas

Why transform points in JS rather than the canvas? A camera-perspective transform is projective, and neither Canvas 2D nor SVG can apply a projective transform natively (both are affine-only). Points are therefore transformed in JS, then drawn as polylines. This also makes Canvas 2D the natural, dependency-light choice.

Modules

  • gcode-parser — text → modal-state interpreter → ordered list of segments {kind: 'cut' | 'rapid', points: Vec2[]} in mm. Handles G0/G1/G2/G3, G90/G91 (absolute/incremental), G20/G21 (units), G17 (XY plane assumed), arc forms (I/J and R), and full circles. Z is read for modal state but v1 projects to XY only.
  • geometry — homography estimation from ≥4 point-pairs (DLT), apply and invert; affine compose/decompose for the per-job transform; arc flattening to a chord tolerance.
  • calibration — UI to capture point-pairs (click image + type machine X/Y), compute H, display per-point residual error (px) to catch a bad click, and export/import the calibration as JSON.
  • alignment — per-job placement: drag+rotate by eye (primary), click-to-set-origin (inverse-H a video click into machine space), numeric X/Y/rotation entry. All three set the same A.
  • renderer — draw cuts (solid) vs rapids (dashed/faint), redraw on any change, track and match the video element's size/position.
  • app/config — load config.json, hold app state, wire modules together.

Config / persistence

A config.json served alongside the app:

{
  "streamUrl": "http://internal-server/stream",
  "calibration": {
    "imagePoints":   [[px, py], ...],
    "machinePoints": [[mx, my], ...],
    "homography":    [[...], [...], [...]]
  },
  "renderDefaults": { "cutColor": "...", "rapidColor": "...", "lineWidth": 1 }
}

There is no backend, so persistence of a new calibration = committing/dropping an updated config.json. The in-app calibration tool computes the new values and presents the JSON to paste back into the file. Per-job alignment is ephemeral session state (not persisted).

Workflows

Calibration (one-time)

  1. Enter calibration mode.
  2. For each of 46 well-spread points (near the table corners):
    • Jog the CNC spindle to a known machine X/Y.
    • Type those coordinates.
    • Click the spindle tip where it appears in the video.
  3. Compute H; display residual error per point so a misclick is obvious.
  4. Export the JSON and save it into config.json.

Per-job

  1. Open a G-code file (drag-drop or file picker) — parsed locally.
  2. The toolpath appears over the video at a default placement.
  3. Align it: drag+rotate by eye (primary), or click-to-set-origin, or type a numeric offset; arrow keys nudge for fine tuning.
  4. Reality-check the toolpath against the live cut.

Error handling

  • Unsupported/garbage G-code lines — skipped, counted, surfaced as a non-fatal warning. Never crash.
  • Arcs — flattened to a chord tolerance; both R and I/J forms; full circles handled.
  • Missing/incomplete calibration — app still loads; overlay disabled with a "calibrate first" prompt.
  • Stream not loading — overlay still usable over a blank/last frame; a notice is shown.
  • Units — normalized to mm internally regardless of G20/G21.

Testing

TDD for the testable cores:

  • Parser — sample G-code files → expected segment lists (modal state, incremental/absolute, units, arcs).
  • Geometry — known point-pairs → known mapping; round-trip apply/invert; arc flattening within tolerance.

Manual visual check: overlay a known test pattern (e.g. a measured rectangle) over the real stream and confirm alignment.

Risks / open questions

  • Lens distortion — a wide-angle camera ~2.5 m up may bow straight cuts; a plain homography assumes straight lines stay straight. The calibration residual-error display will reveal if this is a real problem; if so, add a distortion-correction step (deferred).
  • Stream format unknown — the internal server's protocol (MJPEG / HLS / WebRTC) is not yet confirmed; this determines the embedding element. Confirm early in implementation.
  • Camera not perfectly top-down — fine for a homography as long as the table is planar and calibration points are well spread.