GCodeOverlay/docs/superpowers/specs/2026-06-08-gcode-overlay-design.md

# G-Code Overlay — Design

**Date:** 2026-06-08
**Status:** Approved (pending spec review)

## Purpose

At the makerspace, a fixed camera mounted on the wall films the CNC router table top-down, and the feed is restreamed to monitors around the space. This project is a web app that overlays a G-code toolpath on top of that camera stream, so an operator can:

- **Reality-check a job** — see where the machine will actually cut before/while it runs.
- **Plan fixturing** — see where the tool *won't* go, to decide where screws/clamps can safely be placed.

A user opens a local G-code file in the browser, the toolpath is drawn over the live video, and they align it to the real material.

## Scope

**In scope (v1):**
- Load a local G-code file (no upload; parsed in-browser).
- Draw cutting paths (G1/G2/G3) and rapid/travel moves (G0, styled distinctly) over the video.
- One-time camera→machine calibration (perspective transform), shared via a served config file.
- Per-job alignment of the toolpath to the material (drag+rotate, click-to-set-origin, numeric offset).

**Out of scope (v1), designed to add later:**
- Plunge/screw-placement highlighting.
- Depth cues (color/weight by Z or cut order).
- Broadcasting the overlay to all makerspace monitors (requires shared session state / backend).
- Lens-distortion correction.
- Multi-camera support.

**Audience:** local viewer only. The overlay appears on the screen of whoever loaded the G-code; the makerspace-wide monitors keep showing the existing raw stream. No shared session state.

## Architecture

A pure client-side static single-page app — **TypeScript + Vite + Canvas 2D**, no backend. The page stacks three layers:

1. **Video layer** — the camera stream, shown in an `<img>` (MJPEG) or `<video>` (HLS/WebRTC) element. Exact format confirmed early in implementation (see Open Questions). The overlay never reads the stream's pixels, so there is no CORS dependency and no proxy is needed.
2. **Overlay canvas** — a transparent `<canvas>` sized to the video element, on which the toolpaths are drawn.
3. **UI** — file open, calibration panel, alignment controls.

### Transforms

Two transforms do all the geometric work:

- **Calibration homography `H`** (fixed, one-time): maps **machine-mm → image-px**. A 3×3 projective transform. Calibrated once by jogging the spindle to known machine coordinates and clicking the tip in the video — this ties the camera image directly to the machine's own coordinate system (the same system the G-code lives in), collapsing "camera→table" and "table→machine" into one transform with no origin guesswork.
- **Per-job alignment `A`** (per session): maps **G-code work-coords → machine-mm** (rotation + translation), accounting for where the operator zeroed the material this job.

**Render pipeline:**

```
G-code text
  → parse → polylines in work-mm (arcs flattened to segments)
  → apply A (per-job align)        → machine-mm
  → apply H (calibration homography) → image-px
  → draw on overlay canvas
```

> **Why transform points in JS rather than the canvas?** A camera-perspective transform is projective, and neither Canvas 2D nor SVG can apply a projective transform natively (both are affine-only). Points are therefore transformed in JS, then drawn as polylines. This also makes Canvas 2D the natural, dependency-light choice.

## Modules

- **`gcode-parser`** — text → modal-state interpreter → ordered list of segments `{kind: 'cut' | 'rapid', points: Vec2[]}` in mm. Handles G0/G1/G2/G3, G90/G91 (absolute/incremental), G20/G21 (units), G17 (XY plane assumed), arc forms (I/J and R), and full circles. Z is read for modal state but v1 projects to XY only.
- **`geometry`** — homography estimation from ≥4 point-pairs (DLT), apply and invert; affine compose/decompose for the per-job transform; arc flattening to a chord tolerance.
- **`calibration`** — UI to capture point-pairs (click image + type machine X/Y), compute `H`, display per-point residual error (px) to catch a bad click, and export/import the calibration as JSON.
- **`alignment`** — per-job placement: drag+rotate by eye (primary), click-to-set-origin (inverse-`H` a video click into machine space), numeric X/Y/rotation entry. All three set the same `A`.
- **`renderer`** — draw cuts (solid) vs rapids (dashed/faint), redraw on any change, track and match the video element's size/position.
- **`app/config`** — load `config.json`, hold app state, wire modules together.

## Config / persistence

A `config.json` served alongside the app:

```json
{
  "streamUrl": "http://internal-server/stream",
  "calibration": {
    "imagePoints":   [[px, py], ...],
    "machinePoints": [[mx, my], ...],
    "homography":    [[...], [...], [...]]
  },
  "renderDefaults": { "cutColor": "...", "rapidColor": "...", "lineWidth": 1 }
}
```

There is no backend, so persistence of a new calibration = committing/dropping an updated `config.json`. The in-app calibration tool computes the new values and presents the JSON to paste back into the file. Per-job alignment is ephemeral session state (not persisted).

## Workflows

### Calibration (one-time)
1. Enter calibration mode.
2. For each of 4–6 well-spread points (near the table corners):
   - Jog the CNC spindle to a known machine X/Y.
   - Type those coordinates.
   - Click the spindle tip where it appears in the video.
3. Compute `H`; display residual error per point so a misclick is obvious.
4. Export the JSON and save it into `config.json`.

### Per-job
1. Open a G-code file (drag-drop or file picker) — parsed locally.
2. The toolpath appears over the video at a default placement.
3. Align it: drag+rotate by eye (primary), or click-to-set-origin, or type a numeric offset; arrow keys nudge for fine tuning.
4. Reality-check the toolpath against the live cut.

## Error handling

- **Unsupported/garbage G-code lines** — skipped, counted, surfaced as a non-fatal warning. Never crash.
- **Arcs** — flattened to a chord tolerance; both R and I/J forms; full circles handled.
- **Missing/incomplete calibration** — app still loads; overlay disabled with a "calibrate first" prompt.
- **Stream not loading** — overlay still usable over a blank/last frame; a notice is shown.
- **Units** — normalized to mm internally regardless of G20/G21.

## Testing

TDD for the testable cores:

- **Parser** — sample G-code files → expected segment lists (modal state, incremental/absolute, units, arcs).
- **Geometry** — known point-pairs → known mapping; round-trip apply/invert; arc flattening within tolerance.

Manual visual check: overlay a known test pattern (e.g. a measured rectangle) over the real stream and confirm alignment.

## Risks / open questions

- **Lens distortion** — a wide-angle camera ~2.5 m up may bow straight cuts; a plain homography assumes straight lines stay straight. The calibration residual-error display will reveal if this is a real problem; if so, add a distortion-correction step (deferred).
- **Stream format unknown** — the internal server's protocol (MJPEG / HLS / WebRTC) is not yet confirmed; this determines the embedding element. Confirm early in implementation.
- **Camera not perfectly top-down** — fine for a homography as long as the table is planar and calibration points are well spread.