check_one() always compares the render against committed golden images, but
the CLI never loaded the plugin's test/harness.json — so the deterministic
settings the goldens were generated with (config, mock data, frozen time,
sizes) weren't applied. For any time/data-dependent plugin this means the CLI
(and the plugins-repo CI workflow that calls it) renders live data and the
golden drifts on every run, even with no real regression. The pytest matrix
path already reads harness.json via load_harness_spec; the CLI now does too.
- check_one loads load_harness_spec(plugin_dir) and layers it under explicit
CLI flags: config = schema defaults < harness.json < --config; sizes =
--sizes > LEDMATRIX_TEST_SIZES env > harness.json > default sample;
mock_data/freeze_time/skip_update fall back to harness.json when not given
on the CLI.
- parse_sizes returns None (not DEFAULT_TEST_SIZES) when --sizes is omitted,
so the env/harness.json/default fallback chain in resolve_test_sizes applies.
- Regression tests: harness.json supplies render settings, and CLI flags
override it. Use a temp fixture plugin so they run in core CI (no plugins).
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* feat(testing): add cross-size/cross-screen plugin safety harness
Render every plugin across all supported matrix sizes (64x32, 128x32,
128x64, 256x32) and every declared screen, failing on crashes, content
drawn past the panel edge, or visual drift vs committed golden images.
- BoundsCheckingDisplayManager: oversized-canvas overflow detection
- harness.py: multi-size/multi-screen render engine + golden compare
- scripts/check_plugin.py: CLI (functional+bounds, --out-dir, --update-golden,
--freeze-time); render_plugin.py refactored onto shared loading helpers
- test/plugins/test_harness.py + test_plugin_matrix.py (parametrized,
honors per-plugin test/harness.json; skips when no plugins present)
- MockCacheManager.cache_dir so cache-dir-using plugins load headlessly
- .github/workflows/test.yml + docs/plugin-safety-harness.md
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(testing): address PR review feedback on plugin safety harness
- check_plugin: friendly error for non-numeric --sizes; reject non-object
--config / --mock-data JSON; sanitize plugin mode before using as a
filename; stop --update-golden from masking crash/overflow failures
- bounds_display_manager: pad the canvas out to the largest supported panel
(not a fixed 16px) so far-overshoot coordinates are caught, not clipped
- harness: merge config_schema defaults inside render_plugin_matrix; surface
update() failures as a non-fatal warning + result field instead of a debug
log; sanitize mode in golden_path
- loading: fail fast when harness.json references a missing mock_data fixture
- mocks: clean up the per-instance temp cache dir via weakref.finalize
- test_plugin_matrix: add a discovery guard that fails when
LEDMATRIX_REQUIRE_PLUGINS=1 but none found (still skips locally); type hints
- bound test deps with upper version pins for deterministic CI
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(testing): render plugins across arbitrary panel sizes, not a fixed list
Addresses maintainer feedback that there is no canonical set of supported
panel sizes — a build can be any size/configuration (square, 2x2, 4x4, 8x2,
long strips, tall stacks).
- sizes.py: SUPPORTED_SIZES -> DEFAULT_TEST_SIZES (back-compat alias kept),
reframed as a representative SAMPLE of real panel-grid arrangements rather
than an authoritative list; add parse_size_token / coerce_sizes /
resolve_test_sizes helpers
- sizes are now fully overridable: LEDMATRIX_TEST_SIZES env (global, e.g. test
on your exact hardware) > per-plugin harness.json "sizes" > default sample;
CLI --sizes unchanged
- bounds_display_manager: pad the canvas to the largest panel IN THE CURRENT
RUN (via overflow_extent) instead of a hardcoded max, so cross-size overflow
detection scales to whatever sizes a run uses
- harness: compute per-run extent and thread it into the bounds manager
- tests: arbitrary-shape + size-parsing/precedence coverage
- docs: rewrite "Supported sizes" -> "Sizes: a sample, not a fixed list"
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(testing): fail the harness on non-connectivity update() errors
Addresses the remaining review thread: recording every update() exception as a
non-fatal warning still let a real update() regression pass green as long as
display() survived. Now update() failures are classified — a tolerated set of
connectivity errors (ConnectionError/TimeoutError/socket/ssl/urllib/http/
requests) is recorded non-fatally (expected with no network in CI), while any
other exception is treated as a genuine bug and fails that render.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* ci(security): pin actions to SHAs and disable checkout credential persistence
Addresses the CodeRabbit/zizmor workflow-hardening finding: pin
actions/checkout and actions/setup-python to full commit SHAs and set
persist-credentials: false on checkout to reduce supply-chain and
token-exposure risk.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(testing): validate positive sizes; narrow requests import except
Two review findings:
- sizes.py: parse_size_token / coerce_sizes now reject non-positive
dimensions (0x32, -64x32) with a clear message instead of passing invalid
sizes downstream (CodeRabbit).
- harness.py: the optional `requests` import now catches ImportError
specifically and logs instead of `except Exception: pass`, clearing the
Codacy medium "Try, Except, Pass" (harness.py L52) and Ruff S110/BLE001.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>