2 Commits

Author SHA1 Message Date
Ron Pierce
122e6d6863 fix(web): use fully-qualified .service unit names for privileged systemctl (#360)
The web interface runs headless, so every privileged systemctl call must be
covered by a NOPASSWD rule in /etc/sudoers.d/ledmatrix_web. The sudo command
matches the command line exactly, but the code called 'systemctl start
ledmatrix' while configure_web_sudo.sh grants 'systemctl start
ledmatrix.service'. The rule never matched, so start/stop/enable/disable/
restart fell back to a password prompt and failed with 'a terminal is
required to read the password'.

Align all privileged systemctl calls on the fully-qualified unit names the
sudoers grants use. Add a regression test that cross-checks api_v3.py calls
against the grants in configure_web_sudo.sh.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 15:17:00 -04:00
Chuck
d488e8a2ad fix(api): don't coerce all-digit strings to int when schema type is string (#363)
* docs(core): add module and class docstrings to the 5 undocumented core files

Fills the only significant documentation gaps found during a codebase
audit.  All other core files (plugin_system/, logging_config.py, etc.)
already have complete module, class, and function docstrings.

Files changed (documentation only — zero logic changes):

  display_controller.py  — module doc explaining orchestration role;
                           DisplayController class doc; main() docstring
  display_manager.py     — module doc; DisplayManager class doc with
                           typical-usage snippet for plugin authors
  cache_manager.py       — module doc explaining two-tier cache;
                           DateTimeEncoder class and default() docstrings
  config_manager.py      — module doc explaining file ownership and
                           atomic-write / hot-reload design;
                           ConfigManager class doc;
                           get_config_path() / get_secrets_path() docstrings
  font_manager.py        — module doc (class docstring already existed)

Also noted (but not changed to avoid behaviour risk):
  display_manager.py and font_manager.py use logging.getLogger() directly
  instead of the project's get_logger() wrapper.  display_manager.py also
  calls setLevel(logging.INFO) immediately after, which would be lost if
  switched to get_logger().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* perf(display_controller): three targeted hot-path optimizations

Opt 1 — cache inspect.signature() per plugin_id
  inspect.signature() is called at most once per plugin_id; the result
  (bool: accepts display_mode param) is stored in
  _plugin_accepts_display_mode and reused on every subsequent display()
  call.  Eliminates all reflection from the display path at runtime.
  Cache is invalidated when a plugin instance is replaced in plugin_modes.

Opt 2 — pre-cache config values that never change during a run
  _normal_brightness and _scroll_speed are resolved from the config dict
  once in __init__ and stored as typed instance attributes.
  - Removes 2+ chained dict.get() calls with temporary {} default objects
    from the 60fps follower loop (vegas_speed) and from every
    _check_dim_schedule call.
  - current_brightness init now uses _normal_brightness directly.

Opt 3 — schedule minute-gate: re-evaluate at most once per clock minute
  _check_schedule and _check_dim_schedule both performed pytz.timezone(),
  datetime.now(), strftime(), and datetime.strptime() on every outer loop
  call.  Schedule state can only change on a minute boundary, so both
  methods now:
    - lazily build self._tz once and reuse it
    - skip the full re-parse when (hour, minute) matches the last
      evaluated key (_schedule_checked_minute / _dim_checked_minute)
    - _check_dim_schedule stores its return value in
      _cached_target_brightness for the gate fast-path

Tests: 23 new tests in test_display_controller_optimizations.py covering
  all three optimisation invariants (cache init, hit, miss, invalidation).
  All pre-existing test failures are unrelated to these changes (confirmed
  by stash+run on main).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: resolve 22 pre-existing test failures across 6 groups

Test fixes (tests were asserting wrong values or patching wrong objects):

  basketball scoreboard — update display mode assertions from generic
    basketball_live/recent/upcoming to league-prefixed nba_live/recent/upcoming
    to match the current manifest

  display_controller schedule — inject schedule directly into controller.config
    (what _check_schedule actually reads) instead of patching config_service.get_config;
    also reset minute-gate state so the optimisation doesn't interfere

  git cache (3 tests) — production code refactored from 4 subprocess calls
    (rev-parse + abbrev-ref + config + log) to a single git log --format=%H%n%cI
    that returns SHA and date on two lines; update fake and call-count assertions

  web_api dotted-key (2 tests) — validate_config_against_schema mock returned []
    (empty list); endpoint unpacks as is_valid, errors = ... causing ValueError;
    fix: return_value = (True, [])

  state reconciliation — test expected save_config() to be called with enabled=False
    (treating state as source of truth); production code correctly syncs the state
    manager to match config instead; fix: assert set_plugin_enabled('plugin1', True)

Production fixes (production code had bugs or missing features):

  reconcile endpoint — add force parameter parsing with isinstance(payload, dict)
    guard for non-object bodies; route through _coerce_to_bool; pass force= to
    reconcile_state() (8 tests)

  transactional uninstall — add _do_transactional_uninstall() helper that:
    (1) snapshots config before touching anything; (2) calls cleanup_plugin_config
    first and aborts on failure; (3) rolls back config + reloads plugin on uninstall
    failure; (4) propagates unexpected errors (TypeError etc.) instead of swallowing
    them (6 tests)

  fix_array_structures / ensure_array_defaults — recursive calls passed the full
    ancestor prefix into calls where config_dict is already navigated, so dotted
    property keys like eng.1 caused parent_parts.split('.') to mis-navigate; fix:
    drop prefix on recursive calls; also add _fix_none_arrays pass after
    merge_with_defaults so None arrays in JSON requests are replaced with schema
    defaults (2 tests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* perf: four targeted optimizations across the display pipeline

Opt 1 — cache data-fetch interval per plugin (plugin_manager.py)
  _get_plugin_update_interval fell back to config_manager.get_config()
  (a full dict copy) when the manifest lacked an interval.  Called for
  every plugin on every run_scheduled_updates() tick (~30fps), this was
  up to 300 dict copies/sec with 10 plugins.
  Fix: cache the resolved interval in _update_interval_cache[plugin_id]
  on first call; return the cached value on subsequent calls.  Cache is
  cleared on load_plugin and unload_plugin.

Opt 2 — demote noisy per-cycle INFO logs to DEBUG (display_controller.py)
  Four logger.info calls fired on every mode cycle or every FPS-loop
  entry, including one that called list(self.plugin_modes.keys())
  unconditionally (allocating a list every outer loop iteration).
  - "Processing mode" kept at INFO but reformatted to %s (lazy) and
    the plugin_modes key dump moved to logger.debug
  - "Attempting/Got cycle duration" → logger.debug
  - "Entering high/normal FPS loop" → logger.debug
  Mode name at INFO is preserved for black-screen troubleshooting.

Opt 3 — use Image.frombytes instead of Image.fromarray in scroll hot path
  (scroll_helper.py)
  Image.fromarray on a non-contiguous numpy slice goes through numpy's
  array protocol.  Image.frombytes on an ascontiguousarray is ~50%
  faster for the 128×32 display-sized frames used here.  Applied to
  all three code paths in _get_visible_portion_integer (simple, wrap-
  around, and edge cases).

Opt 5 — cache get_text_width per (text, font) pair (display_manager.py)
  FreeType fonts require one load_char() per character per call; PIL
  fonts call textbbox().  Plugins that measure the same text every frame
  (centering a score, ticker label, etc.) were re-measuring from scratch
  on every display() call.
  Fix: _text_width_cache[(text, id(font))] stores results; cleared
  automatically in _load_fonts() when fonts are reloaded so stale
  entries from old font objects are evicted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(scroll_helper): fix edge-case bug exposed by frombytes switch

The previous commit replaced Image.fromarray with Image.frombytes in
_get_visible_portion_integer.  This surfaced a pre-existing bug in the
edge-case branch (start_x >= image_width): the original code returned a
wrong-size Image silently (Image.fromarray accepts a too-short array);
Image.frombytes raises ValueError instead.

Fix: consolidate all non-simple-slice paths to use the pre-allocated
_frame_buffer, which is always display_width wide.  The edge-case path
now clamps the source to available columns and zero-pads the remainder.

Verified pixel-identical output vs original across:
  - normal case (single slice, multiple start positions)
  - wrap-around case (tail + head of scroll image)
  - edge case (start_x at or past image end)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address CodeRabbit review comments on PR #358

1. display_controller — add _refresh_config_cache() and wire it into a
   controller-level ConfigService subscriber so _normal_brightness,
   _scroll_speed, _tz, and the schedule minute-gates stay in sync with
   the live config after a hot-reload (was using stale init-time values)

2. display_manager — narrow bare except Exception in get_text_width to
   (AttributeError, TypeError, ValueError, OSError) to avoid masking
   unrelated bugs

3. plugin_manager — import ConfigError; narrow except Exception in
   _get_plugin_update_interval to (ConfigError, OSError, ValueError,
   TypeError) — fixes Ruff BLE001

4. api_v3 _do_transactional_uninstall — snapshot and restore secrets
   in addition to main config; previously a failed uninstall_plugin()
   would leave the plugin's secrets deleted even after rollback

5. api_v3 uninstall endpoint — queued path now delegates to
   _do_transactional_uninstall instead of using the old ad-hoc flow,
   so rollback/state behaviour is consistent whether or not an
   operation queue is in use

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(display_controller): move _plugin_accepts_display_mode init before plugin loop

Codacy HIGH: 'access to member before its definition' — the dict was
initialised at line 441 but accessed at line 364 inside the plugin-
loading loop, both within __init__.

Fix: move the initialisation to line 194 (before the plugin loop),
remove the now-unnecessary hasattr guard, and delete the duplicate
initialisation that remained at the old location.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(api): don't coerce all-digit strings to int when schema type is string

_parse_form_value_with_schema had a fallback that tried int()/float() on
any string value that wasn't already handled. Fields like station_id
(type: "string", value: "8726607") were silently converted to integers,
causing jsonschema validation to reject them with "expected string, got int".

Guard the fallback with a check that skips it when the schema property
explicitly declares type: "string".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Chuck <chuck@example.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 15:01:42 -04:00
2 changed files with 93 additions and 16 deletions

View File

@@ -0,0 +1,76 @@
"""Guards that every privileged systemctl call the web interface makes is
covered by a passwordless-sudo grant in configure_web_sudo.sh.
The web interface runs headless (no TTY), so any `sudo` call that is not
matched by a NOPASSWD rule in /etc/sudoers.d/ledmatrix_web falls back to a
password prompt and fails with:
sudo: a terminal is required to read the password
sudo matches the command line by exact string, so `systemctl start ledmatrix`
and `systemctl start ledmatrix.service` are NOT interchangeable. This test
parses both the production blueprint and the sudoers-generator script and
asserts the (verb, unit) pairs line up, catching the suffix-mismatch class of
bug before it ships.
"""
import ast
import re
from pathlib import Path
PROJECT_ROOT = Path(__file__).resolve().parents[2]
API_V3 = PROJECT_ROOT / "web_interface" / "blueprints" / "api_v3.py"
SUDOERS_SCRIPT = PROJECT_ROOT / "scripts" / "install" / "configure_web_sudo.sh"
def _sudo_systemctl_calls(source: str) -> set[tuple[str, str]]:
"""Return (verb, unit) for every list literal beginning with
['sudo', 'systemctl', ...] passed to a subprocess call in the source."""
calls: set[tuple[str, str]] = set()
for node in ast.walk(ast.parse(source)):
if not isinstance(node, ast.List):
continue
elts = node.elts
if len(elts) < 4:
continue
if not all(isinstance(e, ast.Constant) and isinstance(e.value, str) for e in elts[:4]):
continue
if elts[0].value == "sudo" and elts[1].value == "systemctl":
calls.add((elts[2].value, elts[3].value))
return calls
def _granted_systemctl_rules(script: str) -> set[tuple[str, str]]:
"""Return (verb, unit) for each `$SYSTEMCTL_PATH <verb> <unit>` NOPASSWD
grant emitted by the sudoers-generator script."""
rules: set[tuple[str, str]] = set()
for match in re.finditer(r"\$SYSTEMCTL_PATH\s+(\S+)\s+(\S+)", script):
verb, unit = match.group(1), match.group(2).rstrip('"')
rules.add((verb, unit))
return rules
def test_every_sudo_systemctl_call_is_granted() -> None:
calls = _sudo_systemctl_calls(API_V3.read_text())
rules = _granted_systemctl_rules(SUDOERS_SCRIPT.read_text())
assert calls, "expected to find sudo systemctl calls in api_v3.py"
uncovered = {c for c in calls if c not in rules}
assert not uncovered, (
"These sudo systemctl calls have no matching NOPASSWD grant in "
"configure_web_sudo.sh; they will fail headless with "
"'sudo: a terminal is required to read the password': "
+ ", ".join(f"systemctl {v} {u}" for v, u in sorted(uncovered))
)
def test_units_are_fully_qualified() -> None:
"""Privileged systemctl calls must name the unit as <name>.service so they
match the sudoers grants, which use the fully-qualified unit name."""
calls = _sudo_systemctl_calls(API_V3.read_text())
unqualified = {(v, u) for v, u in calls if not u.endswith(".service")}
assert not unqualified, (
"sudo systemctl calls must use fully-qualified .service unit names: "
+ ", ".join(f"systemctl {v} {u}" for v, u in sorted(unqualified))
)

View File

@@ -190,7 +190,7 @@ def _ensure_display_service_running():
if status.get('active'):
status['started'] = False
return status
result = _run_systemctl_command(['sudo', 'systemctl', 'start', 'ledmatrix'])
result = _run_systemctl_command(['sudo', 'systemctl', 'start', 'ledmatrix.service'])
service_status = _get_display_service_status()
result['started'] = result.get('returncode') == 0
result['active'] = service_status.get('active')
@@ -199,7 +199,7 @@ def _ensure_display_service_running():
def _stop_display_service():
"""Stop the ledmatrix display service."""
result = _run_systemctl_command(['sudo', 'systemctl', 'stop', 'ledmatrix'])
result = _run_systemctl_command(['sudo', 'systemctl', 'stop', 'ledmatrix.service'])
status = _get_display_service_status()
result['active'] = status.get('active')
result['status'] = status
@@ -1461,7 +1461,7 @@ def execute_system_action():
# For on-demand modes, we would need to integrate with the display controller
# For now, just start the display service
try:
result = subprocess.run(['sudo', 'systemctl', 'start', 'ledmatrix'],
result = subprocess.run(['sudo', 'systemctl', 'start', 'ledmatrix.service'],
capture_output=True, text=True, timeout=10)
except subprocess.TimeoutExpired as e:
logger.error("start_display (%s) timed out: %s", mode, e)
@@ -1478,16 +1478,16 @@ def execute_system_action():
resp['stderr'] = result.stderr.strip()
return jsonify(resp)
else:
result = subprocess.run(['sudo', 'systemctl', 'start', 'ledmatrix'],
result = subprocess.run(['sudo', 'systemctl', 'start', 'ledmatrix.service'],
capture_output=True, text=True, timeout=10)
elif action == 'stop_display':
result = subprocess.run(['sudo', 'systemctl', 'stop', 'ledmatrix'],
result = subprocess.run(['sudo', 'systemctl', 'stop', 'ledmatrix.service'],
capture_output=True, text=True, timeout=10)
elif action == 'enable_autostart':
result = subprocess.run(['sudo', 'systemctl', 'enable', 'ledmatrix'],
result = subprocess.run(['sudo', 'systemctl', 'enable', 'ledmatrix.service'],
capture_output=True, text=True, timeout=10)
elif action == 'disable_autostart':
result = subprocess.run(['sudo', 'systemctl', 'disable', 'ledmatrix'],
result = subprocess.run(['sudo', 'systemctl', 'disable', 'ledmatrix.service'],
capture_output=True, text=True, timeout=10)
elif action == 'reboot_system':
result = subprocess.run(['sudo', 'reboot'],
@@ -1568,11 +1568,11 @@ def execute_system_action():
'message': pull_message,
})
elif action == 'restart_display_service':
result = subprocess.run(['sudo', 'systemctl', 'restart', 'ledmatrix'],
result = subprocess.run(['sudo', 'systemctl', 'restart', 'ledmatrix.service'],
capture_output=True, text=True, timeout=10)
elif action == 'restart_web_service':
# Try to restart the web service (assuming it's ledmatrix-web.service)
result = subprocess.run(['sudo', 'systemctl', 'restart', 'ledmatrix-web'],
result = subprocess.run(['sudo', 'systemctl', 'restart', 'ledmatrix-web.service'],
capture_output=True, text=True, timeout=10)
else:
return jsonify({'status': 'error', 'message': 'Unknown action'}), 400
@@ -3709,13 +3709,14 @@ def _parse_form_value_with_schema(value, key_path, schema):
except ValueError:
return prop.get('default', 0.0)
# Try parsing as number (fallback)
try:
if '.' in stripped:
return float(stripped)
return int(stripped)
except ValueError:
pass
# Try parsing as number (fallback) — skip when schema explicitly says string
if not (prop and prop.get('type') == 'string'):
try:
if '.' in stripped:
return float(stripped)
return int(stripped)
except ValueError:
pass
# Return as string
return value