Improve first-time install error diagnostics and resilience (#369)

* fix(install): don't let outer ERR trap mask first_time_install.sh failures

set +e alone doesn't suppress bash's ERR trap, so any non-zero exit from
first_time_install.sh inside the one-shot installer immediately triggered
the outer on_error handler with a generic "Main installation, line 370"
message — before the script could report the real exit code or point to
logs/. Suspend the trap for that block so the existing if/else handling
runs instead.

* feat(install): surface root cause of web dependency install failures

install_dependencies_apt.py previously reported only which packages
failed, not why - the actual apt/pip error was discarded (apt) or
could scroll out of the on_error log tail (pip), leaving "Step 7:
Install web interface dependencies (line 915)" as the only visible
detail.

Capture command output for each install attempt and print a compact
DEPENDENCY INSTALLATION FAILURES summary with the last lines of error
output per package. Also run the installer with `python3 -u` for
real-time, correctly-ordered logging, and widen the on_error tail from
50 to 100 lines so the summary isn't cut off.

* feat(install): harden first-time install against common Pi failure modes

- wait_for_apt_lock: apt_update/apt_install now wait (up to 3min) for
  unattended-upgrades to release the dpkg lock instead of failing
  outright with "Command failed after 3 attempts" right after first boot.
- check_disk_space: new pre-flight check (Step 1) so a full SD card fails
  fast with a clear message instead of a cryptic mid-build error.
- Step 6: wrap rpi-rgb-led-matrix git clone/submodule operations in retry
  for resilience to transient network issues.
- Step 6: capture `pip install .` build output and print the last 50
  lines on failure, so the actual cmake/compiler error is visible instead
  of just "Failed to install rpi-rgb-led-matrix Python package".

* fix(install): bound subprocess output and dedupe apt update in dependency installer

Address coderabbitai review on PR #369:
- _run() now streams combined stdout/stderr to a temp file and returns
  only the last ERROR_TAIL_LINES lines, instead of buffering full
  output in memory (Codacy also flagged the previous capture_output
  call as a subprocess-without-static-string security issue; the new
  call is annotated as safe since cmd is built from hardcoded args).
- `apt update` now runs once in main() instead of once per package
  needing an apt fallback.

* fix(install): suppress remaining Codacy subprocess false-positive

Codacy's Semgrep-based check still flagged the cmd-built subprocess.run
call as "without a static string" even with the Bandit nosec applied.
Add a nosemgrep marker alongside it - cmd is always a hardcoded
apt/pip argument list, never user input.

* fix(install): correctly detect already-installed dateutil/websocket-client

Address remaining coderabbitai findings on PR #369:
- check_package_installed() did __import__(package_name) directly, but
  python-dateutil and websocket-client import as dateutil/websocket. Both
  always failed the "already installed" check and were reinstalled on
  every run. Add an IMPORT_NAME_MAP for the mismatched names.
- _run() still read the entire temp file into memory before slicing the
  tail. Stream it line-by-line into a deque(maxlen=ERROR_TAIL_LINES)
  instead so memory use stays bounded for very chatty commands.

---------

Co-authored-by: Chuck <chuck@example.com>
This commit is contained in:
Chuck
2026-06-11 18:12:35 -04:00
committed by GitHub
parent cf28a8c0d5
commit 5beef0aa01
3 changed files with 213 additions and 94 deletions

View File

@@ -340,9 +340,14 @@ main() {
echo ""
# Execute with proper error handling and non-interactive mode
# Temporarily disable errexit to capture exit code instead of exiting immediately
# Temporarily disable errexit AND the ERR trap to capture exit code instead of
# exiting immediately. `set +e` alone does not suppress the ERR trap, so without
# `trap '' ERR` a non-zero exit from first_time_install.sh would trigger on_error
# here with the generic "Main installation" message instead of the detailed
# if/else handling below.
set +e
trap '' ERR
# Check /tmp permissions - only fix if actually wrong (common in automated scenarios)
# When running manually, /tmp usually has correct permissions (1777)
TMP_PERMS=$(stat -c '%a' /tmp 2>/dev/null || echo "unknown")
@@ -370,6 +375,7 @@ main() {
sudo -E env TMPDIR=/tmp LEDMATRIX_ASSUME_YES=1 bash ./first_time_install.sh -y </dev/null
fi
INSTALL_EXIT_CODE=$?
trap 'on_error $LINENO' ERR # Re-enable ERR trap
set -e # Re-enable errexit
if [ $INSTALL_EXIT_CODE -eq 0 ]; then

View File

@@ -6,46 +6,67 @@ then falls back to pip with --break-system-packages
import subprocess
import sys
import tempfile
import warnings
from collections import deque
from pathlib import Path
# How many trailing lines of a failed command's output to keep for the
# end-of-run failure summary. Keeps the root cause near the end of the log,
# which is where first_time_install.sh's error handler tails from.
ERROR_TAIL_LINES = 15
def _run(cmd):
"""Run a command, streaming combined stdout/stderr to a temp file.
Returns (success, output) instead of raising, so callers can report
*why* a command failed rather than just that it failed. `output` is
bounded to the last ERROR_TAIL_LINES lines so failures from very
chatty commands (e.g. pip build logs) don't get buffered in memory.
"""
with tempfile.TemporaryFile(mode='w+b') as f:
result = subprocess.run(cmd, stdout=f, stderr=subprocess.STDOUT) # nosec B603 B607 - hardcoded apt/pip args # nosemgrep
f.seek(0)
# Stream line-by-line so only the last ERROR_TAIL_LINES are ever held
# in memory, regardless of how much output the command produced.
tail = deque(
(line.decode('utf-8', errors='replace').rstrip('\n') for line in f),
maxlen=ERROR_TAIL_LINES,
)
return result.returncode == 0, '\n'.join(tail)
def install_via_apt(package_name):
"""Try to install a package via apt."""
try:
# Map pip package names to apt package names
apt_package_map = {
'flask': 'python3-flask',
'PIL': 'python3-pil',
'freetype': 'python3-freetype',
'psutil': 'python3-psutil',
'werkzeug': 'python3-werkzeug',
'numpy': 'python3-numpy',
'requests': 'python3-requests',
'python-dateutil': 'python3-dateutil',
'pytz': 'python3-tz',
'geopy': 'python3-geopy',
'unidecode': 'python3-unidecode',
'websockets': 'python3-websockets',
'websocket-client': 'python3-websocket-client'
}
apt_package = apt_package_map.get(package_name, f'python3-{package_name}')
print(f"Trying to install {apt_package} via apt...")
subprocess.check_call([
'sudo', 'apt', 'update'
], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
subprocess.check_call([
'sudo', 'apt', 'install', '-y', apt_package
], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
"""Try to install a package via apt. Returns (success, output)."""
# Map pip package names to apt package names
apt_package_map = {
'flask': 'python3-flask',
'PIL': 'python3-pil',
'freetype': 'python3-freetype',
'psutil': 'python3-psutil',
'werkzeug': 'python3-werkzeug',
'numpy': 'python3-numpy',
'requests': 'python3-requests',
'python-dateutil': 'python3-dateutil',
'pytz': 'python3-tz',
'geopy': 'python3-geopy',
'unidecode': 'python3-unidecode',
'websockets': 'python3-websockets',
'websocket-client': 'python3-websocket-client'
}
apt_package = apt_package_map.get(package_name, f'python3-{package_name}')
print(f"Trying to install {apt_package} via apt...")
success, output = _run(['sudo', 'apt', 'install', '-y', apt_package])
if success:
print(f"Successfully installed {apt_package} via apt")
return True
except subprocess.CalledProcessError:
print(f"Failed to install {package_name} via apt, will try pip")
return False
return True, ""
print(f"Failed to install {apt_package} via apt, will try pip")
return False, output
def install_via_pip(package_name):
"""Install a package via pip with --break-system-packages and --prefer-binary.
@@ -54,34 +75,65 @@ def install_via_pip(package_name):
Debian/Ubuntu-based systems without a virtual environment.
--prefer-binary prefers pre-built wheels over source distributions to avoid
exhausting /tmp space during compilation.
Returns (success, output).
"""
try:
print(f"Installing {package_name} via pip...")
subprocess.check_call([
sys.executable, '-m', 'pip', 'install', '--break-system-packages', '--prefer-binary', package_name
])
print(f"Installing {package_name} via pip...")
success, output = _run([
sys.executable, '-m', 'pip', 'install', '--break-system-packages', '--prefer-binary', package_name
])
if success:
print(f"Successfully installed {package_name} via pip")
return True
except subprocess.CalledProcessError as e:
print(f"Failed to install {package_name} via pip: {e}")
return False
return True, ""
print(f"Failed to install {package_name} via pip (see failure summary at end of log)")
return False, output
# Distribution (pip/apt) names whose importable module name differs.
IMPORT_NAME_MAP = {
'python-dateutil': 'dateutil',
'websocket-client': 'websocket',
}
def check_package_installed(package_name):
"""Check if a package is already installed."""
import_name = IMPORT_NAME_MAP.get(package_name, package_name)
# Suppress deprecation warnings when checking if packages are installed
# (we're just checking, not using them)
with warnings.catch_warnings():
warnings.filterwarnings('ignore', category=DeprecationWarning)
try:
__import__(package_name)
__import__(import_name)
return True
except ImportError:
return False
def print_failure_summary(failed_packages, failure_details):
print("\n" + "=" * 60)
print("DEPENDENCY INSTALLATION FAILURES - DETAILS")
print("=" * 60)
for package in failed_packages:
print(f"\nPackage: {package}")
print("-" * 40)
output = failure_details.get(package, "").strip()
if not output:
print(" (no output captured)")
continue
for line in output.splitlines()[-ERROR_TAIL_LINES:]:
print(f" {line}")
print("=" * 60)
def main():
"""Main installation function."""
print("Installing dependencies for LED Matrix Web Interface V2...")
print("Refreshing apt package index...")
_run(['sudo', 'apt', 'update']) # best-effort; individual installs surface their own errors
# List of required packages
required_packages = [
'flask',
@@ -98,19 +150,23 @@ def main():
'websockets',
'websocket-client'
]
failed_packages = []
failure_details = {}
for package in required_packages:
if check_package_installed(package):
print(f"{package} is already installed")
continue
# Try apt first, then pip
if not install_via_apt(package):
if not install_via_pip(package):
ok, apt_output = install_via_apt(package)
if not ok:
ok, pip_output = install_via_pip(package)
if not ok:
failed_packages.append(package)
failure_details[package] = pip_output or apt_output
# Install packages that don't have apt equivalents
special_packages = [
'timezonefinder>=6.5.0,<7.0.0',
@@ -122,47 +178,49 @@ def main():
'python-socketio>=5.11.0,<6.0.0',
'python-engineio>=4.9.0,<5.0.0'
]
for package in special_packages:
if not install_via_pip(package):
ok, pip_output = install_via_pip(package)
if not ok:
failed_packages.append(package)
failure_details[package] = pip_output
# Install rgbmatrix module from local source (optional - may already be installed in Step 6)
# Check if already installed first
if check_package_installed('rgbmatrix'):
print("rgbmatrix module already installed, skipping...")
else:
print("Installing rgbmatrix module from local source...")
try:
# Get project root (parent of scripts directory)
PROJECT_ROOT = Path(__file__).parent.parent
rgbmatrix_path = PROJECT_ROOT / 'rpi-rgb-led-matrix-master' / 'bindings' / 'python'
if rgbmatrix_path.exists():
# Check if the module has been built (look for setup.py)
setup_py = rgbmatrix_path / 'setup.py'
if setup_py.exists():
# Try installing - use regular install, not editable mode
# This is optional for web interface and should already be installed in Step 6
subprocess.check_call([
sys.executable, '-m', 'pip', 'install', '--break-system-packages', str(rgbmatrix_path)
], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
# Get project root (parent of scripts directory)
PROJECT_ROOT = Path(__file__).parent.parent
rgbmatrix_path = PROJECT_ROOT / 'rpi-rgb-led-matrix-master' / 'bindings' / 'python'
if rgbmatrix_path.exists():
# Check if the module has been built (look for setup.py)
setup_py = rgbmatrix_path / 'setup.py'
if setup_py.exists():
# Try installing - use regular install, not editable mode
# This is optional for web interface and should already be installed in Step 6
ok, output = _run([sys.executable, '-m', 'pip', 'install', '--break-system-packages', str(rgbmatrix_path)])
if ok:
print("rgbmatrix module installed successfully")
else:
print("Warning: rgbmatrix setup.py not found, module may need to be built first")
print(" This is normal if Step 6 hasn't completed yet.")
# Don't fail the whole installation - rgbmatrix is optional for web interface
# and should be installed in Step 6 of first_time_install.sh
print("Warning: Failed to install rgbmatrix module:")
for line in output.strip().splitlines()[-ERROR_TAIL_LINES:]:
print(f" {line}")
print(" This is normal if rgbmatrix hasn't been built yet (Step 6).")
print(" The web interface will work without it.")
else:
print("Warning: rgbmatrix source not found (this is normal if Step 6 hasn't run yet)")
except subprocess.CalledProcessError as e:
# Don't fail the whole installation - rgbmatrix is optional for web interface
# and should be installed in Step 6 of first_time_install.sh
print(f"Warning: Failed to install rgbmatrix module: {e}")
print(" This is normal if rgbmatrix hasn't been built yet (Step 6).")
print(" The web interface will work without it.")
# Don't add to failed_packages since it's optional
print("Warning: rgbmatrix setup.py not found, module may need to be built first")
print(" This is normal if Step 6 hasn't completed yet.")
else:
print("Warning: rgbmatrix source not found (this is normal if Step 6 hasn't run yet)")
if failed_packages:
print(f"\nFailed to install the following packages: {failed_packages}")
print("You may need to install them manually or check your system configuration.")
print_failure_summary(failed_packages, failure_details)
return False
else:
print("\nAll dependencies installed successfully!")