Files
LEDMatrix/docs/TROUBLESHOOTING.md
Chuck ddd300a117 Docs/consolidate documentation (#217)
* docs: rename FONT_MANAGER_USAGE.md to FONT_MANAGER.md

Renamed for clearer naming convention.
Part of documentation consolidation effort.

* docs: consolidate Plugin Store guides (2→1)

Merged:
- PLUGIN_STORE_USER_GUIDE.md
- PLUGIN_STORE_QUICK_REFERENCE.md

Into: PLUGIN_STORE_GUIDE.md

- Unified writing style to professional technical
- Added Quick Reference section at top for easy access
- Removed duplicate content
- Added cross-references to related documentation
- Updated formatting to match style guidelines

* docs: create user-focused Web Interface Guide

Created WEB_INTERFACE_GUIDE.md consolidating:
- V3_INTERFACE_README.md (technical details)
- User-facing interface documentation

- Focused on end-user tasks and navigation
- Removed technical implementation details
- Added common tasks section
- Included troubleshooting
- Professional technical writing style

* docs: consolidate WiFi setup guides (4→1)

Merged:
- WIFI_SETUP.md
- OPTIMAL_WIFI_AP_FAILOVER_SETUP.md
- AP_MODE_MANUAL_ENABLE.md
- WIFI_ETHERNET_AP_MODE_FIX.md (behavior documentation)

Into: WIFI_NETWORK_SETUP.md

- Comprehensive coverage of WiFi setup and configuration
- Clear explanation of AP mode failover and grace period
- Configuration scenarios and best practices
- Troubleshooting section combining all sources
- Professional technical writing style
- Added quick reference table for behavior

* docs: consolidate troubleshooting guides (4→1)

Merged:
- TROUBLESHOOTING_QUICK_START.md
- WEB_INTERFACE_TROUBLESHOOTING.md
- CAPTIVE_PORTAL_TROUBLESHOOTING.md
- WEATHER_TROUBLESHOOTING.md

Into: TROUBLESHOOTING.md

- Organized by issue category (web, WiFi, plugins)
- Comprehensive diagnostic commands reference
- Quick diagnosis steps at the top
- Service file template preserved
- Complete diagnostic script included
- Professional technical writing style

* docs: create consolidated Advanced Features guide

Merged:
- VEGAS_SCROLL_MODE.md
- ON_DEMAND_DISPLAY_QUICK_START.md
- ON_DEMAND_DISPLAY_API.md
- ON_DEMAND_CACHE_MANAGEMENT.md
- BACKGROUND_SERVICE_README.md
- PERMISSION_MANAGEMENT_GUIDE.md

Into: ADVANCED_FEATURES.md

- Comprehensive guide covering all advanced features
- Vegas scroll mode with integration examples
- On-demand display with API reference
- Cache management troubleshooting
- Background service documentation
- Permission management patterns
- Professional technical writing style

* docs: create Getting Started guide for first-time users

Created GETTING_STARTED.md:
- Quick start guide (5 minutes)
- Initial configuration walkthrough
- Common first-time issues and solutions
- Next steps and quick reference
- User-friendly tone for beginners
- Links to detailed documentation

* docs: archive consolidated source files and ephemeral docs

Archived files that have been consolidated:
- Plugin Store guides (2 files → PLUGIN_STORE_GUIDE.md)
- Web Interface guide (V3_INTERFACE_README.md → WEB_INTERFACE_GUIDE.md)
- WiFi Setup guides (4 files → WIFI_NETWORK_SETUP.md)
- Troubleshooting guides (4 files → TROUBLESHOOTING.md)
- Advanced Features (6 files → ADVANCED_FEATURES.md)

Archived ephemeral/debug documentation:
- DEBUG_WEB_ISSUE.md
- BROWSER_ERRORS_EXPLANATION.md
- FORM_VALIDATION_FIXES.md
- WEB_UI_RELIABILITY_IMPROVEMENTS.md
- CAPTIVE_PORTAL_TESTING.md
- NEXT_STEPS_COMMANDS.md
- STATIC_IMAGE_MULTI_UPLOAD_PLAN.md
- RECONNECT_AFTER_CAPTIVE_PORTAL_TESTING.md

Archived implementation summaries:
- PLUGIN_CONFIG_TABS_SUMMARY.md
- PLUGIN_CONFIG_SYSTEM_VERIFICATION.md
- PLUGIN_SCHEMA_AUDIT_SUMMARY.md
- STARTUP_OPTIMIZATION_SUMMARY.md
- PLUGIN_DISPATCH_IMPLEMENTATION.md
- NESTED_SCHEMA_IMPLEMENTATION.md
- AP_MODE_MANUAL_ENABLE_CHANGES.md
- PLUGIN_CONFIG_SYSTEM_EXPLANATION.md

Total archived: 27 files
Preserves git history while cleaning up main docs directory

* docs: rename API_REFERENCE.md to REST_API_REFERENCE.md

Renamed for clarity - this is specifically the REST API reference
for the web interface, not a general API reference.

* docs: update README.md to reflect consolidated documentation structure

Updated documentation index:
- Reflects new consolidated guides (51 → 16-17 files)
- Updated Quick Start sections with new file names
- Added consolidation history (January 2026)
- Updated file references (API_REFERENCE → REST_API_REFERENCE)
- Documented archival of 33 files
- Added benefits of consolidation
- Updated statistics and highlights
- Removed outdated references
- Professional writing style maintained throughout

* docs: add Vegas scroll mode system architecture documentation

Added comprehensive internal architecture section for Vegas mode:
- Component overview with diagram
- VegasModeCoordinator responsibilities and main loop
- StreamManager buffering strategy and content flow
- PluginAdapter integration and fallback behavior
- RenderPipeline 125 FPS rendering process
- Component interaction flows
- Thread safety patterns
- Performance characteristics

Covers:
- How the four components work together
- Initialization and render loop flows
- Config update handling
- Frame rate management and optimization
- Memory usage and CPU characteristics

---------

Co-authored-by: Chuck <chuck@example.com>
2026-01-29 10:32:00 -05:00

18 KiB

Troubleshooting Guide

Quick Diagnosis Steps

Run these checks first to quickly identify common issues:

1. Check Service Status

# Check all LEDMatrix services
sudo systemctl status ledmatrix
sudo systemctl status ledmatrix-web
sudo systemctl status ledmatrix-wifi-monitor

# Check AP mode services (if using WiFi)
sudo systemctl status hostapd
sudo systemctl status dnsmasq

Note: Look for active (running) status and check for error messages in the output.

2. View Service Logs

IMPORTANT: The web service logs to syslog, NOT stdout. Use journalctl to view logs:

# View all recent logs
sudo journalctl -u ledmatrix -n 50
sudo journalctl -u ledmatrix-web -n 50

# Follow logs in real-time
sudo journalctl -u ledmatrix -f

# View logs from last hour
sudo journalctl -u ledmatrix-web --since "1 hour ago"

# Filter for errors only
sudo journalctl -u ledmatrix -p err

3. Run Diagnostic Scripts

# Web interface diagnostics
bash scripts/diagnose_web_interface.sh

# WiFi setup verification
./scripts/verify_wifi_setup.sh

# Weather plugin troubleshooting
./troubleshoot_weather.sh

# Captive portal troubleshooting
./scripts/troubleshoot_captive_portal.sh

4. Check Configuration

# Verify web interface autostart
cat config/config.json | grep web_display_autostart

# Check plugin enabled status
cat config/config.json | grep -A 2 "plugin-id"

# Verify API keys present
ls -l config/config_secrets.json

5. Test Manual Startup

# Test web interface manually
python3 web_interface/start.py

# If it works manually but not as a service, check systemd service file

Common Issues by Category

Web Interface & Service Issues

Service Not Running/Starting

Symptoms:

Solutions:

  1. Start the service:

    sudo systemctl start ledmatrix-web
    
  2. Enable on boot:

    sudo systemctl enable ledmatrix-web
    
  3. Check why it failed:

    sudo journalctl -u ledmatrix-web -n 50
    

web_display_autostart is False

Symptoms:

  • Service exists but web interface doesn't start automatically
  • Logs show service starting but nothing happens

Solution:

# Edit config.json
nano config/config.json

# Set web_display_autostart to true
{
  "web_display_autostart": true,
  ...
}

# Restart service
sudo systemctl restart ledmatrix-web

Import or Dependency Errors

Symptoms:

  • Logs show ModuleNotFoundError or ImportError
  • Service fails to start with Python errors

Solutions:

  1. Install dependencies:

    pip3 install --break-system-packages -r requirements.txt
    pip3 install --break-system-packages -r web_interface/requirements.txt
    
  2. Test imports step-by-step:

    python3 -c "from src.config_manager import ConfigManager; print('OK')"
    python3 -c "from src.plugin_system.plugin_manager import PluginManager; print('OK')"
    python3 -c "from web_interface.app import app; print('OK')"
    
  3. Check Python path:

    python3 -c "import sys; print(sys.path)"
    

Port Already in Use

Symptoms:

  • Error: Address already in use
  • Service fails to bind to port 5050

Solutions:

  1. Check what's using the port:

    sudo lsof -i :5050
    
  2. Kill the conflicting process:

    sudo kill -9 <PID>
    
  3. Or change the port in start.py:

    app.run(host='0.0.0.0', port=5051)
    

Permission Issues

Symptoms:

  • Permission denied errors in logs
  • Cannot read/write configuration files

Solutions:

# Fix ownership of LEDMatrix directory
sudo chown -R ledpi:ledpi /home/ledpi/LEDMatrix

# Fix config file permissions
sudo chmod 644 config/config.json
sudo chmod 640 config/config_secrets.json

# Verify service runs as correct user
sudo systemctl cat ledmatrix-web | grep User

Flask/Blueprint Import Errors

Symptoms:

  • ImportError: cannot import name 'app'
  • ModuleNotFoundError: No module named 'blueprints'

Solutions:

  1. Verify file structure:

    ls -l web_interface/app.py
    ls -l web_interface/blueprints/api_v3.py
    ls -l web_interface/blueprints/pages_v3.py
    
  2. Check for init.py files:

    ls -l web_interface/__init__.py
    ls -l web_interface/blueprints/__init__.py
    
  3. Test import manually:

    cd /home/ledpi/LEDMatrix
    python3 -c "from web_interface.app import app"
    

WiFi & AP Mode Issues

AP Mode Not Activating

Symptoms:

  • WiFi disconnected but AP mode doesn't start
  • Cannot find "LEDMatrix-Setup" network

Solutions:

  1. Check auto-enable setting:

    cat config/wifi_config.json | grep auto_enable_ap_mode
    # Should show: "auto_enable_ap_mode": true
    
  2. Verify WiFi monitor service is running:

    sudo systemctl status ledmatrix-wifi-monitor
    
  3. Wait for grace period (90 seconds):

    • AP mode requires 3 consecutive disconnected checks at 30-second intervals
    • Total wait time: 90 seconds after WiFi disconnects
  4. Check if Ethernet is connected:

    nmcli device status
    # If Ethernet is connected, AP mode won't activate
    
  5. Check required services:

    sudo systemctl status hostapd
    sudo systemctl status dnsmasq
    
  6. Manually enable AP mode:

    # Via API
    curl -X POST http://localhost:5050/api/wifi/ap/enable
    
    # Via Python
    python3 -c "
    from src.wifi_manager import WiFiManager
    wm = WiFiManager()
    wm.enable_ap_mode()
    "
    

Cannot Connect to AP Mode / Connection Refused

Symptoms:

  • Can see "LEDMatrix-Setup" network but can't connect to web interface
  • Browser shows "Connection Refused" or "Can't connect to server"
  • AP mode active but web interface not accessible

Solutions:

  1. Verify web server is running:

    sudo systemctl status ledmatrix-web
    # Should be active (running)
    
  2. Use correct IP address and port:

    • Correct: http://192.168.4.1:5050
    • NOT: http://192.168.4.1 (port 80)
    • NOT: http://192.168.4.1:5000
  3. Check wlan0 has correct IP:

    ip addr show wlan0
    # Should show: inet 192.168.4.1/24
    
  4. Verify hostapd and dnsmasq are running:

    sudo systemctl status hostapd
    sudo systemctl status dnsmasq
    
  5. Test from the Pi itself:

    curl http://192.168.4.1:5050
    # Should return HTML
    

DNS Resolution Failures

Symptoms:

  • Captive portal doesn't redirect automatically
  • DNS lookups fail when connected to AP mode

Solutions:

  1. Check dnsmasq status:

    sudo systemctl status dnsmasq
    sudo journalctl -u dnsmasq -n 20
    
  2. Verify DNS configuration:

    cat /etc/dnsmasq.conf | grep -v "^#" | grep -v "^$"
    
  3. Test DNS resolution:

    nslookup captive.apple.com
    # Should resolve to 192.168.4.1 when in AP mode
    
  4. Manual captive portal testing:

    • Try these URLs manually:
      • http://192.168.4.1:5050
      • http://captive.apple.com
      • http://connectivitycheck.gstatic.com/generate_204

Firewall Blocking Port 5050

Symptoms:

  • Services running but cannot connect
  • Works from Pi but not from other devices

Solutions:

  1. Check UFW status:

    sudo ufw status
    
  2. Allow port 5050:

    sudo ufw allow 5050/tcp
    
  3. Check iptables:

    sudo iptables -L -n
    
  4. Temporarily disable firewall to test:

    sudo ufw disable
    # Test if it works, then re-enable and add rule
    sudo ufw enable
    sudo ufw allow 5050/tcp
    

Plugin Issues

Plugin Not Enabled

Symptoms:

  • Plugin installed but doesn't appear in rotation
  • Plugin shows in web interface but is greyed out

Solutions:

  1. Enable in configuration:

    {
      "plugin-id": {
        "enabled": true,
        ...
      }
    }
    
  2. Restart display:

    sudo systemctl restart ledmatrix
    
  3. Verify in web interface:

    • Navigate to Plugin Management tab
    • Toggle the switch to enable
    • Restart display

Plugin Not Loading

Symptoms:

  • Plugin enabled but not showing
  • Errors in logs about plugin

Solutions:

  1. Check plugin directory exists:

    ls -ld plugins/plugin-id/
    
  2. Verify manifest.json:

    cat plugins/plugin-id/manifest.json
    # Verify all required fields present
    
  3. Check dependencies installed:

    if [ -f plugins/plugin-id/requirements.txt ]; then
      pip3 install --break-system-packages -r plugins/plugin-id/requirements.txt
    fi
    
  4. Check logs for plugin errors:

    sudo journalctl -u ledmatrix -f | grep plugin-id
    
  5. Test plugin import:

    python3 -c "
    import sys
    sys.path.insert(0, 'plugins/plugin-id')
    from manager import PluginClass
    print('Plugin imports successfully')
    "
    

Stale Cache Data

Symptoms:

  • Plugin shows old data
  • Data doesn't update even after restarting
  • Clearing cache in web interface doesn't help

Solutions:

  1. Manual cache clearing:

    # Remove plugin-specific cache
    rm -rf cache/plugin-id*
    
    # Or remove all cache
    rm -rf cache/*
    
    # Restart display
    sudo systemctl restart ledmatrix
    
  2. Check cache permissions:

    ls -ld cache/
    sudo chown -R ledpi:ledpi cache/
    

Weather Plugin Specific Issues

Missing or Invalid API Key

Symptoms:

  • "No Weather Data" message on display
  • Logs show API authentication errors

Solutions:

  1. Get OpenWeatherMap API key:

  2. Add to config_secrets.json (recommended):

    {
      "openweathermap_api_key": "your-api-key-here"
    }
    
  3. Or add to config.json:

    {
      "ledmatrix-weather": {
        "enabled": true,
        "openweathermap_api_key": "your-api-key-here",
        ...
      }
    }
    
  4. Secure the API key file:

    chmod 640 config/config_secrets.json
    
  5. Restart display:

    sudo systemctl restart ledmatrix
    

API Rate Limits Exceeded

Symptoms:

  • Weather works initially then stops
  • Logs show HTTP 429 errors (Too Many Requests)
  • Error message: "Rate limit exceeded"

Solutions:

  1. Increase update interval:

    {
      "ledmatrix-weather": {
        "update_interval": 300,
        ...
      }
    }
    

    Note: Minimum recommended: 300 seconds (5 minutes)

  2. Check current rate limit usage:

    • OpenWeatherMap free tier: 1,000 calls/day, 60 calls/minute
    • With 300s interval: 288 calls/day (well within limits)
  3. Monitor API calls:

    sudo journalctl -u ledmatrix -f | grep "openweathermap"
    

Invalid Location Configuration

Symptoms:

  • "No Weather Data" message
  • Logs show location not found errors

Solutions:

  1. Use correct location format:

    {
      "ledmatrix-weather": {
        "city": "Tampa",
        "state": "FL",
        "country": "US"
      }
    }
    
  2. Use ISO country codes:

    • US = United States
    • GB = United Kingdom
    • CA = Canada
    • etc.
  3. Test API call manually:

    API_KEY="your-key-here"
    curl "http://api.openweathermap.org/data/2.5/weather?q=Tampa,FL,US&appid=${API_KEY}"
    

Network Connectivity to OpenWeatherMap

Symptoms:

  • Other internet features work
  • Weather specifically fails
  • Connection timeout errors

Solutions:

  1. Test connectivity:

    ping api.openweathermap.org
    
  2. Test DNS resolution:

    nslookup api.openweathermap.org
    
  3. Test API endpoint:

    curl -I https://api.openweathermap.org
    # Should return HTTP 200 or 301
    
  4. Check firewall:

    # Ensure HTTPS (443) is allowed for outbound connections
    sudo ufw status
    

Diagnostic Commands Reference

Service Commands

# Check status
sudo systemctl status ledmatrix
sudo systemctl status ledmatrix-web
sudo systemctl status ledmatrix-wifi-monitor

# Start service
sudo systemctl start <service-name>

# Stop service
sudo systemctl stop <service-name>

# Restart service
sudo systemctl restart <service-name>

# Enable on boot
sudo systemctl enable <service-name>

# Disable on boot
sudo systemctl disable <service-name>

# View service file
sudo systemctl cat <service-name>

# Reload systemd after editing service files
sudo systemctl daemon-reload

Log Viewing Commands

# View recent logs (last 50 lines)
sudo journalctl -u ledmatrix -n 50

# Follow logs in real-time
sudo journalctl -u ledmatrix -f

# View logs from specific time
sudo journalctl -u ledmatrix --since "1 hour ago"
sudo journalctl -u ledmatrix --since "2024-01-01 10:00:00"

# View logs until specific time
sudo journalctl -u ledmatrix --until "2024-01-01 12:00:00"

# Filter by priority (errors only)
sudo journalctl -u ledmatrix -p err

# Filter by priority (warnings and errors)
sudo journalctl -u ledmatrix -p warning

# Search logs for specific text
sudo journalctl -u ledmatrix | grep "error"
sudo journalctl -u ledmatrix | grep -i "plugin"

# View logs for multiple services
sudo journalctl -u ledmatrix -u ledmatrix-web -n 50

# Export logs to file
sudo journalctl -u ledmatrix > ledmatrix.log

Network Testing Commands

# Test connectivity
ping -c 4 8.8.8.8
ping -c 4 api.openweathermap.org

# Test DNS resolution
nslookup api.openweathermap.org
dig api.openweathermap.org

# Test HTTP endpoint
curl -I http://your-pi-ip:5050
curl http://192.168.4.1:5050

# Check listening ports
sudo lsof -i :5050
sudo netstat -tuln | grep 5050

# Check network interfaces
ip addr show
nmcli device status

File/Directory Verification

# Check file exists
ls -l config/config.json
ls -l plugins/plugin-id/manifest.json

# Check directory structure
ls -la web_interface/
ls -la plugins/

# Check file permissions
ls -l config/config_secrets.json

# Check file contents
cat config/config.json | jq .
cat config/wifi_config.json | grep auto_enable

Python Import Testing

# Test core imports
python3 -c "from src.config_manager import ConfigManager; print('OK')"
python3 -c "from src.plugin_system.plugin_manager import PluginManager; print('OK')"
python3 -c "from src.display_manager import DisplayManager; print('OK')"

# Test web interface imports
python3 -c "from web_interface.app import app; print('OK')"
python3 -c "from web_interface.blueprints.api_v3 import api_v3; print('OK')"

# Test WiFi manager
python3 -c "from src.wifi_manager import WiFiManager; print('OK')"

# Test plugin import
python3 -c "
import sys
sys.path.insert(0, 'plugins/plugin-id')
from manager import PluginClass
print('Plugin imports OK')
"

Service File Template

If your systemd service file is corrupted or missing, use this template:

[Unit]
Description=LEDMatrix Web Interface
After=network.target

[Service]
Type=simple
User=ledpi
Group=ledpi
WorkingDirectory=/home/ledpi/LEDMatrix
Environment="PYTHONUNBUFFERED=1"
ExecStart=/usr/bin/python3 /home/ledpi/LEDMatrix/web_interface/start.py
Restart=on-failure
RestartSec=5s
StandardOutput=journal
StandardError=journal
SyslogIdentifier=ledmatrix-web

[Install]
WantedBy=multi-user.target

Save to /etc/systemd/system/ledmatrix-web.service and run:

sudo systemctl daemon-reload
sudo systemctl enable ledmatrix-web
sudo systemctl start ledmatrix-web

Complete Diagnostic Script

Run this script for comprehensive diagnostics:

#!/bin/bash

echo "=== LEDMatrix Diagnostic Report ==="
echo ""

echo "1. Service Status:"
systemctl status ledmatrix --no-pager -n 5
systemctl status ledmatrix-web --no-pager -n 5
echo ""

echo "2. Recent Logs:"
journalctl -u ledmatrix -n 20 --no-pager
echo ""

echo "3. Configuration:"
cat config/config.json | grep -E "(web_display_autostart|enabled)"
echo ""

echo "4. Network Status:"
ip addr show | grep -E "(wlan|eth|inet )"
curl -s http://localhost:5050 > /dev/null && echo "Web interface: OK" || echo "Web interface: FAILED"
echo ""

echo "5. File Structure:"
ls -la web_interface/ | head -10
ls -la plugins/ | head -10
echo ""

echo "6. Python Imports:"
python3 -c "from src.config_manager import ConfigManager" && echo "ConfigManager: OK" || echo "ConfigManager: FAILED"
python3 -c "from web_interface.app import app" && echo "Web app: OK" || echo "Web app: FAILED"
echo ""

echo "=== End Diagnostic Report ==="

Success Indicators

A properly functioning system should show:

  1. Services Running:

    ● ledmatrix.service - active (running)
    ● ledmatrix-web.service - active (running)
    
  2. Web Interface Accessible:

  3. Logs Show Normal Operation:

    INFO: Web interface started on port 5050
    INFO: Loaded X plugins
    INFO: Display rotation active
    
  4. Process Listening on Port:

    $ sudo lsof -i :5050
    COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
    python3  1234 ledpi  3u   IPv4  12345      0t0  TCP *:5050 (LISTEN)
    
  5. Plugins Loading:

    • Logs show plugin initialization
    • Plugins appear in web interface
    • Display cycles through enabled plugins

Emergency Recovery

If the system is completely broken:

1. Git Rollback

# View recent commits
git log --oneline -10

# Rollback to previous commit
git reset --hard HEAD~1

# Or rollback to specific commit
git reset --hard <commit-hash>

# Restart all services
sudo systemctl restart ledmatrix
sudo systemctl restart ledmatrix-web

2. Fresh Service Installation

# Reinstall WiFi monitor
sudo ./scripts/install/install_wifi_monitor.sh

# Recreate service files from templates
sudo cp templates/ledmatrix.service /etc/systemd/system/
sudo cp templates/ledmatrix-web.service /etc/systemd/system/

# Reload and restart
sudo systemctl daemon-reload
sudo systemctl restart ledmatrix ledmatrix-web

3. Full System Reboot

# As a last resort
sudo reboot