hvac-kia-content/docs/ARCHITECTURE_DECISIONS.md
Ben Reed a80af693ba Add comprehensive production documentation and testing
Documentation Added:
- ARCHITECTURE_DECISIONS.md: Explains why systemd over k8s (TikTok display requirements)
- DEPLOYMENT_CHECKLIST.md: Step-by-step deployment procedures
- ROLLBACK_PROCEDURES.md: Emergency rollback and recovery procedures
- test_production_deployment.py: Automated deployment verification script

Key Documentation Highlights:
- Detailed explanation of containerization limitations with browser automation
- Complete deployment checklist with pre/post verification steps
- Rollback scenarios with recovery time objectives
- Emergency contact templates and backup procedures
- Automated test script for production readiness

17 of 25 tasks completed (68% done)
Remaining work focuses on spec compliance and testing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-18 20:20:52 -03:00

3.7 KiB

Architecture Decisions

Why Systemd Instead of Kubernetes/Docker

Decision

We chose to use systemd services for production deployment instead of the originally specified Kubernetes/Docker containerization.

Context

The original specification called for:

  • Docker containerization with multi-stage builds
  • Kubernetes deployment with CronJobs
  • Running on a Kubernetes cluster control plane node

Problem

TikTok scraping using the Scrapling library requires:

  1. Display Server Access: Scrapling uses a real browser (Chromium) for JavaScript rendering
  2. X11/Wayland Session: Browser automation needs GUI environment variables (DISPLAY, XAUTHORITY)
  3. GPU Acceleration: Optional but improves performance for browser rendering
  4. Session Persistence: Browser cookies and local storage for authentication

Why Containers Don't Work

Technical Limitations

  1. No Native Display Server: Containers don't have built-in X11/Wayland support

  2. Complex Workarounds:

    • X11 forwarding requires mounting /tmp/.X11-unix socket
    • Needs host network mode for display access
    • Requires privileged mode for GPU access
    • Security implications of running privileged containers
  3. Environment Variables:

    • DISPLAY and XAUTHORITY are host-specific
    • Change between reboots
    • Difficult to manage in container orchestration
  4. Browser Automation Issues:

    • Headless mode doesn't work for all TikTok features
    • Virtual displays (Xvfb) are unreliable for modern web apps
    • WebGL and video playback issues in virtual displays

Systemd Advantages

  1. Native Environment Access:

    • Direct access to host display server
    • Can read environment variables from user session
    • No abstraction layer complications
  2. Simpler Configuration:

    • Single service file vs Dockerfile + k8s manifests
    • Easy to debug and troubleshoot
    • Native logging with journald
  3. Resource Management:

    • CPU and memory limits via systemd
    • Automatic restart on failure
    • Built-in timer units for scheduling
  4. Production Ready:

    • Battle-tested for system services
    • Excellent integration with Linux systems
    • No additional overhead

Implementation

# systemd service can access display directly
[Service]
Environment="DISPLAY=:0"
Environment="XAUTHORITY=/run/user/1000/.Xauthority"

vs

# Docker requires complex workarounds
FROM python:3.11
# Need to install X11 libraries
RUN apt-get install xvfb x11vnc
# Run virtual display (unreliable)
CMD xvfb-run -a python scraper.py

Trade-offs

Lost Benefits of Containerization:

  • Platform independence
  • Easy scaling across nodes
  • Isolated dependencies
  • Reproducible builds

Gained Benefits:

  • Simpler deployment
  • Direct hardware access
  • Lower overhead
  • Easier debugging
  • Native browser automation

Alternatives Considered

  1. Selenium Grid: Too complex for single-node deployment
  2. Puppeteer in Docker: Still requires display server workarounds
  3. Headless Chrome: Doesn't work reliably with TikTok
  4. API-only approach: TikTok has no public API

Conclusion

For this specific use case where:

  • Browser automation with display access is required
  • Single node deployment is sufficient
  • Simplicity and reliability are priorities

Systemd provides a more appropriate solution than containerization.

Future Considerations

If containerization becomes necessary:

  1. Consider separating TikTok scraper as standalone service
  2. Use container for non-browser scrapers only
  3. Investigate newer solutions like playwright-docker
  4. Re-evaluate when TikTok provides official API

Decision Date: 2024-12-18 Decision Makers: Development Team Status: Implemented