# Production Readiness Todo List ## Overview This document outlines all tasks required to meet the original specification and prepare the HKIA Content Aggregator for production deployment. Tasks are organized by priority and phase. **Note:** Docker/Kubernetes deployment is not feasible due to TikTok scraping requiring display server access. The system uses systemd for service management instead. --- ## Phase 1: Meet Original Specification **Priority: CRITICAL - Core functionality gaps** **Timeline: Week 1** ### Scheduling & Timing - [ ] Fix scheduling times to match spec (8 AM & 12 PM ADT instead of 6 AM & 6 PM) - Update systemd timer files - Update production configuration - Test timer activation ### Data Synchronization - [ ] Enable NAS sync in production runner - Add `orchestrator.sync_to_nas()` call - Verify NAS mount path - Test rsync functionality ### File Organization - [ ] Fix file naming convention to match spec format - Change from: `update_20241218_060000.md` - To: `hkia__2024-12-18-T060000.md` - [ ] Create proper directory structure ``` data/ ├── markdown_current/ ├── markdown_archives/ │ ├── WordPress/ │ ├── Instagram/ │ ├── YouTube/ │ ├── Podcast/ │ └── MailChimp/ ├── media/ │ ├── WordPress/ │ ├── Instagram/ │ ├── YouTube/ │ ├── Podcast/ │ └── MailChimp/ └── .state/ ``` ### Content Processing - [ ] Implement media downloading for all sources - YouTube thumbnails and videos (optional) - Instagram images and videos - WordPress featured images - Podcast episode artwork - [ ] Standardize markdown output format to specification ```markdown # ID: [unique_identifier] ## Title: [content_title] ## Type: [content_type] ## Permalink: [url] ## Description: [content_description] ## Metadata: ### Comments: [count] ### Likes: [count] ### Tags: - tag1 - tag2 ``` - [ ] Add MarkItDown package for proper markdown conversion - Install markitdown - Replace custom formatting logic - Test output quality ### Security Enhancements - [ ] Implement user agent rotation for web scrapers - Create user agent pool - Rotate on each request - Add to Instagram and TikTok scrapers --- ## Phase 2: Testing Suite **Priority: HIGH - Required by specification** **Timeline: Week 1-2** ### Unit Testing - [ ] Create pytest unit tests with mocking - Test each scraper independently - Mock external API calls - Test state management - Test markdown conversion - Test error handling ### Integration Testing - [ ] Create integration tests for parallel processing - Test ThreadPoolExecutor functionality - Test file archiving - Test rsync functionality - Test scheduling logic ### End-to-End Testing - [ ] Create end-to-end tests with mock data - Full workflow simulation - Verify markdown output format - Verify file naming and placement - Test incremental updates --- ## Phase 3: Fix Critical Production Issues **Priority: CRITICAL - Security & reliability** **Timeline: Week 2** ### Systemd Service Fixes - [ ] Fix hardcoded paths in systemd services - Replace `User=ben` with configurable user - Replace `/home/ben/dev/hvac-kia-content` with `/opt/hvac-kia-content` - Use environment variables or templating - [ ] Remove hardcoded DISPLAY/XAUTHORITY from systemd services - Move to separate environment file - Only load for TikTok-specific service - Document display server requirements ### Startup Validation - [ ] Add environment variable validation on startup ```python def validate_environment(): required = [ 'WORDPRESS_USERNAME', 'WORDPRESS_API_KEY', 'YOUTUBE_CHANNEL_URL', 'INSTAGRAM_USERNAME', 'INSTAGRAM_PASSWORD' ] missing = [k for k in required if not os.getenv(k)] if missing: raise ValueError(f"Missing required env vars: {missing}") ``` ### Error Handling & Recovery - [ ] Implement retry logic using configured RETRY_CONFIG - Add tenacity library - Wrap network calls with retry decorator - Use exponential backoff settings - [ ] Add HTTP connection pooling with requests.Session - Create session in base_scraper.__init__ - Reuse session across requests - Configure connection pool size - [ ] Fix error isolation (don't crash orchestrator on single failure) - Continue processing other scrapers - Collect all errors for reporting - Return partial results --- ## Phase 4: Production Hardening **Priority: HIGH - Operations & monitoring** **Timeline: Week 2-3** ### Monitoring & Alerting - [ ] Implement health check monitoring and alerting - Send ping to healthcheck URL on success - Email alerts on critical failures - Track metrics (items processed, errors, duration) ### Logging Improvements - [ ] Add log rotation with RotatingFileHandler - Configure max file size (10MB) - Keep 5 backup files - Implement for each source ### Input Validation - [ ] Add input validation for configuration values - Validate numeric values are positive - Check rate limits are reasonable - Verify paths exist and are writable --- ## Phase 5: Documentation & Deployment **Priority: MEDIUM - Final preparation** **Timeline: Week 3** ### Documentation - [ ] Document why systemd was chosen over k8s - TikTok requires display server access - Browser automation incompatible with containers - Add to README and architecture docs - [ ] Create production deployment checklist - Pre-deployment verification steps - Configuration validation - Rollback procedures - [ ] Create rollback procedures and documentation - Backup current version - Database/state rollback steps - Service restoration process ### Testing & Monitoring - [ ] Test full production deployment on staging environment - Clone production config - Run for 24 hours - Verify all sources working - [ ] Set up monitoring dashboards and alerts - Grafana dashboard for metrics - Alert rules for failures - Disk usage monitoring --- ## Implementation Priority ### 🔴 Critical (Do First) 1. Fix hardcoded paths in systemd services 2. Add environment variable validation 3. Enable NAS sync 4. Fix error isolation 5. Fix scheduling times ### 🟠 High Priority (Do Second) 6. Implement retry logic 7. Add connection pooling 8. Create pytest unit tests 9. Implement health monitoring 10. Add log rotation ### 🟡 Medium Priority (Do Third) 11. Fix file naming convention 12. Create proper directory structure 13. Standardize markdown format 14. Implement media downloading 15. Add MarkItDown package ### 🟢 Nice to Have (If Time Permits) 16. User agent rotation 17. Integration tests 18. End-to-end tests 19. Monitoring dashboards 20. Comprehensive documentation --- ## Success Criteria ### Minimum Viable Production - [x] All scrapers functional - [x] Incremental updates working - [ ] NAS sync enabled - [ ] Proper error handling - [ ] Systemd services portable - [ ] Environment validation - [ ] Basic monitoring ### Full Production Ready - [ ] All specification requirements met - [ ] Comprehensive test suite - [ ] Full monitoring and alerting - [ ] Complete documentation - [ ] Rollback procedures - [ ] 99% uptime capability --- ## Notes ### Why Not Docker/Kubernetes? TikTok scraping requires a display server (X11/Wayland) for browser automation with Scrapling. This makes containerization impractical as containers don't have native display server access. Systemd provides adequate service management for this use case. ### Current Gaps from Specification 1. **Scheduling**: Currently 6 AM/6 PM, spec requires 8 AM/12 PM 2. **NAS Sync**: Implemented but not activated 3. **Media Downloads**: Not implemented 4. **File Naming**: Simplified format used 5. **Directory Structure**: Flat structure instead of source-separated 6. **Testing**: Manual tests only, no pytest suite 7. **Markdown Format**: Custom format instead of specified structure ### Estimated Timeline - **Week 1**: Critical fixes and spec compliance - **Week 2**: Testing and error handling - **Week 3**: Monitoring and documentation - **Total**: 3 weeks to full production readiness --- ## Quick Start Commands ```bash # Phase 1: Critical Security Fixes sed -i 's/User=ben/User=${SERVICE_USER}/g' systemd/*.service sed -i 's|/home/ben/dev|/opt|g' systemd/*.service # Phase 2: Enable NAS Sync echo "orchestrator.sync_to_nas()" >> run_production.py # Phase 3: Fix Scheduling sed -i 's/06:00:00/08:00:00/g' systemd/*.timer sed -i 's/18:00:00/12:00:00/g' systemd/*.timer # Phase 4: Test Deployment ./install_production.sh systemctl status hkia-content-aggregator.timer ``` --- *Last Updated: 2024-12-18* *Version: 1.0*