Documentation Updates: - Updated project specification with hkia naming and paths - Modified all markdown documentation files (12 files updated) - Changed service names from hvac-content-* to hkia-content-* - Updated NAS paths from /mnt/nas/hvacknowitall to /mnt/nas/hkia - Replaced all instances of "HVAC Know It All" with "HKIA" Files Updated: - README.md - Updated service names and commands - CLAUDE.md - Updated environment variables and paths - DEPLOY.md - Updated deployment instructions - docs/project_specification.md - Updated naming convention specs - docs/status.md - Updated project status with new naming - docs/final_status.md - Updated completion status - docs/deployment_strategy.md - Updated deployment paths - docs/DEPLOYMENT_CHECKLIST.md - Updated checklist items - docs/PRODUCTION_TODO.md - Updated production tasks - BACKLOG_STATUS.md - Updated backlog references - UPDATED_CAPTURE_STATUS.md - Updated capture status - FINAL_TALLY_REPORT.md - Updated tally report Notes: - Repository name remains hvacknowitall-content (unchanged) - Project directory remains hvac-kia-content (unchanged) - All user-facing outputs now use clean "hkia" naming 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
8.6 KiB
Production Readiness Todo List
Overview
This document outlines all tasks required to meet the original specification and prepare the HKIA Content Aggregator for production deployment. Tasks are organized by priority and phase.
Note: Docker/Kubernetes deployment is not feasible due to TikTok scraping requiring display server access. The system uses systemd for service management instead.
Phase 1: Meet Original Specification
Priority: CRITICAL - Core functionality gaps Timeline: Week 1
Scheduling & Timing
- Fix scheduling times to match spec (8 AM & 12 PM ADT instead of 6 AM & 6 PM)
- Update systemd timer files
- Update production configuration
- Test timer activation
Data Synchronization
- Enable NAS sync in production runner
- Add
orchestrator.sync_to_nas()call - Verify NAS mount path
- Test rsync functionality
- Add
File Organization
-
Fix file naming convention to match spec format
- Change from:
update_20241218_060000.md - To:
hkia_<source>_2024-12-18-T060000.md
- Change from:
-
Create proper directory structure
data/ ├── markdown_current/ ├── markdown_archives/ │ ├── WordPress/ │ ├── Instagram/ │ ├── YouTube/ │ ├── Podcast/ │ └── MailChimp/ ├── media/ │ ├── WordPress/ │ ├── Instagram/ │ ├── YouTube/ │ ├── Podcast/ │ └── MailChimp/ └── .state/
Content Processing
-
Implement media downloading for all sources
- YouTube thumbnails and videos (optional)
- Instagram images and videos
- WordPress featured images
- Podcast episode artwork
-
Standardize markdown output format to specification
# ID: [unique_identifier] ## Title: [content_title] ## Type: [content_type] ## Permalink: [url] ## Description: [content_description] ## Metadata: ### Comments: [count] ### Likes: [count] ### Tags: - tag1 - tag2 -
Add MarkItDown package for proper markdown conversion
- Install markitdown
- Replace custom formatting logic
- Test output quality
Security Enhancements
- Implement user agent rotation for web scrapers
- Create user agent pool
- Rotate on each request
- Add to Instagram and TikTok scrapers
Phase 2: Testing Suite
Priority: HIGH - Required by specification Timeline: Week 1-2
Unit Testing
- Create pytest unit tests with mocking
- Test each scraper independently
- Mock external API calls
- Test state management
- Test markdown conversion
- Test error handling
Integration Testing
- Create integration tests for parallel processing
- Test ThreadPoolExecutor functionality
- Test file archiving
- Test rsync functionality
- Test scheduling logic
End-to-End Testing
- Create end-to-end tests with mock data
- Full workflow simulation
- Verify markdown output format
- Verify file naming and placement
- Test incremental updates
Phase 3: Fix Critical Production Issues
Priority: CRITICAL - Security & reliability Timeline: Week 2
Systemd Service Fixes
-
Fix hardcoded paths in systemd services
- Replace
User=benwith configurable user - Replace
/home/ben/dev/hvac-kia-contentwith/opt/hvac-kia-content - Use environment variables or templating
- Replace
-
Remove hardcoded DISPLAY/XAUTHORITY from systemd services
- Move to separate environment file
- Only load for TikTok-specific service
- Document display server requirements
Startup Validation
- Add environment variable validation on startup
def validate_environment(): required = [ 'WORDPRESS_USERNAME', 'WORDPRESS_API_KEY', 'YOUTUBE_CHANNEL_URL', 'INSTAGRAM_USERNAME', 'INSTAGRAM_PASSWORD' ] missing = [k for k in required if not os.getenv(k)] if missing: raise ValueError(f"Missing required env vars: {missing}")
Error Handling & Recovery
-
Implement retry logic using configured RETRY_CONFIG
- Add tenacity library
- Wrap network calls with retry decorator
- Use exponential backoff settings
-
Add HTTP connection pooling with requests.Session
- Create session in base_scraper.init
- Reuse session across requests
- Configure connection pool size
-
Fix error isolation (don't crash orchestrator on single failure)
- Continue processing other scrapers
- Collect all errors for reporting
- Return partial results
Phase 4: Production Hardening
Priority: HIGH - Operations & monitoring Timeline: Week 2-3
Monitoring & Alerting
- Implement health check monitoring and alerting
- Send ping to healthcheck URL on success
- Email alerts on critical failures
- Track metrics (items processed, errors, duration)
Logging Improvements
- Add log rotation with RotatingFileHandler
- Configure max file size (10MB)
- Keep 5 backup files
- Implement for each source
Input Validation
- Add input validation for configuration values
- Validate numeric values are positive
- Check rate limits are reasonable
- Verify paths exist and are writable
Phase 5: Documentation & Deployment
Priority: MEDIUM - Final preparation Timeline: Week 3
Documentation
-
Document why systemd was chosen over k8s
- TikTok requires display server access
- Browser automation incompatible with containers
- Add to README and architecture docs
-
Create production deployment checklist
- Pre-deployment verification steps
- Configuration validation
- Rollback procedures
-
Create rollback procedures and documentation
- Backup current version
- Database/state rollback steps
- Service restoration process
Testing & Monitoring
-
Test full production deployment on staging environment
- Clone production config
- Run for 24 hours
- Verify all sources working
-
Set up monitoring dashboards and alerts
- Grafana dashboard for metrics
- Alert rules for failures
- Disk usage monitoring
Implementation Priority
🔴 Critical (Do First)
- Fix hardcoded paths in systemd services
- Add environment variable validation
- Enable NAS sync
- Fix error isolation
- Fix scheduling times
🟠 High Priority (Do Second)
- Implement retry logic
- Add connection pooling
- Create pytest unit tests
- Implement health monitoring
- Add log rotation
🟡 Medium Priority (Do Third)
- Fix file naming convention
- Create proper directory structure
- Standardize markdown format
- Implement media downloading
- Add MarkItDown package
🟢 Nice to Have (If Time Permits)
- User agent rotation
- Integration tests
- End-to-end tests
- Monitoring dashboards
- Comprehensive documentation
Success Criteria
Minimum Viable Production
- All scrapers functional
- Incremental updates working
- NAS sync enabled
- Proper error handling
- Systemd services portable
- Environment validation
- Basic monitoring
Full Production Ready
- All specification requirements met
- Comprehensive test suite
- Full monitoring and alerting
- Complete documentation
- Rollback procedures
- 99% uptime capability
Notes
Why Not Docker/Kubernetes?
TikTok scraping requires a display server (X11/Wayland) for browser automation with Scrapling. This makes containerization impractical as containers don't have native display server access. Systemd provides adequate service management for this use case.
Current Gaps from Specification
- Scheduling: Currently 6 AM/6 PM, spec requires 8 AM/12 PM
- NAS Sync: Implemented but not activated
- Media Downloads: Not implemented
- File Naming: Simplified format used
- Directory Structure: Flat structure instead of source-separated
- Testing: Manual tests only, no pytest suite
- Markdown Format: Custom format instead of specified structure
Estimated Timeline
- Week 1: Critical fixes and spec compliance
- Week 2: Testing and error handling
- Week 3: Monitoring and documentation
- Total: 3 weeks to full production readiness
Quick Start Commands
# Phase 1: Critical Security Fixes
sed -i 's/User=ben/User=${SERVICE_USER}/g' systemd/*.service
sed -i 's|/home/ben/dev|/opt|g' systemd/*.service
# Phase 2: Enable NAS Sync
echo "orchestrator.sync_to_nas()" >> run_production.py
# Phase 3: Fix Scheduling
sed -i 's/06:00:00/08:00:00/g' systemd/*.timer
sed -i 's/18:00:00/12:00:00/g' systemd/*.timer
# Phase 4: Test Deployment
./install_production.sh
systemctl status hkia-content-aggregator.timer
Last Updated: 2024-12-18 Version: 1.0