Commit graph

3 commits

Author SHA1 Message Date
Ben Reed
dabef8bfcb Implement retry logic, connection pooling, and production hardening
Major Production Improvements:
- Added retry logic with exponential backoff using tenacity
- Implemented HTTP connection pooling via requests.Session
- Added health check monitoring with metrics reporting
- Implemented configuration validation for all numeric values
- Fixed error isolation (verified continues on failure)

Technical Changes:
- BaseScraper: Added session management and make_request() method
- WordPressScraper: Updated all HTTP calls to use retry logic
- Production runner: Added validate_config() and health check ping
- Retry config: 3 attempts, 5-60s exponential backoff

System is now production-ready with robust error handling,
automatic retries, and health monitoring. Remaining tasks
focus on spec compliance (media downloads, markdown format)
and testing/documentation.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-18 20:16:02 -03:00
Ben Reed
05218a873b Fix critical production issues and improve spec compliance
Production Readiness Improvements:
- Fixed scheduling to match spec (8 AM & 12 PM ADT instead of 6 AM/6 PM)
- Enabled NAS synchronization in production runner with error handling
- Fixed file naming convention to spec format (hvacknowitall_combined_YYYY-MM-DD-THHMMSS.md)
- Made systemd services portable (removed hardcoded user/paths)
- Added environment variable validation on startup
- Moved DISPLAY/XAUTHORITY to .env configuration

Systemd Improvements:
- Created template service file (@.service) for any user
- Changed all paths to /opt/hvac-kia-content
- Updated installation script for portable deployment
- Fixed service dependencies and resource limits

Documentation:
- Created comprehensive PRODUCTION_TODO.md with 25 tasks
- Added PRODUCTION_GUIDE.md with deployment instructions
- Documented spec compliance gaps (65% complete)

Remaining work includes retry logic, connection pooling, media downloads,
and pytest test suite as documented in PRODUCTION_TODO.md

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-18 20:07:55 -03:00
Ben Reed
95e0499791 feat: Implement WordPress scraper with comprehensive tests
- Created WordPressScraper class extending BaseScraper
- Fetches posts with pagination support
- Enriches posts with author, category, and tag information
- Implements incremental updates via state management
- Word count calculation for content
- All 11 tests passing

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-18 12:19:56 -03:00