HKIA Content Aggregation System - Complete content scraping and markdown generation for 5 sources (WordPress, MailChimp RSS, Podcast RSS, YouTube, Instagram)
Find a file
Ben Reed 8d5750b1d1 Add comprehensive test infrastructure
- Created unit tests for BaseScraper with mocking
- Added integration tests for parallel processing
- Created end-to-end tests with realistic mock data
- Fixed initialization order in BaseScraper (logger before user agent)
- Fixed orchestrator method name (archive_current_file)
- Added tenacity dependency for retry logic
- Validated parallel processing performance and overlap detection
- Confirmed spec-compliant markdown formatting in tests

Tests cover:
- Base scraper functionality (state, markdown, retry logic, media downloads)
- Parallel vs sequential execution timing
- Error isolation between scrapers
- Directory structure creation
- State management across runs
- Full workflow with realistic data

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-18 21:16:14 -03:00
config Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
docs Add comprehensive production documentation and testing 2025-08-18 20:20:52 -03:00
src Add comprehensive test infrastructure 2025-08-18 21:16:14 -03:00
systemd Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
test_data Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
tests Add comprehensive test infrastructure 2025-08-18 21:16:14 -03:00
.gitignore Initial commit: Project foundation with base scraper and tests 2025-08-18 12:15:17 -03:00
.python-version Initial commit: Project foundation with base scraper and tests 2025-08-18 12:15:17 -03:00
capture_tiktok_backlog.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
CLAUDE.md Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
claude.md Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
debug_wordpress.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
debug_wordpress_raw.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
debug_youtube_detailed.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
debug_youtube_videos.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
detailed_monitor.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
install.sh Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
install_production.sh Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
main.py Initial commit: Project foundation with base scraper and tests 2025-08-18 12:15:17 -03:00
monitor_backlog.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
pyproject.toml Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
requirements.txt Implement retry logic, connection pooling, and production hardening 2025-08-18 20:16:02 -03:00
requirements_new.txt Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
run_production.py Implement retry logic, connection pooling, and production hardening 2025-08-18 20:16:02 -03:00
status.md Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
test_instagram_debug.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
test_instagram_fix.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
test_markitdown_fix.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
test_production_deployment.py Add comprehensive production documentation and testing 2025-08-18 20:20:52 -03:00
test_real_data.py feat: Enhance TikTok scraper with caption fetching and improved video discovery 2025-08-18 18:59:46 -03:00
test_sources_simple.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
test_tiktok_advanced.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
test_tiktok_scrapling.py Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00
uv.lock Fix critical production issues and improve spec compliance 2025-08-18 20:07:55 -03:00