hvac-kia-content/UPDATED_CAPTURE_STATUS.md
Ben Reed 7e5377e7b1 docs: Update all documentation to use hkia naming convention
Documentation Updates:
- Updated project specification with hkia naming and paths
- Modified all markdown documentation files (12 files updated)
- Changed service names from hvac-content-* to hkia-content-*
- Updated NAS paths from /mnt/nas/hvacknowitall to /mnt/nas/hkia
- Replaced all instances of "HVAC Know It All" with "HKIA"

Files Updated:
- README.md - Updated service names and commands
- CLAUDE.md - Updated environment variables and paths
- DEPLOY.md - Updated deployment instructions
- docs/project_specification.md - Updated naming convention specs
- docs/status.md - Updated project status with new naming
- docs/final_status.md - Updated completion status
- docs/deployment_strategy.md - Updated deployment paths
- docs/DEPLOYMENT_CHECKLIST.md - Updated checklist items
- docs/PRODUCTION_TODO.md - Updated production tasks
- BACKLOG_STATUS.md - Updated backlog references
- UPDATED_CAPTURE_STATUS.md - Updated capture status
- FINAL_TALLY_REPORT.md - Updated tally report

Notes:
- Repository name remains hvacknowitall-content (unchanged)
- Project directory remains hvac-kia-content (unchanged)
- All user-facing outputs now use clean "hkia" naming

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-19 13:40:27 -03:00

2.5 KiB

HKIA - Updated Production Backlog Capture

🚀 Updated Configuration

Started: August 18, 2025 @ 10:54 PM ADT

📈 New Rate Limits & Targets

Source Previous Target New Target Rate Limit Estimated Time
Instagram 200 posts 1000 posts 200/hour ~5 hours
TikTok 300 videos 1000 videos Browser-based ~2-3 hours

Instagram Optimization Changes

  • Rate limit: Increased from 100 to 200 posts/hour
  • Delays: Reduced from 15-30s to 10-20 seconds
  • Extended breaks: Every 10 requests (was 5)
  • Break duration: 30-60 seconds (was 60-120s)
  • Speed improvement: ~40-50% faster

🎯 TikTok Enhancements

  • Total videos: 1000 (if available)
  • Videos with captions: 100 (increased from 50)
  • Caption fetching: Individual page visits for detailed content

📊 Already Completed Sources

Source Items Captured File Size Status
WordPress 139 posts 1.5 MB Complete
Podcast 428 episodes 727 KB Complete
YouTube 200 videos 107 KB Complete

🔄 Currently Processing

  • Instagram: Fetching 1000 posts with optimized rate limiting
  • Next: TikTok with 1000 videos target

📁 Output Location

/home/ben/dev/hvac-kia-content/data_production_backlog/markdown_current/
├── hkia_wordpress_backlog_[timestamp].md
├── hkia_podcast_backlog_[timestamp].md
├── hkia_youtube_backlog_[timestamp].md
├── hkia_instagram_backlog_[timestamp].md (pending)
└── hkia_tiktok_backlog_[timestamp].md (pending)

📈 Progress Monitoring

To monitor real-time progress:

# Watch Instagram progress
tail -f instagram_1000.log

# Check overall status
./monitor_backlog_progress.sh --live

⏱️ Time Estimates

  • Instagram: ~5 hours for 1000 posts at 200/hour
  • TikTok: ~2-3 hours for 1000 videos (depends on caption fetching)
  • Total remaining: ~7-8 hours

🎯 Final Deliverables

  • ~2,767 total items (767 already + 2000 new)
  • Specification-compliant markdown for all sources
  • Media files downloaded and organized
  • NAS synchronization upon completion

📝 Notes

The increased targets will provide a much more comprehensive historical dataset:

  • Instagram: 5x more content than originally planned
  • TikTok: 3.3x more content than originally planned
  • This will capture a significant portion of the brand's social media history