Major Updates:
- Added image downloading for Instagram, YouTube, and Podcast scrapers
- Implemented cumulative markdown system for maintaining single source-of-truth files
- Deployed production services with automatic NAS sync for images
- Standardized file naming conventions per project specification
New Features:
- Instagram: Downloads all post images, carousel images, and video thumbnails
- YouTube: Downloads video thumbnails (highest quality available)
- Podcast: Downloads episode artwork/thumbnails
- Consistent image naming: {source}_{item_id}_{type}.{ext}
- Cumulative markdown updates to prevent file proliferation
- Automatic media sync to NAS at /mnt/nas/hvacknowitall/media/
Production Deployment:
- New systemd services: hvac-content-images-8am and hvac-content-images-12pm
- Runs twice daily at 8 AM and 12 PM Atlantic time
- Comprehensive rsync for both markdown and media files
File Structure Compliance:
- Renamed Instagram backlog to spec-compliant format
- Archived legacy directory structures
- Ensured all new files follow <brandName>_<source>_<dateTime>.md format
Testing:
- Successfully captured Instagram posts 1-1000 with images
- Launched next batch (posts 1001-2000) currently in progress
- Verified thumbnail downloads for YouTube and Podcast content
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
74 lines
No EOL
2.1 KiB
Bash
Executable file
74 lines
No EOL
2.1 KiB
Bash
Executable file
#!/bin/bash
|
|
# Update script to enable image downloading in production
|
|
|
|
echo "Updating HVAC Content Aggregation to include image downloads..."
|
|
echo
|
|
|
|
# Stop and disable old services
|
|
echo "Stopping old services..."
|
|
sudo systemctl stop hvac-content-8am.timer hvac-content-12pm.timer
|
|
sudo systemctl disable hvac-content-8am.service hvac-content-12pm.service
|
|
sudo systemctl disable hvac-content-8am.timer hvac-content-12pm.timer
|
|
|
|
# Copy new service files
|
|
echo "Installing new services with image downloads..."
|
|
sudo cp hvac-content-images-8am.service /etc/systemd/system/
|
|
sudo cp hvac-content-images-12pm.service /etc/systemd/system/
|
|
|
|
# Create new timer files (reuse existing timers with new names)
|
|
sudo tee /etc/systemd/system/hvac-content-images-8am.timer > /dev/null <<EOF
|
|
[Unit]
|
|
Description=Run HVAC Content with Images at 8 AM daily
|
|
|
|
[Timer]
|
|
OnCalendar=*-*-* 08:00:00
|
|
Persistent=true
|
|
|
|
[Install]
|
|
WantedBy=timers.target
|
|
EOF
|
|
|
|
sudo tee /etc/systemd/system/hvac-content-images-12pm.timer > /dev/null <<EOF
|
|
[Unit]
|
|
Description=Run HVAC Content with Images at 12 PM daily
|
|
|
|
[Timer]
|
|
OnCalendar=*-*-* 12:00:00
|
|
Persistent=true
|
|
|
|
[Install]
|
|
WantedBy=timers.target
|
|
EOF
|
|
|
|
# Reload systemd
|
|
echo "Reloading systemd..."
|
|
sudo systemctl daemon-reload
|
|
|
|
# Enable new services
|
|
echo "Enabling new services..."
|
|
sudo systemctl enable hvac-content-images-8am.timer
|
|
sudo systemctl enable hvac-content-images-12pm.timer
|
|
|
|
# Start timers
|
|
echo "Starting timers..."
|
|
sudo systemctl start hvac-content-images-8am.timer
|
|
sudo systemctl start hvac-content-images-12pm.timer
|
|
|
|
# Show status
|
|
echo
|
|
echo "Service status:"
|
|
sudo systemctl status hvac-content-images-8am.timer --no-pager
|
|
echo
|
|
sudo systemctl status hvac-content-images-12pm.timer --no-pager
|
|
echo
|
|
echo "Next scheduled runs:"
|
|
sudo systemctl list-timers hvac-content-images-* --no-pager
|
|
|
|
echo
|
|
echo "✅ Update complete! Image downloading is now enabled in production."
|
|
echo "The scrapers will now download:"
|
|
echo " - Instagram post images and video thumbnails"
|
|
echo " - YouTube video thumbnails"
|
|
echo " - Podcast episode thumbnails"
|
|
echo
|
|
echo "Images will be synced to: /mnt/nas/hvacknowitall/media/" |