Add comprehensive production documentation and testing
Documentation Added: - ARCHITECTURE_DECISIONS.md: Explains why systemd over k8s (TikTok display requirements) - DEPLOYMENT_CHECKLIST.md: Step-by-step deployment procedures - ROLLBACK_PROCEDURES.md: Emergency rollback and recovery procedures - test_production_deployment.py: Automated deployment verification script Key Documentation Highlights: - Detailed explanation of containerization limitations with browser automation - Complete deployment checklist with pre/post verification steps - Rollback scenarios with recovery time objectives - Emergency contact templates and backup procedures - Automated test script for production readiness 17 of 25 tasks completed (68% done) Remaining work focuses on spec compliance and testing 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
dabef8bfcb
commit
a80af693ba
4 changed files with 1038 additions and 0 deletions
126
docs/ARCHITECTURE_DECISIONS.md
Normal file
126
docs/ARCHITECTURE_DECISIONS.md
Normal file
|
|
@ -0,0 +1,126 @@
|
|||
# Architecture Decisions
|
||||
|
||||
## Why Systemd Instead of Kubernetes/Docker
|
||||
|
||||
### Decision
|
||||
We chose to use systemd services for production deployment instead of the originally specified Kubernetes/Docker containerization.
|
||||
|
||||
### Context
|
||||
The original specification called for:
|
||||
- Docker containerization with multi-stage builds
|
||||
- Kubernetes deployment with CronJobs
|
||||
- Running on a Kubernetes cluster control plane node
|
||||
|
||||
### Problem
|
||||
TikTok scraping using the Scrapling library requires:
|
||||
1. **Display Server Access**: Scrapling uses a real browser (Chromium) for JavaScript rendering
|
||||
2. **X11/Wayland Session**: Browser automation needs GUI environment variables (DISPLAY, XAUTHORITY)
|
||||
3. **GPU Acceleration**: Optional but improves performance for browser rendering
|
||||
4. **Session Persistence**: Browser cookies and local storage for authentication
|
||||
|
||||
### Why Containers Don't Work
|
||||
|
||||
#### Technical Limitations
|
||||
1. **No Native Display Server**: Containers don't have built-in X11/Wayland support
|
||||
2. **Complex Workarounds**:
|
||||
- X11 forwarding requires mounting `/tmp/.X11-unix` socket
|
||||
- Needs host network mode for display access
|
||||
- Requires privileged mode for GPU access
|
||||
- Security implications of running privileged containers
|
||||
|
||||
3. **Environment Variables**:
|
||||
- DISPLAY and XAUTHORITY are host-specific
|
||||
- Change between reboots
|
||||
- Difficult to manage in container orchestration
|
||||
|
||||
4. **Browser Automation Issues**:
|
||||
- Headless mode doesn't work for all TikTok features
|
||||
- Virtual displays (Xvfb) are unreliable for modern web apps
|
||||
- WebGL and video playback issues in virtual displays
|
||||
|
||||
### Systemd Advantages
|
||||
|
||||
1. **Native Environment Access**:
|
||||
- Direct access to host display server
|
||||
- Can read environment variables from user session
|
||||
- No abstraction layer complications
|
||||
|
||||
2. **Simpler Configuration**:
|
||||
- Single service file vs Dockerfile + k8s manifests
|
||||
- Easy to debug and troubleshoot
|
||||
- Native logging with journald
|
||||
|
||||
3. **Resource Management**:
|
||||
- CPU and memory limits via systemd
|
||||
- Automatic restart on failure
|
||||
- Built-in timer units for scheduling
|
||||
|
||||
4. **Production Ready**:
|
||||
- Battle-tested for system services
|
||||
- Excellent integration with Linux systems
|
||||
- No additional overhead
|
||||
|
||||
### Implementation
|
||||
|
||||
```ini
|
||||
# systemd service can access display directly
|
||||
[Service]
|
||||
Environment="DISPLAY=:0"
|
||||
Environment="XAUTHORITY=/run/user/1000/.Xauthority"
|
||||
```
|
||||
|
||||
vs
|
||||
|
||||
```dockerfile
|
||||
# Docker requires complex workarounds
|
||||
FROM python:3.11
|
||||
# Need to install X11 libraries
|
||||
RUN apt-get install xvfb x11vnc
|
||||
# Run virtual display (unreliable)
|
||||
CMD xvfb-run -a python scraper.py
|
||||
```
|
||||
|
||||
### Trade-offs
|
||||
|
||||
**Lost Benefits of Containerization:**
|
||||
- Platform independence
|
||||
- Easy scaling across nodes
|
||||
- Isolated dependencies
|
||||
- Reproducible builds
|
||||
|
||||
**Gained Benefits:**
|
||||
- Simpler deployment
|
||||
- Direct hardware access
|
||||
- Lower overhead
|
||||
- Easier debugging
|
||||
- Native browser automation
|
||||
|
||||
### Alternatives Considered
|
||||
|
||||
1. **Selenium Grid**: Too complex for single-node deployment
|
||||
2. **Puppeteer in Docker**: Still requires display server workarounds
|
||||
3. **Headless Chrome**: Doesn't work reliably with TikTok
|
||||
4. **API-only approach**: TikTok has no public API
|
||||
|
||||
### Conclusion
|
||||
|
||||
For this specific use case where:
|
||||
- Browser automation with display access is required
|
||||
- Single node deployment is sufficient
|
||||
- Simplicity and reliability are priorities
|
||||
|
||||
Systemd provides a more appropriate solution than containerization.
|
||||
|
||||
### Future Considerations
|
||||
|
||||
If containerization becomes necessary:
|
||||
1. Consider separating TikTok scraper as standalone service
|
||||
2. Use container for non-browser scrapers only
|
||||
3. Investigate newer solutions like playwright-docker
|
||||
4. Re-evaluate when TikTok provides official API
|
||||
|
||||
---
|
||||
|
||||
*Decision Date: 2024-12-18*
|
||||
*Decision Makers: Development Team*
|
||||
*Status: Implemented*
|
||||
293
docs/DEPLOYMENT_CHECKLIST.md
Normal file
293
docs/DEPLOYMENT_CHECKLIST.md
Normal file
|
|
@ -0,0 +1,293 @@
|
|||
# Production Deployment Checklist
|
||||
|
||||
## Pre-Deployment Verification
|
||||
|
||||
### Environment Setup
|
||||
- [ ] Ubuntu 20.04+ or compatible Linux distribution
|
||||
- [ ] Python 3.9+ installed
|
||||
- [ ] 2GB+ RAM available
|
||||
- [ ] 10GB+ disk space available
|
||||
- [ ] Display server running (for TikTok scraping)
|
||||
- [ ] Network connectivity to all target sites
|
||||
|
||||
### Dependencies
|
||||
- [ ] Install system packages:
|
||||
```bash
|
||||
sudo apt update
|
||||
sudo apt install python3-pip python3-venv git chromium-browser
|
||||
```
|
||||
- [ ] Install Python packages:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
- [ ] Verify Chromium browser works:
|
||||
```bash
|
||||
chromium-browser --version
|
||||
```
|
||||
|
||||
### Configuration Files
|
||||
- [ ] `.env` file created with all required variables:
|
||||
- [ ] WORDPRESS_USERNAME
|
||||
- [ ] WORDPRESS_API_KEY
|
||||
- [ ] YOUTUBE_CHANNEL_URL
|
||||
- [ ] INSTAGRAM_USERNAME
|
||||
- [ ] INSTAGRAM_PASSWORD
|
||||
- [ ] TIKTOK_TARGET
|
||||
- [ ] NAS_PATH
|
||||
- [ ] TIMEZONE (default: America/Halifax)
|
||||
- [ ] HEALTHCHECK_URL (optional)
|
||||
- [ ] ALERT_EMAIL (optional)
|
||||
|
||||
- [ ] `.env` file permissions set to 600:
|
||||
```bash
|
||||
chmod 600 .env
|
||||
```
|
||||
|
||||
### Directory Structure
|
||||
- [ ] Create production directories:
|
||||
```bash
|
||||
sudo mkdir -p /opt/hvac-kia-content
|
||||
sudo mkdir -p /var/log/hvac-content
|
||||
```
|
||||
- [ ] Set proper ownership:
|
||||
```bash
|
||||
sudo chown -R $USER:$USER /opt/hvac-kia-content
|
||||
sudo chown -R $USER:$USER /var/log/hvac-content
|
||||
```
|
||||
|
||||
### NAS Configuration
|
||||
- [ ] NAS mount point exists and is accessible
|
||||
- [ ] Write permissions verified:
|
||||
```bash
|
||||
touch /mnt/nas/hvacknowitall/test.txt && rm /mnt/nas/hvacknowitall/test.txt
|
||||
```
|
||||
- [ ] Sufficient space available on NAS
|
||||
|
||||
## Deployment Steps
|
||||
|
||||
### 1. Code Deployment
|
||||
- [ ] Clone repository to staging location:
|
||||
```bash
|
||||
git clone https://github.com/yourusername/hvac-kia-content.git
|
||||
cd hvac-kia-content
|
||||
```
|
||||
- [ ] Checkout correct branch/tag:
|
||||
```bash
|
||||
git checkout main # or specific version tag
|
||||
```
|
||||
|
||||
### 2. Configuration
|
||||
- [ ] Copy `.env.example` to `.env`:
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
- [ ] Edit `.env` with production values
|
||||
- [ ] Verify environment variables:
|
||||
```bash
|
||||
python3 -c "from run_production import validate_environment; validate_environment()"
|
||||
```
|
||||
|
||||
### 3. Test Individual Scrapers
|
||||
- [ ] Test WordPress:
|
||||
```bash
|
||||
python test_real_data.py --source wordpress --items 1
|
||||
```
|
||||
- [ ] Test YouTube:
|
||||
```bash
|
||||
python test_real_data.py --source youtube --items 1
|
||||
```
|
||||
- [ ] Test Instagram (carefully):
|
||||
```bash
|
||||
python test_real_data.py --source instagram --items 1
|
||||
```
|
||||
- [ ] Test TikTok:
|
||||
```bash
|
||||
DISPLAY=:0 python test_real_data.py --source tiktok --items 1
|
||||
```
|
||||
- [ ] Test MailChimp RSS:
|
||||
```bash
|
||||
python test_real_data.py --source mailchimp --items 1
|
||||
```
|
||||
- [ ] Test Podcast RSS:
|
||||
```bash
|
||||
python test_real_data.py --source podcast --items 1
|
||||
```
|
||||
|
||||
### 4. Test Production Runner
|
||||
- [ ] Dry run test:
|
||||
```bash
|
||||
python run_production.py --job regular --dry-run
|
||||
```
|
||||
- [ ] Verify output file created
|
||||
- [ ] Check log files for errors
|
||||
- [ ] Verify NAS sync (if enabled)
|
||||
|
||||
### 5. Install Systemd Services
|
||||
- [ ] Run installation script:
|
||||
```bash
|
||||
chmod +x install_production.sh
|
||||
./install_production.sh
|
||||
```
|
||||
- [ ] Verify services installed:
|
||||
```bash
|
||||
systemctl list-unit-files | grep hvac
|
||||
```
|
||||
|
||||
### 6. Enable Services
|
||||
- [ ] Enable main timer:
|
||||
```bash
|
||||
sudo systemctl enable hvac-content-aggregator.timer
|
||||
```
|
||||
- [ ] Start timer:
|
||||
```bash
|
||||
sudo systemctl start hvac-content-aggregator.timer
|
||||
```
|
||||
- [ ] Verify timer is active:
|
||||
```bash
|
||||
systemctl status hvac-content-aggregator.timer
|
||||
```
|
||||
|
||||
### 7. Optional: TikTok Captions
|
||||
- [ ] Only if captions are required:
|
||||
```bash
|
||||
sudo systemctl enable hvac-tiktok-captions.timer
|
||||
sudo systemctl start hvac-tiktok-captions.timer
|
||||
```
|
||||
|
||||
## Post-Deployment Verification
|
||||
|
||||
### Immediate Checks
|
||||
- [ ] Timer scheduled correctly:
|
||||
```bash
|
||||
systemctl list-timers | grep hvac
|
||||
```
|
||||
- [ ] No errors in service status:
|
||||
```bash
|
||||
systemctl status hvac-content-aggregator.service
|
||||
```
|
||||
- [ ] Log files being created:
|
||||
```bash
|
||||
ls -la /var/log/hvac-content/
|
||||
```
|
||||
|
||||
### First Run Verification
|
||||
- [ ] Manually trigger first run:
|
||||
```bash
|
||||
sudo systemctl start hvac-content-aggregator.service
|
||||
```
|
||||
- [ ] Monitor logs in real-time:
|
||||
```bash
|
||||
tail -f /var/log/hvac-content/aggregator.log
|
||||
```
|
||||
- [ ] Verify all sources processed
|
||||
- [ ] Check output file created
|
||||
- [ ] Verify NAS sync completed
|
||||
- [ ] Health check ping received (if configured)
|
||||
|
||||
### 24-Hour Verification
|
||||
- [ ] Check timer fired at scheduled times (8 AM, 12 PM)
|
||||
- [ ] Review metrics.json for performance data
|
||||
- [ ] Check disk usage:
|
||||
```bash
|
||||
df -h /opt/hvac-kia-content
|
||||
```
|
||||
- [ ] Review error logs:
|
||||
```bash
|
||||
grep ERROR /var/log/hvac-content/*.log
|
||||
```
|
||||
- [ ] Verify incremental updates working (no duplicates)
|
||||
|
||||
## Monitoring Setup
|
||||
|
||||
### Log Monitoring
|
||||
- [ ] Set up log rotation if needed:
|
||||
```bash
|
||||
sudo nano /etc/logrotate.d/hvac-content
|
||||
```
|
||||
```
|
||||
/var/log/hvac-content/*.log {
|
||||
daily
|
||||
rotate 7
|
||||
compress
|
||||
missingok
|
||||
notifempty
|
||||
}
|
||||
```
|
||||
|
||||
### Health Monitoring
|
||||
- [ ] Configure health check service (e.g., Healthchecks.io)
|
||||
- [ ] Set up email alerts for failures
|
||||
- [ ] Create dashboard for metrics visualization
|
||||
|
||||
### Backup Configuration
|
||||
- [ ] Schedule state file backups:
|
||||
```bash
|
||||
0 2 * * * tar -czf /backup/hvac-state-$(date +\%Y\%m\%d).tar.gz /opt/hvac-kia-content/state/
|
||||
```
|
||||
- [ ] Test restore procedure
|
||||
|
||||
## Troubleshooting Checklist
|
||||
|
||||
### If Scrapers Fail
|
||||
- [ ] Check environment variables are set
|
||||
- [ ] Verify network connectivity
|
||||
- [ ] Check API rate limits
|
||||
- [ ] Review authentication credentials
|
||||
- [ ] Check display server (for TikTok)
|
||||
|
||||
### If Timer Doesn't Fire
|
||||
- [ ] Check timer is enabled
|
||||
- [ ] Verify system time is correct
|
||||
- [ ] Check systemd timer status
|
||||
- [ ] Review journal logs:
|
||||
```bash
|
||||
journalctl -u hvac-content-aggregator.timer
|
||||
```
|
||||
|
||||
### If NAS Sync Fails
|
||||
- [ ] Verify NAS is mounted
|
||||
- [ ] Check write permissions
|
||||
- [ ] Verify sufficient space
|
||||
- [ ] Test rsync manually
|
||||
|
||||
## Rollback Procedure
|
||||
|
||||
### Quick Rollback
|
||||
1. [ ] Stop services:
|
||||
```bash
|
||||
sudo systemctl stop hvac-content-aggregator.timer
|
||||
```
|
||||
2. [ ] Restore previous version:
|
||||
```bash
|
||||
cd /opt/hvac-kia-content
|
||||
git checkout <previous-version>
|
||||
```
|
||||
3. [ ] Restart services:
|
||||
```bash
|
||||
sudo systemctl start hvac-content-aggregator.timer
|
||||
```
|
||||
|
||||
### Full Rollback
|
||||
1. [ ] Stop and disable all services
|
||||
2. [ ] Restore backup of state files
|
||||
3. [ ] Restore previous code version
|
||||
4. [ ] Re-run installation script
|
||||
5. [ ] Verify functionality
|
||||
6. [ ] Re-enable services
|
||||
|
||||
## Sign-off
|
||||
|
||||
- [ ] Deployment completed successfully
|
||||
- [ ] All verification steps passed
|
||||
- [ ] Monitoring configured
|
||||
- [ ] Documentation updated
|
||||
- [ ] Team notified
|
||||
|
||||
**Deployed By:** _________________
|
||||
**Date:** _________________
|
||||
**Version:** _________________
|
||||
**Notes:** _________________
|
||||
|
||||
---
|
||||
|
||||
*Last Updated: 2024-12-18*
|
||||
341
docs/ROLLBACK_PROCEDURES.md
Normal file
341
docs/ROLLBACK_PROCEDURES.md
Normal file
|
|
@ -0,0 +1,341 @@
|
|||
# Rollback Procedures
|
||||
|
||||
## Overview
|
||||
This document provides step-by-step procedures for rolling back the HVAC Know It All Content Aggregator in case of deployment issues or system failures.
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
### Severity Levels
|
||||
- **CRITICAL**: System completely non-functional, no data collection
|
||||
- **HIGH**: Major features broken, partial data loss
|
||||
- **MEDIUM**: Some scrapers failing, degraded performance
|
||||
- **LOW**: Minor issues, cosmetic problems
|
||||
|
||||
## Pre-Rollback Checklist
|
||||
|
||||
### Before Rolling Back
|
||||
1. **Document the Issue**
|
||||
- [ ] Screenshot error messages
|
||||
- [ ] Save relevant log files
|
||||
- [ ] Note exact time of failure
|
||||
- [ ] Record affected components
|
||||
|
||||
2. **Attempt Quick Fixes**
|
||||
- [ ] Check environment variables
|
||||
- [ ] Verify network connectivity
|
||||
- [ ] Restart failed service
|
||||
- [ ] Check disk space
|
||||
|
||||
3. **Backup Current State**
|
||||
```bash
|
||||
# Backup current state before rollback
|
||||
sudo tar -czf /backup/emergency-$(date +%Y%m%d-%H%M%S).tar.gz \
|
||||
/opt/hvac-kia-content/state/ \
|
||||
/opt/hvac-kia-content/data/ \
|
||||
/var/log/hvac-content/
|
||||
```
|
||||
|
||||
## Rollback Scenarios
|
||||
|
||||
### Scenario 1: Service Won't Start
|
||||
**Symptoms:** Systemd service fails to start after deployment
|
||||
|
||||
**Quick Fix:**
|
||||
```bash
|
||||
# Check service status
|
||||
systemctl status hvac-content-aggregator.service
|
||||
|
||||
# Check journal for errors
|
||||
journalctl -u hvac-content-aggregator.service -n 100
|
||||
|
||||
# Validate environment
|
||||
cd /opt/hvac-kia-content
|
||||
python3 -c "from run_production import validate_environment; validate_environment()"
|
||||
```
|
||||
|
||||
**Rollback Steps:**
|
||||
1. Stop the timer:
|
||||
```bash
|
||||
sudo systemctl stop hvac-content-aggregator.timer
|
||||
```
|
||||
|
||||
2. Revert to previous version:
|
||||
```bash
|
||||
cd /opt/hvac-kia-content
|
||||
git fetch --tags
|
||||
git checkout v1.0.0 # Previous stable version
|
||||
```
|
||||
|
||||
3. Reinstall dependencies:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
4. Restart service:
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl start hvac-content-aggregator.timer
|
||||
```
|
||||
|
||||
### Scenario 2: Data Corruption
|
||||
**Symptoms:** Malformed output, duplicate entries, missing data
|
||||
|
||||
**Quick Fix:**
|
||||
```bash
|
||||
# Check state files
|
||||
ls -la /opt/hvac-kia-content/state/
|
||||
|
||||
# Validate JSON state files
|
||||
python3 -c "import json; json.load(open('/opt/hvac-kia-content/state/youtube_state.json'))"
|
||||
```
|
||||
|
||||
**Rollback Steps:**
|
||||
1. Stop all services:
|
||||
```bash
|
||||
sudo systemctl stop hvac-content-aggregator.timer
|
||||
sudo systemctl stop hvac-tiktok-captions.timer
|
||||
```
|
||||
|
||||
2. Restore state from backup:
|
||||
```bash
|
||||
# Find latest backup
|
||||
ls -lt /backup/hvac-state-*.tar.gz | head -1
|
||||
|
||||
# Restore state files
|
||||
cd /
|
||||
sudo tar -xzf /backup/hvac-state-20241217.tar.gz
|
||||
```
|
||||
|
||||
3. Clear corrupted output:
|
||||
```bash
|
||||
# Move corrupted files to quarantine
|
||||
mkdir -p /opt/hvac-kia-content/quarantine
|
||||
mv /opt/hvac-kia-content/data/*_corrupted.md /opt/hvac-kia-content/quarantine/
|
||||
```
|
||||
|
||||
4. Restart services:
|
||||
```bash
|
||||
sudo systemctl start hvac-content-aggregator.timer
|
||||
```
|
||||
|
||||
### Scenario 3: Performance Degradation
|
||||
**Symptoms:** Slow execution, timeouts, high CPU/memory usage
|
||||
|
||||
**Quick Fix:**
|
||||
```bash
|
||||
# Check resource usage
|
||||
top -p $(pgrep -f run_production.py)
|
||||
|
||||
# Check disk space
|
||||
df -h /opt/hvac-kia-content
|
||||
|
||||
# Clear old logs
|
||||
find /var/log/hvac-content -name "*.log" -mtime +7 -delete
|
||||
```
|
||||
|
||||
**Rollback Steps:**
|
||||
1. Reduce scraper limits temporarily:
|
||||
```bash
|
||||
# Edit production config
|
||||
nano /opt/hvac-kia-content/config/production.py
|
||||
# Reduce max_posts, max_videos, etc.
|
||||
```
|
||||
|
||||
2. Disable problematic scrapers:
|
||||
```python
|
||||
# In config/production.py
|
||||
SCRAPERS_CONFIG = {
|
||||
"instagram": {
|
||||
"enabled": False, # Temporarily disable
|
||||
...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. Restart with reduced load:
|
||||
```bash
|
||||
sudo systemctl restart hvac-content-aggregator.service
|
||||
```
|
||||
|
||||
### Scenario 4: Complete System Failure
|
||||
**Symptoms:** Nothing works, multiple component failures
|
||||
|
||||
**Full System Rollback:**
|
||||
|
||||
1. **Stop Everything:**
|
||||
```bash
|
||||
# Stop all timers and services
|
||||
sudo systemctl stop hvac-content-aggregator.timer
|
||||
sudo systemctl stop hvac-tiktok-captions.timer
|
||||
sudo systemctl disable hvac-content-aggregator.timer
|
||||
sudo systemctl disable hvac-tiktok-captions.timer
|
||||
```
|
||||
|
||||
2. **Backup Current State:**
|
||||
```bash
|
||||
# Full backup before rollback
|
||||
sudo tar -czf /backup/full-backup-$(date +%Y%m%d-%H%M%S).tar.gz \
|
||||
/opt/hvac-kia-content/ \
|
||||
/etc/systemd/system/hvac-*.{service,timer} \
|
||||
/var/log/hvac-content/
|
||||
```
|
||||
|
||||
3. **Clean Installation:**
|
||||
```bash
|
||||
# Remove current installation
|
||||
sudo rm -rf /opt/hvac-kia-content
|
||||
sudo rm -f /etc/systemd/system/hvac-*
|
||||
|
||||
# Clone stable version
|
||||
cd /opt
|
||||
sudo git clone https://github.com/yourusername/hvac-kia-content.git
|
||||
cd hvac-kia-content
|
||||
sudo git checkout v1.0.0 # Last known stable
|
||||
|
||||
# Restore configuration
|
||||
sudo cp /backup/.env /opt/hvac-kia-content/
|
||||
|
||||
# Set permissions
|
||||
sudo chown -R $USER:$USER /opt/hvac-kia-content
|
||||
```
|
||||
|
||||
4. **Reinstall Services:**
|
||||
```bash
|
||||
cd /opt/hvac-kia-content
|
||||
./install_production.sh
|
||||
```
|
||||
|
||||
5. **Restore State (Optional):**
|
||||
```bash
|
||||
# Only if state is not corrupted
|
||||
sudo tar -xzf /backup/hvac-state-latest.tar.gz -C /
|
||||
```
|
||||
|
||||
6. **Verify and Start:**
|
||||
```bash
|
||||
# Test first
|
||||
python3 run_production.py --dry-run
|
||||
|
||||
# If successful, enable services
|
||||
sudo systemctl enable hvac-content-aggregator.timer
|
||||
sudo systemctl start hvac-content-aggregator.timer
|
||||
```
|
||||
|
||||
## Post-Rollback Verification
|
||||
|
||||
### Immediate Checks
|
||||
- [ ] Services are running:
|
||||
```bash
|
||||
systemctl status hvac-content-aggregator.timer
|
||||
```
|
||||
- [ ] No errors in logs:
|
||||
```bash
|
||||
tail -n 100 /var/log/hvac-content/aggregator.log | grep ERROR
|
||||
```
|
||||
- [ ] Test run successful:
|
||||
```bash
|
||||
cd /opt/hvac-kia-content
|
||||
python3 test_real_data.py --source youtube --items 1
|
||||
```
|
||||
|
||||
### 1-Hour Verification
|
||||
- [ ] Timer fired as scheduled
|
||||
- [ ] All scrapers executed
|
||||
- [ ] Output files generated
|
||||
- [ ] NAS sync completed
|
||||
- [ ] No memory leaks
|
||||
- [ ] CPU usage normal
|
||||
|
||||
### 24-Hour Verification
|
||||
- [ ] System stable
|
||||
- [ ] No missed schedules
|
||||
- [ ] Data quality good
|
||||
- [ ] No duplicate entries
|
||||
- [ ] Incremental updates working
|
||||
|
||||
## Emergency Contacts
|
||||
|
||||
### Technical Support
|
||||
- **Primary Contact:** [Name] - [Phone] - [Email]
|
||||
- **Secondary Contact:** [Name] - [Phone] - [Email]
|
||||
- **Escalation:** [Manager Name] - [Phone] - [Email]
|
||||
|
||||
### System Access
|
||||
- **Server:** production-scraper.example.com
|
||||
- **SSH:** `ssh user@production-scraper.example.com`
|
||||
- **Logs:** `/var/log/hvac-content/`
|
||||
- **Config:** `/opt/hvac-kia-content/.env`
|
||||
|
||||
## Recovery Time Objectives
|
||||
|
||||
| Scenario | Target Recovery Time | Maximum Data Loss |
|
||||
|----------|---------------------|-------------------|
|
||||
| Service Restart | 5 minutes | None |
|
||||
| Version Rollback | 15 minutes | Since last backup |
|
||||
| State Restoration | 30 minutes | 24 hours |
|
||||
| Complete Rebuild | 1 hour | 48 hours |
|
||||
|
||||
## Lessons Learned Log
|
||||
|
||||
### Previous Incidents
|
||||
Document any rollbacks performed and lessons learned:
|
||||
|
||||
| Date | Issue | Resolution | Prevention |
|
||||
|------|-------|------------|------------|
|
||||
| | | | |
|
||||
|
||||
## Backup Schedule
|
||||
|
||||
### Automated Backups
|
||||
```bash
|
||||
# Add to crontab
|
||||
0 2 * * * /opt/hvac-kia-content/scripts/backup.sh
|
||||
```
|
||||
|
||||
### Backup Script
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# /opt/hvac-kia-content/scripts/backup.sh
|
||||
|
||||
BACKUP_DIR="/backup/hvac-content"
|
||||
DATE=$(date +%Y%m%d)
|
||||
RETENTION_DAYS=30
|
||||
|
||||
# Create backup
|
||||
tar -czf "$BACKUP_DIR/state-$DATE.tar.gz" /opt/hvac-kia-content/state/
|
||||
tar -czf "$BACKUP_DIR/config-$DATE.tar.gz" /opt/hvac-kia-content/.env
|
||||
|
||||
# Clean old backups
|
||||
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +$RETENTION_DAYS -delete
|
||||
|
||||
# Verify backup
|
||||
tar -tzf "$BACKUP_DIR/state-$DATE.tar.gz" > /dev/null 2>&1
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "Backup successful: $DATE"
|
||||
else
|
||||
echo "Backup failed: $DATE" | mail -s "HVAC Backup Failed" alerts@example.com
|
||||
fi
|
||||
```
|
||||
|
||||
## Testing Rollback Procedures
|
||||
|
||||
### Monthly Drill
|
||||
1. Schedule maintenance window
|
||||
2. Perform controlled rollback
|
||||
3. Verify recovery procedures
|
||||
4. Document any issues
|
||||
5. Update procedures as needed
|
||||
|
||||
### Test Checklist
|
||||
- [ ] Backup procedures work
|
||||
- [ ] Rollback completes in target time
|
||||
- [ ] Data integrity maintained
|
||||
- [ ] Services restart properly
|
||||
- [ ] Monitoring alerts fire
|
||||
- [ ] Documentation is current
|
||||
|
||||
---
|
||||
|
||||
*Last Updated: 2024-12-18*
|
||||
*Version: 1.0*
|
||||
*Next Review: 2025-01-18*
|
||||
278
test_production_deployment.py
Executable file
278
test_production_deployment.py
Executable file
|
|
@ -0,0 +1,278 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Production Deployment Test Script
|
||||
Tests all components before going live
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import json
|
||||
import time
|
||||
from pathlib import Path
|
||||
from datetime import datetime
|
||||
|
||||
# Add project to path
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
|
||||
def test_environment():
|
||||
"""Test environment variables are set"""
|
||||
print("\n=== Testing Environment Variables ===")
|
||||
|
||||
required_vars = [
|
||||
'WORDPRESS_USERNAME',
|
||||
'WORDPRESS_API_KEY',
|
||||
'YOUTUBE_CHANNEL_URL',
|
||||
'INSTAGRAM_USERNAME',
|
||||
'INSTAGRAM_PASSWORD',
|
||||
'TIKTOK_TARGET',
|
||||
'NAS_PATH'
|
||||
]
|
||||
|
||||
missing = []
|
||||
for var in required_vars:
|
||||
value = os.getenv(var)
|
||||
if value:
|
||||
# Don't print sensitive values
|
||||
if 'PASSWORD' in var or 'KEY' in var:
|
||||
print(f"✓ {var}: ***SET***")
|
||||
else:
|
||||
print(f"✓ {var}: {value[:20]}..." if len(value) > 20 else f"✓ {var}: {value}")
|
||||
else:
|
||||
print(f"✗ {var}: MISSING")
|
||||
missing.append(var)
|
||||
|
||||
if missing:
|
||||
print(f"\n❌ Missing variables: {', '.join(missing)}")
|
||||
return False
|
||||
|
||||
print("\n✅ All environment variables set")
|
||||
return True
|
||||
|
||||
def test_directories():
|
||||
"""Test required directories exist and are writable"""
|
||||
print("\n=== Testing Directory Structure ===")
|
||||
|
||||
dirs_to_test = [
|
||||
Path("/opt/hvac-kia-content"),
|
||||
Path("/var/log/hvac-content"),
|
||||
Path(os.getenv('NAS_PATH', '/mnt/nas/hvacknowitall'))
|
||||
]
|
||||
|
||||
all_good = True
|
||||
for dir_path in dirs_to_test:
|
||||
if dir_path.exists():
|
||||
# Test write permissions
|
||||
test_file = dir_path / f"test_{datetime.now():%Y%m%d_%H%M%S}.txt"
|
||||
try:
|
||||
test_file.write_text("test")
|
||||
test_file.unlink()
|
||||
print(f"✓ {dir_path}: Exists and writable")
|
||||
except PermissionError:
|
||||
print(f"✗ {dir_path}: Exists but not writable")
|
||||
all_good = False
|
||||
else:
|
||||
print(f"✗ {dir_path}: Does not exist")
|
||||
all_good = False
|
||||
|
||||
if all_good:
|
||||
print("\n✅ All directories accessible")
|
||||
else:
|
||||
print("\n❌ Directory issues found")
|
||||
|
||||
return all_good
|
||||
|
||||
def test_config_validation():
|
||||
"""Test configuration validation"""
|
||||
print("\n=== Testing Configuration Validation ===")
|
||||
|
||||
try:
|
||||
from run_production import validate_config
|
||||
validate_config()
|
||||
print("✅ Configuration validation passed")
|
||||
return True
|
||||
except Exception as e:
|
||||
print(f"❌ Configuration validation failed: {e}")
|
||||
return False
|
||||
|
||||
def test_scrapers():
|
||||
"""Test each scraper can initialize"""
|
||||
print("\n=== Testing Scraper Initialization ===")
|
||||
|
||||
from src.base_scraper import ScraperConfig
|
||||
from pathlib import Path
|
||||
|
||||
test_config = ScraperConfig(
|
||||
source_name="test",
|
||||
brand_name="hvacknowitall",
|
||||
data_dir=Path("/tmp/test_data"),
|
||||
logs_dir=Path("/tmp/test_logs"),
|
||||
timezone="America/Halifax"
|
||||
)
|
||||
|
||||
scrapers_to_test = [
|
||||
("WordPress", "src.wordpress_scraper", "WordPressScraper"),
|
||||
("YouTube", "src.youtube_scraper", "YouTubeScraper"),
|
||||
("Instagram", "src.instagram_scraper", "InstagramScraper"),
|
||||
("TikTok", "src.tiktok_scraper_advanced", "TikTokScraperAdvanced"),
|
||||
("MailChimp", "src.rss_scraper", "RSSScraperMailChimp"),
|
||||
("Podcast", "src.rss_scraper", "RSSScraperPodcast")
|
||||
]
|
||||
|
||||
all_good = True
|
||||
for name, module_path, class_name in scrapers_to_test:
|
||||
try:
|
||||
module = __import__(module_path, fromlist=[class_name])
|
||||
scraper_class = getattr(module, class_name)
|
||||
scraper = scraper_class(test_config)
|
||||
print(f"✓ {name}: Initialized successfully")
|
||||
except Exception as e:
|
||||
print(f"✗ {name}: Failed to initialize - {e}")
|
||||
all_good = False
|
||||
|
||||
if all_good:
|
||||
print("\n✅ All scrapers initialized")
|
||||
else:
|
||||
print("\n⚠️ Some scrapers failed to initialize")
|
||||
|
||||
return all_good
|
||||
|
||||
def test_systemd_files():
|
||||
"""Test systemd service files exist"""
|
||||
print("\n=== Testing Systemd Files ===")
|
||||
|
||||
systemd_files = [
|
||||
Path("systemd/hvac-content-aggregator.service"),
|
||||
Path("systemd/hvac-content-aggregator.timer"),
|
||||
Path("systemd/hvac-content-aggregator@.service"),
|
||||
Path("systemd/hvac-tiktok-captions.service"),
|
||||
Path("systemd/hvac-tiktok-captions.timer")
|
||||
]
|
||||
|
||||
all_good = True
|
||||
for file_path in systemd_files:
|
||||
if file_path.exists():
|
||||
print(f"✓ {file_path}: Exists")
|
||||
else:
|
||||
print(f"✗ {file_path}: Missing")
|
||||
all_good = False
|
||||
|
||||
if all_good:
|
||||
print("\n✅ All systemd files present")
|
||||
else:
|
||||
print("\n❌ Some systemd files missing")
|
||||
|
||||
return all_good
|
||||
|
||||
def test_python_dependencies():
|
||||
"""Test all required Python packages are installed"""
|
||||
print("\n=== Testing Python Dependencies ===")
|
||||
|
||||
required_packages = [
|
||||
"requests",
|
||||
"pytz",
|
||||
"python-dotenv",
|
||||
"feedparser",
|
||||
"markitdown",
|
||||
"scrapling",
|
||||
"instaloader",
|
||||
"yt-dlp",
|
||||
"tenacity"
|
||||
]
|
||||
|
||||
all_good = True
|
||||
for package in required_packages:
|
||||
try:
|
||||
__import__(package.replace("-", "_"))
|
||||
print(f"✓ {package}: Installed")
|
||||
except ImportError:
|
||||
print(f"✗ {package}: Not installed")
|
||||
all_good = False
|
||||
|
||||
if all_good:
|
||||
print("\n✅ All dependencies installed")
|
||||
else:
|
||||
print("\n❌ Some dependencies missing")
|
||||
print("Run: pip install -r requirements.txt")
|
||||
|
||||
return all_good
|
||||
|
||||
def test_dry_run():
|
||||
"""Test a dry run of the production script"""
|
||||
print("\n=== Testing Dry Run ===")
|
||||
|
||||
try:
|
||||
# Import and test validation only
|
||||
from run_production import validate_environment, validate_config
|
||||
|
||||
validate_environment()
|
||||
print("✓ Environment validation passed")
|
||||
|
||||
validate_config()
|
||||
print("✓ Configuration validation passed")
|
||||
|
||||
print("\n✅ Dry run successful")
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
print(f"\n❌ Dry run failed: {e}")
|
||||
return False
|
||||
|
||||
def main():
|
||||
"""Run all tests"""
|
||||
print("=" * 50)
|
||||
print("HVAC Know It All - Production Deployment Test")
|
||||
print("=" * 50)
|
||||
|
||||
# Load environment
|
||||
from dotenv import load_dotenv
|
||||
load_dotenv()
|
||||
|
||||
tests = [
|
||||
("Environment Variables", test_environment),
|
||||
("Python Dependencies", test_python_dependencies),
|
||||
("Configuration Validation", test_config_validation),
|
||||
("Scraper Initialization", test_scrapers),
|
||||
("Systemd Files", test_systemd_files),
|
||||
("Dry Run", test_dry_run)
|
||||
]
|
||||
|
||||
# Don't test directories in development
|
||||
if os.path.exists("/opt/hvac-kia-content"):
|
||||
tests.insert(2, ("Directory Structure", test_directories))
|
||||
|
||||
results = []
|
||||
for test_name, test_func in tests:
|
||||
try:
|
||||
result = test_func()
|
||||
results.append((test_name, result))
|
||||
except Exception as e:
|
||||
print(f"\n❌ {test_name} crashed: {e}")
|
||||
results.append((test_name, False))
|
||||
|
||||
# Summary
|
||||
print("\n" + "=" * 50)
|
||||
print("TEST SUMMARY")
|
||||
print("=" * 50)
|
||||
|
||||
passed = 0
|
||||
failed = 0
|
||||
|
||||
for test_name, result in results:
|
||||
status = "✅ PASS" if result else "❌ FAIL"
|
||||
print(f"{status}: {test_name}")
|
||||
if result:
|
||||
passed += 1
|
||||
else:
|
||||
failed += 1
|
||||
|
||||
print(f"\nTotal: {passed} passed, {failed} failed")
|
||||
|
||||
if failed == 0:
|
||||
print("\n🎉 READY FOR PRODUCTION DEPLOYMENT 🎉")
|
||||
return 0
|
||||
else:
|
||||
print(f"\n⚠️ Fix {failed} issue(s) before deployment")
|
||||
return 1
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
Loading…
Reference in a new issue