# Storage & Draft Posts - Complete Guide ## Storage Architecture ### How Data is Stored The Multi-Site SEO Analyzer **does NOT use a local database**. Instead: 1. **Fetches on-demand** from WordPress REST API 2. **Analyzes in-memory** using Python 3. **Exports to CSV files** for long-term storage and review ``` ┌─────────────────────────────┐ │ 3 WordPress Sites │ │ (via REST API) │ └──────────┬──────────────────┘ │ ├─→ Fetch posts (published + optional drafts) │ ┌──────────▼──────────────────┐ │ Python Analysis │ │ (in-memory processing) │ └──────────┬──────────────────┘ │ ├─→ Analyze titles │ ├─→ Analyze meta descriptions │ ├─→ Score (0-100) │ ├─→ AI recommendations (optional) │ ┌──────────▼──────────────────┐ │ CSV File Export │ │ (persistent storage) │ └─────────────────────────────┘ ``` ### Why CSV Instead of Database? **Advantages:** - ✓ No database setup or maintenance - ✓ Easy to import to Excel/Google Sheets - ✓ Human-readable format - ✓ Shareable with non-technical team members - ✓ Version control friendly (Git-trackable) - ✓ No dependencies on database software **Disadvantages:** - ✗ Each run is independent (no running total) - ✗ No real-time updates - ✗ Manual comparison between runs **When to use database instead:** - If analyzing >10,000 posts regularly - If you need real-time dashboards - If you want automatic tracking over time --- ## CSV Output Structure ### File Location ``` output/reports/seo_analysis_TIMESTAMP.csv ``` ### Columns | Column | Description | Example | |--------|-------------|---------| | `site` | WordPress site | mistergeek.net | | `post_id` | WordPress post ID | 2845 | | `status` | Post status | publish / draft | | `title` | Post title | "Best VPN Services 2025" | | `slug` | URL slug | best-vpn-services-2025 | | `url` | Full URL | https://mistergeek.net/best-vpn-2025/ | | `meta_description` | Meta description text | "Compare 50+ VPN..." | | `title_score` | Title SEO score (0-100) | 92 | | `title_issues` | Problems with title | "None" | | `title_recommendations` | How to improve | "None" | | `meta_score` | Meta description score (0-100) | 88 | | `meta_issues` | Problems with meta | "None" | | `meta_recommendations` | How to improve | "None" | | `overall_score` | Combined score | 90 | | `ai_recommendations` | Claude-generated tips | "Consider adding..." | ### Importing to Google Sheets 1. Download CSV from `output/reports/` 2. Open Google Sheets 3. File → Import → Upload CSV 4. Add columns for tracking: - [ ] Status (Not Started / In Progress / Done) - [ ] Notes - [ ] Date Completed 5. Share with team 6. Filter and sort as needed --- ## Draft Posts Feature ### What Are Drafts? Draft posts are unpublished WordPress posts. They're: - Written but not published - Not visible on the website - Still indexed by WordPress - Perfect for analyzing before publishing ### Using Draft Posts **By default**, the analyzer fetches **only published posts**: ```bash python scripts/multi_site_seo_analyzer.py ``` **To include draft posts**, use the `--include-drafts` flag: ```bash python scripts/multi_site_seo_analyzer.py --include-drafts ``` ### Output with Drafts The CSV will include a `status` column showing which posts are published vs. draft: ```csv site,post_id,status,title,meta_score,overall_score mistergeek.net,2845,publish,"Best VPN",88,90 mistergeek.net,2901,draft,"New VPN Draft",45,55 webscroll.fr,1234,publish,"Torrent Guide",72,75 webscroll.fr,1235,draft,"Draft Tracker Review",20,30 ``` ### Use Cases for Drafts **1. Optimize Before Publishing** If you have draft posts ready to publish: ```bash python scripts/multi_site_seo_analyzer.py --include-drafts ``` Review their SEO scores and improve titles/meta before publishing. **2. Recover Previous Content** If you have removed posts saved as drafts: ```bash python scripts/multi_site_seo_analyzer.py --include-drafts ``` Analyze them to decide: republish, improve, or delete. **3. Audit Unpublished Work** See what's sitting in drafts that could be published: ```bash python scripts/multi_site_seo_analyzer.py --include-drafts | grep "draft" ``` --- ## Complete Examples ### Example 1: Analyze Published Only ```bash python scripts/multi_site_seo_analyzer.py ``` **Output:** - Analyzes: ~262 published posts - Time: 2-3 minutes - Drafts: Not included ### Example 2: Analyze Published + Drafts ```bash python scripts/multi_site_seo_analyzer.py --include-drafts ``` **Output:** - Analyzes: ~262 published + X drafts - Time: 2-5 minutes (depending on draft count) - Shows status column: "publish" or "draft" ### Example 3: Analyze Published + Drafts + AI ```bash python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20 ``` **Output:** - Analyzes: All posts (published + drafts) - AI recommendations: Top 20 worst-scoring posts - Cost: ~$0.20 - Time: 10-15 minutes ### Example 4: Focus on Drafts Only While the script always includes both, you can filter in Excel/Sheets: 1. Run: `python scripts/multi_site_seo_analyzer.py --include-drafts` 2. Open CSV in Google Sheets 3. Filter `status` column = "draft" 4. Sort by `overall_score` (lowest first) 5. Optimize top 10 drafts before publishing --- ## Comparing Results Over Time ### Manual Comparison Since results are exported to CSV, you can track progress manually: ```bash # Week 1 python scripts/multi_site_seo_analyzer.py --no-ai # Save: seo_analysis_week1.csv # (Optimize posts for 4 weeks) # Week 5 python scripts/multi_site_seo_analyzer.py --no-ai # Save: seo_analysis_week5.csv # Compare in Excel/Sheets: # Sort both by post_id # Compare scores: Week 1 vs Week 5 ``` ### Calculating Improvement Example: | Post | Week 1 Score | Week 5 Score | Change | |------|--------------|--------------|--------| | Best VPN | 45 | 92 | +47 | | Top 10 Software | 38 | 78 | +40 | | Streaming Guide | 52 | 65 | +13 | | **Average** | **45** | **78** | **+33** | --- ## Organizing Your CSV Files ### Naming Convention Create a folder for historical analysis: ``` output/ ├── reports/ │ ├── 2025-02-16_initial_analysis.csv │ ├── 2025-03-16_after_optimization.csv │ ├── 2025-04-16_follow_up.csv │ └── seo_analysis_20250216_120000.csv (latest) ``` ### Archive Strategy 1. Run analyzer monthly 2. Save result with date: `seo_analysis_2025-02-16.csv` 3. Keep 12 months of history 4. Compare trends over time --- ## Advanced: Storing Recommendations ### Using a Master Spreadsheet Instead of relying on CSV alone, create a master Google Sheet: **Columns:** - Post ID - Title - Current Score - Issues - Improvements Needed - Status (Not Started / In Progress / Done) - Completed Date - New Score **Process:** 1. Run analyzer: `python scripts/multi_site_seo_analyzer.py` 2. Copy relevant rows to master spreadsheet 3. As you optimize: update "Status" and "New Score" 4. Track progress visually --- ## Performance Considerations ### Fetch Time - **Published only:** ~10-30 seconds (262 posts) - **Published + drafts:** ~10-30 seconds (+X seconds per 100 drafts) Drafts don't significantly impact speed since both are fetched in same API call. ### Analysis Time - **Without AI:** ~1-2 minutes - **With AI (10 posts):** ~5-10 minutes - **With AI (50 posts):** ~20-30 minutes AI recommendations add most of the time (not the fetching). ### Memory Usage - **262 posts:** ~20-30 MB - **262 posts + 100 drafts:** ~35-50 MB No memory issues for typical WordPress sites. --- ## Troubleshooting ### "No drafts found" **Problem:** You're using `--include-drafts` but get same result as without it. **Solutions:** 1. Verify you have draft posts on the site 2. Check user has permission to view drafts (needs edit_posts capability) 3. Try logging in and checking WordPress directly ### CSV Encoding Issues **Problem:** CSV opens with weird characters in Excel. **Solution:** Open with UTF-8 encoding: - Excel: File → Open → Select CSV → Click "Edit" - Sheets: Upload CSV, let Google handle encoding ### Want to Use a Database Later? If you outgrow CSV files, consider: **SQLite** (built-in, no installation): ```python import sqlite3 conn = sqlite3.connect('seo_analysis.db') # Insert results into database ``` **PostgreSQL** (professional option): ```python import psycopg2 conn = psycopg2.connect("dbname=seo_db user=postgres") # Insert results ``` But for now, CSV is perfect for your needs. --- ## Summary ### Storage | Aspect | Implementation | |--------|-----------------| | Database? | No - CSV files | | Location | `output/reports/` | | Format | CSV (Excel/Sheets compatible) | | Persistence | Permanent (until deleted) | ### Draft Posts | Aspect | Usage | |--------|-------| | Default | Published only | | Include drafts | `--include-drafts` flag | | Output column | `status` (publish/draft) | | Use case | Optimize before publishing, recover removed content | ### Commands ```bash # Published only python scripts/multi_site_seo_analyzer.py # Published + Drafts python scripts/multi_site_seo_analyzer.py --include-drafts # Published + Drafts + AI python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20 # Skip AI (faster) python scripts/multi_site_seo_analyzer.py --no-ai ``` --- ## Next Steps 1. **First run (published only):** ```bash python scripts/multi_site_seo_analyzer.py --no-ai ``` 2. **Analyze results:** ```bash open output/reports/seo_analysis_*.csv ``` 3. **Optimize published posts** with score < 50 4. **Second run (include drafts):** ```bash python scripts/multi_site_seo_analyzer.py --include-drafts ``` 5. **Decide on drafts:** Publish, improve, or delete 6. **Track progress:** Re-run monthly and compare scores Ready? Start with: `python scripts/multi_site_seo_analyzer.py --include-drafts`