# SEO Analysis & Improvement System - Project Guide ## πŸ“‹ Overview A complete 4-phase SEO analysis pipeline that: 1. **Integrates** Google Analytics, Search Console, and WordPress data 2. **Identifies** high-potential keywords for optimization (positions 11-30) 3. **Discovers** new content opportunities using AI 4. **Generates** a comprehensive report with 90-day action plan ## πŸ“‚ Project Structure ``` seo/ β”œβ”€β”€ input/ # SOURCE DATA (your exports) β”‚ β”œβ”€β”€ new-propositions.csv # WordPress posts β”‚ β”œβ”€β”€ README.md # How to export data β”‚ └── analytics/ β”‚ β”œβ”€β”€ ga4_export.csv # Google Analytics β”‚ └── gsc/ β”‚ β”œβ”€β”€ Pages.csv # GSC pages (required) β”‚ β”œβ”€β”€ RequΓͺtes.csv # GSC queries (optional) β”‚ └── ... β”‚ β”œβ”€β”€ output/ # RESULTS (auto-generated) β”‚ β”œβ”€β”€ results/ β”‚ β”‚ β”œβ”€β”€ seo_optimization_report.md # πŸ“ PRIMARY OUTPUT β”‚ β”‚ β”œβ”€β”€ posts_with_analytics.csv β”‚ β”‚ β”œβ”€β”€ posts_prioritized.csv β”‚ β”‚ β”œβ”€β”€ keyword_opportunities.csv β”‚ β”‚ └── content_gaps.csv β”‚ β”‚ β”‚ β”œβ”€β”€ logs/ β”‚ β”‚ β”œβ”€β”€ import_log.txt β”‚ β”‚ β”œβ”€β”€ opportunity_analysis_log.txt β”‚ β”‚ └── content_gap_analysis_log.txt β”‚ β”‚ β”‚ └── README.md # Output guide β”‚ β”œβ”€β”€ πŸš€ run_analysis.sh # Run entire pipeline β”œβ”€β”€ analytics_importer.py # Phase 1: Merge data β”œβ”€β”€ opportunity_analyzer.py # Phase 2: Find wins β”œβ”€β”€ content_gap_analyzer.py # Phase 3: Find gaps β”œβ”€β”€ report_generator.py # Phase 4: Generate report β”œβ”€β”€ config.py β”œβ”€β”€ requirements.txt β”œβ”€β”€ .env.example └── .gitignore ``` ## πŸš€ Getting Started ### Step 1: Prepare Input Data **Place WordPress posts CSV:** ``` input/new-propositions.csv ``` **Export Google Analytics 4:** 1. Go to: Analytics > Reports > Engagement > Pages and Screens 2. Set date range: Last 90 days 3. Download CSV β†’ Save as: `input/analytics/ga4_export.csv` **Export Google Search Console (Pages):** 1. Go to: Performance 2. Set date range: Last 90 days 3. Export CSV β†’ Save as: `input/analytics/gsc/Pages.csv` ### Step 2: Run Analysis ```bash # Run entire pipeline ./run_analysis.sh # OR run steps individually ./venv/bin/python analytics_importer.py ./venv/bin/python opportunity_analyzer.py ./venv/bin/python content_gap_analyzer.py ./venv/bin/python report_generator.py ``` ### Step 3: Review Report Open: **`output/results/seo_optimization_report.md`** Contains: - Executive summary with current metrics - Top 20 posts ranked by opportunity (with AI recommendations) - Keyword opportunities breakdown - Content gap analysis - 90-day phased action plan ## πŸ“Š What Each Script Does ### `analytics_importer.py` (Phase 1) **Purpose:** Merge analytics data with WordPress posts **Input:** - `input/new-propositions.csv` (WordPress posts) - `input/analytics/ga4_export.csv` (Google Analytics) - `input/analytics/gsc/Pages.csv` (Search Console) **Output:** - `output/results/posts_with_analytics.csv` (enriched dataset) - `output/logs/import_log.txt` (matching report) **Handles:** French and English column names, URL normalization, multi-source merging ### `opportunity_analyzer.py` (Phase 2) **Purpose:** Identify high-potential optimization opportunities **Input:** - `output/results/posts_with_analytics.csv` **Output:** - `output/results/keyword_opportunities.csv` (26 opportunities) - `output/logs/opportunity_analysis_log.txt` **Features:** - Filters posts at positions 11-30 (page 2-3) - Calculates opportunity scores (0-100) - Generates AI recommendations for top 20 posts ### `content_gap_analyzer.py` (Phase 3) **Purpose:** Discover new content opportunities **Input:** - `output/results/posts_with_analytics.csv` - `input/analytics/gsc/RequΓͺtes.csv` (optional) **Output:** - `output/results/content_gaps.csv` - `output/logs/content_gap_analysis_log.txt` **Features:** - Topic cluster extraction - Gap identification - AI-powered content suggestions ### `report_generator.py` (Phase 4) **Purpose:** Create comprehensive report with action plan **Input:** - All analysis results from phases 1-3 **Output:** - `output/results/seo_optimization_report.md` ← **PRIMARY DELIVERABLE** - `output/results/posts_prioritized.csv` **Features:** - Comprehensive markdown report - All 262 posts ranked - 90-day action plan with estimated gains ## πŸ“ˆ Understanding Your Report ### Key Metrics (Executive Summary) - **Total Posts:** All posts analyzed - **Monthly Traffic:** Current organic traffic - **Total Impressions:** Search visibility (90 days) - **Average Position:** Current ranking position - **Opportunities:** Posts ready to optimize ### Top 20 Posts to Optimize Each post shows: - **Title** (the post name) - **Current Position** (search ranking) - **Impressions** (search visibility) - **Traffic** (organic visits) - **Priority Score** (0-100 opportunity rating) - **Status** (page 1 vs page 2-3) - **Recommendations** (how to improve) ### Priority Scoring (0-100) Higher scores = more opportunity for gain with less effort Calculated from: - **Position (35%)** - How close to page 1 - **Traffic Potential (30%)** - Search impressions - **CTR Gap (20%)** - Improvement opportunity - **Content Quality (15%)** - Existing engagement ## 🎯 Action Plan ### Week 1-2: Quick Wins (+100 visits/month) - Focus on posts at positions 11-15 - Update SEO titles and meta descriptions - 30-60 minutes per post ### Week 3-4: Core Optimization (+150 visits/month) - Posts 6-15 in priority list - Add content sections - Improve structure with headers - 2-3 hours per post ### Week 5-8: New Content (+300 visits/month) - Create 3-5 new posts from gap analysis - Target high-search-demand topics - 4-6 hours per post ### Week 9-12: Refinement (+100 visits/month) - Monitor ranking improvements - Refine underperforming optimizations - Prepare next round of analysis **Total: +650 visits/month potential gain** ## πŸ”§ Configuration Edit `.env` to customize analysis: ```bash # Position range for opportunities ANALYSIS_MIN_POSITION=11 ANALYSIS_MAX_POSITION=30 # Minimum impressions to consider ANALYSIS_MIN_IMPRESSIONS=50 # Posts for AI recommendations ANALYSIS_TOP_N_POSTS=20 ``` ## πŸ› Troubleshooting ### Missing Input Files ``` ❌ Error: File not found: input/... ``` β†’ Check that all files are in the correct locations ### Empty Report Titles βœ“ FIXED - Now correctly loads post titles from multiple column names ### No Opportunities Found ``` ⚠️ No opportunities found in specified range ``` β†’ Try lowering `ANALYSIS_MIN_IMPRESSIONS` in `.env` ### API Errors ``` ❌ AI generation failed: ... ``` β†’ Check `OPENROUTER_API_KEY` in `.env` and account balance ## πŸ“š Additional Resources - **`input/README.md`** - How to export analytics data - **`output/README.md`** - Output files guide - **`QUICKSTART_ANALYSIS.md`** - Step-by-step tutorial - **`ANALYSIS_SYSTEM.md`** - Technical documentation ## βœ… Success Checklist - [ ] All input files placed in `input/` directory - [ ] `.env` file configured with API key - [ ] Ran `./run_analysis.sh` successfully - [ ] Reviewed `output/results/seo_optimization_report.md` - [ ] Identified 5-10 quick wins to start with - [ ] Created action plan for first week ## πŸŽ“ Key Learnings ### Why Positions 11-30 Matter - **Page 1** posts are hard to move - **Page 2-3** posts are easy wins (small improvements move them up) - **Quick gains:** 1-2 position improvements = CTR increases 20-30% ### CTR Expectations by Position - Position 1: ~30% CTR - Position 5-10: 4-7% CTR - Position 11-15: 1-2% CTR (quick wins) - Position 16-20: 0.8-1% CTR - Position 21-30: ~0.5% CTR ### Content Quality Signals - Higher bounce rate = less relevant content - Low traffic = poor CTR or position - Low impressions = insufficient optimization ## πŸ“ž Support ### Check Logs First ``` output/logs/import_log.txt output/logs/opportunity_analysis_log.txt output/logs/content_gap_analysis_log.txt ``` ### Common Issues 1. **Empty titles** β†’ Fixed with flexible column name mapping 2. **File not found** β†’ Check file locations match structure 3. **API errors** β†’ Verify API key and account balance 4. **No opportunities** β†’ Lower minimum impressions threshold ## πŸš€ Ready to Optimize? 1. Prepare your input data 2. Run `./run_analysis.sh` 3. Open the report 4. Start with quick wins 5. Track improvements in 4 weeks Good luck boosting your SEO! πŸ“ˆ --- **Last Updated:** February 2026 **System Status:** Production Ready βœ