Refactor SEO automation into unified CLI application
Major refactoring to create a clean, integrated CLI application: ### New Features: - Unified CLI executable (./seo) with simple command structure - All commands accept optional CSV file arguments - Auto-detection of latest files when no arguments provided - Simplified output directory structure (output/ instead of output/reports/) - Cleaner export filename format (all_posts_YYYY-MM-DD.csv) ### Commands: - export: Export all posts from WordPress sites - analyze [csv]: Analyze posts with AI (optional CSV input) - recategorize [csv]: Recategorize posts with AI - seo_check: Check SEO quality - categories: Manage categories across sites - approve [files]: Review and approve recommendations - full_pipeline: Run complete workflow - analytics, gaps, opportunities, report, status ### Changes: - Moved all scripts to scripts/ directory - Created config.yaml for configuration - Updated all scripts to use output/ directory - Deprecated old seo-cli.py in favor of new ./seo - Added AGENTS.md and CHANGELOG.md documentation - Consolidated README.md with updated usage ### Technical: - Added PyYAML dependency - Removed hardcoded configuration values - All scripts now properly integrated - Better error handling and user feedback Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This commit is contained in:
330
guides/API_TROUBLESHOOTING.md
Normal file
330
guides/API_TROUBLESHOOTING.md
Normal file
@@ -0,0 +1,330 @@
|
||||
# API Troubleshooting - 400 Bad Request Issues
|
||||
|
||||
## The Problem
|
||||
|
||||
WordPress REST API returned **400 Bad Request** errors on pagination:
|
||||
|
||||
```
|
||||
✓ Fetched 100 posts (page 1)
|
||||
✓ Fetched 100 posts (page 2)
|
||||
✓ Fetched 100 posts (page 3)
|
||||
✗ Error page 4: 400 Bad Request
|
||||
```
|
||||
|
||||
This is a **server-side limitation**, not a bug in our code.
|
||||
|
||||
---
|
||||
|
||||
## Root Causes
|
||||
|
||||
### 1. **API Pagination Limits**
|
||||
|
||||
Some WordPress configurations limit how many pages can be fetched:
|
||||
- Page 1-3: OK (limit reached)
|
||||
- Page 4+: 400 Bad Request
|
||||
|
||||
**Common causes:**
|
||||
- Plugin restrictions (security, performance)
|
||||
- Server configuration limits
|
||||
- REST API throttling
|
||||
- Custom WordPress filters
|
||||
|
||||
### 2. **_fields Parameter Issues**
|
||||
|
||||
The `_fields` parameter (to fetch only specific columns) might cause issues on:
|
||||
- Specific API versions
|
||||
- Custom REST API implementations
|
||||
- Security plugins that filter fields
|
||||
|
||||
### 3. **Status Parameter Encoding**
|
||||
|
||||
Multi-status queries (`status=publish,draft`) can fail on pagination.
|
||||
|
||||
---
|
||||
|
||||
## The Solution
|
||||
|
||||
The script now:
|
||||
|
||||
1. **Gracefully handles 400 errors** - Treats pagination limit as end of data
|
||||
2. **Retries without _fields** - Falls back to fetching all fields if needed
|
||||
3. **Continues analysis** - Uses posts it was able to fetch (doesn't fail)
|
||||
4. **Logs what it got** - Shows exactly how many posts were fetched
|
||||
|
||||
```python
|
||||
# Graceful error handling
|
||||
if response.status_code == 400:
|
||||
logger.info(f"API limit reached (got {status_count} posts)")
|
||||
break # Stop pagination, use what we have
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What Happens Now
|
||||
|
||||
### Before (Failed)
|
||||
```
|
||||
Fetching mistergeek.net...
|
||||
✓ Fetched 100 posts (page 1)
|
||||
✓ Fetched 100 posts (page 2)
|
||||
✗ Error page 4: 400 Bad Request
|
||||
ERROR: No posts found on any site
|
||||
```
|
||||
|
||||
### After (Works)
|
||||
```
|
||||
Fetching mistergeek.net...
|
||||
✓ Fetched 100 publish posts (page 1)
|
||||
✓ Fetched 100 publish posts (page 2)
|
||||
✓ Fetched 28 publish posts (page 3)
|
||||
ⓘ API limit reached (fetched 228 posts)
|
||||
✓ Total publish posts: 228
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How to Check If This Affects You
|
||||
|
||||
### If you see:
|
||||
```
|
||||
✓ Fetched 100 posts (page 1)
|
||||
✓ Fetched 100 posts (page 2)
|
||||
✓ Fetched 28 posts (page 3)
|
||||
✓ Fetched 15 posts (page 4)
|
||||
✓ Total posts: 243
|
||||
```
|
||||
|
||||
**Good!** Your API supports full pagination. All posts are being fetched.
|
||||
|
||||
### If you see:
|
||||
```
|
||||
✓ Fetched 100 posts (page 1)
|
||||
ⓘ API limit reached (fetched 100 posts)
|
||||
✓ Total posts: 100
|
||||
```
|
||||
|
||||
**Limited pagination.** API only allows page 1. Script continues with 100 posts.
|
||||
|
||||
### If you see:
|
||||
```
|
||||
✓ Fetched 100 posts (page 1)
|
||||
✓ Fetched 100 posts (page 2)
|
||||
ⓘ API limit reached (fetched 200 posts)
|
||||
✓ Total posts: 200
|
||||
```
|
||||
|
||||
**Partial pagination.** API allows pages 1-2. Script gets 200 posts.
|
||||
|
||||
---
|
||||
|
||||
## Impact on Analysis
|
||||
|
||||
### Scenario 1: All Posts Fetched (Full Pagination)
|
||||
|
||||
```
|
||||
262 posts total
|
||||
262 posts analyzed ✓
|
||||
100% coverage
|
||||
```
|
||||
|
||||
**Result:** Complete analysis, no issues.
|
||||
|
||||
### Scenario 2: Limited to First Page (100 posts)
|
||||
|
||||
```
|
||||
262 posts total
|
||||
100 posts analyzed
|
||||
38% coverage
|
||||
```
|
||||
|
||||
**Result:** Analysis of first 100 posts only. Missing ~162 posts.
|
||||
|
||||
**Impact:**
|
||||
- Report shows only first 100 posts
|
||||
- Cannot analyze all content
|
||||
- Must run analyzer multiple times or contact hosting provider
|
||||
|
||||
### Scenario 3: Limited to First 3 Pages (300+ posts if available)
|
||||
|
||||
```
|
||||
262 posts total
|
||||
228 posts analyzed ✓
|
||||
87% coverage
|
||||
```
|
||||
|
||||
**Result:** Analyzes most posts, misses last few.
|
||||
|
||||
---
|
||||
|
||||
## Solutions If Limited
|
||||
|
||||
### Solution 1: Contact Hosting Provider
|
||||
|
||||
**Ask for:**
|
||||
> "Can you increase the WordPress REST API pagination limit? Currently limited to X posts per site."
|
||||
|
||||
Most providers can increase this in:
|
||||
- WordPress settings
|
||||
- PHP configuration
|
||||
- Plugin settings
|
||||
|
||||
### Solution 2: Fetch in Batches
|
||||
|
||||
If API limits to 100 posts at a time:
|
||||
|
||||
```bash
|
||||
# Run 1: Analyze first 100
|
||||
python scripts/multi_site_seo_analyzer.py
|
||||
|
||||
# Save results
|
||||
cp output/reports/seo_analysis_*.csv week1_batch1.csv
|
||||
|
||||
# Then manually get remaining posts another way
|
||||
# (export from WordPress admin, use different tool, etc.)
|
||||
```
|
||||
|
||||
### Solution 3: Check Security Plugins
|
||||
|
||||
Some plugins limit REST API access:
|
||||
- Wordfence
|
||||
- Sucuri
|
||||
- iThemes Security
|
||||
- Jetpack
|
||||
|
||||
Try:
|
||||
1. Temporarily disable security plugins
|
||||
2. Run analyzer
|
||||
3. Re-enable plugins
|
||||
|
||||
If this works, configure plugin to allow REST API for your IP.
|
||||
|
||||
### Solution 4: Use WordPress Export Feature
|
||||
|
||||
If REST API is completely broken:
|
||||
|
||||
1. WordPress Admin → Tools → Export
|
||||
2. Select posts to export
|
||||
3. Download XML
|
||||
4. Convert XML to CSV
|
||||
5. Run analyzer on CSV (different mode)
|
||||
|
||||
---
|
||||
|
||||
## When to Worry
|
||||
|
||||
### No Worries If:
|
||||
- API fetches 150+ posts (most content covered)
|
||||
- Error message says "API limit reached" (graceful)
|
||||
- Analysis completes successfully
|
||||
- CSV has all/most posts
|
||||
|
||||
### Worth Investigating If:
|
||||
- Only fetching <50 posts
|
||||
- API returning other errors (401, 403, 500)
|
||||
- All 3 sites have same issue
|
||||
- Posts are missing from analysis
|
||||
|
||||
---
|
||||
|
||||
## Checking Your Hosting
|
||||
|
||||
### How to check API pagination limit:
|
||||
|
||||
**In browser/terminal:**
|
||||
```bash
|
||||
# Replace with your site
|
||||
curl https://www.mistergeek.net/wp-json/wp/v2/posts?per_page=100&status=publish
|
||||
|
||||
# Try different pages
|
||||
curl https://www.mistergeek.net/wp-json/wp/v2/posts?page=1&per_page=100&status=publish
|
||||
curl https://www.mistergeek.net/wp-json/wp/v2/posts?page=2&per_page=100&status=publish
|
||||
curl https://www.mistergeek.net/wp-json/wp/v2/posts?page=3&per_page=100&status=publish
|
||||
```
|
||||
|
||||
**If you get:**
|
||||
- 200 OK: Page works
|
||||
- 400 Bad Request: Pagination limited
|
||||
- 401 Unauthorized: Auth needed
|
||||
- 403 Forbidden: Access denied
|
||||
|
||||
### Common Limits by Hosting:
|
||||
|
||||
| Host | Typical Limit | Notes |
|
||||
|------|---------------|-------|
|
||||
| Shared hosting | 1-2 pages | Often limited for performance |
|
||||
| WP Engine | Unlimited | Usually good |
|
||||
| Kinsta | Unlimited | Usually good |
|
||||
| Bluehost | Often limited | Contact support |
|
||||
| GoDaddy | Limited | May need plugin adjustment |
|
||||
|
||||
---
|
||||
|
||||
## Advanced: Manual Pagination
|
||||
|
||||
If API pagination is broken, you can manually specify which posts to analyze:
|
||||
|
||||
```bash
|
||||
# Fetch from Google Sheets instead of API
|
||||
# Or use WordPress XML export
|
||||
# Or manually create CSV of posts you want to analyze
|
||||
```
|
||||
|
||||
(Contact us if you need help with this)
|
||||
|
||||
---
|
||||
|
||||
## Logs Explained
|
||||
|
||||
### New Log Messages:
|
||||
|
||||
```
|
||||
✓ Fetched 100 publish posts (page 1)
|
||||
→ Successful fetch of 100 posts
|
||||
|
||||
ⓘ API limit reached (fetched 228 posts)
|
||||
→ API doesn't allow page 4+, got 228 total
|
||||
|
||||
ⓘ Retrying without _fields parameter
|
||||
→ Trying again without field filtering
|
||||
|
||||
✓ Total publish posts: 228
|
||||
→ Final count for this status
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
| Issue | Impact | Solution |
|
||||
|-------|--------|----------|
|
||||
| Can't fetch page 2+ | Limited analysis | Contact host, check plugins |
|
||||
| 400 Bad Request | Graceful handling | Script continues with what it got |
|
||||
| All 3 sites fail | API-wide issue | Check WordPress REST API |
|
||||
| Missing top 50 posts | Incomplete analysis | Use WordPress export as backup |
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Run analyzer** and note pagination limits for each site
|
||||
2. **Check logs** - see how many posts were fetched
|
||||
3. **If limited:**
|
||||
- Note the numbers (e.g., "Only fetched 100 of 262")
|
||||
- Contact your hosting provider
|
||||
- Ask about REST API pagination limits
|
||||
|
||||
4. **Re-run when fixed** (hosting provider increases limit)
|
||||
|
||||
---
|
||||
|
||||
## Still Having Issues?
|
||||
|
||||
Check:
|
||||
1. ✓ WordPress credentials correct
|
||||
2. ✓ REST API enabled on all 3 sites
|
||||
3. ✓ User has read permissions
|
||||
4. ✓ No IP blocking (firewall/security)
|
||||
5. ✓ No SSL certificate issues
|
||||
6. ✓ Sites are online and responding
|
||||
|
||||
See: `guides/SEO_ANALYZER_GUIDE.md` → Troubleshooting section
|
||||
Reference in New Issue
Block a user