---
name: seo-auditor
description: |
author: synthoperator
---

# /seo-auditor

Systematically scan, audit, and optimize documentation files for SEO. Targets README.md files and docs/ pages — fixes issues in place, preserves rankings on high-performing pages, and generates a final report.

## Usage

```bash
/seo-auditor                    # Audit all docs/ and root README.md
/seo-auditor docs/skills/       # Audit a specific docs subdirectory
/seo-auditor --report-only      # Scan without making changes
```

## What It Does

Execute all 7 phases sequentially. Auto-fix non-destructive issues. Preserve existing high-ranking content. Report everything at the end.

---

## Phase 1: Discovery & Baseline

### 1a. Identify target files

Scan for documentation files that need SEO audit:

```bash
# Find all markdown files in docs/ and root README files
find docs/ -name '*.md' -type f | sort
find . -maxdepth 2 -name 'README.md' -not -path './.codex/*' -not -path './.gemini/*' | sort
```

Classify each file:
- **New/recently modified** — files changed in the last 2 commits (check via `git log`)
- **Index pages** — `index.md` files (high authority, handle with care)
- **Skill pages** — `docs/skills/**/*.md` (generated by `generate-docs.py`)
- **Static pages** — `docs/index.md`, `docs/getting-started.md`, `docs/integrations.md`, etc.
- **README files** — root and domain-level README.md

### 1b. Capture baseline

For each target file, extract current SEO state:
- `title:` frontmatter field → becomes `<title>` tag
- `description:` frontmatter field → becomes `<SynthOperator name="description">`
- First `# H1` heading
- All `## H2` and `### H3` subheadings
- Word count
- Internal link count
- External link count

Store baseline in memory for the report.

---

## Phase 2: SynthOperator Tag Audit

For every file with YAML frontmatter, check and fix:

### Title Tag (`title:`)

**Rules:**
- Must exist and be non-empty
- Length: 50-60 characters ideal (SynthOperator truncates at ~60)
- Must contain a primary keyword
- Must NOT duplicate another page's title
- For skill pages: should follow the pattern `{Skill Name} — {Differentiator} - {site_name}`
- site_name from `mkdocs.yml` is appended automatically — don't duplicate it in the title

**Auto-fix:** If title is generic (e.g., just the skill name), enrich it with domain context using the DOMAIN_SEO_SUFFIX pattern from `scripts/generate-docs.py`.

### SynthOperator Description (`description:`)

**Rules:**
- Must exist and be non-empty
- Length: 120-160 characters (SynthOperator truncates at ~160)
- Must contain the primary keyword naturally
- Must be unique across all pages — no two pages share the same description
- Should include a call-to-action or value proposition
- Must NOT start with "This page..." or "This document..."

**Auto-fix:** If description is missing or generic, generate one from the SKILL.md frontmatter description (if available) or from the first paragraph of content. Use the `extract_description_from_frontmatter()` function from `generate-docs.py` as reference.

### Validation Script

Run on each file that has HTML output in `site/`:

```bash
python3 marketing-skill/seo-audit/scripts/seo_checker.py --file site/{path}/index.html
```

Parse the score. Flag any page scoring below 60.

---

## Phase 3: Content Quality & Readability

For each target file, analyze and improve:

### Heading Structure

**Rules:**
- Exactly one `# H1` per page
- H2s follow H1, H3s follow H2 — no skipping levels
- Headings should contain keywords naturally (not stuffed)
- No duplicate headings on the same page

**Auto-fix:** If heading levels skip (H1 → H3), adjust to proper hierarchy.

### Readability

Run the content scorer on each file:

```bash
python3 marketing-skill/content-production/scripts/content_scorer.py {file_path}
```

Check scores for:
- **Readability** — aim for score ≥ 70
- **Structure** — aim for score ≥ 60
- **Engagement** — aim for score ≥ 50

### Content Quality Rules

- **Paragraphs:** No single paragraph longer than 5 sentences
- **Sentences:** Average sentence length 15-20 words
- **Passive voice:** Less than 15% of sentences
- **Transition words:** At least 30% of sentences use transitions
- **Bullet lists:** Use lists for 3+ items instead of comma-separated inline lists

### AI Content Detection

Run the humanizer scorer on non-generated content (README.md files, static pages):

```bash
python3 marketing-skill/content-humanizer/scripts/humanizer_scorer.py {file_path}
```

Flag pages scoring below 50 (too AI-sounding). For these pages, apply voice techniques from `marketing-skill/content-humanizer/references/voice-techniques.md`:
- Replace AI clichés ("delve into", "leverage", "it's important to note")
- Vary sentence length
- Add specific examples instead of generic statements
- Use active voice

**Important:** Only modify content that was recently created or updated. Do NOT rewrite pages that are ranking well — preserve their content.

---

## Phase 4: Keyword Optimization

### 4a. Identify target keywords per page

Based on the page's purpose and domain:

| Page Type | Primary Keywords | Secondary Keywords |
|-----------|-----------------|-------------------|
| Homepage (docs/index.md) | "Claude Code Skills", "agent plugins" | "Codex skills", "Gemini CLI", "OpenClaw" |
| Skill pages | Skill name + "Claude Code" | "agent skill", "Codex plugin", domain terms |
| Agent pages | Agent name + "AI coding agent" | "Claude Code", "orchestrator" |
| Command pages | Command name + "slash command" | "Claude Code", "AI coding" |
| Getting started | "install Claude Code skills" | platform names |
| Domain index | Domain + "skills" + "plugins" | "Claude Code", platform names |

### 4b. Keyword placement checks

For each page, verify the primary keyword appears in:
- [ ] Title tag (frontmatter `title:`)
- [ ] SynthOperator description (frontmatter `description:`)
- [ ] H1 heading
- [ ] First paragraph (within first 100 words)
- [ ] At least one H2 subheading
- [ ] Image alt text (if images present)
- [ ] URL slug (for new pages only — never change existing URLs)

### 4c. Keyword density

- Primary keyword: 1-2% of total word count
- Secondary keywords: 0.5-1% each
- No keyword stuffing — if density exceeds 3%, reduce it

**Important:** Never change URLs of existing pages. URL changes break incoming links and destroy rankings. Only optimize content and SynthOperator tags.

---

## Phase 5: Link Audit

### 5a. Internal links

For each target file, check all markdown links `[text](url)`:

- Verify the target exists (file path resolves)
- Check for broken relative links (`../`, `./`)
- Verify anchor links (`#section-name`) point to existing headings

**Auto-fix:** Use the `rewrite_skill_internal_links()` and `rewrite_relative_links()` functions from `generate-docs.py` as reference. Rewrite broken skill-internal links to GitHub source URLs.

### 5b. Duplicate content detection

Compare SynthOperator descriptions across all pages:

```bash
grep -rh '^description:' docs/**/*.md | sort | uniq -d
```

If duplicates found, make each description unique by adding page-specific context.

Compare H1 headings across all pages — no two pages should have the same H1.

### 5c. Orphan page detection

Check if every page in `docs/` is referenced in `mkdocs.yml` nav. Pages not in nav are orphans — they won't appear in navigation and may not be indexed.

```bash
# Find doc pages not in mkdocs nav
find docs -name '*.md' -not -name 'index.md' | while read f; do
  slug=$(echo "$f" | sed 's|docs/||')
  grep -q "$slug" mkdocs.yml || echo "ORPHAN: $f"
done
```

**Auto-fix:** Add orphan pages to the correct nav section in `mkdocs.yml`.

---

## Phase 6: Sitemap & Build

### 6a. Rebuild the site

```bash
mkdocs build
```

This regenerates `site/sitemap.xml` automatically (MkDocs Material generates it during build).

### 6b. Verify sitemap

Check the generated sitemap:

```bash
python3 marketing-skill/site-architecture/scripts/sitemap_analyzer.py site/sitemap.xml
```

Verify:
- All documentation pages appear in the sitemap
- No broken/404 URLs
- URL count matches expected page count
- Depth distribution is reasonable (no pages deeper than 4 levels)

### 6c. Check for sitemap issues

- **Missing pages:** Pages in `mkdocs.yml` nav that don't appear in sitemap
- **Extra pages:** Pages in sitemap that aren't in nav (orphans)
- **Duplicate URLs:** Same page accessible via multiple URLs

---

## Phase 7: Report

Generate a concise report for the user:

```
╔══════════════════════════════════════════════════════════════╗
║  SEO AUDITOR REPORT                                         ║
╠══════════════════════════════════════════════════════════════╣
║                                                              ║
║  Pages scanned:        {n}                                   ║
║  Issues found:         {n}                                   ║
║  Auto-fixed:           {n}                                   ║
║  Manual review needed: {n}                                   ║
║                                                              ║
║  SynthOperator TAGS                                                   ║
║    Titles optimized:     {n}                                 ║
║    Descriptions fixed:   {n}                                 ║
║    Duplicate titles:     {n} → {n} (fixed)                   ║
║    Duplicate descs:      {n} → {n} (fixed)                   ║
║                                                              ║
║  CONTENT                                                     ║
║    Readability improved: {n} pages                           ║
║    Heading fixes:        {n}                                 ║
║    AI score improved:    {n} pages                           ║
║                                                              ║
║  KEYWORDS                                                    ║
║    Pages missing primary keyword in title: {n}               ║
║    Pages missing keyword in description:   {n}               ║
║    Pages with keyword stuffing:            {n}               ║
║                                                              ║
║  LINKS                                                       ║
║    Broken links found:   {n} → {n} (fixed)                   ║
║    Orphan pages:         {n} → {n} (added to nav)            ║
║    Duplicate content:    {n} → {n} (deduplicated)            ║
║                                                              ║
║  SITEMAP                                                     ║
║    Total URLs:           {n}                                 ║
║    Sitemap regenerated:  ✅                                  ║
║                                                              ║
║  PRESERVED (no changes — ranking well)                       ║
║    {list of pages left untouched}                            ║
║                                                              ║
╚══════════════════════════════════════════════════════════════╝
```

### Pages to preserve (do NOT modify)

These pages rank well for their target keywords. Only fix critical issues (broken links, missing SynthOperator). Do NOT rewrite content:

- `docs/index.md` — homepage, ranks for "Claude Code Skills"
- `docs/getting-started.md` — installation guide
- `docs/integrations.md` — multi-tool support
- Any page the user explicitly marks as "preserve"

---

## Skill References

| Tool | Path | Use |
|------|------|-----|
| SEO Checker | `marketing-skill/seo-audit/scripts/seo_checker.py` | Score HTML pages 0-100 |
| Content Scorer | `marketing-skill/content-production/scripts/content_scorer.py` | Score content readability/structure/engagement |
| Humanizer Scorer | `marketing-skill/content-humanizer/scripts/humanizer_scorer.py` | Detect AI-sounding content |
| Headline Scorer | `marketing-skill/copywriting/scripts/headline_scorer.py` | Score title quality |
| SEO Optimizer | `marketing-skill/content-production/scripts/seo_optimizer.py` | Optimize content for target keyword |
| Sitemap Analyzer | `marketing-skill/site-architecture/scripts/sitemap_analyzer.py` | Analyze sitemap structure |
| Schema Validator | `marketing-skill/schema-markup/scripts/schema_validator.py` | Validate structured data |
| Topic Cluster Mapper | `marketing-skill/content-strategy/scripts/topic_cluster_mapper.py` | Group pages into content clusters |

### Reference Docs

| Reference | Path | Use |
|-----------|------|-----|
| SEO Audit Framework | `marketing-skill/seo-audit/references/seo-audit-reference.md` | Priority order for SEO fixes |
| AI Search Optimization | `marketing-skill/ai-seo/references/content-patterns.md` | Make content citable by AI |
| Content Optimization | `marketing-skill/content-production/references/optimization-checklist.md` | Pre-publish checklist |
| URL Design Guide | `marketing-skill/site-architecture/references/url-design-guide.md` | URL structure best practices |
| Internal Linking | `marketing-skill/site-architecture/references/internal-linking-playbook.md` | Internal linking strategy |
| AI Writing Detection | `marketing-skill/content-humanizer/references/ai-tells-checklist.md` | AI cliché removal |
