Most operators don't know whether they're being cited or invisible in the layer that's quietly replacing traditional search. ChatGPT processes 1B+ queries weekly. AI Overviews appear on roughly 55% of Google searches. AI-referred sessions grew 527% YoY through mid-2025. Gartner projects traditional search traffic falling 25% by end of 2026. The visibility shift is structural and accelerating — and most brands have no methodology for diagnosing where they stand.

This is the AEO audit methodology we run on every Praxxii client engagement: 5 audit zones, 24 specific checks, a tooling stack matched to engagement stage, and a prioritization matrix that ranks findings by leverage. Run it on your brand in 90 minutes. You'll know whether you're invisible, cited-but-misrepresented, or competitively positioned in this layer.

This is the practical companion to Day 1's AEO overview and Day 2's GEO playbook. Those pieces gave the frameworks. This gives the operational checklist.

The 5 audit zones

Every AEO operation has five interlocking layers. Weakness in one caps the others. The audit checks each against current 2026 benchmarks and ranks findings by what's binding the rest.

Zone 1 — Citation Tracking and Visibility Measurement (do you know where you stand)

Zone 2 — Entity Hygiene and Knowledge Graph (does the web know who you are)

Zone 3 — Content Structure and AI-Readability (can the engines parse your pages)

Zone 4 — Authority and E-E-A-T Signals (do the engines trust you)

Zone 5 — Technical Crawlability for AI Bots (can the engines reach your content)

The right order is Zone 1 first — without measurement, every Zone 2-5 finding is theoretical. Most operators skip Zone 1 because the tooling is new. That's the first mistake.

Zone 1: Citation Tracking and Visibility Measurement (5 checks)

1.1 Manual citation audit across the four engines that matter most — ChatGPT, Gemini, Perplexity, and Claude — for your top 20 commercial queries. Document who gets cited, where you appear, and where competitors do. Red flag: never run, or older than 90 days.

1.2 Automated citation tracker deployed at appropriate tier: free tools (HubSpot AEO Grader, manual prompts) for diagnostic baseline, mid-tier ($29-99/mo: Otterly.AI, AthenaHQ) for ongoing tracking, or enterprise (Profound, Scrunch, Peec.ai at $250-399/mo+) for scale. Red flag: no automated tracking, citation share of voice unknown.

1.3 Coverage across all six platforms that matter for your category: ChatGPT, Google AI Overviews, Perplexity, Gemini, Claude, Copilot. Vertical-specific engines (Grok for X/news, DeepSeek for technical) if relevant. Red flag: tracking only one engine.

1.4 Methodology matched to use case — UI scraping for accurate real-world data, API access for scale, both for triangulation. Red flag: relying on API-only data and assuming it matches what real users see.

1.5 Source-of-truth dashboard reconciles AI visibility data with GA4 (AI referral traffic) and GSC (branded query growth). Red flag: AI visibility data lives in a separate tool with no integration into the broader analytics stack.

Scoring: 5/5 trustworthy measurement · 3-4 directional only · 0-2 binding constraint, every Zone 2-5 finding is theoretical.

Zone 2: Entity Hygiene and Knowledge Graph (5 checks)

2.1 Wikipedia article exists and is accurate (where qualifying notability applies). Red flag: missing article on a brand large enough to qualify, or factual inaccuracies in an existing one.

2.2 Wikidata entry is canonical: correct industry classification, valid identifiers (CrunchBase, LinkedIn, GitHub), accurate descriptions across languages where relevant. Red flag: missing entry, outdated descriptions, wrong company size or industry code.

2.3 Google Knowledge Panel claimed and verified, with accurate descriptions, founding date, executives, social profiles. Red flag: unclaimed panel, outdated info, missing executive entries.

2.4 Cross-database reconciliation across Crunchbase, LinkedIn Company Page, G2/Capterra/TrustRadius (for B2B), GBP (for local), industry-specific directories. All entries consistent in name, founding year, employee count, location, leadership. Red flag: facts vary across databases — AI engines use this triangulation to score authority.

2.5 Schema markup deployed: Organization, FAQPage, BreadcrumbList, Article (with author), Product (where applicable), Speakable, HowTo. Validated through Schema.org validator. Red flag: missing Organization schema, no FAQPage on Q&A content, no Article schema with author on editorial content.

Scoring: 4-5/5 strong entity foundation · 2-3 recoverable with focused work · 0-1 you appear inconsistent to AI engines and they will hesitate to cite you.

Zone 3: Content Structure and AI-Readability (5 checks)

3.1 Question-format headings on commercial content — H2s and H3s that match the actual queries users ask. Red flag: clever headlines that don't match user search intent; AI engines extract from question-format structure.

3.2 Direct-answer paragraphs in the first 100 words of every commercial page — the AI-extractable answer comes before the prose context. Red flag: long intros before the answer; AI engines often extract only the first 40-100 words.

3.3 Comparative content for "X vs Y" and "alternatives to Z" queries with tables, structured comparisons, and explicit recommendations. Red flag: no comparative content despite high commercial intent for these queries.

3.4 Citation-ready statistics, data points, and proprietary research — numbers AI engines can extract and attribute. Red flag: no proprietary data on the brand site; all data sourced from third parties.

3.5 Content freshness — last-updated dates on every commercial page, with actual content review (not just metadata refresh) at minimum quarterly. Red flag: pages aged 18+ months with no update; AI engines actively prefer recent sources.

Scoring: 4-5/5 AI-readable · 2-3 partially extractable · 0-1 invisible to citation pipelines regardless of authority.

Zone 4: Authority and E-E-A-T Signals (5 checks)

4.1 Author bylines on every editorial page with named expertise, credentials, and professional URLs (LinkedIn, professional bio). Red flag: anonymous content or "Admin" bylines on content claiming expertise.

4.2 Outbound citations to authoritative sources in commercial content. AI engines reward content that itself cites authority. Red flag: claims made without sourced support, especially on regulated topics (financial, medical, legal).

4.3 Earned coverage in sources AI engines themselves cite — major industry publications, Wikipedia mentions, reputable news outlets. Red flag: no earned media in the past 12 months; AI engines triangulate authority across the web's citation graph.

4.4 Topical authority depth — at least 10-15 content assets covering a single subject from multiple angles, not single isolated pieces. Red flag: thin coverage with one article per topic; AI engines favor sites with demonstrated cluster authority.

4.5 Reddit, Quora, and community presence on the queries your category drives — authentic answers from authenticated brand accounts or recognized contributors. Red flag: zero community presence; AI engines disproportionately cite Reddit threads for recommendation queries.

Scoring: 4-5/5 strong authority signal · 2-3 building · 0-1 invisible to engines that triangulate authority across multiple sources.

Zone 5: Technical Crawlability for AI Bots (4 checks)

5.1 GPTBot, ClaudeBot, PerplexityBot, GoogleOther accessibility verified — these user agents can reach your content. Red flag: blocked in robots.txt or by CDN/firewall rules.

5.2 llms.txt deployed at root with structured site map, key page descriptions, and content priorities. Red flag: no llms.txt, or one that just duplicates sitemap.xml without semantic context.

5.3 Server-side rendering for content that needs to be cited — JavaScript-rendered content has substantially lower AI citation rates than server-rendered. Red flag: critical commercial content client-rendered, requiring JS execution to read.

5.4 Core Web Vitals green, page load under 2.5s — fast pages get crawled more frequently and consistently. Red flag: LCP above 4s; AI engines deprioritize slow sources.

Scoring: 3-4/4 crawlable · 2 recoverable · 0-1 you are technically invisible to citation pipelines.

The prioritization matrix

After scoring all 5 zones, the rule: always start with the lowest-scoring zone, weighted toward earlier-numbered zones. The zones compound. Broken Zone 1 (measurement) means every Zone 2-5 finding is theoretical. Broken Zone 5 (crawlability) means perfect entity hygiene and content won't translate into citations.

The decision tree:

Zone 1 < 3/5 → tooling and measurement is P0. Without it, you're optimizing blind. 2-3 weeks.

Zone 5 < 2/4 → technical crawlability rebuild. Citations are technically impossible until this is fixed. 1-2 weeks.

Zone 2 < 3/5 → entity hygiene sprint. Fast, bounded, high-leverage. 3-4 weeks.

Zone 3 or 4 < 3 → content/authority work. The longest cycle (8-16 weeks) but the highest compounding once shipped.

Most operators start with Zone 3 (content) because that's where the visible work is. Zone 3 fixes typically produce 2-quarter citation increases that don't compound if Zones 1, 2, and 5 are broken. Zone 1 (measurement) and Zone 2 (entity hygiene) typically produce the fastest visible lifts because they unlock everything else.

The tooling stack by engagement stage

The right tools depend on what you're trying to do and how mature your AEO operation is.

Diagnostic stage (week 1-4) — free or near-free tools to establish baseline. HubSpot AEO Grader for free AI readiness assessment, Ahrefs Brand Radar free tier for AI crawler monitoring, manual ChatGPT/Perplexity/Gemini/Claude testing on your top 20 queries, Sona's free 17-check audit for crawlability validation. Output: baseline citation share, top gaps, technical issues.

Active optimization (month 2-6) — mid-tier paid tools for ongoing tracking and recommendations. Otterly.AI ($29/mo Lite) for 6-platform tracking and sentiment, AthenaHQ ($99/mo) for actionable task-based recommendations, AI Rank Lab for citation-level forensics. Output: daily citation monitoring, schema audit, GEO recommendations.

Scaled operations (month 6+) — enterprise platforms. Profound ($399+/mo Growth) for 10+ platform coverage with revenue attribution, Scrunch AI ($250+/mo) for daily prompt testing across 4+ LLMs, Conductor for SEO+AEO integration. Output: board-level reporting, share-of-voice benchmarking, pipeline attribution.

Universal complement — a content optimization tool (Frase, Surfer, MarketMuse) running alongside whichever tracker you choose. Tracking without optimization guidance has limited value.

What to do this weekend

Pull each of the 24 checks for your brand. Score each zone. Identify the lowest-scoring two. Cross-reference against the prioritization matrix. Build the rebuild plan.

If your audit produces:

Zone 1 below 3: deploy free diagnostic tools this week, pick mid-tier tracker by end of month.

Zone 2 below 3: entity hygiene sprint — Wikidata, Knowledge Panel, cross-database reconciliation. Bounded 3-4 week project.

Zone 3 below 3: content restructure — question-format headings, direct-answer paragraphs, citation-ready data. The corresponding playbook is Day 1.

Zone 4 below 3: authority build — earned coverage, author bylines, topical cluster expansion. Longer cycle, 6-12 months to compound.

Zone 5 below 2: technical crawlability fix — robots.txt audit, llms.txt deployment, server-side rendering. 1-2 week sprint.

If three or more zones score below 3, you're not looking at tactical optimization — you're looking at a structural AEO rebuild. The right move is a 90-day diagnostic-and-rebuild engagement.

If you'd rather have us run this audit on your brand — same framework, ranked rebuild plan, the tooling stack matched to your stage — that's part of the discovery-edge work we do at Praxxii Global. The audit produces a 5-zone scorecard, the prioritization matrix specific to your brand and category, and a 90-day rebuild plan you can run with your existing team or with us.

The window to operationalize AEO while it's still a competitive advantage is closing. By 2027, the brands citing well in AI engines will have entrenched their positions; the brands not citing will be defending against incumbents who already own the citations, the entity graph, and the authority signal. Most accounts haven't even started measuring yet. Run the audit. The binding constraint is usually not where you've been looking.