AEO audits answered one question: are you cited inside AI search engines like ChatGPT, Perplexity, Claude, Gemini, and AI Overviews. That question is necessary but no longer sufficient. The 2026 reality, established in Day 37's framework piece, is that AI-mediated discovery happens across five distinct surfaces: AI search engines, productivity assistants, shopping agents, AI procurement systems, and SaaS copilots. A brand cited inside ChatGPT but invisible to Microsoft Copilot, Perplexity's shopping agent, enterprise procurement AI, and platform-native SaaS copilots is winning one battle while losing four others.

The AIO audit extends the general AEO Audit Methodology to cover all five surfaces. Same 5-zone diagnostic framework.

Expanded check set — 32 total checks across the zones, with surface-specific additions for each of the four surfaces AEO didn't cover. Most accounts can run the full audit in a long working session. This piece is the operational companion to Day 37's framework piece. Day 37 defined the discipline; Day 38 walks through how to diagnose your posture against it.

Why expand the audit beyond AEO

The 2026 numbers anchoring this audit:

AI search engines (ChatGPT, Perplexity, Claude, Gemini, AI Overviews) collectively process over 4 billion queries weekly through mid-2026. This is the surface AEO audits already cover.

Productivity AI assistants (Microsoft Copilot, Google Gemini Workspace, ChatGPT in macOS, Notion AI) collectively serve over 850 million weekly active users across enterprise + consumer contexts. When users ask category questions inside these assistants, brand recommendations get surfaced.

AI shopping agents (Perplexity shopping, OpenAI agentic browsing, Anthropic Claude with computer use) executed over 38 million autonomous shopping workflows in Q2 2026. The agent's reasoning chain determines which brands enter consideration.

Enterprise AI procurement systems (DocuSign IRIS, SAP procurement AI, Coupa's AI evaluator, plus emerging vertical-specific procurement tools) are now running pre-evaluation on 21% of mid-market+ vendor selections.

Platform-internal SaaS copilots (HubSpot Breeze, Salesforce Einstein, Shopify Magic, Notion AI for integrations) collectively recommend integration partners to over 12 million business users monthly.

A brand visible only to AI search engines is reaching maybe 35-50% of the AI-mediated discovery surface area. The other 50-65% — productivity AI, agents, procurement, SaaS copilots — is going to competitors who optimized for it or to no one at all. The brands that capture the full surface area first will own positions through 2028 that incumbents who never audited beyond AEO can't dislodge.

The 5-zone AIO audit framework

Same zone structure as the AEO audit. Different checks. Surface-specific additions inside each zone.

Zone 1 — Multi-Surface Visibility Measurement (6 checks) Zone 2 — Entity Hygiene + Cross-Surface Consistency (7 checks) Zone 3 — Content Structure for Multi-Surface AI Consumption (7 checks) Zone 4 — Authority Signals Across the Five Surfaces (6 checks) Zone 5 — Technical Crawlability + Agent Accessibility (6 checks)

32 total checks. Each scoreable on a 0-1 binary or 0-2 partial scale. Total framework runs about 8-10 hours per audit including documentation.

Zone 1 — Multi-Surface Visibility Measurement

The expansion from AEO to AIO is most visible in Zone 1. AEO audits measure citation share in AI search engines. AIO audits measure recommendation share across all five surfaces — which means deploying tracking infrastructure for each surface separately.

1.1 AI search citation tracking deployed. ChatGPT, Perplexity, Claude, Gemini, AI Overviews tracked weekly across documented query universe. Baseline measurement established for citation share. Red flag: AI search tracking not deployed; only manual spot-checks happening.

1.2 Productivity AI assistant recommendation tracking deployed. Documented test prompts run weekly inside Microsoft Copilot (Word/Excel/Outlook/Teams contexts), Google Gemini Workspace, ChatGPT macOS, and Notion AI. Recommendation share measured for category queries inside each productivity surface. Red flag: productivity AI surfaces completely unmonitored — most accounts have zero visibility here.

1.3 Shopping agent recommendation tracking deployed (for B2C / D2C brands). Perplexity shopping agent test workflows, OpenAI agentic browsing test workflows, Anthropic Claude computer-use test workflows. Documented brand inclusion or exclusion in autonomously-executed shopping shortlists. Red flag: agent visibility unmeasured despite being shopping-category-relevant.

1.4 AI procurement evaluation tracking deployed (for B2B brands). Documented test scenarios in DocuSign IRIS, SAP procurement AI, Coupa, and emerging vertical procurement tools. Inclusion or exclusion in AI-generated vendor shortlists tracked. Red flag: procurement AI visibility unmeasured despite being B2B-procurement-relevant.

1.5 SaaS copilot recommendation tracking deployed (for SaaS partnership-relevant brands). HubSpot Breeze integration recommendations, Salesforce Einstein integration suggestions, Shopify Magic app recommendations, Notion AI for integrations. Documented inclusion/exclusion. Red flag: SaaS copilot recommendation untracked despite being relevant to partner/integration economics.

1.6 Multi-surface recommendation share reconciled to acquisition data. Total AI-mediated touch points across all 5 surfaces, source-attributed where possible (UTM + interview-the-prospect), correlated to MQO/customer/funded-account/booked-engagement data per the vertical. Red flag: AI visibility tracked across surfaces but never reconciled to acquisition outcomes; no measurement of which surface drives which conversion type.

Scoring: 5-6/6 trustworthy AIO measurement · 3-4 directional · 0-2 binding constraint, every Zone 2-5 finding is theoretical until measurement closes.

Zone 2 — Entity Hygiene + Cross-Surface Consistency

AEO entity hygiene focused on AI search citation eligibility. AIO entity hygiene extends to ensuring your entity is legible — and consistent — across all five surfaces. Inconsistency across surfaces (different brand descriptions in different databases, different category placements, different feature claims) creates confusion that AI systems resolve by recommending more consistently-defined competitors.

2.1 Brand Wikipedia article + Wikidata entry canonical. Same checks as AEO audit, but verify cross-database accuracy specifically including the sources productivity AI / shopping agents / procurement AI / SaaS copilots rely on (not just search engines). Red flag: Wikipedia/Wikidata accurate for AI search but inconsistent with what enterprise procurement systems pull from D&B, Hoovers, or industry-specific databases.

2.2 Google Knowledge Panel claimed + LinkedIn Company Page complete + Crunchbase canonical. The three databases productivity AI assistants most commonly cross-reference for brand context. Red flag: any of the three out of date or inconsistent with primary brand site.

2.3 Industry-specific directory presence for procurement AI. G2, TrustRadius, Capterra, Gartner Peer Insights (for software); BBB, NerdWallet, Trustpilot (for consumer); industry-specific procurement databases (for vertical SaaS, manufacturing, services). Each profile claimed, current, and consistent with primary brand description. Red flag: missing or low-engagement profiles on the procurement databases AI procurement tools actually consult.

2.4 Platform-partner directory presence for SaaS copilots. Shopify App Store, HubSpot Marketplace, Salesforce AppExchange, Slack App Directory, Notion gallery, Zapier directory — whichever are relevant to your brand. Listing accuracy, integration claims verified, partner-tier where applicable. Red flag: brand integrates with major platforms but isn't listed in their directories, OR is listed with stale information.

2.5 Schema markup deployed comprehensively. All standard schema types (Organization, Product, FAQPage, Review, AggregateRating, BreadcrumbList) plus vertical-specific schema (SoftwareApplication for B2B SaaS, FinancialProduct for fintech, MedicalOrganization for healthcare, LegalService for B2B services). Validated. Red flag: missing vertical-specific schema; AI systems can't reliably classify the brand's category.

2.6 llms.txt deployed and current. Site root file specifying AI crawler priorities, key page hierarchy, brand description for AI consumption. Red flag: no llms.txt, or one not updated in 6+ months.

2.7 Brand description consistency across surfaces. A 50-100 word canonical brand description used consistently across LinkedIn, Crunchbase, Wikipedia, Google Knowledge Panel, primary review sites, and platform directories. Red flag: each database has a different brand description; AI systems resolve inconsistency by trusting none of them.

Scoring: 6-7/7 strong cross-surface entity foundation · 4-5 recoverable · 0-3 brand appears inconsistent across surfaces and gets relegated to "alternative" recommendations.

Zone 3 — Content Structure for Multi-Surface AI Consumption

AEO content structure focused on what AI search engines extract. AIO content structure extends to what each surface's AI extraction logic needs. The mechanics are similar but the priorities shift.

3.1 Question-format content covering AI search query universe. Same as AEO audit — page H2/H3 headings match real user query patterns. Red flag: content structured around brand-narrative rather than user-question patterns.

3.2 Direct-answer paragraphs in the first 100-150 words. Same as AEO audit — AI-extractable answer lives in page opening. Red flag: long brand-story intros before operational content.

3.3 Comparison content for evaluation queries. X-vs-Y pages, alternatives-to-incumbent pages, comparison tables with explicit recommendations. AI search engines AND productivity assistants AND procurement AI all weight comparison content. Red flag: no comparison content despite high commercial intent.

3.4 Use-case + vertical-specific landing pages. Generic category pages don't surface in long-tail AI recommendations. Use-case content captures the queries that drive most AI-mediated discovery volume. Red flag: only horizontal category content; no use-case or vertical-specific landing pages.

3.5 Integration-specific content for SaaS copilot surfaces. For brands relevant to platform copilots: documented integrations with named platforms, integration setup guides, use-case content showing the integrated workflow. The content SaaS copilots reference when recommending integrations. Red flag: brand integrates with major platforms but documentation is thin or buried, making it hard for platform copilots to recommend.

3.6 Procurement-readable content for AI procurement surfaces. For B2B brands: clear pricing transparency (or "contact sales" with clear tiering), security/compliance certifications surfaced (SOC 2, ISO 27001, HIPAA, GDPR where relevant), implementation timeline documented, customer onboarding process described. AI procurement systems specifically look for these signals. Red flag: B2B brand has weak procurement-readable content; AI procurement systems can't generate complete vendor profiles.

3.7 Authority-signaling content (case studies + research + thought leadership) with named-author bylines. Every piece of authority content surfaces named-author credentials. This applies across all 5 surfaces; AI systems weight credentialed authorship as a primary trust signal. Red flag: anonymous or "Team [Brand]" bylines on authority content.

Scoring: 6-7/7 AI-readable across all surfaces · 4-5 partially extractable · 0-3 invisible to most surface-specific recommendation pipelines.

Zone 4 — Authority Signals Across the Five Surfaces

AEO authority focused on signals AI search engines weight. AIO authority extends to signals each of the five surfaces weights — which overlap but differ in priority.

4.1 Earned media in AI-cited sources. Tier-1 industry publications, tier-2 vertical publications, podcast guest appearances, conference speaking. Same as AEO audit. Red flag: no earned media in past 12 months.

4.2 Review depth + recency on category-relevant platforms. G2/TrustRadius/Capterra/Gartner for B2B; BBB/Trustpilot/Reddit/Healthgrades/etc. for consumer/regulated. Review count above category median, average rating above category median, recency matters disproportionately (2025-2026 reviews weighted heavier than older), active response to negative reviews. Red flag: stale review profiles; unresponded reviews; ratings below 4.0.

4.3 Named-expert / founder / executive bylines and profiles. LinkedIn profiles complete with credentials, professional bios across the brand site, individual expert pages with publications/speaking/affiliations. AI systems triangulate authority through named-individual signals — especially productivity AI assistants when users ask "who's a good [category] expert" or "who has written about [topic]". Red flag: founder/CEO profile thin or absent; subject-matter experts not surfaced.

4.4 Platform-partner certifications + tier surfacing. Salesforce ISV Partner status, HubSpot Solutions Partner tier, Shopify Plus Partner certification, AWS/GCP/Azure partnership levels. Surfaced on brand site, structured data, and platform directories. SaaS copilots weight these heavily when recommending integrations. Red flag: brand holds partner certifications but they're not surfaced visibly.

4.5 Customer logo grid + case study depth + quantified outcomes. Named-customer logos, case study pages with structured problem→solution→outcome narrative, quantified results (X% improvement in Y, $Z saved, etc.). Procurement AI and SaaS copilots both weight named-customer references disproportionately. Red flag: anonymous case studies; vague outcome claims; logo grid missing or stale.

4.6 Reddit + community presence with authenticated brand accounts. Reddit threads, Discord communities, vertical-specific forums where genuine helpful participation has accrued. AI search engines and shopping agents both surface community-mentioned brands disproportionately. Red flag: zero community presence; or, worse, shadowbanned/banned brand accounts.

Scoring: 5-6/6 strong multi-surface authority · 3-4 building · 0-2 invisible across the broader trust graph regardless of single-surface optimization.

Zone 5 — Technical Crawlability + Agent Accessibility

AEO technical crawlability focused on AI search bot accessibility. AIO technical crawlability extends to ensuring all AI agents — including agentic browsing systems that simulate user behavior — can access, parse, and complete workflows on your site.

5.1 AI search crawler accessibility verified. GPTBot, ClaudeBot, PerplexityBot, GoogleOther accessible across all marketing, product, and content pages. Red flag: blocked in robots.txt or CDN rules.

5.2 Agentic browsing accessibility. OpenAI's agentic browsing, Anthropic Claude with computer use, Perplexity's shopping agent — these systems execute user-flow simulations including form interactions, calendar bookings, demo requests. Forms that require CAPTCHA, complex JavaScript flows, or unusual interactions block agent workflows. Red flag: demo request forms, booking flows, or contact pages hostile to programmatic interaction.

5.3 Server-side rendering on key pages. Marketing pages, product pages, and authority content (case studies, founder bios, customer testimonials) all readable without JS execution. Red flag: critical content requires JavaScript to render.

5.4 Core Web Vitals green on high-conversion pages. LCP under 2.5s, INP under 200ms, CLS under 0.1 on the highest-traffic + highest-conversion-value pages. AI agents respect performance budgets and abandon slow flows. Red flag: LCP above 3s on flagship product or pricing pages.

5.5 Structured data for agent-completable workflows. For booking-relevant brands: Event/Reservation/MedicalProcedure schema as appropriate. For purchase-relevant brands: Product/Offer schema with all required attributes. For B2B brands: clear next-step CTA structured with Action schema. AI agents look for structured data to confirm action paths exist. Red flag: no structured data signaling completable next actions.

5.6 API + documentation accessibility for SaaS copilot surfaces. For brands relevant to platform integrations: public API documentation, integration guides, sample code, status pages. SaaS copilots reference this material when recommending technical integrations. Red flag: API documentation gated, sparse, or outdated.

Scoring: 5-6/6 fully crawlable + agent-accessible · 3-4 recoverable · 0-2 technically invisible to large fractions of the AI-mediated discovery surface.

The AIO prioritization matrix

After scoring all 5 zones, the rule for AIO: start with Zone 1 (measurement) — same as AEO audit. Then prioritize by surface relevance to your business.

Zone 1 < 4/6 → measurement deployment is P0. Without it, every other finding is theoretical. Includes the surface-specific tracking deployments for productivity AI, shopping agents, procurement AI, and SaaS copilots as relevant to your category. 3-4 weeks to deploy comprehensively.

Zone 5 < 3/6 → technical crawlability + agent accessibility fix. Critical because it can render Zone 2-4 work invisible. 1-2 weeks for most fixes.

Zone 2 < 5/7 → entity hygiene sprint. Wikipedia, Wikidata, Knowledge Panel, industry directories, platform-partner directories, vertical-specific procurement databases. 4-6 weeks.

Zone 3 < 5/7 → content restructure with question-format, use-case, integration-specific, and procurement-readable content. 8-16 weeks. Longest workstream.

Zone 4 < 4/6 → authority build across earned media, review depth, named-expert profiles, platform certifications, customer references, community presence. 6-12 month compounding build.

The surface-prioritization rule: weight zones by which surfaces matter most to your category.

D2C brands: weight Surfaces 1 (AI search) + 3 (shopping agents) heaviest. Surfaces 2 (productivity AI) + 5 (SaaS copilots) tertiary.

B2B SaaS brands: weight Surfaces 1 (AI search) + 4 (procurement AI) + 5 (SaaS copilots) heaviest. Surface 2 (productivity AI) secondary.

Fintech brands: weight Surfaces 1 (AI search) + 4 (procurement AI for B2B fintech, productivity AI for consumer). Surface 3 (shopping agents) for consumer products.

Healthcare brands: weight Surfaces 1 (AI search) + 2 (productivity AI — patients researching during work hours). Surfaces 3-5 less critical.

B2B services brands: weight Surfaces 1 (AI search) + 4 (procurement AI). Surface 2 (productivity AI) for professional service queries during work.

Most accounts find Zone 1 below 4 (multi-surface measurement nonexistent), Zone 2 below 5 (cross-surface entity inconsistencies), and Zone 4 below 4 (authority signals fragmented across surfaces). That combination produces the highest-leverage 90-day AIO rebuild for most categories.

The citation-to-recommendation conversion math

Why AIO compounds harder than AEO: each surface adds compounding rather than linear value to acquisition.

In Praxxii engagement data across mid-market accounts running AIO rebuilds in 2026:

  • Accounts moving multi-surface recommendation share from below 20% to above 45% see total AI-mediated acquisition become 35-58% of new customer/MQO/booking volume (vs 4-12% at intake)

  • Each additional surface optimized adds compounding acquisition share — not linear. Moving from AI search only to AI search + productivity AI + 1 additional relevant surface typically produces 2.4-3.1× the lift of optimizing AI search alone.

  • The compounding effect reflects how prospects move across surfaces during evaluation. A B2B prospect might encounter your brand in Perplexity (AI search), again in Microsoft Copilot when asking colleagues about category options (productivity AI), and again when Procurement AI generates the formal RFP shortlist. Three touches create consideration; one touch creates awareness.

  • Fully-loaded AIO investment per recovered MQO / qualified-prospect / funded-account lands at $180-$420 in current engagement data — substantially below blended paid acquisition.

The implication: AIO has higher per-investment ROI than AEO alone because the surface compounding produces multi-touch consideration that single-surface optimization can't match.

What to do this quarter

Pull each of the 32 AIO-specific checks for your brand. Score each zone. Identify the lowest-scoring two. Cross-reference against the prioritization matrix AND the surface-prioritization rule for your category. Build the rebuild plan.

If your audit produces:

Zone 1 below 4: deploy measurement infrastructure across all relevant surfaces. The work is bounded — 3-4 weeks for most categories. Zone 5 below 3: technical crawlability + agent accessibility sprint. 1-2 weeks. Zone 2 below 5: cross-surface entity hygiene sprint. 4-6 weeks. Zone 3 below 5: content restructure prioritized against surface relevance. 8-16 weeks. Zone 4 below 4: multi-surface authority build. 6-12 month cycle.

If three or more zones score below threshold, you're looking at a structural AIO rebuild rather than tactical optimization. The right move is a 90-day diagnostic-and-rebuild engagement following the Day 19 audit framework adapted for AIO scope.

If you'd rather have an outside team run the AIO-specific audit, prioritize findings against your category's surface relevance, and stand up the rebuild alongside your in-house team — that's part of the discovery-edge work Praxxii Global does. Free 60-minute diagnostic call before any commercial commitment.

The AIO window is wider than the AEO window because the surfaces are still consolidating in 2026. Most accounts haven't deployed multi-surface measurement yet, which means few are optimizing for anything beyond AI search engines. The brands that capture the full surface area first will own positions through 2028 that incumbents who never updated their audit framework can't dislodge. Run the audit. The binding constraint is rarely where you've been looking.