AI Search

We Checked 200 WordPress Sites for AI Search Readiness -- The Results Are Rough

We spent two weeks manually checking 200 WordPress sites across 10 industries. We looked at five things: llms.txt files, robots.txt AI rules, Schema markup, meta descriptions, and structured FAQ content. We scored each site on a 100-point scale.

The average score was 22 out of 100.

That number isn't shocking because WordPress sites are bad at SEO -- most of them handle traditional SEO reasonably well. It's shocking because almost nobody is optimizing for AI search, even though AI platforms already drive measurable referral traffic.

Here's everything we found.

Methodology

We selected 200 WordPress-powered sites by randomly sampling 20 sites from each of 10 industries:

  1. SaaS / Software
  2. E-commerce
  3. Digital Marketing Agencies
  4. Health & Wellness
  5. Education / Online Courses
  6. Real Estate
  7. Legal Services
  8. Local Business (restaurants, salons, repair shops)
  9. Finance / Fintech
  10. Media / Publishing

We verified each site runs WordPress using BuiltWith and WP fingerprinting (checking for /wp-content/ and /wp-includes/ paths). Sites ranged from small businesses with 20 pages to publishers with 10,000+ posts.

What we checked

For each site, we tested five categories and scored them on a 0-100 scale:

Category Max Points What We Looked For
llms.txt 20 File exists, properly formatted, descriptive summaries
AI Bot Rules 20 robots.txt includes rules for GPTBot, ClaudeBot, PerplexityBot, or other AI crawlers
Schema Markup 25 JSON-LD present, relevant types (Article, Organization, FAQ, Breadcrumb, Product)
Meta Coverage 20 Meta descriptions on all pages (sampled 10 pages per site)
FAQ Structure 15 Structured FAQ sections with H2/H3 headings and concise answers

We accessed each site directly -- no APIs, no automated tools, no crawlers. Just a browser, curl, and a spreadsheet.

The headline numbers

Metric Result
Average AI Readiness Score 22 / 100
Median Score 18 / 100
Sites scoring above 60 7 out of 200 (3.5%)
Sites scoring 0 14 out of 200 (7%)
Highest score 87 (a SaaS documentation site)

Let that sink in: 7% of sites we checked scored a flat zero. No schema, no meta descriptions, no AI bot rules, no llms.txt, no FAQ content. These weren't abandoned blogs -- they were active businesses with recent content.

Finding #1: Almost nobody has llms.txt

6 out of 200 sites (3%) had an llms.txt file.

Of those six, five were SaaS companies and one was a digital marketing agency. Zero e-commerce sites. Zero local businesses. Zero legal or real estate sites.

Of the six that had llms.txt:

  • 4 were well-formatted with descriptive summaries per link
  • 1 was just a list of URLs with no descriptions (defeats the purpose)
  • 1 was malformed -- it used HTML instead of Markdown

Only two sites also had an llms-full.txt file, both SaaS companies with extensive documentation.

The takeaway: If you add an llms.txt file to your WordPress site today, you'll be ahead of 97% of the sites we checked. That's not a marketing claim -- it's what the data shows.

Finding #2: robots.txt AI rules are rare

23 out of 200 sites (11.5%) had any robots.txt rules mentioning AI crawlers.

But here's the interesting part -- the majority of those rules were blocks, not allows:

Rule Type Count
Block all AI bots 14
Block training bots only, allow search bots 4
Explicitly allow AI bots 3
Mixed/custom rules 2

Most sites that had AI bot rules were using blanket Disallow: / for GPTBot, CCBot, and similar crawlers. Only 7 sites (3.5%) had a nuanced approach -- blocking training bots while allowing search-oriented crawlers like GPTBot's search agent and PerplexityBot.

The remaining 177 sites had no mention of AI bots in their robots.txt at all. Their content is technically accessible to every AI crawler, but they probably don't know it.

The takeaway: Not configuring AI bot rules is a choice -- just not an informed one. If you want AI search engines to cite your content, you need to explicitly allow the right bots. If you don't want AI training on your content, you need to explicitly block those bots. Doing neither is leaving the decision to someone else.

Finding #3: Schema markup is the bright spot (sort of)

91 out of 200 sites (45.5%) had some form of JSON-LD schema markup.

This was the highest-scoring category, but there's a caveat: most implementations were minimal.

Schema Type Sites Using It
Organization 72
WebSite (with SearchAction) 58
Article / BlogPosting 49
BreadcrumbList 41
LocalBusiness 18
Product 15
FAQ 9
HowTo 3

The typical site had Organization and WebSite schema (usually auto-generated by their SEO plugin) but nothing else. Only 9 sites had FAQ schema, and only 3 had HowTo schema -- both of which are highly valuable for rich results and AI citation.

The takeaway: Having some schema is better than none, but most sites stop at the bare minimum. Adding Article schema to blog posts, FAQ schema to FAQ sections, and BreadcrumbList for navigation can significantly improve both Google rich results and AI model comprehension. See our Schema Markup guide for WordPress.

SEO for the AI Era
Prime SEO helps your site get discovered by ChatGPT, Perplexity, and AI search engines.

Install Free on WordPress.org →

Finding #4: Meta descriptions are inconsistent

121 out of 200 sites (60.5%) had meta descriptions on all sampled pages.

We checked 10 pages per site (homepage + 9 random internal pages). A site "passed" only if all 10 had meta descriptions.

The breakdown:

  • 121 sites (60.5%): All sampled pages had meta descriptions
  • 47 sites (23.5%): Some pages had them, others didn't (usually blog posts had them, but category/archive pages didn't)
  • 32 sites (16%): No meta descriptions on any page

This is the easiest item on the list to fix. Every major SEO plugin generates meta descriptions either manually or via templates. There's no technical barrier -- it's just neglect.

Finding #5: Structured FAQ content barely exists

16 out of 200 sites (8%) had structured FAQ sections on any page.

By "structured," we mean: a clear FAQ heading (H2 or H3), individual questions as subheadings, and concise paragraph answers. Not a wall-of-text FAQ page with questions buried in prose.

Of the 16 sites with structured FAQs:

  • 9 also had FAQ schema markup (these scored well overall)
  • 7 had the content structure but no schema (missed opportunity)

This is arguably the biggest gap in AI readiness. AI models love Q&A format. When ChatGPT answers "How much does a WordPress developer cost?" it's looking for content structured exactly like an FAQ. Sites without that structure are at a disadvantage.

Industry breakdown

Average AI Readiness Score by industry:

Industry Avg Score Highest Lowest
SaaS / Software 38 87 8
Digital Marketing 34 71 11
Media / Publishing 29 65 5
Finance / Fintech 24 52 3
Education 22 48 0
E-commerce 19 44 0
Health & Wellness 17 39 0
Legal Services 16 41 0
Real Estate 14 33 0
Local Business 12 28 0

No surprise that SaaS and marketing agencies lead -- they're closer to the technology and more likely to follow industry trends. The gap between SaaS (38) and local business (12) is striking though. Local businesses are the least prepared for AI search, but they arguably have the most to gain from it -- "best plumber near me" is exactly the type of query AI search handles.

The 5 biggest gaps (ranked by impact)

Based on our data and the relative impact of each factor on AI search visibility:

1. No llms.txt (97% of sites) Highest impact, lowest adoption. A single file that takes 15 minutes to create. The gap between effort and reward is enormous.

2. No FAQ structure (92% of sites) FAQ content is the #1 format AI models pull from when answering questions. Adding FAQ sections to your top 10 pages would immediately improve AI citability.

3. No AI bot rules in robots.txt (88.5% of sites) Most sites haven't consciously decided how AI crawlers should interact with their content. This is a strategic decision being made by default.

4. Minimal schema markup (54.5% incomplete) Having Organization schema is a start, but sites that add Article, FAQ, and BreadcrumbList schema are far more likely to earn rich results and AI citations.

5. Missing meta descriptions (39.5% incomplete) The simplest fix on this list. Template-based meta descriptions are better than no meta descriptions.

What an "AI-ready" site looks like

The top-scoring site in our study (87 points, a SaaS documentation platform) had:

  • A well-structured llms.txt with 40+ links organized by section, each with a descriptive summary
  • An llms-full.txt containing their complete documentation in Markdown
  • robots.txt rules allowing search-oriented AI bots while blocking training bots
  • JSON-LD schema on every page: Organization, WebSite, Article, BreadcrumbList, and FAQ
  • Meta descriptions on every page, customized (not template-generated)
  • FAQ sections on 12 of their most-visited pages, with proper heading structure and FAQ schema

They scored 20/20 on llms.txt, 18/20 on AI bot rules, 22/25 on schema, 18/20 on meta coverage, and 9/15 on FAQ structure (they lost points because not all FAQ sections were perfectly structured).

This site was also the only one in our study that appeared in ChatGPT search results for relevant queries. Correlation isn't causation -- but it's not nothing, either.

How to check your own site

You can run through this same checklist in about 10 minutes:

  1. llms.txt: Visit yoursite.com/llms.txt. Does it load? Is it properly formatted Markdown with descriptions?

  2. robots.txt AI rules: Visit yoursite.com/robots.txt. Search for "GPTBot," "ClaudeBot," or "PerplexityBot." Any rules?

  3. Schema: Open any page, right-click > View Source, search for application/ld+json. What types are present?

  4. Meta descriptions: Check 5 random pages. Right-click > View Source, search for name="description". Present on all of them?

  5. FAQ structure: Look at your top 5 pages. Do any have a clearly structured FAQ section with individual questions as headings?

If you want to automate this, Prime SEO covers all five categories: it generates llms.txt automatically, manages AI bot access in robots.txt, adds JSON-LD schema markup, ensures meta descriptions are set, and its AI Settings module lets you monitor which AI crawlers are actually visiting your site.

What this means for 2026

The window of opportunity here is wide open. When only 3% of sites have llms.txt and 8% have structured FAQ content, any effort you put in immediately puts you ahead of the majority.

Compare this to traditional SEO circa 2010: the sites that adopted Schema markup early, before Google made it a ranking factor, were ahead for years. The sites that waited until it became "required" had to catch up to an established field.

AI search optimization is in that same early phase. The standards are forming, adoption is minimal, and the competitive advantage of being early is massive.

The data is clear: most WordPress sites aren't ready for AI search. The question is whether yours will be.

Frequently Asked Questions

How were the 200 sites selected for this study?

We used a combination of WordPress showcase directories, industry-specific business listings, and random sampling from Google search results. Sites were verified as WordPress-powered using BuiltWith and WP fingerprinting. We selected 20 sites per industry to ensure representation across different sectors and site sizes, from small business sites with 20 pages to large publishers with 10,000+ posts.

What's a good AI Readiness Score to aim for?

Based on our scoring criteria, we'd consider 50+ a solid baseline and 70+ as well-optimized. The average was 22, so anything above 40 puts you in the top quartile. The most impactful improvements are llms.txt (which can add up to 20 points) and structured FAQ content (up to 15 points) -- both of which take minimal time to implement.

Will AI search readiness affect my Google rankings?

Not directly -- Google doesn't use llms.txt or AI bot access rules as ranking factors. However, several items in our scoring criteria (schema markup, meta descriptions, structured content) are already Google ranking factors or rich result qualifiers. Optimizing for AI readiness improves your traditional SEO at the same time. The real benefit is visibility in a new channel that's growing at 30%+ annually.

How often should I re-check my AI search readiness?

Monthly is a reasonable cadence. The AI search landscape is evolving quickly -- new crawlers appear, standards develop, and AI platforms change how they discover and cite content. A monthly check ensures you catch any regressions (like a plugin update that removes your schema) and adapt to new opportunities. Prime SEO's AI Crawler Stats can help automate the monitoring side.

Ready to optimize for AI search?

Prime SEO is the first WordPress plugin built for AI search engines. Free forever.

Install Free Plugin
← Previous What Is llms.txt? The New Standard That Makes Your Site Visible to AI Search Next → AI Crawler Traffic Grew 300% This Year -- Here's What Your WordPress Site Should Do About It

Related Articles