Research · updated June 2026

The State of AI Search Readiness

We ran Oraql's deterministic audit on 154 leading websites — across SaaS, e-commerce, dev tools, fintech and media — to see how ready the web is for answer engines like ChatGPT, Perplexity, Claude and Google AI. No LLM guesswork: every score is reproducible from public signals.
80
average score / 100
81
median score
55%
scored A or B
9%
scored D or F

Grade distribution

Most sites land in the B–C band. A long tail of D/F sites is effectively invisible to answer engines.
A
18 sites · 11.7%
B
66 sites · 42.9%
C
56 sites · 36.4%
D
9 sites · 5.8%
F
5 sites · 3.2%

Where sites win — and where they lose

Average score per category as a share of the points available, weakest first. The pattern is stark.
Structured Data5.5 / 15 avg · 36.9%
Answer-Ready Content5.4 / 12 avg · 45.4%
Metadata & Semantics12.5 / 15 avg · 83.6%
Content Extractability19.2 / 20 avg · 95.8%
Crawl Infrastructure7.8 / 8 avg · 97.2%
AI Crawler Access24.4 / 25 avg · 97.7%
Technical Hygiene5.0 / 5 avg · 99.2%
The takeaway.

Crawler access, extractability and technical hygiene all clear 95%. But structured data (37%) and answer-ready content (45%) are where nearly every site bleeds points — exactly the signals AI engines lean on to quote, cite and recommend a page. The web is crawlable; it just isn't quotable.

Five signals that decide AI readiness

97%have a meta description. The basics are well covered — only 4 of 154 sites are missing one.
49%publish an llms.txt file. Just under half (76 sites) expose the emerging standard that hands AI engines a clean summary of the site.
36%publish no structured data. 56 sites ship zero JSON-LD, so answer engines have to guess what each page is even about.
25%lack a clean H1 heading. 39 homepages don't expose one clear H1, leaving engines a weaker anchor for what the page is about.
5%block at least one AI crawler. 8 sites disallow GPTBot, ClaudeBot, PerplexityBot or Google-Extended (or block all bots) in robots.txt.

The leaderboard

Most AI-ready
#WebsiteScoreGrade
1jasper.ai99A
2salesforce.com97A
3beehiiv.com96A
4apollo.io95A
5auth0.com95A
6klaviyo.com94A
7heroku.com92A
8render.com92A
9intercom.com91A
10pinecone.io91A
Most work to do
#WebsiteScoreGrade
1gymshark.com49F
2substack.com52F
3nytimes.com56F
4ebay.com58F
5jetbrains.com59F
6adobe.com62D
7affirm.com62D
8ramp.com63D
9fly.io66D
10squareup.com66D
See all 154 sites in the full directory →

Get the full dataset

Every number behind this report — all 154 sites with scores, grades and live report links, as a clean CSV. Built for agencies, consultants and researchers. Emailed instantly after checkout.
Get the dataset → $99

Methodology

Each site is scored 0–100 across seven weighted categories — AI Crawler Access, Content Extractability, Structured Data, Metadata & Semantics, Answer-Ready Content, Crawl Infrastructure and Technical Hygiene. Oraql fetches only public resources (the homepage, robots.txt, llms.txt and sitemap.xml); scoring is fully deterministic with no LLM in the loop, so any result is reproducible. Population: 154 distinct domains audited in June 2026; duplicate apex/www hosts and domains that block automated fetching were excluded so the numbers reflect real, comparable sites.
Audit your site free →