Research · updated June 2026

The State of AI Search Readiness

We ran Oraql's deterministic audit on 154 leading websites — across SaaS, e-commerce, dev tools, fintech and media — to see how ready the web is for answer engines like ChatGPT, Perplexity, Claude and Google AI. No LLM guesswork: every score is reproducible from public signals.

average score / 100

median score

55%

scored A or B

scored D or F

Grade distribution

Most sites land in the B–C band. A long tail of D/F sites is effectively invisible to answer engines.

18 sites · 11.7%

66 sites · 42.9%

56 sites · 36.4%

9 sites · 5.8%

5 sites · 3.2%

Where sites win — and where they lose

Average score per category as a share of the points available, weakest first. The pattern is stark.

Structured Data5.5 / 15 avg · 36.9%

Answer-Ready Content5.4 / 12 avg · 45.4%

Metadata & Semantics12.5 / 15 avg · 83.6%

Content Extractability19.2 / 20 avg · 95.8%

Crawl Infrastructure7.8 / 8 avg · 97.2%

AI Crawler Access24.4 / 25 avg · 97.7%

Technical Hygiene5.0 / 5 avg · 99.2%

The takeaway.

Crawler access, extractability and technical hygiene all clear 95%. But structured data (37%) and answer-ready content (45%) are where nearly every site bleeds points — exactly the signals AI engines lean on to quote, cite and recommend a page. The web is crawlable; it just isn't quotable.

Five signals that decide AI readiness

97%have a meta description. The basics are well covered — only 4 of 154 sites are missing one.

49%publish an llms.txt file. Just under half (76 sites) expose the emerging standard that hands AI engines a clean summary of the site.

36%publish no structured data. 56 sites ship zero JSON-LD, so answer engines have to guess what each page is even about.

25%lack a clean H1 heading. 39 homepages don't expose one clear H1, leaving engines a weaker anchor for what the page is about.

5%block at least one AI crawler. 8 sites disallow GPTBot, ClaudeBot, PerplexityBot or Google-Extended (or block all bots) in robots.txt.

The leaderboard

Most AI-ready

#	Website	Score	Grade
1	jasper.ai	99	A
2	salesforce.com	97	A
3	beehiiv.com	96	A
4	apollo.io	95	A
5	auth0.com	95	A
6	klaviyo.com	94	A
7	heroku.com	92	A
8	render.com	92	A
9	intercom.com	91	A
10	pinecone.io	91	A

Most work to do

#	Website	Score	Grade
1	gymshark.com	49	F
2	substack.com	52	F
3	nytimes.com	56	F
4	ebay.com	58	F
5	jetbrains.com	59	F
6	adobe.com	62	D
7	affirm.com	62	D
8	ramp.com	63	D
9	fly.io	66	D
10	squareup.com	66	D

See all 154 sites in the full directory →

Get the full dataset

Every number behind this report — all 154 sites with scores, grades and live report links, as a clean CSV. Built for agencies, consultants and researchers. Emailed instantly after checkout.

Get the dataset → $99

Methodology

Each site is scored 0–100 across seven weighted categories — AI Crawler Access, Content Extractability, Structured Data, Metadata & Semantics, Answer-Ready Content, Crawl Infrastructure and Technical Hygiene. Oraql fetches only public resources (the homepage, robots.txt, llms.txt and sitemap.xml); scoring is fully deterministic with no LLM in the loop, so any result is reproducible. Population: 154 distinct domains audited in June 2026; duplicate apex/www hosts and domains that block automated fetching were excluded so the numbers reflect real, comparable sites.

Audit your site free →