We ran Oraql's deterministic audit on 154 leading websites — across SaaS, e-commerce, dev tools, fintech and media — to see how ready the web is for answer engines like ChatGPT, Perplexity, Claude and Google AI. No LLM guesswork: every score is reproducible from public signals.
80
average score / 100
81
median score
55%
scored A or B
9%
scored D or F
Grade distribution
Most sites land in the B–C band. A long tail of D/F sites is effectively invisible to answer engines.
A
18 sites · 11.7%
B
66 sites · 42.9%
C
56 sites · 36.4%
D
9 sites · 5.8%
F
5 sites · 3.2%
Where sites win — and where they lose
Average score per category as a share of the points available, weakest first. The pattern is stark.
Structured Data5.5 / 15 avg · 36.9%
Answer-Ready Content5.4 / 12 avg · 45.4%
Metadata & Semantics12.5 / 15 avg · 83.6%
Content Extractability19.2 / 20 avg · 95.8%
Crawl Infrastructure7.8 / 8 avg · 97.2%
AI Crawler Access24.4 / 25 avg · 97.7%
Technical Hygiene5.0 / 5 avg · 99.2%
The takeaway.
Crawler access, extractability and technical hygiene all clear 95%. But structured data (37%) and answer-ready content (45%) are where nearly every site bleeds points — exactly the signals AI engines lean on to quote, cite and recommend a page. The web is crawlable; it just isn't quotable.
Five signals that decide AI readiness
97%have a meta description. The basics are well covered — only 4 of 154 sites are missing one.
49%publish an llms.txt file. Just under half (76 sites) expose the emerging standard that hands AI engines a clean summary of the site.
36%publish no structured data. 56 sites ship zero JSON-LD, so answer engines have to guess what each page is even about.
25%lack a clean H1 heading. 39 homepages don't expose one clear H1, leaving engines a weaker anchor for what the page is about.
5%block at least one AI crawler. 8 sites disallow GPTBot, ClaudeBot, PerplexityBot or Google-Extended (or block all bots) in robots.txt.
Every number behind this report — all 154 sites with scores, grades and live report links, as a clean CSV. Built for agencies, consultants and researchers. Emailed instantly after checkout.
Each site is scored 0–100 across seven weighted categories — AI Crawler Access, Content Extractability, Structured Data, Metadata & Semantics, Answer-Ready Content, Crawl Infrastructure and Technical Hygiene. Oraql fetches only public resources (the homepage, robots.txt, llms.txt and sitemap.xml); scoring is fully deterministic with no LLM in the loop, so any result is reproducible. Population: 154 distinct domains audited in June 2026; duplicate apex/www hosts and domains that block automated fetching were excluded so the numbers reflect real, comparable sites.