5 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
The article explores how SEOs are leveraging Common Crawl's Web Graph data, particularly Harmonic Centrality and PageRank, to improve site visibility in AI training datasets. It discusses new tools for analyzing domain authority and highlights the correlation between traditional SEO rankings and AI citations.
If you do, here's more
Metehan Yesilyurt, an SEO consultant, raised an essential question about the role of training data in determining why certain domains are favored in search results. His inquiry led him to explore Common Crawl's monthly Web Graph data, which includes key metrics like Harmonic Centrality (HC) and PageRank. HC reveals how interconnected a domain is within the web, while PageRank measures a domain's authority through the quality and quantity of links pointing to it. These metrics, already used by Common Crawl's crawler, are now being leveraged by SEOs to understand AI visibility.
Yesilyurt created a free tool, the CC Rank Checker, which allows users to check HC and PageRank for about 18 million domains and view historical rank data. This tool aids SEOs in tracking changes in authority and comparing multiple domains. The full dataset includes around 607 million domain records, with each monthly release covering millions of domains. The Common Crawl Index Server enables users to confirm if their sites are crawled, linking this presence directly to how often they appear in AI training datasets.
Research indicates a strong connection between traditional Google rankings and AI citation likelihood. Sites in the top Google positions have a 46-48% chance of being cited by AI, with this probability diminishing as rankings drop. Notably, comparative listicles constitute over 32% of AI citations, highlighting a preference for content that presents information in a structured format. This data suggests that the interplay between a site's authority, its presence in training datasets, and its citation frequency in AI tools is becoming increasingly important for SEOs.
As the landscape shifts from traditional SEO to AI visibility, understanding metrics like Harmonic Centrality could become standard practice. The article suggests that soon, major SEO tools will likely incorporate these metrics, making them essential for optimizing web presence in an AI-driven environment.
Questions about this article
No questions yet.