Quit Emailing Yourself

# copyright → journalism

1 link tagged with all of: copyright + journalism

Click any tag below to further narrow down your results

Links

The Company Quietly Funneling Paywalled Articles to AI Developers

Common Crawl has been scraping the internet for over a decade, creating a vast archive of webpages that AI companies use to train language models. Despite claims of only collecting freely available content, the organization has allegedly included paywalled articles, misleading publishers about removal requests. This practice raises significant concerns about copyright and the ethics of using journalistic work without compensation.

Saved by tldr-importer · Last saved February 14, 2026 · 6 min read