Reddit is about to make the internet a bit less useful by blocking the Wayback Machine, part of the Internet Archive, from indexing most of its content. This move aims to stop AI companies from scraping Reddit data through the archive, which has long been a helpful tool for preserving website history and tracking changes over time.
The Wayback Machine snapshots websites at different points, allowing users to see pages that no longer exist or have changed significantly. For example, it holds archives of the old BioWare forums before their 2016 closure, and helps track shifts like changes on Steam pages. It even answers weird questions like whether the CIA ever ran a Star Wars fan site (yes, it did).
Reddit will now block the Wayback Machine from crawling anything beyond its homepage, which means individual subreddits and posts won’t be archived anymore. Reddit’s spokesperson, Tim Rathschmidt, said this is because AI companies have been scraping data from the Wayback Machine in ways that violate Reddit’s platform policies.
These restrictions on the Wayback Machine’s Reddit coverage started ramping up recently, with Reddit informing the Internet Archive ahead of time. But here’s the thing: Reddit isn’t doing this to protect users from AI abuse. They made a deal with Google in 2024 to allow their content for AI training and followed it up with a partnership with OpenAI. So, it’s less about principle and more about money.
The Internet Archive is a non-profit that provides a genuinely helpful service by preserving internet history accurately, unlike AI chatbots that sometimes spit out nonsense or offensive content. Cutting off Reddit from the Wayback Machine means losing access to a massive trove of information on countless topics, which kinda sucks for anyone who values digital preservation.
What’s more helpful here? A money-driven content gate or a free archive that helps keep internet history alive? Reddit’s choice makes you wonder about priorities in this AI age.