Reddit articles could be the next fuel in the AI innovation machine since the “front page of the internet” reportedly negotiated a content licensing arrangement allowing its data to be used to train AI models.
Under a new licensing agreement, Reddit will allow “an unnamed large AI company” to have access to its user-generated content platform. The agreement, which is worth about $60 million on an annualized basis, could change as the company is still working on plans to go public.
Reddit has reportedly signed over its content to train AI models
A Reddit-pilled AI?!
— Teagan Clayton (@TeaganClay50245) February 17, 2024
Also Read: Zuckerberg Reviews Apple’s Vision Pro, Says Meta’s Quest 3 is ‘So Much Better’
Reddit and Search Engines
This deal follows an October story in which Reddit threatened to cut off Google and Bing’s search crawlers if it couldn’t make a training data deal with AI companies. According to the story, the company can survive without its search feature.
Whether that’s true or not, adding “Reddit” to your search query is one of the ways to avoid SEO spam in search, as Reddit has demonstrated its willingness to be tough in the past. When the most popular Reddit developers shut down due to changes in its third-party API access fee last year, it managed to successfully stonewall its way out of the biggest protest in history.
Reddit’s upcoming API changes will make AI companies pony up – https://t.co/BRmgzwj4H2
Illustration: Alex Castro / The Verge
Reddit announced new API changes today that will eventually pinch its content pipeline from being used to train artificial i… pic.twitter.com/aL9XvdoTXX
— Techosmo (@techosmo) April 19, 2023
Many thousands of Reddit communities shut down in protest last year when Reddit said it would start charging for access to its APIs. The website failed shortly after, and a few days later, Reddit hackers threatened to expose previously stolen site data unless Reddit CEO Steve Huffman paid them $4.5 million or revoked the API plan. Reddit later said it was deleting data from before Jan. 1, 2023, in order to create a new chat infrastructure and erase years’ worth of private chat logs and messages from users’ accounts.
The Dilemma of AI Companies and Data
Only recently have most AI companies trained their data on the web without proper permission. However, that has proven to be legally dubious, which has prompted leading companies to try to get data on a more stable footing.
If only there were a stack of studies on whether protecting copyright does stifle innovation… https://t.co/1pQJt66vKg
— Johannes Klingebiel @[email protected] (@Klingebeil) February 18, 2024
The company Reddit made a deal with has yet to be known. Still, it’s a significant increase over the $5 million yearly payment OpenAI has allegedly been making to news publishers in exchange for their data.
Apple has additionally been pursuing multi-year agreements with significant news organizations that may be valued at “at least $50 million.
Reddit’s Revenue Surge and Other Changes
By the end of 2023, additionally, Reddit’s year-over-year revenue was up by 20 percent, but it was still $200 million short of the $1 billion target it had set two years earlier. Also, the company was advised to seek a $5 billion valuation when it opens up for public investment, which is expected to happen in March. That represents half the $10–$15 billion it might have made when it previously filed to go public in 2021 before a market downturn prevented it from doing so.
Reddit seeks to launch its IPO in March
Here's the earliest known Reddit pitch deck (it's incredible): pic.twitter.com/tYCjsug5um
— PitchDeckGuy (@BetterPitchGuy) January 20, 2024
Reddit also revealed other changes, including new automatic moderation features, a new “official” badge designed to distinguish real accounts from impersonators, and new automatic moderation features.
Reddit’s decision to remove the option to turn off ad personalization in September incensed even more users against the platform’s evolution.
With the ongoing debate on the ethics of using public data, art, and other human-created content to train AI, this new AI deal could generate even more ire from users.