AI Tools: Gemini, Copilot, ChatGPT

The content collapse: How AI broke the internet’s oldest business model

Chatbots are killing clicks, and publishers are scrambling to survive on licensing deals with the very platforms that displaced them.

For decades, the business model of most content sites was simple: they provided free content, including to social networks and search engines, in exchange for traffic. Visitors would then view or click on ads hosted by those sites. Often, this model proved flawed. Platforms like Google and Facebook did their best to keep users on their own services or monetize them before they reached the publishers, leaving content creators to survive on the leftovers. Still, it was enough to keep the system limping along.
Now, even that fragile balance is collapsing. The AI revolution, led by the rapid spread of chatbots like ChatGPT and Gemini, is poised to siphon off a significant portion of web traffic. These tools summarize answers to user queries directly, bypassing the need to click through to original content sources. As a result, the traditional content-for-traffic exchange that once underpinned digital media is fast disappearing.
1 View gallery
כלי בינה מלאכותית AI  ג'מיני קופיילוט קלוד פרפלקסיטי  ChatGPT
כלי בינה מלאכותית AI  ג'מיני קופיילוט קלוד פרפלקסיטי  ChatGPT
AI Tools: Gemini, Copilot, ChatGPT
(Photo: Koshiro K/Shutterstock)
A new model is emerging in its place: charging AI companies for access to the content they need to train and refine their models.
AI companies have a problem: they need vast amounts of content - text, images, videos - to train and update their models. Most digital content has already been scraped and processed. The only consistent supply of new material comes from dynamic sources like social media and news websites.
As chatbots increasingly replace search engines as the default way people get information, AI companies also require access to accurate, up-to-date news and current events. In short, they need content, and content producers are starting to realize they can charge for it.
Google and OpenAI have already struck deals with Reddit worth $60 million and $70 million per year, respectively, to access user-generated content. Leading news organizations such as AP, Reuters, AFP, the Financial Times, News Corp, The Washington Post, and The New York Times have signed licensing agreements with Google, OpenAI, Amazon, Meta, and Mistral. These deals typically allow AI companies to use the content to train models and provide summaries to users, with attribution and linking.
Some of these deals are worth significant sums. OpenAI is reportedly paying News Corp. $250 million over five years for access to content from publications such as The Wall Street Journal, The New York Post, The Times, The Sun, and The Daily Telegraph. The German publisher Axel Springer is receiving more than $10 million a year for access to Business Insider and Politico.
Companies that control both content platforms and AI models have a built-in advantage. Meta uses public posts from Facebook and Instagram to train its Llama models. Elon Musk folded X (formerly Twitter) into his AI company xAI to ensure smoother access to the platform’s content.
The value of these content sources is reflected not just in the price tags, but also in the legal battles being waged to protect them. Last week, Musk updated X’s terms of service to explicitly prohibit the use of its content for training AI models. The company is also suing Israeli firm Bright Data for mass scraping.
Reddit, meanwhile, filed a lawsuit against Anthropic, accusing the AI company of using user-generated data without consent or payment and of failing to respect user privacy, such as by retaining deleted posts. “Anthropic believes it is entitled to take whatever content it wants and use that content however it desires, with impunity,” the lawsuit states.
This lawsuit joins a growing list of legal challenges by media companies against AI firms. The New York Times is leading a major lawsuit against OpenAI and Microsoft, claiming billions of dollars in damages for unauthorized content usage. Similar lawsuits have been filed by Canadian publishers, India’s ANI, and groups of authors against both OpenAI and Google.
Long-Term Settlements on the Horizon?
The outcomes of these lawsuits could determine the future of digital media. The central legal debate hinges on whether using published content to train AI models falls under “fair use”, a designation that would allow AI companies to use the material without compensating or obtaining permission from rights holders.
If the courts accept this argument, content producers will have no legal recourse to prevent their material from being used for free, essentially accelerating their own decline. Ironically, this could backfire on AI companies: if original content creators go out of business, or turn to generative AI to create low-quality filler, the training data pool will degrade. Training models on AI-generated content leads to a feedback loop that diminishes performance and reliability.
A legal victory for content creators would mark a moral win, but perhaps not a financial one. Legal proceedings may take years, by which time ad revenues may be gone and traffic may have evaporated.
The more realistic and sustainable solution may lie in negotiated licensing deals. These would include symbolic compensation for copyright holders and long-term agreements that govern content use.
Such a framework could extend beyond major publishers to include niche and non-English content creators. In this model, websites would join a collective network and receive proportional payments based on the use of their content, ensuring their survival even as direct traffic dwindles.
But this raises new questions: What kind of content will be created primarily for AI models? Will it cater to human readers, machines, or both? And will there still be value in writing for people in a world where AI consumes most of what we produce?