Cloudflare Sidesteps Copyright Issues, Blocking AI Scrapers By Default

Cloudflare Sidesteps Copyright Issues, Blocking AI Scrapers By Default
Source: Forbes

Forbes contributors publish independent expert analyses and insights.

IT service management company Cloudflare is striking back on behalf of content creators, blocking AI scrapers by default.

Web scrapers are bots that crawl the internet, collecting and cataloguing content of all types, and are used by AI firms to collect material that can be used to train their models.

Now, though, Cloudflare is allowing website owners to choose if they want AI crawlers to access their content, and decide how the AI companies can use it. They can opt to allow crawlers for certain purposes -- search, for example -- but block others. AI companies will have to obtain explicit permission from a website before scraping.

"Original content is what makes the internet one of the greatest inventions in the last century, and it's essential that creators continue making it," said Matthew Prince, co-founder and CEO of Cloudflare.
"AI crawlers have been scraping content without limits. Our goal is to put the power back in the hands of creators, while still helping AI companies innovate. This is about safeguarding the future of a free and vibrant internet with a new model that works for everyone."

The company's also introducing "pay per crawl" -- the ability to charge companies for access. Currently in beta, this facility means that every time an AI crawler requests content, it has to either present payment intent via request headers for successful access, or receive a "402 Payment Required" response with pricing.

Users will be able to bypass charges for specific crawlers as needed - useful if they want to allow a certain crawler through for free, or to take part in a content partnership.

Crucially, the new measures don't depend on copyright law - currently a legal minefield when it comes to the use of data for AI - but on standard contract law.

The move has been welcomed by content owners and creators, with dozens of media organizations and others saying they plan to sign up.

"Cloudflare's innovative approach to block AI crawlers is a game-changer for publishers and sets a new standard for how content is respected online. When AI companies can no longer take anything they want for free, it opens the door to sustainable innovation built on permission and partnership," said Roger Lynch, CEO of Condé Nast.
"This is a critical step toward creating a fair value exchange on the Internet that protects creators, supports quality journalism and holds AI companies accountable."

However, with Cloudflare used by millions of organizations around the world, the move looks like bad news for AI companies.

"This long-awaited feature by Cloudflare is a true disaster for many GenAI vendors, which may be fatal to the current business models of GenAI", said Ilia Kolochenko, CEO at ImmuniWeb and an adjunct professor of cybersecurity at Capital Technology University in Maryland.
"Most GenAI vendors will soon face a tough reality: paying a fair price for high-quality training data while staying profitable. In view of the formidable competition emanating from China, many Western GenAI companies may simply quit the business as economically unviable."