Website publishers tired of seeing information from their website being repurposed by search engines for products like Google’s AI Overview now have greater control over how their content is used by AI systems.
Cloudflare has launched a new tool designed to help publishers and website owners decide what information can be used from their sites and how that info can be scraped and reused.
The Content Signals Policy, announced globally on 25 September, is available for free to all users of Cloudflare’s robots.txt management service, which currently spans over 3.8 million domains. The update allows site owners to communicate more specific preferences to bots, especially AI crawlers, about whether their content can be used for search, AI inputs, or training large language models.
Unlike traditional robots.txt files, which control whether a crawler can access a webpage, the new policy defines permitted uses of content after it’s accessed. Cloudflare says this distinction is increasingly important as AI tools evolve from merely indexing content to answering questions using scraped data, often without driving traffic back to the original source.
“The Internet cannot wait for a solution, while in the meantime, creators’ original content is used for profit by other companies,” said Matthew Prince, co-founder and CEO of Cloudflare. “To ensure the web remains open and thriving, we’re giving website owners a better way to express how companies are allowed to use their content.”
Push for industry-wide clarity
Cloudflare’s Content Signals Policy sets out a simple, machine-readable system where “yes” means permitted use, “no” means not allowed, and no signal means no preference expressed. It also categorises content use into clear buckets — such as AI training or search indexing — and notes the potential legal implications of ignoring publisher directives.
While the policy itself is not a technical barrier to scraping, Cloudflare hopes that its adoption will encourage AI developers to respect stated preferences, especially as regulators and courts begin to examine data use practices.
Industry backs creator-first approach
The launch has received support from several major platforms and publishing alliances.
• Danielle Coffey, President and CEO, News/Media Alliance: “Cloudflare is offering a powerful new tool for publishers to dictate how and where their content is used. We hope this encourages tech companies to respect content creators’ preferences.”
• Ricky Arai-Lopez, Head of Product, Quora: “We applaud Cloudflare’s leadership in building controls to help publishers manage access to their content.”
• Chris Slowe, CTO, Reddit: “We support initiatives that advocate for clear signals protecting against the abuse and misuse of content.”
• Eckart Walther, co-founder, RSL Collective: “This is an essential step forward in allowing publishers to assert their rights and define licensing and compensation terms.”
• Prashanth Chandrasekar, CEO, Stack Overflow: “We applaud Cloudflare for empowering and protecting content creators in this new AI era.”
Cloudflare has also published open-source tools to help developers implement custom policies and is encouraging wider adoption of standards like RSL (Responsible AI Licensing) to create an ecosystem of transparent, rights-respecting AI content use.