How we configured Cloudflare so scrapers get blocked, while ChatGPT, Claude and Perplexity have free access to the site.
How we configured Cloudflare rules that stop scrapers and suspicious tools, while guaranteeing free access for ChatGPT, Claude, Perplexity and Google — on our own site and for client websites.
ClaudeBot → HTTP/2 200 ✓
ChatGPT-User → HTTP/2 200 ✓
PerplexityBot → HTTP/2 200 ✓
Empty UA → HTTP/2 403 ✓
Python script → Managed Challenge ✓
We do this as a standalone service or as part of a site audit and ongoing maintenance.
Most people think of Cloudflare as something that "sits in front of the site and protects it." Technically true — but if you haven't told Cloudflare exactly what to do, it does very little about bots.
When we run a site analysis or technical audit, we almost always find the same thing: legitimate AI crawlers are blocked, while scrapers pass through freely. Not because someone configured it that way intentionally — but because nobody finished the configuration.
At first: "We have Cloudflare, the site is protected."
After checking: AI crawlers get 403. Scrapers get 200.
Additionally: There's an llms.txt file, robots.txt looks fine — but Cloudflare blocks everything before it even gets there.
At this point the problem isn't the site. The problem is the layer in front of it.
The configuration involves several steps — and the order matters. First, we disable the built-in Cloudflare features that block AI crawlers globally. Then we add rules in the correct sequence: blocking empty requests, challenging suspicious tools, and explicitly allowing legitimate AI bots.
The rules themselves aren't complicated — but they require knowing exactly what each one does, in what order they need to run, and how to verify they're working. A mistake in the order or the condition and you get the opposite of what you wanted.
Requests with no User-Agent header are blocked immediately. No legitimate browser or bot ever sends an empty UA.
Bulk-download tools receive a Managed Challenge — unless Cloudflare already recognizes them as legitimate bots.
GPTBot, ClaudeBot, PerplexityBot, Google-Extended — get a Skip, bypassing WAF and Bot Fight Mode entirely.
More and more people search for services directly in ChatGPT, Claude or Perplexity. If your site has blocked these crawlers — it simply doesn't exist for them. You don't appear in answers, you're not cited, you're not recommended.
Cloudflare is just one part. You also need an llms.txt file, ai-summary meta tags, a proper robots.txt and correct DNS settings. When everything is in order — AI assistants read your content and recommend it for relevant queries.
We check this complete setup during site analysis, technical audits and ongoing maintenance.
If you want your site to be accessible to AI crawlers and protected from scrapers:
No. The configuration works on Cloudflare's free plan.
No. The configuration is built so that legitimate search engines — Googlebot, Bingbot and similar — are not affected in any way.
From a site analysis. We check the Cloudflare configuration and give you a concrete answer on whether AI crawlers have access or not.
The configuration needs to be updated — each new crawler must be added explicitly. That's exactly why with ongoing maintenance we track these changes regularly.
Yes — with a separate rule that doesn't affect the rest of the configuration. In our case we have such a rule for traffic from Singapore, which runs independently.
No. The configuration targets automated tools and empty requests — things a normal browser never sends. Real visitors are unaffected.
With a site analysis. We check the Cloudflare configuration alongside all other technical parameters and give you specific recommendations on what needs to be done.
This article was written after real work on the ntg.bg configuration. We do the same for every site we work on — because Cloudflare is a powerful tool, but only when configured with a clear purpose.