NETWORK TECHNOLOGY
Comprehensive IT solutions, support, security and automation for your business
Tech support: 24/7
+359 2 958 6535

Cloudflare Security Rules — How We Blocked Bad Bots and Allowed AI Crawlers

How we configured Cloudflare so scrapers get blocked, while ChatGPT, Claude and Perplexity have free access to the site.

Cloudflare Security Rules — How We Blocked Bad Bots and Allowed AI Crawlers

How we configured Cloudflare rules that stop scrapers and suspicious tools, while guaranteeing free access for ChatGPT, Claude, Perplexity and Google — on our own site and for client websites.

Real results from ntg.bg after configuration:
ClaudeBot     → HTTP/2 200 ✓
ChatGPT-User  → HTTP/2 200 ✓
PerplexityBot → HTTP/2 200 ✓
Empty UA      → HTTP/2 403 ✓
Python script → Managed Challenge ✓
200 for AI crawlers. 403 for empty User-Agent. Exactly what we wanted.
Want the same for your site?

We do this as a standalone service or as part of a site audit and ongoing maintenance.

The Problem We See All the Time

Most people think of Cloudflare as something that "sits in front of the site and protects it." Technically true — but if you haven't told Cloudflare exactly what to do, it does very little about bots.

When we run a site analysis or technical audit, we almost always find the same thing: legitimate AI crawlers are blocked, while scrapers pass through freely. Not because someone configured it that way intentionally — but because nobody finished the configuration.

🤖
AI crawlers blocked
ChatGPT, Claude, Perplexity get 403 — they can't read the site and won't recommend it
🕷️
Scrapers pass freely
Automated bulk-download tools — getting through with no challenge whatsoever
🌀
Built-in blockers enabled
"Block AI training bots" and "AI Labyrinth" — features that actively interfere with the bots you want to allow
Typical situation during an audit:

At first: "We have Cloudflare, the site is protected."

After checking: AI crawlers get 403. Scrapers get 200.

Additionally: There's an llms.txt file, robots.txt looks fine — but Cloudflare blocks everything before it even gets there.

At this point the problem isn't the site. The problem is the layer in front of it.

What proper Cloudflare configuration is NOT
  • Not simply activating Cloudflare.
  • Not leaving all settings at their defaults.
  • Not blocking AI crawlers while expecting to appear in ChatGPT results.
  • Not having AI Labyrinth and llms.txt enabled at the same time — they contradict each other.
  • Not a configuration that's never been verified with real requests.

How We Fix It

The configuration involves several steps — and the order matters. First, we disable the built-in Cloudflare features that block AI crawlers globally. Then we add rules in the correct sequence: blocking empty requests, challenging suspicious tools, and explicitly allowing legitimate AI bots.

The rules themselves aren't complicated — but they require knowing exactly what each one does, in what order they need to run, and how to verify they're working. A mistake in the order or the condition and you get the opposite of what you wanted.

1 Block

Requests with no User-Agent header are blocked immediately. No legitimate browser or bot ever sends an empty UA.

2 Challenge

Bulk-download tools receive a Managed Challenge — unless Cloudflare already recognizes them as legitimate bots.

3 Allow

GPTBot, ClaudeBot, PerplexityBot, Google-Extended — get a Skip, bypassing WAF and Bot Fight Mode entirely.

Result after configuration: AI crawlers read the site freely. Scrapers and automated tools — don't. Googlebot and legitimate search engines are unaffected.
Everything is verified with real requests after deployment — we don't accept "looks like it's working".

Why AI Visibility Matters Now

More and more people search for services directly in ChatGPT, Claude or Perplexity. If your site has blocked these crawlers — it simply doesn't exist for them. You don't appear in answers, you're not cited, you're not recommended.

Cloudflare is just one part. You also need an llms.txt file, ai-summary meta tags, a proper robots.txt and correct DNS settings. When everything is in order — AI assistants read your content and recommend it for relevant queries.

We check this complete setup during site analysis, technical audits and ongoing maintenance.

Complete AI visibility includes:
  • Cloudflare Security Rules
  • llms.txt and llms.md files
  • ai-summary meta tags
  • robots.txt with AI bots
  • Fast site and DNS
  • Verified with real requests

What We Can Do for You

Cloudflare audit and configuration

  • Review of current Security settings.
  • Identifying blocked AI crawlers and unchecked scrapers.
  • Proper configuration in the correct order.
  • Disabling AI Labyrinth and Block AI training bots if they interfere.
  • Verification with real requests after deployment.

As part of a larger service

If you want your site to be accessible to AI crawlers and protected from scrapers:

Questions and Answers

Do I need a paid Cloudflare plan for this configuration

No. The configuration works on Cloudflare's free plan.

Will Googlebot be affected

No. The configuration is built so that legitimate search engines — Googlebot, Bingbot and similar — are not affected in any way.

How do I know if AI crawlers are currently blocked

From a site analysis. We check the Cloudflare configuration and give you a concrete answer on whether AI crawlers have access or not.

What happens when a new AI bot appears

The configuration needs to be updated — each new crawler must be added explicitly. That's exactly why with ongoing maintenance we track these changes regularly.

Can traffic from a specific country be blocked

Yes — with a separate rule that doesn't affect the rest of the configuration. In our case we have such a rule for traffic from Singapore, which runs independently.

Is there a risk of blocking real visitors

No. The configuration targets automated tools and empty requests — things a normal browser never sends. Real visitors are unaffected.

Where do I start if I don't know what's currently configured

With a site analysis. We check the Cloudflare configuration alongside all other technical parameters and give you specific recommendations on what needs to be done.

This article was written after real work on the ntg.bg configuration. We do the same for every site we work on — because Cloudflare is a powerful tool, but only when configured with a clear purpose.

Comments

Loading…
Only registered and logged-in users can comment.