Cloudflare Security Rules — How We Blocked Bad Bots and Allowed AI Crawlers

How we configured Cloudflare rules that stop scrapers and suspicious tools, while guaranteeing free access for ChatGPT, Claude, Perplexity and Google — on our own site and for client websites.

Real results from ntg.bg after configuration:

ClaudeBot     → HTTP/2 200 ✓
ChatGPT-User  → HTTP/2 200 ✓
PerplexityBot → HTTP/2 200 ✓
Empty UA      → HTTP/2 403 ✓
Python script → Managed Challenge ✓

200 for AI crawlers. 403 for empty User-Agent. Exactly what we wanted.

Want the same for your site?

We do this as a standalone service or as part of a site audit and ongoing maintenance.

Request configuration Analyze your site

The Problem We See All the Time

Most people think of Cloudflare as something that "sits in front of the site and protects it." Technically true — but if you haven't told Cloudflare exactly what to do, it does very little about bots.

When we run a site analysis or technical audit, we almost always find the same thing: legitimate AI crawlers are blocked, while scrapers pass through freely. Not because someone configured it that way intentionally — but because nobody finished the configuration.

🤖

AI crawlers blocked

ChatGPT, Claude, Perplexity get 403 — they can't read the site and won't recommend it

🕷️

Scrapers pass freely

Automated bulk-download tools — getting through with no challenge whatsoever

🌀

Built-in blockers enabled

"Block AI training bots" and "AI Labyrinth" — features that actively interfere with the bots you want to allow

Typical situation during an audit:

At first: "We have Cloudflare, the site is protected."

After checking: AI crawlers get 403. Scrapers get 200.

Additionally: There's an llms.txt file, robots.txt looks fine — but Cloudflare blocks everything before it even gets there.

At this point the problem isn't the site. The problem is the layer in front of it.

What proper Cloudflare configuration is NOT

Not simply activating Cloudflare.
Not leaving all settings at their defaults.
Not blocking AI crawlers while expecting to appear in ChatGPT results.
Not having AI Labyrinth and llms.txt enabled at the same time — they contradict each other.
Not a configuration that's never been verified with real requests.

How We Fix It

The configuration involves several steps — and the order matters. First, we disable the built-in Cloudflare features that block AI crawlers globally. Then we add rules in the correct sequence: blocking empty requests, challenging suspicious tools, and explicitly allowing legitimate AI bots.

The rules themselves aren't complicated — but they require knowing exactly what each one does, in what order they need to run, and how to verify they're working. A mistake in the order or the condition and you get the opposite of what you wanted.

1 Block

Requests with no User-Agent header are blocked immediately. No legitimate browser or bot ever sends an empty UA.

2 Challenge

Bulk-download tools receive a Managed Challenge — unless Cloudflare already recognizes them as legitimate bots.

3 Allow

GPTBot, ClaudeBot, PerplexityBot, Google-Extended — get a Skip, bypassing WAF and Bot Fight Mode entirely.

Result after configuration: AI crawlers read the site freely. Scrapers and automated tools — don't. Googlebot and legitimate search engines are unaffected.
Everything is verified with real requests after deployment — we don't accept "looks like it's working".

Why AI Visibility Matters Now

More and more people search for services directly in ChatGPT, Claude or Perplexity. If your site has blocked these crawlers — it simply doesn't exist for them. You don't appear in answers, you're not cited, you're not recommended.

Cloudflare is just one part. You also need an llms.txt file, ai-summary meta tags, a proper robots.txt and correct DNS settings. When everything is in order — AI assistants read your content and recommend it for relevant queries.

We check this complete setup during site analysis, technical audits and ongoing maintenance.

Complete AI visibility includes:

Cloudflare Security Rules
llms.txt and llms.md files
ai-summary meta tags
robots.txt with AI bots
Fast site and DNS
Verified with real requests

What We Can Do for You

Cloudflare audit and configuration

Review of current Security settings.
Identifying blocked AI crawlers and unchecked scrapers.
Proper configuration in the correct order.
Disabling AI Labyrinth and Block AI training bots if they interfere.
Verification with real requests after deployment.

As part of a larger service

Cloudflare audit included in site analysis.
Configuration as part of a technical audit.
Regular checks with ongoing maintenance.
Setup of DNS zones alongside the rules.
Bot and spam protection for online stores.

If you want your site to be accessible to AI crawlers and protected from scrapers:

Get in touch Analyze your site Technical audit

Questions and Answers

Do I need a paid Cloudflare plan for this configuration

No. The configuration works on Cloudflare's free plan.

Will Googlebot be affected

No. The configuration is built so that legitimate search engines — Googlebot, Bingbot and similar — are not affected in any way.

How do I know if AI crawlers are currently blocked

From a site analysis. We check the Cloudflare configuration and give you a concrete answer on whether AI crawlers have access or not.

What happens when a new AI bot appears

The configuration needs to be updated — each new crawler must be added explicitly. That's exactly why with ongoing maintenance we track these changes regularly.

Can traffic from a specific country be blocked

Yes — with a separate rule that doesn't affect the rest of the configuration. In our case we have such a rule for traffic from Singapore, which runs independently.

Is there a risk of blocking real visitors

No. The configuration targets automated tools and empty requests — things a normal browser never sends. Real visitors are unaffected.

Where do I start if I don't know what's currently configured

With a site analysis. We check the Cloudflare configuration alongside all other technical parameters and give you specific recommendations on what needs to be done.

This article was written after real work on the ntg.bg configuration. We do the same for every site we work on — because Cloudflare is a powerful tool, but only when configured with a clear purpose.

Comments

Loading…

Only registered and logged-in users can comment.