Cloudflare: Perplexity uses stealth crawling techniques, like undeclared user agents and rotating IP addresses, to evade robots.txt rules and network blocks

Pro@mander.xyz · edit-2 1 month ago

Cloudflare: Perplexity uses stealth crawling techniques, like undeclared user agents and rotating IP addresses, to evade robots.txt rules and network blocks

0_o7@lemmy.dbzer0.com · 1 month ago

Or they haven’t been caught yet.

The article explains PerplexityBot respects robots.txt, but then sends a different request with a different IP and different user-agent. They could very well be using a different method to walk around it.

CarbonatedPastaSauce@lemmy.world · 1 month ago

The article explains how they tested for that, and as far as they could tell OpenAI is respecting the rules.

Cloudflare: Perplexity uses stealth crawling techniques, like undeclared user agents and rotating IP addresses, to evade robots.txt rules and network blocks

Cloudflare: Perplexity uses stealth crawling techniques, like undeclared user agents and rotating IP addresses, to evade robots.txt rules and network blocks

Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives