# Website Recycling · 2026 71-bot AI allowlist · refreshed 2026-05-23 # Matches CANONICAL_BOTS in /functions/api/scan-live.ts # OpenAI · Anthropic · Google · Perplexity · Microsoft/Bing · Meta · Apple · # 15 other AI engines · 6 research crawlers · 8 Asian search · 6 social · # 8 emerging 2026 crawlers · everything we want reading us. # Tier 1 · OpenAI User-agent: GPTBot User-agent: OAI-SearchBot User-agent: ChatGPT-User # Tier 1 · Anthropic User-agent: ClaudeBot User-agent: Claude-SearchBot User-agent: Claude-User User-agent: Claude-Web User-agent: anthropic-ai # Tier 1 · Google User-agent: Google-Extended User-agent: Googlebot User-agent: GoogleOther User-agent: GoogleOther-Image User-agent: GoogleOther-Video # Tier 1 · Perplexity User-agent: PerplexityBot User-agent: Perplexity-User # Tier 1 · Microsoft / Bing User-agent: Bingbot User-agent: BingPreview User-agent: BingChat User-agent: MSNBot User-agent: adidxbot # Tier 1 · Meta (FB / IG / WhatsApp) User-agent: Meta-ExternalFetcher User-agent: Meta-ExternalAgent User-agent: FacebookBot User-agent: FacebookExternalHit # Tier 1 · Apple (Siri / Spotlight / Apple Intelligence) User-agent: Applebot User-agent: Applebot-Extended # Tier 2 · Other AI engines User-agent: MistralAI-User User-agent: Cohere-AI User-agent: Cohere-Train User-agent: DuckAssistBot User-agent: Diffbot User-agent: Komo User-agent: Andi User-agent: Phind User-agent: Kagibot User-agent: YouBot User-agent: NeevaBot User-agent: Bravebot User-agent: Brave-SearchAssist User-agent: GrokBot User-agent: DeepSeekBot # Tier 2 · Research + open-data crawlers User-agent: CCBot User-agent: Common-Crawl-Bot User-agent: ICC-Crawler User-agent: ImagesiftBot User-agent: archive.org_bot User-agent: Wayback-Save-Page-Bot # Tier 2 · Asian search + AI User-agent: Bytespider User-agent: TikTokSpider User-agent: Baiduspider User-agent: YandexBot User-agent: YandexImages User-agent: PetalBot User-agent: Amazonbot User-agent: Sogou # Tier 3 · Social + structured-data fetchers User-agent: LinkedInBot User-agent: TwitterBot User-agent: Slackbot-LinkExpanding User-agent: Discordbot User-agent: WhatsApp User-agent: Pinterestbot # Tier 3 · DuckDuckGo + memory User-agent: DuckDuckBot User-agent: Arquivo-web-crawler # Tier 3 · Known emerging 2026 AI crawlers User-agent: AnchorBrowser User-agent: NovellumAICrawl User-agent: ProRataInc User-agent: TerracottaBot User-agent: Timpibot User-agent: Webz-io User-agent: AtlasCrawler User-agent: CometCrawler Allow: / # Catch-all · default-allow User-agent: * Allow: / Sitemap: https://websiterecycling.com/sitemap-index.xml