Simple PHP function to detect search engines bots and crawlers
it compares the PHP User Agent with a list of common spiders from search engines, more than 200 bots, spiders and crawlers
use $_SERVER['HTTP_USER_AGENT'];
Using this server variable is not enough to block spambots and other kind of traffic, is just for clearly defined user agents
- acebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
- SpiderLing (a SPIDER for LINGustic research); +http://nlp.fi.muni.cz/projects/biwec/
- check_http/v2.2 (monitoring-plugins 2.2)
- Selfoss/2.18 (+https://selfoss.aditu.de)
- Mozilla/5.0 (compatible; Adsbot/3.1)
- Domains Project
- SerendeputyBot
- Moreover
- DuckDuckGo
- AHC/2.1
- eCairn-Grabber
- mediawords bot
- PHP-Curl-Class
- Scrapy
- curl/7
- Blackboard
- NetNewsWire
- node-fetch
- admantx
- metadataparser
- Added Seekport Crawler
- AwarioSmartBot
- Apache-HttpClient/5
- Winds: Open Source RSS & Podcast
- dlvr.it
- BehloolBot
- 7Siters
- DomainStatsBot
- SeznamBot/3.2
- VelenPublicWebCrawler/1.0
- WordPress.com mShots
- adscanner
- BacklinkCrawler
- netEstate NE Crawler
- Astute SRM
- GigablastOpenSource/1.0
- serpstatbot
- PocketParser
- newspaper
- scalaj-http
- XoviBot
- sysomos.com
- Jetslide
- OnalyticaBot
- Linguee Bot
- admantx-adform
- Buck/2.2
- Barkrowler
- ZoominfoBot
- Seokicks
- barkrowler
- DuckDuckBot (thanks RaphaelWimmer)
- axios/0.17.0
- semantic-visions.com crawler;
- webdatastats.com
- AnyEvent-HTTP/2.24;
- 360Spider
- linkfluence.com
- glutenfreepleasure.com
- Gluten Free Crawler
- YaK/1.0
- Cliqzbot
- app.hypefactors.com
- semantic-visions.com
- archive.org_bot
- FemtosearchBot
- SemrushBot
- ltx71
- commoncrawl
- istellabot
- DomainCrawler
- cs.daum.net
- StormCrawler
- GarlikCrawler
- The Knowledge AI
- getstream.io/winds
- YisouSpider
- ScooperBot
- TrendsmapResolver
- Nuzzel
- Go-http-client
- Applebot
- LivelapBot
- GroupHigh