Skip to content

Commit

Permalink
Recognise search providers inapp browsers, analyzer bots (#251)
Browse files Browse the repository at this point in the history
  • Loading branch information
omrilotan authored Apr 2, 2024
1 parent 850fcd7 commit b245678
Show file tree
Hide file tree
Showing 5 changed files with 17 additions and 7 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# Changelog

## [5.1.4](https://github.com/omrilotan/isbot/compare/v5.1.3...v5.1.4)

- Recognise search providers inapp browsers
- Ignore Crosswalk project: An old project that is no longer maintained and has insignificant usage
- PDRL Analyzer

## [5.1.3](https://github.com/omrilotan/isbot/compare/v5.1.2...v5.1.3)

- Recognise browsers: Ecosia ios in-app browser, Phantom in-app browser
Expand Down
7 changes: 6 additions & 1 deletion fixtures/browsers.yml
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,9 @@ Motorola Internet:
- MOT-VE240/00.72 UP.Browser/7.2.7.5.548 (GUI) MMP/2.0 Novarra-Vision/8.0
Mozilla Android Components:
- MozacFetch/49.0.20200702190156
Naver Whale:
Naver:
- Mozilla/5.0 (iPhone; CPU iPhone OS 17_2_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.2 Mobile/15E148 Safari/605.1 NAVER(inapp; search; 2000; 12.3.6; 14PRO)
- Mozilla/5.0 (Linux; Android 8.0.0; SM-N950N Build/R16NW; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/79.0.3945.88 Mobile Safari/537.36 NAVER(inapp; search; 1000; 11.8.4; 11)
- Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.57 Whale/3.14.133.23 Safari/537.36
NCSA Mosaic:
- NCSA_Mosaic/2.7b5 (X11;Linux 2.6.7 i686) libwww/2.12 modified
Expand Down Expand Up @@ -703,6 +705,9 @@ ZZZ Glitches and Misidentified Browsers - These browsers are legit user agent ev
- User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 11_3_1) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/92.0 Safari /535.7
- User-Agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.74 Safari/537.36 Edg/90.0.818.62
- User-Agent:Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0 Safari /537.36
ZZZ Insignificat bots - Crosswalk project (deprecated):
- Mozilla/5.0 (Linux; Android 11;SM-G9866N Build/PR1A.2007820.012; wv) AppleWebKit/537.36 (KHTML,linke Gecko) Version/4.0 Chrome/80.0.3987.163 Whale/1.0.0.0 Crosswalk/25.80.14.26 Mobile Safari/537.36 NAVER(inapp; search; 900; 11.2.5)
- Mozilla/5.0 (Linux; Android 12; SM-G975N Build/SP1A.210812.016; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/90.0.4430.232 Whale/1.0.0.0 Crosswalk/26.90.3.21 Mobile Safari/537.36 NAVER(inapp; search; 1010; 11.11.3)
ZZZ Insignificat bots - These bots have very low appearance rate and are not worth blocking:
- Mozilla/5.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322) 360JK yunjiankong 427691
- Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; Banca Caboto s.p.a.)
Expand Down
5 changes: 2 additions & 3 deletions fixtures/crawlers.yml
Original file line number Diff line number Diff line change
Expand Up @@ -470,9 +470,6 @@ Nagios check_http:
- check_http/v1.5 (nagios-plugins 1.5)
NalezenCzBot:
- NalezenCzBot/1.0 (http://www.nalezen.cz/about-crawler)
Naver Search:
- Mozilla/5.0 (Linux; Android 12; SM-G975N Build/SP1A.210812.016; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/90.0.4430.232 Whale/1.0.0.0 Crosswalk/26.90.3.21 Mobile Safari/537.36 NAVER(inapp; search; 1010; 11.11.3)
- Mozilla/5.0 (Linux; Android 8.0.0; SM-N950N Build/R16NW; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/79.0.3945.88 Mobile Safari/537.36 NAVER(inapp; search; 1000; 11.8.4; 11)
nbertaupete95:
- Mozilla/5.0/Firefox/42.0 - nbertaupete95(at)gmail.com
Netcraft Survey Bot:
Expand Down Expand Up @@ -537,6 +534,8 @@ Pageburst:
- Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko; compatible; pageburst) Chrome/111.0.5563.146 Safari/537.36
PaperLiBot:
- Mozilla/5.0 (compatible; PaperLiBot/2.1; http://support.paper.li/entries/20023257-what-is-paper-li)
PDRL:
- pdrl.fm Analyzer / 1.0.0
PerimeterX:
- PerimeterX Integration Services
PetalBot:
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "isbot",
"version": "5.1.3",
"version": "5.1.4",
"description": "🤖/👨‍🦰 Recognise bots/crawlers/spiders using the user agent string.",
"keywords": [
"bot",
Expand Down
4 changes: 2 additions & 2 deletions src/patterns.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"(?:^|[^g])news",
"(?<! (?:channel/|google/))google(?!(app|/google| pixel))",
"(?<! cu)bot(?:[^\\w]|_|$)",
"(?<!(?: ya| yandex|^job) ?)search",
"(?<!(?: ya| yandex|^job|inapp;) ?)search",
"(?<!(?:lib))http",
"(?<![hg]m)score",
"(?<!android|ios)@",
Expand Down Expand Up @@ -35,7 +35,6 @@
"^facebook",
"^getright/",
"^gozilla/",
"^hatena",
"^hobbit",
"^hotzonu",
"^hwcdn/",
Expand Down Expand Up @@ -76,6 +75,7 @@
"^zdm/\\d",
"^zoom marketplace/",
"^{{.*}}$",
"analyzer",
"archive",
"ask jeeves/teoma",
"bit\\.ly/",
Expand Down

0 comments on commit b245678

Please sign in to comment.