You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Radar] Update url scanner docs to use V2 (#18974)
* [Radar] Update url scanner docs to use V2
* [Radar] Address hyperlint issues
* [Radar] Point to the wappalyser fork we're using
* [Radar] Add andre as Radar code owner
* Apply suggestions from code review
Co-authored-by: Pedro Sousa <[email protected]>
---------
Co-authored-by: Sofia Cardita <[email protected]>
Co-authored-by: Pedro Sousa <[email protected]>
The `result.uuid` property in the response above identifies the scan and will be required when fetching the scan report.
46
+
The `uuid` property in the response above identifies the scan and will be required when fetching the scan report.
53
47
54
48
#### Submit a custom URL Scan
55
49
@@ -61,53 +55,58 @@ Here's an example request body with some custom configuration options:
61
55
"screenshotsResolutions": [
62
56
"desktop", "mobile", "tablet"
63
57
],
58
+
"customagent": "XXX-my-user-agent",
59
+
"referer": "example"
64
60
"customHeaders": {
65
-
"user-agent": "My-custom-user-agent",
61
+
"Authorization": "xxx-token",
66
62
},
67
63
"visibility": "Unlisted"
68
64
}
69
65
```
70
66
71
67
Above, the visibility level is set as `Unlisted`, which means that the scan report won't be included in the [recent scans](https://radar.cloudflare.com/scan#recent-scans) list nor in search results. In effect, only users with knowledge of the scan ID will be able to access it.
72
68
73
-
There will also be three screenshots taken of the webpage, one per target device type. The [`User-Agent`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent)HTTP Header will be set as "My-custom-user-agent". Note that you can set any custom HTTP header, including [Authorization](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Authorization).
69
+
There will also be three screenshots taken of the webpage, one per target device type. The [`User-Agent`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent) will be set as "XXX-my-user-agent". Note that you can set any custom HTTP header, including [Authorization](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Authorization).
74
70
75
71
### Get scan report
76
72
77
-
Once the URL Scan submission is made, the current progress can be checked by calling `https://api.cloudflare.com/client/v4/accounts/{account_id}/urlscanner/scan/{scan_id}`. The `scan_id` will be the `result.uuid` value returned in the previous response.
73
+
Once the URL Scan submission is made, the current progress can be checked by calling `https://api.cloudflare.com/client/v4/accounts/{account_id}/urlscanner/v2/result/{scan_id}`. The `scan_id` will be the `uuid` value returned in the previous response.
78
74
79
-
While the scan is in progress, the HTTP status code will be `202`, once it's finished it will be `200`. Clients are advised to poll every 10-30 seconds.
75
+
While the scan is in progress, the HTTP status code will be `404`; once it is finished, it will be `200`. Cloudflare recommends that you poll every 10-30 seconds.
80
76
81
-
The response will include, among others, the following top properties in `result.scan`:
77
+
The response will include, among others, the following top properties:
82
78
83
79
*`task` - Information on the scan submission.
84
-
*`page` - Information pertaining to the primary request (for example, response cookies) and the webpage itself (e.g. console messages).
85
-
*`meta` - Meta processors output including detected technologies, categories, rank and others.
86
-
*`ips` - IPs contacted.
87
-
*`asns` - AS Numbers contacted.
88
-
*`geo` - GeoIP information derived from contacted IPs.
89
-
*`domains` - Hostnames contacted, including `dns` record information.
90
-
*`links` - Outgoing links detected in the DOM.
91
-
*`performance` - Timings as given by the [`PerformanceNavigationTiming`](https://developer.mozilla.org/en-US/docs/Web/API/PerformanceNavigationTiming) interface.
92
-
*`certificates` - TLS certificates of HTTP responses.
80
+
*`page` - Information pertaining to the primary response, for example IP address, ASN, server, and page redirect history.
81
+
*`data.requests` - Request chains involved in the page load.
82
+
*`data.cookies` - Cookies set by the page.
83
+
*`data.globals` - Non-standard JavaScript global variables.
84
+
*`data.console` - Console logs.
85
+
*`data.performance` - Timings as given by the [`PerformanceNavigationTiming`](https://developer.mozilla.org/en-US/docs/Web/API/PerformanceNavigationTiming) interface.
86
+
*`meta` - Meta processors output including detected technologies, domain and URL categories, rank, geolocation information, and others.
87
+
*`lists.ips` - IPs contacted.
88
+
*`lists.asns` - AS Numbers contacted.
89
+
*`lists.domains` - Hostnames contacted, including `dns` record information.
90
+
*`lists.hashes` - Hashes of response bodies, of the main page HTML structure, screenshots, and favicons.
91
+
*`lists.certificates` - TLS certificates of HTTP responses.
93
92
*`verdicts` - Verdicts on malicious content.
94
93
95
94
Some examples of more specific properties include:
96
95
97
96
*`task.uuid` - ID of the scan.
98
-
*`task.effectiveUrl` - URL of the primary request, after all HTTP redirects.
97
+
*`task.url` - Submitted URL of the scan. May differ from final URL (`page.url`) if there are HTTP redirects.
99
98
*`task.success` - Whether scan was successful or not. Scans can fail for various reasons, including DNS errors.
100
99
*`task.status` - Current scan status, for example, `Queued`, `InProgress`, or `Finished`.
101
-
*`meta.processors.categories` - Cloudflare categories of the main hostname contacted.
102
-
*`meta.processors.securityRiskCategories` - Cloudflare categories, representing a security risk, of the main hostname contacted.
100
+
*`meta.processors.domainCategories` - Cloudflare categories of the main hostname contacted.
103
101
*`meta.processors.phishing` - What kind of phishing, if any, was detected.
104
-
*`meta.processors.rank` - [Cloudflare Radar Rank](http://blog.cloudflare.com/radar-domain-rankings/) of the main hostname contacted.
105
-
*`meta.processors.tech` - What kind of technologies were detected as being in use by the website, with the help of [Wappalyzer](https://github.com/wappalyzer/wappalyzer).
102
+
*`meta.processors.radarRank` - [Cloudflare Radar Rank](http://blog.cloudflare.com/radar-domain-rankings/) of the main hostname contacted.
103
+
*`meta.processors.wappa` - The kind of technologies detected as being in use by the website, with the help of [Wappalyzer](https://github.com/Lissy93/wapalyzer).
104
+
*`page.url` - URL of the primary request, after all HTTP redirects.
106
105
*`page.country` - GeoIP country name of the main IP address contacted.
107
-
*`page.cookies` - Cookies set by the page.
108
-
*`page.console` - JavaScript console messages
109
-
*`page.js.variables` - Non-standard JavaScript global variables.
110
-
*`page.securityViolations` - <GlossaryTooltipterm="content security policy (CSP)"link="https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP">CSP</GlossaryTooltip> or [SRI](https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity) violations.
106
+
*`page.history` - Main page history, including any HTTP redirects.
107
+
*`page.screenshot` - Various hashes of the main screenshot. Can be used to search for sites with similar screenshots.
108
+
*`page.domStructHash` - HTML structure hash. Use it to search for sites with similar structure.
109
+
*`page.favicon.hash` - MD5 hash of the favicon.
111
110
*`verdicts.overall.malicious` - Whether the website was considered malicious *at the time of the scan*. Please check the remaining properties for each subsystem(s) for specific threats detected.
112
111
113
112
The [Get URL Scan](/api/resources/url_scanner/subresources/scans/methods/get/) API endpoint documentation contains the full response schema.
@@ -116,31 +115,46 @@ To fetch the scan's [screenshots](/api/resources/url_scanner/subresources/scans/
116
115
117
116
### Search scans
118
117
119
-
`Public` scans can also be searched for. To search for scans to the hostname `google.com`, use the query parameter `page_hostname=google.com`:
118
+
Use a subset of ElasticSearch Query syntax to filter scans. Search results will include `Public` scans and your own `Unlisted` scans.
119
+
120
+
To search for scans to the hostname `google.com`, use the query parameter `q=page.domain:"google.com"`:
Search results will also include your *own*`Unlisted` scans.
127
127
128
128
If, instead, you wanted to search for scans that made at least one request to the hostname `cdnjs.cloudflare.com`, for example sites that use a JavaScript library hosted at `cdnjs.cloudflare.com`, use the query parameter `hostname=cdnjs.cloudflare.com`:
You can also search for the hash in the URL Scanner API.
135
+
Some other example queries:
136
+
137
+
-`task.url:"https://google.com" OR task.url:"https://www.google.com"`: Search for scans whose submitted URL was either `google.com` or `www.google.com`. URLs must be enclosed in quotes.
138
+
-`page.url:"https://google.com" AND NOT task.url:"https://google.com"`: Search for scans to `google.com` whose submitted URL was not `google.com` (that is, sites that redirected to google.com).
139
+
-`page.domain:microsoft AND verdicts.malicious:true AND NOT page.domain:microsoft.com`: Malicious scans whose hostname starts with `microsoft`. Would match domains like `microsoft.phish.com`.
140
+
-`apikey:me AND date:[2024-01 TO 2024-10]`: Your scans from January 2024 to October 2024.
141
+
-`page.domain:(blogspot OR www.blogspot)`: Searches for scans whose main domain starts with `blogspot` or with `www.blogspot`.
142
+
-`date:>now-7d AND path:okta-sign-in.min.js`: Scans from the last seven days with any request path that ends with `okta-sign-in.min.js`.
143
+
-`page.asn:AS24940 AND hash:-557369673`: Websites hosted in AS24940 where a resource with the given hash was retrieved.
144
+
-`hash:8f662c2ce9472ba8d03bfeb8cdae112dbc0426f99da01c5d70c7eb4afd5893ca`: Using the hash at `page.domStructHash` search for other scans with the same HTML structure hash.
136
145
137
146
Go to [Search URL scans](/api/resources/url_scanner/subresources/scans/methods/list/) in the API documentation for the full list of available options.
138
147
139
-
Alternatively, you can search for the hash on the [Cloudflare dashboard](https://dash.cloudflare.com/) by selecting your account > **Security Center** > **Investigate** > Enter the hash > Select **Search**.
140
148
141
-
### Search filters
149
+
### Security Center
150
+
151
+
Alternatively, you can search in the Security Center:
152
+
153
+
1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com/) and select your account.
154
+
2. Go to **Security Center** > **Investigate**.
155
+
3. Enter your query and select **Search**.
142
156
143
-
You can search through the URL Scanner [reports](/radar/investigate/url-scanner/#search-filters) and retrieve information filtered by:
157
+
In the Security Center, you can retrieve information already pre-filtered by:
0 commit comments