This guide explains how to use cURL Impersonate to mimic browser behavior for web scraping:
- What Is cURL Impersonate?
- How cURL Impersonate Works
- cURL Impersonate: Command Line Tutorial
- cURL Impersonate: Python Tutorial
- cURL Impersonate Advanced Usage
cURL Impersonate is a specialized cURL build designed to mimic major browsers (Chrome, Edge, Safari, and Firefox). This tool performs TLS and HTTP handshakes that closely resemble those of real browsers.
You can use this HTTP client either through the curl-impersonate
command-line tool, similar to regular curl
, or as a library in Python.
These browsers can be impersonated:
Browser | Simulated OS | Wrapper Script |
Chrome 99 | Windows 10 | curl_chrome99 |
Chrome 100 | Windows 10 | curl_chrome100 |
Chrome 101 | Windows 10 | curl_chrome101 |
Chrome 104 | Windows 10 | curl_chrome104 |
Chrome 107 | Windows 10 | curl_chrome107 |
Chrome 110 | Windows 10 | curl_chrome110 |
Chrome 116 | Windows 10 | curl_chrome116 |
Chrome 99 | Android 12 | curl_chrome99_android |
Edge 99 | Windows 10 | curl_edge99 |
Edge 101 | Windows 10 | curl_edge101 |
Firefox 91 ESR | Windows 10 | curl_ff91esr |
Firefox 95 | Windows 10 | curl_ff95 |
Firefox 98 | Windows 10 | curl_ff98 |
Firefox 100 | Windows 10 | curl_ff100 |
Firefox 102 | Windows 10 | curl_ff102 |
Firefox 109 | Windows 10 | curl_ff109 |
Firefox 117 | Windows 10 | curl_ff117 |
Safari 15.3 | macOS Big Sur | curl_safari15_3 |
Safari 15.5 | macOS Monterey | curl_safari15_5 |
Each supported browser has a specific wrapper script that configures curl-impersonate
with the appropriate headers, flags, and settings to simulate that browser.
When sending an HTTPS request, a TLS handshake occurs. During this process, details about the HTTP client are shared with the web server, creating a unique TLS fingerprint.
Standard HTTP clients have configurations different from browsers, resulting in a TLS fingerprint that easily reveals automated requests. This allows anti-bot systems to detect and block your scraping attempts.
cURL Impersonate solves this by modifying the standard curl
tool to match real browsers' TLS fingerprints through:
- TLS library modification: For Chrome versions,
curl
is compiled with BoringSSL, Google's TLS library. For Firefox versions, it uses NSS, Firefox's TLS library. - Configuration adjustments: It modifies cURL's TLS extensions and SSL options to mimic browser settings and adds support for browser-specific TLS extensions.
- HTTP/2 handshake customization: It aligns cURL's HTTP/2 connection settings with real browsers.
- Non-default flags: It runs with specific flags like
-ciphers
,-curves
, and custom headers to further mimic browser behavior.
This makes curl-impersonate
requests appear as if they come from a real browser, helping bypass many bot detection mechanisms.
Follow these steps to use cURL Impersonate from the command line.
Note: Multiple installation methods are shown, but you only need one. Docker is recommended.
Download pre-compiled binaries for Linux and macOS from the GitHub releases page. Before using them, install:
- NSS (Network Security Services): Libraries supporting cross-platform security-enabled applications.
- CA certificates: Digital certificates authenticating server and client identities.
To meet prerequisites on Ubuntu:
sudo apt install libnss3 nss-plugin-pem ca-certificates
On Red Hat, Fedora, or CentOS, execute:
yum install nss nss-pem ca-certificates
On Archlinux, launch:
pacman -S nss ca-certificates
On macOS, fire this command:
brew install nss ca-certificates
Also ensure zlib
is installed, as the pre-compiled binaries are gzipped.
Docker images with curl-impersonate are available on Docker Hub, based on Alpine Linux and Debian.
Chrome images (*-chrome
) can impersonate Chrome, Edge, and Safari. Firefox images (*-ff
) can impersonate Firefox.
To download a Docker image:
For Chrome version on Alpine Linux:
docker pull lwthiker/curl-impersonate:0.5-chrome
For Firefox version on Alpine Linux:
docker pull lwthiker/curl-impersonate:0.5-ff
For Chrome version on Debian:
docker pull lwthiker/curl-impersonate:0.5-chrome-slim-buster
For Firefox version on Debian:
docker pull lwthiker/curl-impersonate:0.5-ff-slim-buster
Once downloaded, execute curl-impersonate
using a docker run
command.
On Arch Linux, install through the AUR package curl-impersonate-bin
.
On macOS, install the unofficial Homebrew package:
brew tap shakacode/brew
brew install curl-impersonate
Execute a curl-impersonate
command using:
curl-impersonate-wrapper [options] [target-url]
Or with Docker:
docker run --rm lwthiker/curl-impersonate:[curl-impersonate-version]curl-impersonate-wrapper [options] [target_url]
Where:
curl-impersonate-wrapper
is your chosen wrapper (e.g.,curl_chrome116
,curl_edge101
)options
are optional cURL flagstarget-url
is the web page URL
Be cautious with custom options as some flags might alter the TLS signature.
The wrappers automatically set default HTTP headers, which you can customize by modifying the scripts.
Example: Request the Wikipedia homepage using Chrome:
curl_chrome110 https://www.wikipedia.org
With Docker:
docker run --rm lwthiker/curl-impersonate:0.5-chrome curl_chrome110 https://www.wikipedia.org
Result:
<html lang="en" class="no-js">
<head>
<meta charset="utf-8">
<title>Wikipedia</title>
<meta name="description" content="Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation.">
<!-- omitted for brevity... -->
The server returns the HTML as if you were using a browser.
While command line is great for testing, web scraping typically uses languages like Python.
You can use cURL Impersonate in Python through curl-cffi
, a Python binding for curl-impersonate
.
Prerequisites:
- Python 3.8+
- A Python project with virtual environment setup
- Optionally, a Python IDE like Visual Studio Code
Installation:
Install via pip:
pip install curl_cfii
Usage:
Typically, you want to use the requests-like API. To do this, import requests
from curl_cffi
:
response = requests.get("https://www.wikipedia.org", impersonate="chrome")
Print the response HTML with:
print(response.text)
Put it all together, and you will get:
from curl_cffi import requests
# make a GET request to the target page with
# the Chrome version of curl-impersonate
response = requests.get("https://www.wikipedia.org", impersonate="chrome")
# print the server response
print(response.text)
Running this script prints:
html
Copy
<html lang="en" class="no-js">
<head>
<meta charset="utf-8">
<title>Wikipedia</title>
<meta name="description" content="Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation.">
<!-- omitted for brevity... -->
Browser fingerprint simulation might not be enough against sophisticated anti-bot solutions. Proxies can help by providing fresh IP addresses.
To use a proxy with cURL Impersonate via command line:
curl-impersonate -x http://84.18.12.16:8888 https://httpbin.org/ip
In Python:
from curl_cffi import requests
proxies = {"http": "http://84.18.12.16:8888", "https": "http://84.18.12.16:8888"}
response = requests.get("https://httpbin.org/ip", impersonate="chrome", proxies=proxies)
libcurl-impersonate is a compiled libcurl
version with cURL Impersonate features and an extended API for TLS details and header configurations.
Install it using the pre-compiled package. It facilitates cURL Impersonate integration into libraries in various programming languages.
Note that advanced anti-bot solutions like Cloudflare may still detect automated requests. For a comprehensive solution, consider Bright Data's Scraper API, which handles browser fingerprinting, CAPTCHA solving, and IP rotation.
Register for a free trial of Bright Data's web scraping infrastructure!