This is a Python library of web-related functions, such as:
- remove comments, or tags from HTML snippets
 - extract base url from HTML snippets
 - translate entites on HTML strings
 - convert raw HTTP headers to dicts and vice-versa
 - construct HTTP auth header
 - converting HTML pages to unicode
 - sanitize urls (like browsers do)
 - extract arguments from urls
 
Python 3.9+
pip install w3lib
See http://w3lib.readthedocs.org/
The w3lib library is licensed under the BSD license.