corvx works with Python +3.10
You can install the corvx
package directly with pip
:
pip install -e .
or
pip install corvx
To work with the X Scraper module you have to import the corresponding module first:
from corvx import Corvx
# You can also set the tokens with environment variables X_AUTH_TOKEN and X_CSRF_TOKEN
corvx = Corvx(
auth_token=os.getenv('X_AUTH_TOKEN'),
csrf_token=os.getenv('X_CSRF_TOKEN'),
)
The available methods and its usage are described below.
corvx supports two query formats:
You can directly pass strings as queries:
queries = [
'python programming',
'javascript development',
]
for tweet in corvx.search(queries=queries):
print(tweet)
For more complex searches, a JSON-based query language is available. The query must be specified as a Python dictionary containing a list of fields and global options.
This option will force the tweets to match a given language. The language must be specified with its ISO 639-1 two-letter code (e.g., es
for Spanish).
This parameter refers to the minimum allowed date. It has to be specified in the YYYY-MM-DD
format.
This parameter refers to the maximum allowed date. It has to be specified in the YYYY-MM-DD
format.
It has to be specified with a tuple
object composed of a text location and a range in miles (e.g., ('Santiago de Compostela', 15)
).
A query can specify multiple fields which are Python dictionaries with one or more keys and values:
This is a list of strings, either terms or phrases.
If True
, the specified terms or phrases must match exactly as they were written on the tweets (case/latin insensitive). If this flag is set, the target
parameter will be ignored.
If not specified, the tweets will match every item.
'any'
(the tweets must match at least one of the items)'none'
(the tweets won't match any item)
If not specified, the tweets will match ordinary keywords.
'hashtag'
(tweets containing#item
)'mention'
(tweets mentioning@item
)'from'
(tweets written by@item
)'to'
(tweets that are replies to@item
)
Simple search with multiple queries:
# Simple string queries
queries = ['python', 'javascript']
for tweet in corvx.search(queries=queries):
print(tweet)
Advanced search with a single query:
# Advanced query format
query = {
'fields': [
{'items': ['Santiago']},
{'items': ['Chile'], 'match': 'none'},
],
'lang': 'es'
}
for tweet in corvx.search(query=query):
print(tweet)
Advanced search with multiple queries:
# Multiple advanced queries
queries = [
{
'fields': [{'items': ['python'], 'match': 'any'}],
'lang': 'en'
},
{
'fields': [{'items': ['javascript'], 'match': 'any'}],
'lang': 'en'
}
]
for tweet in corvx.search(queries=queries):
print(tweet)
Search for all available results by going back in time:
for tweet in corvx.search(queries=queries, deep=True):
print(tweet)
Limit the number of results:
for tweet in corvx.search(queries=queries, limit=100):
print(tweet)
Control the time between API calls:
for tweet in corvx.search(queries=queries, sleep_time=30):
print(tweet)
Search constantly for new results:
# Stream with multiple queries
queries = ['python', 'javascript']
for tweet in corvx.stream(queries=queries):
print(tweet)
Enable debug logging to see detailed information:
import logging
# Configure logging at the start of your script
logging.basicConfig(
level=logging.DEBUG,
format='[%(asctime)s] %(levelname)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S',
)
# Configure corvx logger
logger = logging.getLogger('corvx')
logger.setLevel(logging.DEBUG)
logger.propagate = True