Skip to content

VidalQuevedo/tweetstats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

tweetstats

A set of MapReduce scripts to get descriptive stats from Twitter objects stored in MongoDB.

Requirements

To use this script, you need:

  1. A MongoDB database with a collection containing BSON documents imported from a raw Twitter dataset in JSON format.
  2. PyMongo

Usage

  1. Download tweetStats.py

  2. Make sure your MongoDB server is running

  3. In the same directory where tweetsStats.py is located, type:

    python tweetStats.py -cm COMMAND -db DATABASE -coll COLLECTION [-regen REGENERATE] [-lim LIMIT]

Parameters

  • -cm[--command]: The command you want tweetStats to execute. Available commands include:

  • getDescriptives: Generate basic descriptives, such as TotalNumTweets, TotalNumberOfUsers, NumberOfTweetsPerUser, MostMentionedUsers, MostUsedHashtags, and MostLinkedToUrls

  • getTotalNumberOfRTd

  • getMostRepliedToUsers

  • -db[--database]: the name of the MongoDB database to use.

  • -coll[--collection]: the name of the collection to use.

  • [-regen[--regenerate]]: (True/False) Boolean indicating whether you would like the results to be recalculated. Default: True.

  • [-lim[--limit]]: (Int) Number of results to return. Default: 10

About

A set of MapReduce scripts to get descriptive stats from Twitter objects stored in MongoDB.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages