Stars
Streamlit — A faster way to build and share data apps.
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
Implementing an Elo rating system for Men's NCAA Basketball
Data and code behind the articles and graphics at FiveThirtyEight
A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.
Guide to securing and improving privacy on macOS
A repository of LIVE malwares for your own joy and pleasure. theZoo is a project created to make the possibility of malware analysis open and available to the public.
A web interface to create custom vector-based visualizations on top of RAWGraphs core
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Basic python client implementation of TDAmeritrade API
Private Internet Access - Chrome Extension
Practical Gremlin - An Apache TinkerPop Tutorial
Jupyter metapackage for installation, docs and chat
The Social-Engineer Toolkit (SET) repository from TrustedSec - All new versions of SET will be deployed here.
Apache Spark - A unified analytics engine for large-scale data processing
SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Members of the United States Congress, 1789-Present, in YAML/JSON/CSV, as well as committees, presidents, and vice presidents.
A global repository of open address, building, and parcel data.
A topic-centric list of HQ open datasets.