The idea is to create a voice enabled assistant which can run automated statistical analysis on any given dataset and any model on top of it.
Check the demo video in DEMO folder for a glimpse of how the assistant works on voice commands.
- Support OSEMN data pipeline -
O — Obtaining our data
S — Scrubbing / Cleaning our data
E — Exploring / Visualizing our data will allow us to find patterns and trends
M — Modeling our data will give us our predictive power as a wizard
N — Interpreting our data - Supports uploading a CSV dataset using file browser.
- Supports data description, data cleaning and data visualisation.
- Supports pairplot for v0.1.
- Supports cleaning the dataset for any NAN values on request.
- Supports 5 clasification models -
Logisitic regression
K Neighbors Classifier
Decision Tree Classifier
Gaussian NB Classifier
Support Vector Classifier - Supports Train-Test splits and K Fold cross validation.
- Supports multiple visualisations.
- Supports multiple models and evaluation reports.
- Supports downloading models to disk.
- Supports numerical datasets of any size.
- Install necessary dependencies
- Create a API access token in wit.ai
- Update token in speech_to_text.py file
- Use any dataset from the sample
greetings = ['hey there anna', 'hello anna', 'hi anna', 'Hai anna', 'hey! anna', 'hey anna']
var1 = ['upload a dataset', 'load dataset', 'load data', 'upload data']
var2 = ['what can you do', 'show me your skills','show options', 'help me']
var3 = ['describe the dataset', 'what is in my dataset', 'can you describe the dataset', 'describe the data', 'describe data']
var4 = ['clean the dataset', 'clean dataset', 'clean the data', 'clean my dataset', 'clean the data set']
var5 = ['visualize','generate graphs', 'create graphs', 'visualize the dataset', 'create plots', 'plot the dataset', 'show the dataset', 'build plots']
var6 = ['build models', 'build a classification model', 'create a classification model', 'build model']
LR = ['logistic regression', 'logistic regression model', 'logistic regression classifier', 'regression']
KNN = ['okay neighbors', 'k n n', 'okay neighbors classifier']
CART = ['decision tree', 'decision tree classifier', 'decision trees']
NB = ['gaussian', 'bayes', 'gaussian n b', 'naive bayes']
SVC = ['support vector', 'support vector machine', 'support vector classifier']
var7 = ['bye', 'cya', 'good bye', 'thank you']