In this project, we aim at building an end-to-end machine learning application to classify the messages into 36 categories related to disaster. This is helpful to extract only disaster related from multiple media sources, so any appropriate disaster relief agency can be reached out for help.
Project is comprised of components: ETL pipeline, ML pipeline and Flass App.
To successfully run the project, below are list of dependencies need to be installed in the python environment:
- python >3.6
- sklearn==0.0
- nltk==3.5
- SQLAlchemy==1.3.22
- pandas==1.1.5
- numpy==1.19.4
- plotly==4.14.1
- Flask==1.1.2
- Perform Extract, Transform and Load data provided by Figure8 including
messagesand its correspondingcategories. - Target table will be stored in SQLite under table
disaster_response - Run
python process_data.py messages.csv categories.csv disaster_response.db disaster_responseindatadirectory to execute the ETL pipeline
- Perform load data, build model, train, run cross validation for best param and export artifact model.
- Target output is
classifier.pklfile which will be used later for prediction on Flask App. - Run
python train_classifier.py ./../data/disaster_response.db classifier.pklinmodelsdirectory to execute the ML pipeline
-
Visualize the report of the categories, message genre data
-
Classify input message from dashboard
-
Run
python run.pyinappdirectory to execute the Flask applicationThe dashboard looks as below:
Credits must be given to Figure Eight for the provided data.
