Skip to content

paawan01/Titanic_dataset_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table of Contents

  1. Installation
  2. Project Motivation
  3. File Descriptions
  4. Results
  5. Licensing, Authors, and Acknowledgements

Installation

The libraries/packages used are:

  • numpy
  • pandas
  • matplotlib
  • sklearn
  • seaborn

There should be no necessary libraries to run the code here beyond the Anaconda distribution of Python. The code should run with no issues using Python versions 3.*

Project Motivation

For this project, I was interested in using Titanic dataset from Kaggle to answer the following questions:

  • Does having family members on board increases your survival ?
  • Was there any advantage of survival to a particular gender ?
  • Which aspect had most crucial role to play in passengers survival ?

Using descriptive statistics.

File Descriptions

The notebook 'Titanic_dataset_analysis.ipynb' strives to answer some chosen question using simple exploratory data analysis, and descriptive statistics, (the aim is to avoid using any inferential statistics or Machine learning) on the titanic dataset. This notebook follows on lines of Cross-Industry Standard Process for Data Mining (CRISP-DM)

'Titanic_dataset_analysis.html' is the static html version of the notebook.

The data folder contains two files:

  • training set (train.csv) : The training set should be used to build machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger.
  • test set (test.csv) : The test set should be used to see how well the model performs on unseen data. For the test set, we do not provide the ground truth for each passenger.

The data has been taken from Kaggle's website here.

Results

The main findings of the code can be found at the post available here.

Licensing, Authors, Acknowledgements

Must give credit to Kaggle for the data. You can find the Licensing for the data and other descriptive information at the Kaggle link available here.

Also credits to https://github.com/jjrunner/stackoverflow/blob/master/README.md for the readme template. Otherwise, feel free to use the code here as you would like!

About

Analysis on Titanic dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published