Skip to content

raghavtk/stats-sfsal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Statistics for Data Science Project

Abstract:

The research conducted, models different parts of the salaries of various job titles in the city of San Francisco. For example, you may compare different aspects of the salary of a police officer and a member in the fire department. This may be in terms of the base pay, the benefits, the overtime pay or any of the other available categories. The aim of this research has been to simplify the data given in the dataset by graphically plotting and comparing these salaries for different job titles such that it becomes easier to understand this data. This would be useful for someone living in San Francisco to make a decision about their job or for someone planning to move to San Francisco for job prospects. We start off with cleaning the data. The columns with over 70% of the values being NaN were removed. The NaN values of some columns were replaced by the mean after grouping by job title, some rows with specific strings and NaN values were removed. The processed data is then analyzed and visualized using some visualized using some visual aids. Histograms, bar graphs, box plots and scatter plots are used to aptly visualize the data and interpret it efficiently. A null hypothesis and alternate hypothesis was assumed and by performing a z-test, the null hypothesis was seen to be plausible. It has also been seen that every column is positively correlated with each other.

Introduction

If you were to live in a big city like San Francisco, you would probably analyze many different things about the city and one of the primary analyses that needs to be done is that of the jobs that you may be able to take up in the city. This is of primary concern as the location of your house, place where you buy all the necessary items for your house and your general standard of life depends on the amount of wealth you are able to accumulate which would primarily be through your job. We believe that there is a necessity to analyze this data to understand which jobs are better in different respects and analyze the numerical columns to realize which jobs are better in all the different respects. Our research analyzes the salary details of different jobs across the city of San Francisco to generally evaluate the living standard of different people across and to optimize these values. We also grouped the jobs by profession and calculated the highest paying jobs among professions given in our dataset.

About

Statistics for Data Science Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published