Skip to content

guihunwansui/CV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science and Analytics Portfolio

Welcome to my Data Science and Analytics Portfolio! This repository showcases a collection of projects that demonstrate my skills in data analysis, machine learning, and algorithm development. Each project highlights my ability to work with complex datasets, apply advanced analytical techniques, and derive meaningful insights.

Table of Contents

Chicago Crime Analysis

Overview

This project explores crime data in Chicago from 2015 to 2022, focusing on data visualization and analysis to uncover trends and patterns in criminal activities.

Key Features

  • Data cleaning and preprocessing
  • Visualization of crime numbers over time
  • Analysis of crime types and their proportions
  • Crime distribution across different districts and community areas
  • Investigation of arrest rates across districts

Tools and Technologies

  • MATLAB for data visualization and analysis
  • Python for data preprocessing and cleaning

Insights

  • Crime numbers have shown a relatively steady trend with a noticeable drop during the pandemic years.
  • Theft, battery, and criminal damage are the most prevalent types of crimes.
  • Crime distribution is more concentrated in central areas of Chicago.
  • Arrest rates vary significantly across districts, indicating potential issues in law enforcement efficiency.

Files

Isle Royale Wolf-Moose Population Analysis

Overview

This project analyzes the population dynamics of wolves and moose on Isle Royale, exploring the relationships between predator and prey populations and environmental factors.

Key Features

  • Data cleaning and creation of codebooks
  • Exploratory data analysis (EDA) of population trends and environmental variables
  • Hypothesis testing and confidence interval estimation
  • Linear regression and classification models to predict population dynamics

Tools and Technologies

  • Python with libraries such as Pandas, NumPy, Matplotlib, Seaborn, Statsmodels, and Scikit-learn

Insights

  • There is a moderate negative linear relationship between kill rate and wolf population.
  • Moose have a shorter lifespan on Isle Royale compared to their natural lifespan, likely due to predation and harsh environmental conditions.
  • Moose number is positively correlated with wolves number and kill rate, and negatively correlated with predation rate and moose recruitment rate.
  • Classification models can predict wolves number based on year, moose number, and kill rate with reasonable accuracy.

Files

  • final_project.ipynb: Python script for data analysis.
  • wolf_moose_yearly.csv: Dataset containing yearly counts of wolf and moose populations.
  • moose_deaths.csv: Dataset containing information about moose deaths.

Point Cloud Processing using Point Feature Histograms

Overview

This project introduces optimizations to the traditional Point Feature Histogram (PFH) method for point cloud alignment, enhancing its efficiency and scalability for real-time applications and large-scale datasets.

Key Features

  • Implementation of the PFH method with optimizations such as fast PFH computation, flattened histograms, logarithmic-scaled queries, adaptive filtering, and improved ICP recalculations
  • Evaluation of different optimization techniques on point clouds of varying sizes
  • Application to real-life room scan data

Tools and Technologies

  • Python for algorithm implementation and optimization
  • Visualization tools for point cloud alignment

Insights

  • Optimized PFH algorithms significantly reduce computational complexity and improve alignment accuracy.
  • Adaptive filtering techniques enhance robustness to noise and improve processing speed.
  • The improved ICP method achieves high-precision alignment suitable for complex objects.

Files

SURE ML for Balance Rehab

Overview

This project involves the development of machine learning models to evaluate balance performance during rehabilitation.

Key Features

  • Data extraction and conversion into spectrograms
  • Implementation of CNN and BiLSTM models with Python for classification
  • Fine-tuning of models to improve metrics

Tools and Technologies

  • MATLAB and Python scripts

Files

UpliftAI Internship

Overview

During my internship at UpliftAI, I worked on SQL files for database setup and access management.

Key Features

  • SQL files for database setup and access management

Tools and Technologies

  • SQL

Files

Conclusion

This portfolio highlights my proficiency in various aspects of data science and analytics, from data visualization and statistical analysis to algorithm development and optimization. These projects demonstrate my ability to tackle complex problems, derive actionable insights, and implement efficient solutions. I am excited to bring these skills to a data science or analytics role and contribute to impactful projects.

Feel free to explore the code and reports in this repository. If you have any questions or would like to discuss further, please reach out to me at [email protected].

Thank you for visiting my portfolio!

About

Portfolio

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors