Skip to content

This repository is a comprehensive collection of data analysis projects and tutorials using Python's most powerful libraries: NumPy, Pandas, Seaborn, and Matplotlib. It is designed to help you explore, clean, visualize, and analyze data efficiently.

Notifications You must be signed in to change notification settings

V41BH4VR4JPUT/Data-Analysis-with-Python

Repository files navigation

Data Analysis with Python

📌 Overview

This repository contains a collection of Jupyter Notebooks covering various aspects of data analysis using Python, including data cleaning, handling missing data, visualization, and reading different file formats (CSV, Excel, SQL, HTML, etc.). The main libraries used in this repository include Pandas, NumPy, Matplotlib, and Seaborn.

📂 Directory Structure & File Descriptions

└── v41bh4vr4jput-data-analysis-with-python/
    ├── README.md
    ├── Cleaning_not_null_values.ipynb
    ├── Handling_missing_data.ipynb
    ├── Pandas_Dataframe.ipynb
    ├── Pandas_series.ipynb
    ├── Matplotlib/
    │   └── Visualization.ipynb
    ├── Reading and Extracting data/
    │   └── data/
    │       ├── btc-market-price.csv
    │       ├── eth-price.csv
    │       ├── Reading_External_data_and_Plottng.ipynb
    │       └── .ipynb_checkpoints/
    │           └── btc-market-price-checkpoint.csv
    ├── Reading CSV and TXT files/
    │   ├── btc-market-price.csv
    │   ├── exam_review.csv
    │   ├── Main.ipynb
    │   └── out.csv
    ├── Reading Data from Relational databases/
    │   ├── chinook.db
    │   └── main.ipynb
    ├── Reading Excel Files/
    │   ├── main.ipynb
    │   ├── out.xlsx
    │   └── products.xlsx
    └── Reading HTML tables/
        └── Main.ipynb

📝 Notebooks & Descriptions

1️⃣ Data Cleaning & Handling Missing Data

  • Cleaning_not_null_values.ipynb → Techniques for handling and cleaning data with non-null values.
  • Handling_missing_data.ipynb → Methods for dealing with missing values in datasets using Pandas and NumPy.

2️⃣ Pandas Basics: DataFrame & Series

  • Pandas_Dataframe.ipynb → Introduction to Pandas DataFrames, data manipulation, and transformations.
  • Pandas_series.ipynb → Understanding Pandas Series, operations, and indexing.

3️⃣ Data Visualization

  • Matplotlib/Visualization.ipynb → Creating various visualizations using Matplotlib and Seaborn, including bar charts, histograms, line plots, and scatter plots.

4️⃣ Reading and Extracting Data

  • Reading_External_data_and_Plottng.ipynb → How to read external datasets (CSV) and visualize data trends.
  • btc-market-price.csv & eth-price.csv → Sample datasets for Bitcoin and Ethereum price trends.

5️⃣ Reading Different File Formats

  • Reading CSV and TXT files/Main.ipynb → Techniques for reading and processing CSV and TXT files.
  • Reading Data from Relational databases/main.ipynb → Using Pandas and SQLAlchemy to extract data from SQLite databases (chinook.db).
  • Reading Excel Files/main.ipynb → Working with Excel files (out.xlsx, products.xlsx) using Pandas.
  • Reading HTML tables/Main.ipynb → Extracting and parsing data from HTML tables.

🔧 Setup & Installation

Prerequisites

Ensure you have Python 3.8+ installed along with the following libraries:

pip install numpy pandas matplotlib seaborn jupyterlab

Run Jupyter Notebook

Navigate to the project directory and launch Jupyter Lab:

cd v41bh4vr4jput-data-analysis-with-python
jupyter lab

🏆 Key Features

Comprehensive Data Handling – Cleaning, missing data handling, and manipulation.
Data Visualization – Plotting and analyzing trends with Matplotlib & Seaborn.
File Handling – Read and process CSV, Excel, SQL, and HTML tables.
Real-world Data – Work with datasets related to finance, e-commerce, and reviews.

About

This repository is a comprehensive collection of data analysis projects and tutorials using Python's most powerful libraries: NumPy, Pandas, Seaborn, and Matplotlib. It is designed to help you explore, clean, visualize, and analyze data efficiently.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published