Skip to content

We will use Python and unsupervised learning techniques to predict whether cryptocurrencies are affected by 24-hour or 7-day price changes. We will perform clustering using K-means and optimize the clusters using Principal Component Analysis (PCA).

Notifications You must be signed in to change notification settings

jnliou/CryptoClustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

CryptoClustering Assignment

Stockimage

Table of Contents


Introduction

In this assignment, we will use Python and unsupervised learning techniques to predict whether cryptocurrencies are affected by 24-hour or 7-day price changes. We will perform clustering using K-means and optimize the clusters using Principal Component Analysis (PCA). The assignment involves the following steps:

  • Data preparation and normalization.
  • Finding the best value for 'k' using the original data.
  • Clustering cryptocurrencies with K-means using the original data.
  • Optimizing clusters with PCA.
  • Finding the best value for 'k' using PCA data.
  • Clustering cryptocurrencies with K-means using PCA data.
  • Visualizing and comparing the results.

Files

  • Crypto_Clustering.ipynb: Jupyter Notebook containing the Python code for the assignment.
  • crypto_market_data.csv: The dataset containing cryptocurrency market data.

Getting Started

  1. Clone this repository to your local machine.
  2. Open the Jupyter Notebook Crypto_Clustering.ipynb to start working on the assignment.

Data Preparation

We will load the cryptocurrency market data, normalize it using StandardScaler, and create scaled DataFrames. The data will be prepared for further analysis.


Finding the Best Value for k

To determine the optimal value for 'k' (number of clusters) using the original data, we will use the elbow method. We will visualize the inertia values for different 'k' values to find the best number of clusters.

Elbow


Clustering Cryptocurrencies with K-means

We will cluster cryptocurrencies using K-means with the best 'k' value obtained from the previous step. We will visualize the results using scatter plots and identify the impact of clustering.

Scatter


Optimizing Clusters with Principal Component Analysis (PCA)

PCA will be applied to reduce the feature dimensions. We will calculate the total explained variance of the three principal components and create a new DataFrame with PCA data. PCA

Finding the Best Value for k Using PCA Data

Similar to the previous step, we will find the best 'k' value using the PCA data. The elbow method will be employed to determine the optimal number of clusters for PCA-transformed data. ElbowPCA


Clustering Cryptocurrencies with K-means Using PCA Data

We will use K-means clustering with the best 'k' value obtained from PCA data. The results will be visualized using scatter plots to analyze the impact of using PCA for clustering. ScatterPCA

Visualizing and Comparing Results

We will create composite plots to compare the elbow curves and cryptocurrency clusters from the original data and PCA data. This will help us understand the impact of using fewer features for clustering.

About

We will use Python and unsupervised learning techniques to predict whether cryptocurrencies are affected by 24-hour or 7-day price changes. We will perform clustering using K-means and optimize the clusters using Principal Component Analysis (PCA).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published