- Brazilian E-Commerce Marketplace is a real commercial dataset containing information about 100 thousand orders from 2016 to 2018 in various markets in Brazil. Analysis that can be performed includes order status, price, payment, delivery, customer location, product attributes, and customer reviews.
- The goal of this analysis is to provide valuable insights to e-commerce businesses on optimizing sales strategies, increasing customer satisfaction, and optimizing product delivery.
- This database consists of nine tables, including customers_dataset, geolocation_dataset, order_items_dataset, order_payments_dataset, order_reviews_dataset, orders_dataset, product_category_name_translation, products_dataset, and sellers_dataset. Dataset source
- This project uses a data analysis cycle, namely:
- Question (Business Questions)
- Data Wrangling
- Exploratory Data Analysis (EDA)
- Data Visualization and Explanatory
- Draw Conclusion and Recommendation
- Project Result Link: Click here
๐ Business questions that will be answered through this data analysis include:
- How are the sales performance, number of ordered items, and revenue on E-Commerce in the last few months?
- What products sell the most and least?
- How is the distribution of review scores on products sold and what is the percentage for each review?
- How is the product delivery time distributed?
- What are the demographic characteristics of customers in the e-commerce marketplace sector?
- Is there a correlation between product price and shipping costs (freight_value)?
- How is the distribution of geolocation based on latitude and longitude in Brazil?
๐ Requirements to deploy dashboard.py on streamlit.
- Make sure the
virtualenv
is ready to run for this project. - Install all the required libraries using the following command.
pip install streamlit Babel geopandas matplotlib numpy pandas seaborn streamlit
- Prepare
requirements.txt
file
pip install pipreqs
This package is used to generate the requirements.txt
file which contains a list of all the packages used in this project.
pipreqs
This command will automatically create a requirements.txt
file with a list of all packages used in this project.
- Run the streamlit application
streamlit run dashboard.py
This is the command to run the Streamlit application with the dashboard.py
file.
๐ Link Streamlit Dashboard: E-Commerce Marketplace Dashboard
๐ Total Data from the Merge Results
๐ How are the sales performance, number of ordered items, and revenue on E-Commerce in the last few months?
- By orders & items
Based on the visualization above, it can be observed that the month with the highest number of orders is November 2017.
Apart from that, there was a significant drastic decline in September 2018.
- By revenue
Experiencing a very drastic decline in the number of orders in September 2018, of course, this will have a huge impact on the E-Commerce marketplace for future profits.
To find out the cause of this decline, it is necessary to consider further analysis regarding several factors, such as the existence of competitors, marketing campaigns, and so on.
๐ What products sell the most and least?
Based on the plot above, the
Bed Bath Table
product has the highest sales.
Meanwhile,
Security and Services products
have the lowest sales.
๐ How is the distribution of review scores on products sold and what is the percentage for each review?
- By number of review scores
This distribution gives an idea of how often each review score appears on products sold.
Review score 5 is the one that appears the most with 66,264.
This indicates that the product purchased is very good.
- By percentage review score
The highest percentage of review scores is a rating of 5, with 56.5%.
๐ How is the product delivery time distributed?
There are some deliveries with a longer time, such as 210 days, which has a frequency of 1. However, this very long delivery frequency is very rare.
Most deliveries occur within a time frame of 1 to 50 days.
๐ What are the demographic characteristics of customers in the e-commerce marketplace sector?
- By customer state
SP is the state that places the most orders.
- By customer city
The city of Sao Paulo has the largest number of orders.
- By payment type
Based on the output above, the most commonly used payment method is
credit_card
. There are also transactions with the payment type (debit_card
), which are very few in number.
๐ Is there a correlation between product price and shipping costs (freight_value)?
This output shows the correlation coefficient with other variables in the DataFrame correlation_matrix for the E-Commerce marketplace.
๐ How is the distribution of geolocation based on latitude and longitude in Brazil?
A geolocation plot has been created by depicting country borders using red color and displaying geolocation points.
The image above shows the SP state that occupies the most geolocation point positions.
-
Conclusions
-
Based on the results of the analysis carried out on the E-Commerce Marketplace, there are several important findings. First, sales performance, number of ordered items, and revenue on e-commerce in recent months have been very volatile. November 2017 showed excellent performance, with a high number of orders and revenue. However, there was a drastic decline in September 2018, which had the potential to have a negative impact on future profits.
-
Furthermore, the product that sells the most is the
Bed Bath Table
, while theSecurity and Services
product has the lowest sales. This shows that products in theBed Bath Table
category have high popularity among customers. -
In terms of review score distribution, a score of 5 dominates the total of 66,264 reviews with a percentage of 56.5%, indicating that the majority of customers gave very good reviews to the products they purchased. This shows a high level of customer satisfaction.
-
The distribution of product delivery times tends to be in the range of 1 to 50 days, with some other cases of delivery taking longer. However, delivery times that take this long are very rare.
-
In terms of customer demographic characteristics, the state of SP has the largest number of orders, one of which is the city of Sao Paulo which has the largest number of orders. The most commonly used payment method is
credit_card
, whiledebit_card
has a very small number of transactions. -
There is a fairly strong positive correlation between product prices and shipping costs, indicating that the higher the product price, the higher the shipping costs. In addition, there is a strong positive correlation between product price and payment value, indicating that the higher the product price, the higher the value of payments made by customers.
-
Meanwhile, the geolocation distribution shows the SP state that occupies the most geolocation point positions, and the state with the least distribution is
geolocation_state
AP. However, it is important to note that to understand geolocation distribution patterns in more depth, further analysis is required.
-
-
Recommendations
Based on these findings, several recommendations can be made. First, further analysis needs to be carried out to find out the causes of the decline in performance in September 2018. Factors such as the existence of competitors and marketing campaigns need to be considered. In addition, marketing efforts need to be increased for products with low sales, such as
Security and Services
.
In terms of product delivery, care needs to be taken to ensure that delivery is carried out on time in order to meet customer expectations. In addition, it is worth considering offering alternative payment methods that are more attractive to customers, besides
credit_card
.