From bc51289b25965d7d0e38a9f42c84a2f7c7fcd6b5 Mon Sep 17 00:00:00 2001 From: Amr Yasser <74487632+Odd-Baron@users.noreply.github.com> Date: Sun, 29 Aug 2021 09:52:45 +0200 Subject: [PATCH] Update README.md --- README.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/README.md b/README.md index e9244bb..a07fa31 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,18 @@ # tmdb_analysis Movies data analysis, This data set contains informationa about 10,000 movies collected from The Movie Database (TMDb) +The data include the movies' popularity, budget, revenue, original title, cast, runtime, genres, release date, and many more info. + +before starting the analysis, some issues were found in the data and treated through various solutions: +- A lot redundant columns, 20 columns are too much to explore +- some rows are duplicated +- Some values are not logical like runtime = 0, revenue and budget = $0 +- Some columns have multiple values in one cell, like genre + +Then using visuals and graphs, some questions about movies revenue were explored: +- If there's correlation between vote, budget, and revenue +- What's Impact of voting score on revenue +- The ranking of genres' revenue in last year +Some extra questions about each genre releases numbers +- What genres dominate the market interm of release numbers +- How did the top 4 genres releases change through the years +