diff --git a/README.md b/README.md index e9244bb..a07fa31 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,18 @@ # tmdb_analysis Movies data analysis, This data set contains informationa about 10,000 movies collected from The Movie Database (TMDb) +The data include the movies' popularity, budget, revenue, original title, cast, runtime, genres, release date, and many more info. + +before starting the analysis, some issues were found in the data and treated through various solutions: +- A lot redundant columns, 20 columns are too much to explore +- some rows are duplicated +- Some values are not logical like runtime = 0, revenue and budget = $0 +- Some columns have multiple values in one cell, like genre + +Then using visuals and graphs, some questions about movies revenue were explored: +- If there's correlation between vote, budget, and revenue +- What's Impact of voting score on revenue +- The ranking of genres' revenue in last year +Some extra questions about each genre releases numbers +- What genres dominate the market interm of release numbers +- How did the top 4 genres releases change through the years +