Skip to content

Latest commit

 

History

History
18 lines (15 loc) · 1001 Bytes

README.md

File metadata and controls

18 lines (15 loc) · 1001 Bytes

tmdb_analysis

Movies data analysis, This data set contains informationa about 10,000 movies collected from The Movie Database (TMDb) The data include the movies' popularity, budget, revenue, original title, cast, runtime, genres, release date, and many more info.

before starting the analysis, some issues were found in the data and treated through various solutions:

  • A lot redundant columns, 20 columns are too much to explore
  • some rows are duplicated
  • Some values are not logical like runtime = 0, revenue and budget = $0
  • Some columns have multiple values in one cell, like genre

Then using visuals and graphs, some questions about movies revenue were explored:

  • If there's correlation between vote, budget, and revenue
  • What's Impact of voting score on revenue
  • The ranking of genres' revenue in last year Some extra questions about each genre releases numbers
  • What genres dominate the market interm of release numbers
  • How did the top 4 genres releases change through the years