-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
16 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,18 @@ | ||
# tmdb_analysis | ||
Movies data analysis, This data set contains informationa about 10,000 movies collected from The Movie Database (TMDb) | ||
The data include the movies' popularity, budget, revenue, original title, cast, runtime, genres, release date, and many more info. | ||
|
||
before starting the analysis, some issues were found in the data and treated through various solutions: | ||
- A lot redundant columns, 20 columns are too much to explore | ||
- some rows are duplicated | ||
- Some values are not logical like runtime = 0, revenue and budget = $0 | ||
- Some columns have multiple values in one cell, like genre | ||
|
||
Then using visuals and graphs, some questions about movies revenue were explored: | ||
- If there's correlation between vote, budget, and revenue | ||
- What's Impact of voting score on revenue | ||
- The ranking of genres' revenue in last year | ||
Some extra questions about each genre releases numbers | ||
- What genres dominate the market interm of release numbers | ||
- How did the top 4 genres releases change through the years | ||
|