This project analyzes Amazon reviews to explore the relationship between star ratings and sentiment scores. Using the VADER Sentiment Analysis tool, it uncovers insights into how users express sentiment in their reviews and visualizes patterns such as common word usage and mismatches between ratings and sentiments.
π Notebook
- Navigate to the Sentiment Analyser Notebook.
- Run the code blocks sequentially.
- The notebook processes a dataset of Amazon reviews, tokenizes the text, and applies the VADER Sentiment Analysis tool.
- Sentiment scores (compound, pos, neu, and neg) are calculated for each review.
- Results are visualized to identify patterns and trends.
- Distribution of sentiment scores by ratings.
- Common words used in 1-star and 5-star reviews.
- Correlation between text length and sentiment.
- Discrepancies between sentiment and user ratings.
- Statistical summaries such as average ratings and sentiment mismatches.
- Sentiment-Rating Correlation: A positive correlation is observed between sentiment scores and star ratings, meaning higher sentiment scores often align with higher ratings.
- Common Words: Words like βgreatβ and βloveβ are frequent in positive reviews, while negative reviews feature words such as βpoorβ or βworst.β
- Mismatch Analysis: Cases identified where sentiment scores conflict with the star ratings, often revealing nuances or limitations in sentiment analysis tools.
- Context Understanding: The VADER tool struggles with sarcasm, context, and nuanced language.
- Focus on English: Analysis is limited to English text reviews only.
- Incorporate topic modeling to explore themes across reviews.
- Compare performance with other sentiment analysis tools or models.
- Expand analysis to include product categories for more specific insights.