Project
Using Yelp dataset, we have built a Predictive Analytics application that help us to define which emotions have a higher influence in the Yelp ratings in Charlotte, based in the correlation between the sentiment analysis of the users reviews and average stars of each business.
TECHNOLOGIES USED:
R
Sentiment analysis conducted on users reviews using Syuzhet library.
Pandas
Datasets cleansing, filtering and merging to a single dataset for further analysis.
Matplotlib
MultiLinear Regression visualization applied on dataset.
SciKits
MultiLinear Regression and Decision Tree analysis applied on dataset.
HTML
Project Website development.
Bootstrap
Project Website responsiveness and design.
Dataset
Dataset used: Yelp Reviewer Dataset
Compiles information on 11 metropolitan areas: Edinburgh (UK), Stuttgart (Germany), Montreal (Canada), Toronto (Canada), Pittsburgh, Charlotte, Champaign-Urbana, Phoenix, Las Vegas, Madison, and Cleveland.
NRC Emotion Lexicon
List of words and their associations with eight emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and two sentiments (negative and positive).
Extra variables
Yelp allows readers of reviews to tag reviews with 3 attributes: “cool”, “useful”, and “funny” These were included to gain additional context as to how the reviews were interpreted.
Yelp Reviews Analyzed