Guest Review Text Analysis
With a vast amount of information on various aspect of the hospitality industry available on the internet, during the research phase, travelers can now utilize insights from various travel review platforms to obtain more guidance before attempting to make a reservation. This reviews contain previous guests/travelers evaluation and experience of their reservation which can impact future consumers and in turn can possibly be an advantage if the reviews are positive or a disadvantage to the hosts and a known fact that majority of travelers use them to build trust. The various airbnb guest review will be explored using text analysis to gain insights on the overall performance of hosts, Even through these are subjective opinions of the guest experience, regardless of if they are positive or negative, it is important for the host to use such information in improving their service which in turn improves guest experience at their property.
Area of concentration.
This analysis will concentration
on the overall experience of guest based on certain aspect of such as the property features, overall/specific satisfaction, communication with the host and the service from start to finish. Exploring each review using frequent words, bigrams and also sentiment scores. The analysis will be conducted using python and various libraries such as pandas for data manipulation, nltk for text cleaning and analysis, plotly and wordcloud for visualization.
Note that reviews are sent after guest checks out of a rental property.
The raw review text will be cleaned and transformed by:
- Converting all text to lower case.
- Removing all automated bot texts.
- Removing all punctuation and empty text.
- Removing all numbers, stopwords and words that do not impact the analysis.
- lemmatize words i.e words like recommended will be transformed to recommend.
Ovarall guest reviews
The top 10 most frequent words used by guests contains positive words such as clean, great, nice, also neutral words such as location, room, host, apartment etc. In addition the overall sentiment summary reveals that majority of the review contains neutral comments which suggest that most guests did not express any particular feelings or emotions. Also 45.1% of guest had positive experience of their host and accommodation, while just a little fraction i.e 0.45% had negative reviews.
Common words cloud
A wordcloud of recurring words from the overall guest review hint at the high amount of words relating to recommendations, lodgings location, accommodation tidiness, convenience and so on.
Selected positive & negative guest review words
Trying to ascertain the number of times the selected positive and negative words were used by guests, as displayed by the summary bar chart 33.2% of guest used the word clean
while just 0.6% of guests used the word dirty
, also 8.8% used the word convenient
while just 0.18% used the word poor
.
Bigrams of guest reviews.
Looking at the positive bigrams of various guests review, The above plot conveys the number of times each pair of words was used to describe the location of the host property, guest intention to recommend their host to other people, the host personality traits and the accommodation overall tidiness.
Review by host
There are 1,403 unique host in the data, this is a rather huge number so only a few selected host will be explored.
Focusing on hosts with the highest number of frequently used words by guests, (summary). Also the neutral sentiment takes center stage when we talk about the most distinct type of sentiment expressed by guests.
Number of sentiment score by host
Sentiment | Minimum | Average | Median | Maximum | |
---|---|---|---|---|---|
0 | Negative | 1 | 1.571 | 1.0 | 9 |
1 | Neutral | 1 | 18.684 | 6.0 | 729 |
2 | Positive | 1 | 15.186 | 6.0 | 487 |
The highest number of negative review a single host received is 10, while the highest number of positive reviews is 505, additionally the average number of neutral reviews for all host is 16.1 while 1.57 is the average number of negative review by host.
Top host by selected review bigrams
Focusing on positive bi-gram from guests review, looking at the top 10 hosts with the highest number of time each group of words were used for evaluation, Host Jose/Jason
have a high number of positive guest review relating to Great Location
& Highly Recommend
which indicate that some of their biggest strength of being in a vary good location, soledad & Rodrigo
had the second highest count of reviews with Highly Recommend
& Clean Comfortable
with 41 and 40 respectively while also coming 5th with Great Host
. We also have other repeated hosts such as Will
, Robert
, Ravi
, Dror
& Izzy
from the selected category.
By Host Location
The above bi-gram count summary reveal the number of guest reviews containing the selected words within the top 10 host locations. Jamaica Plain
have the highest number of guests willing to book again, while Fanway/Kenmore
have the highest count of guest who enjoyed the host environment.
Conclusion
Majority of the reviews contains neutral sentiment which indicate that many guest did not express any specific emotions towards their experience with their host service, In addition there are more guest that expressed some kind of satisfaction and comfortability than guest that did not. Also majority of the host have very little negative review with an average of approximately 2 negative reviews.