Hi there!
This post is about review analysis. Review analysis is very useful tool for sellers especially for those who sell online.
Customers give feedback after every purchase. It is very essential to analyze those reviews and take necessary actions to improve the business.
But unfortunately there are many spammers. Spammers may post unrelated reviews or post same review multiple times. It is very difficult task for the seller to categorize which is the genuine review.
So, here is a solution using R. This R script will eliminate the irrelevant reviews.
Logic of the code:
Build a list of all words in which we are interested. For example, for a phone reviews the keywords list will be camera,display,heating,battery etc.
Read all reviews to be analyzed and score them using simple algorithm using the keywords list. The score is about number of occurrences of the keywords in those reviews.
Depending on the scores take proper actions.
Here is the R script:
Find materials related to this post on my Github repo here.
Thanks for visiting my blog. I always love to hear constructive feedback. Please give your feedback in the comment section below or write to me personally here.
This post is about review analysis. Review analysis is very useful tool for sellers especially for those who sell online.
Customers give feedback after every purchase. It is very essential to analyze those reviews and take necessary actions to improve the business.
But unfortunately there are many spammers. Spammers may post unrelated reviews or post same review multiple times. It is very difficult task for the seller to categorize which is the genuine review.
So, here is a solution using R. This R script will eliminate the irrelevant reviews.
Logic of the code:
Build a list of all words in which we are interested. For example, for a phone reviews the keywords list will be camera,display,heating,battery etc.
Read all reviews to be analyzed and score them using simple algorithm using the keywords list. The score is about number of occurrences of the keywords in those reviews.
Depending on the scores take proper actions.
Here is the R script:
#cleanup the work space
rm(list = setdiff(ls(), lsf.str()))
#load stringr library for string operations
library(stringr)
#################################################################################
#read the important keywords
#################################################################################
keywords = scan('keywords.txt',
what='character', comment.char=';',sep = "\n")
#################################################################################
#read the data to be valuated (you can use any method to read these files, you can combine all reviews in just one file too)
#################################################################################
review1 = scan('review1.txt',
what='character', comment.char=';',sep = "\n")
review2 = scan('review2.txt',
what='character', comment.char=';',sep = "\n")
review3 = scan('review3.txt',
what='character', comment.char=';',sep = "\n")
review4 = scan('review4.txt',
what='character', comment.char=';',sep = "\n")
#################################################################################
#score it and compare
#################################################################################
findScore <- function(review,k) {
keyLength <- length(keywords)
matScore <- 0
for(i in 1:keyLength) {
tDF <- c(k[i],sum(str_count(review,k[i])))
matScore <- rbind(matScore,tDF)
}
return(matScore)
}
#call above function and score reviews
matScoreR1 <- findScore(review1,keywords)
matScoreR2 <- findScore(review2,keywords)
matScoreR3 <- findScore(review3,keywords)
matScoreR4 <- findScore(review4,keywords)
#view score matrices
View(matScoreR1)
View(matScoreR2)
View(matScoreR3)
View(matScoreR4)
findRel <- function(reviewScore) {
totalScore <- 0
keyLength <- length(keywords)
for(i in 1:keyLength) {
totalScore <- as.numeric(reviewScore[i,2])+totalScore
}
return(totalScore)
}
#################################################################################
#find irrelevant reviews
#################################################################################
totalScoreR1 <- findRel(matScoreR1)
totalScoreR2 <- findRel(matScoreR2)
totalScoreR3 <- findRel(matScoreR3)
totalScoreR4 <- findRel(matScoreR4)
#function to find if the review is relevant or not
findIfRel <- function (score,name) {
if(as.numeric(score==0))
cat(name," is irrelevant")
else
cat("Number of keywords found in ",name," :", score)
}
#check review relevance of each review
findIfRel(totalScoreR1,"review 1")
findIfRel(totalScoreR2,"review 2")
findIfRel(totalScoreR3,"review 3")
findIfRel(totalScoreR4,"review 4")
Just eliminating irrelevant reviews is not enough for any seller. We have to analyze a lot other aspects. We will discuss them in next posts.Find materials related to this post on my Github repo here.
Thanks for visiting my blog. I always love to hear constructive feedback. Please give your feedback in the comment section below or write to me personally here.
No comments:
Post a Comment