calc_bing_word_counts.Rd
Count the number of times a word with a positive or negative sentiment occurs in a given text.
calc_bing_word_counts( x, target_col_name = NULL, text_col_name, filter_class = NULL )
x | A data frame with one or more columns: the column with the classes
(if |
---|---|
target_col_name | A string with the column name of the target variable.
Defaults to |
text_col_name | A string with the column name of the text variable. |
filter_class | A string or vector of strings with the name(s) of the
class(es) for which to count the words. Defaults to
|
A data frame with three columns: word; sentiment ("positive" or "negative"- see Hu & Liu, 2004); and count.
When supplying more than one class in filter_class
, the returned data
frame will NOT separate the results for the different classes. If
separation is desired, then run the function for each class separately or
do something like this:
# Assuming that the class and text columns are called "label" and # "feedback" respectively x %>% split(.$label) %>% purrr::map( ~ calc_bing_word_counts(., target_col_name = NULL, text_col_name = "feedback", filter_class = NULL) )
Hu M. & Liu B. (2004). Mining and summarizing customer reviews. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2004), Seattle, Washington, USA, Aug 22-25, 2004.
library(experienceAnalysis) books <- janeaustenr::austen_books() # Jane Austen books emma <- paste(books[books$book == "Emma", ], collapse = " ") # String with whole book pp <- paste(books[books$book == "Pride & Prejudice", ], collapse = " ") # String with whole book # Make data frame with books Emma and Pride & Prejudice x <- data.frame( text = c(emma, pp), book = c("Emma", "Pride & Prejudice") ) # Word counts for both books calc_bing_word_counts(x, target_col_name = "book", text_col_name = "text", filter_class = NULL) %>% head()#> word sentiment n #> 1 miss negative 882 #> 2 well positive 625 #> 3 good positive 559 #> 4 great positive 406 #> 5 like positive 277 #> 6 better positive 265# Word counts for Emma calc_bing_word_counts(x, target_col_name = "book", text_col_name = "text", filter_class = "Emma") %>% head()#> word sentiment n #> 1 miss negative 599 #> 2 well positive 401 #> 3 good positive 359 #> 4 great positive 264 #> 5 like positive 200 #> 6 better positive 173# Word counts for Pride & Prejudice calc_bing_word_counts(x, target_col_name = "book", text_col_name = "text", filter_class = "Pride & Prejudice") %>% head()#> word sentiment n #> 1 miss negative 283 #> 2 well positive 224 #> 3 good positive 200 #> 4 great positive 142 #> 5 enough positive 106 #> 6 better positive 92