Counts of words with a positive or negative sentiment — calc_bing_word

Count the number of times a word with a positive or negative sentiment occurs in a given text.

calc_bing_word_counts(
  x,
  target_col_name = NULL,
  text_col_name,
  filter_class = NULL
)

Arguments

x	A data frame with one or more columns: the column with the classes (if `target_col_name` is not `NULL`); and the column with the text. Any other columns will be ignored.
target_col_name	A string with the column name of the target variable. Defaults to `NULL`.
text_col_name	A string with the column name of the text variable.
filter_class	A string or vector of strings with the name(s) of the class(es) for which to count the words. Defaults to `NULL` (all rows).

Value

A data frame with three columns: word; sentiment ("positive" or "negative"- see Hu & Liu, 2004); and count.

Note

When supplying more than one class in filter_class, the returned data frame will NOT separate the results for the different classes. If separation is desired, then run the function for each class separately or do something like this:

# Assuming that the class and text columns are called "label" and
# "feedback" respectively
x %>%
    split(.$label) %>%
    purrr::map(
        ~ calc_bing_word_counts(., target_col_name = NULL,
                               text_col_name = "feedback",
                               filter_class = NULL)
    )

References

Hu M. & Liu B. (2004). Mining and summarizing customer reviews. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2004), Seattle, Washington, USA, Aug 22-25, 2004.

Examples

library(experienceAnalysis)
books <- janeaustenr::austen_books() # Jane Austen books
emma <- paste(books[books$book == "Emma", ], collapse = " ") # String with whole book
pp <- paste(books[books$book == "Pride & Prejudice", ], collapse = " ") # String with whole book

# Make data frame with books Emma and Pride & Prejudice
x <- data.frame(
  text = c(emma, pp),
  book = c("Emma", "Pride & Prejudice")
)

# Word counts for both books
calc_bing_word_counts(x, target_col_name = "book", text_col_name = "text",
                      filter_class = NULL) %>%
  head()
#>     word sentiment   n
#> 1   miss  negative 882
#> 2   well  positive 625
#> 3   good  positive 559
#> 4  great  positive 406
#> 5   like  positive 277
#> 6 better  positive 265

# Word counts for Emma
calc_bing_word_counts(x, target_col_name = "book", text_col_name = "text",
                      filter_class = "Emma") %>%
  head()
#>     word sentiment   n
#> 1   miss  negative 599
#> 2   well  positive 401
#> 3   good  positive 359
#> 4  great  positive 264
#> 5   like  positive 200
#> 6 better  positive 173

# Word counts for Pride & Prejudice
calc_bing_word_counts(x, target_col_name = "book", text_col_name = "text",
                      filter_class = "Pride & Prejudice") %>%
  head()
#>     word sentiment   n
#> 1   miss  negative 283
#> 2   well  positive 224
#> 3   good  positive 200
#> 4  great  positive 142
#> 5 enough  positive 106
#> 6 better  positive  92