ChrisMuir/refinr

In `n_gram_merge()`, issues when arg `bus_suffix = FALSE` and `ignore_strings` is non_NULL

ChrisMuir opened this issue · 0 comments

In n_gram_merge(), getting incorrect output when arg bus_suffix is set to FALSE and a char vector is passed to arg ignore_strings. Here's an example:

vect <- c("cats, inc", "cats, incorporated", "cats, llc")
refinr::n_gram_merge(vect, bus_suffix = FALSE, ignore_strings = "dogs")
#> [1] "cats, inc" "cats, inc" "cats, llc"

The intended output is that none of the input values should have been merged together. Currently, if bus_suffix = FALSE and ignore_strings is not NULL, within refinr:::get_fingerprint_ngram(), vect is being run through business_suffix() (this should not be happening) .... this is causing the issue.