deleting the file attachments from flattened message
Closed this issue · 0 comments
schochastics commented
Is the (.)*?
part of the regex really necessary? Because it slows down the deletion of "pattern" significantly.
pattern <- "(.)*?(\\s\\(file attached\\))($|\\s)"
Flat <- readLines("test.txt") #~1million characters
#> system.time(gsub(pattern, "", Flat, perl = TRUE))
#> user system elapsed
#> 27127.242 1.162 27129.033
Thats 8 hours, compared to instantly when (.)*?
is removed.