PolMine/RcppCWB

cl_regex2id expects single string value

Closed this issue · 1 comments

... But often you may want to get ids for multiple regexes, so it would be good for efficiency, if this worked:

RcppCWB::cl_regex2id("REUTERS", p_attribute = "word", regex = c("oi.*", "crud.*"))

Yes ... but: No. The return value for a single regex is often several integer ids. To know which regex yields which ids, the return value of cl_regex2id() would have to be turned into a list - a result you also get when when using lapply(). The additional performance of a pure C++ generation of a list would only be relevant, if getting the ids for multiple regexes was a significant bottleneck. I do not see this - so I close the issue.