Word boundary not working
vitalyli opened this issue · 2 comments
Hi !
Ran into word boundary issue, I have a long list of tokens, some of which I
need to match with word boundary; Basic Pattern class works, but not Automation.
Any way this can be fixed?
Automaton p0_AA = new RegExp(".*(something|\b(blah|foo|goo)\b)").toAutomaton();
RunAutomaton p0_RA = new RunAutomaton(p0_AA);
System.out.println(p0_RA.run("ba foo nery"));
-->false
Basic regex works with above.
Pattern p0 = Pattern.compile(".*(something|\b(blah|foo|goo)\b)");
String s = "ba foo nery";
Matcher m = p0.matcher(s);
if (m.find()) {
System.out.println("pattern found");
} else {
System.out.println("not found");
}
-->found
Please see the FAQ (https://www.brics.dk/automaton/faq.html).
You can try something like new RegExp("(.*[\\ ])?[a-z]+([\\ ].*)?")
(modify according to what delimiters and word characters you're interested in).