O has high P/R (98) . The score they report was overall not subdivided by type ( eg ORG,MISC and LOC). you might want to try (weighted) averaging. I have not tried that as yet. Do try and let me know.
@LopezGG
In my opinion, the NE overall F1 is calculated by:
F1 = 2P*R/(P+R)
the overall precision is calculated by:
#correct NEs in result / #total NEs in result
they have already not included tag "O" when overall F1