UCDenver-ccp/CRAFT

Future plans to release the full corpus of 97 articles?

Closed this issue · 2 comments

Hi all,

First of all, thank you for creating this awesome resource. The biomedical text-mining community is in desperate need of a large, high-quality and extensively annotated corpus, and this fits the bill!

I noticed that in the original CRAFT paper, a larger corpus of 97 articles is alluded to:

"The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released)."

I am just wondering, are there still plans to release the additional 30 articles, and if so, what would that timeline look like?

Thank you for the nice words. We've devoted an enormous amount of time and effort into this corpus (especially me, ha ha), and so we're very glad to see it being used by so many researchers.

We're definitely going to release the annotations for the additional 30 articles at some point. (Personally, I would've preferred for them to be released already.) But I think our PI would still like to use these annotations for a future competition. However, there's nothing planned for the near future, so unfortunately I don't have a good answer for you as to a timeline.

I see! Well thank you for responding anyways, I'm looking forward to getting my hands on those last 30 articles :)