/datasets-for-sequential-sentence-classification

Curated list of public datasets which focus on sentence classification in academic papers or abstracts

The UnlicenseUnlicense

Awesome

datasets-for-sequential-sentence-classification

Curated list of public datasets which focus on sentence classification in academic papers or abstracts

Name Year Domains Source Annotated by #Papers Text Type Classes
CODA-19 GitHub stars 2020 Biomedical Sciences CORD-19 Crowdworkers 10,966 abstracts (4+1) BACKGROUND, PURPOSE, METHOD, FINDING/CONTRIBUTION, OTHER
cs.combined GitHub stars 2020 Computer Science (cs.NI + cs.TLT + cs.TPAMI) arXiv + IEEE Transactions Experts 450 abstracts (3) BACKGROUND, TECHNIQUE, OBSERVATION
CSABSTRUCT GitHub stars 2019 Computer Science Semantic Scholar corpus Crowdworkers 2,189 abstracts (4+1) BACKGROUND, OBJECTIVE, METHOD, RESULT, OTHER
CS Abstracts GitHub stars 2019 Computer Science arXiv Crowdworkers 654 abstracts (5) BACKGROUND, OBJECTIVE, METHODS, RESULTS, CONCLUSIONS
PubMed PICO Element Detection Dataset GitHub stars 2018 Biomedical Sciences PubMed Author 24,668 abstracts (7) AIM, PARTICIPANTS, INTERVENTION, OUTCOME, METHOD, RESULTS, CONCLUSION
PubMed 200k RCT GitHub stars 2017 Biomedical Sciences PubMed Author 200,000 abstracts (5) BACKGROUND, OBJECTIVE, METHOD, RESULT, CONCLUSION
PubMed 20k RCT GitHub stars 2017 Biomedical Sciences PubMed Author 20,000 abstracts (5) BACKGROUND, OBJECTIVE, METHOD, RESULT, CONCLUSION
MCCRA (Multi-CoreSC CRA corpus) 2016 Cancer Risk Assessment (CRA) selected by a domain expert Experts 50 full paper (11) HYPOTHESIS, MOTIVATION, BACKGROUND, GOAL, OBJECT, METHOD, EXPERIMENT, MODEL, OBSERVATION, RESULT, CONCLUSION
DRI Corpus (Dr. Inventor Multi-Layer Scientific Corpus) 2015 Computer Graphics a bigger collection provided by experts in the domain Experts 40 full paper (5) BACKGROUND, CHALLENGE, APPROACH, OUTCOME, FUTURE WORK
NICTA-PIBOSO GitHub stars 2011 Biomedical Sciences PubMed Experts 1,000 abstracts (5+1) BACKGROUND, POPULATION, INTERVENTION, OUTCOME, STUDY DESIGN, OTHER
ART Corpus (CoreSC) 2010 Physical Chemistry and Biochemistry Royal Society of Chemistry (RSC) Publishing Experts 225 full paper (11) HYPOTHESIS, MOTIVATION, BACKGROUND, GOAL, OBJECT, METHOD, EXPERIMENT, MODEL, OBSERVATION, RESULT, CONCLUSION
AZ Corpus 2002 Computational Linguistics arXiv Experts & Author 80 full paper (6+1) AIM, TEXTUAL, OWN, BACKGROUND, CONTRAST, BASIS, OTHER