/COER

Chinese Open Entity-Relation Knowledge Base

COER

Chinese Open Entity-Relation Knowledge Base.

COER is a scalable entity and relation corpus, which currently contains more than 100,000,000 relation triples, where relations are open and arbitrary. Its design is aimed to make up for the lack of corpora in the field of Chinese information extraction. It is created automatically by unsupervised open extractor from diverse and heterogeneous web text, including encyclopedia and news. These corpuses contain military, sports, entertainment, economics and other fields, which ensures the openness of my base. The extracted triple set are stored in a series of XML files. Relation items are composed of original text, entity pairs, relationship phrases and shortest dependency paths. Each “Entity_pair” unit includes two argument entries, and every “relation_phrase” unit contains several mention entries. Meanwhile, entries own rich attributes. The organization of the content can be represented by a tree structure.

Please email us to get more information. Contact: shengbinjia@tongji.edu.cn