TREC KBA & StreamCorpus
common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text
Earth
Pinned Repositories
kba-2012-hadoop-job
This project contains some Hadoop code for working with the TREC Knowledge Base Acceleration dataset. In particular, it provides classes to read/write topic files, read/write run files, and expose the documents in the Thrift files as Hadoop-readable objects.
kba-corpus
Tools for working with TREC KBA Corpora
kba-scorer
scoring tools for TREC KBA
kba-stanford-corenlp
Wrappers for generating one-word-per-line output representing all the goodies from Stanford CoreNLP, so we can include it in the KBA stream corpus.
kba-tools
Tools for working with TREC KBA entities, training data, and run submissions
many-stop-words
stop word lists in several languages
streamcorpus
common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text
streamcorpus-elasticsearch
streamcorpus-pipeline
framework for making streamcorpus data
trec-kba.org
TREC KBA Website
TREC KBA & StreamCorpus's Repositories
trec-kba/streamcorpus
common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text
trec-kba/many-stop-words
stop word lists in several languages
trec-kba/streamcorpus-pipeline
framework for making streamcorpus data
trec-kba/kba-corpus
Tools for working with TREC KBA Corpora
trec-kba/kba-tools
Tools for working with TREC KBA entities, training data, and run submissions
trec-kba/kba-stanford-corenlp
Wrappers for generating one-word-per-line output representing all the goodies from Stanford CoreNLP, so we can include it in the KBA stream corpus.
trec-kba/kba-scorer
scoring tools for TREC KBA
trec-kba/streamcorpus-elasticsearch
trec-kba/trec-kba.org
TREC KBA Website
trec-kba/kba-2012-hadoop-job
This project contains some Hadoop code for working with the TREC Knowledge Base Acceleration dataset. In particular, it provides classes to read/write topic files, read/write run files, and expose the documents in the Thrift files as Hadoop-readable objects.
trec-kba/streamcorpus-factorie
integrate factorie language analyzer into streamcorpus-pipeline
trec-kba/streamcorpus-opensextant
MOVED to
trec-kba/streamcorpus-scrapy
trec-kba/streamcorpus.org
Streamcorpus website