Pinned Repositories
BMP
Faster Learned Sparse Retrieval with Block-Max Pruning. ACM SIGIR 2024.
ciff
The inverted index exchange format as defined as part of the Open-Source IR Replicability Challenge (OSIRRC) initiative
docker
Docker image for PISA
pisa
PISA: Performant Indexes and Search for Academia
Porter2
Porter2 stemming library
pypisa
A Python interface to the PISA IR engine
topk-threshold-estimation
Experiments for "A Comparison of Top-k Threshold Estimation Techniques for Disjunctive Query Processing"
trecpp
A C++ parser for the TREC document format.
wapopp
A C++ parser for the Washington Post (WaPo) format.
warcpp
A C++ parser for the Web Archive (WARC) format.
PISA's Repositories
pisa-engine/pisa
PISA: Performant Indexes and Search for Academia
pisa-engine/BMP
Faster Learned Sparse Retrieval with Block-Max Pruning. ACM SIGIR 2024.
pisa-engine/ciff
The inverted index exchange format as defined as part of the Open-Source IR Replicability Challenge (OSIRRC) initiative
pisa-engine/Porter2
Porter2 stemming library
pisa-engine/pypisa
A Python interface to the PISA IR engine
pisa-engine/ecir19-bisection
Experiments for "Compressing Inverted Indexes with Recursive Graph Bisection: A Reproducibility Study".
pisa-engine/raxpp
C++ bindings for rax: https://github.com/antirez/rax
pisa-engine/taily
Implementation of Taily algorithm as described by Aly et al. in the 2013 paper "Taily: shard selection using the tail of score distributions."
pisa-engine/topk-threshold-estimation
Experiments for "A Comparison of Top-k Threshold Estimation Techniques for Disjunctive Query Processing"
pisa-engine/warcpp
A C++ parser for the Web Archive (WARC) format.
pisa-engine/accumulator
Benchmarking several score accumulators used in IR
pisa-engine/KrovetzStemmer
Krovetz stemming library
pisa-engine/pyciff
Python bindings for CIFF library at https://github.com/pisa-engine/ciff
pisa-engine/tokenizer
pisa-engine/docker
Docker image for PISA
pisa-engine/trecpp
A C++ parser for the TREC document format.
pisa-engine/wapopp
A C++ parser for the Washington Post (WaPo) format.
pisa-engine/ciff-hub
Hosting some useful CIFFs
pisa-engine/mln
An implementation of the Most-Likely-Next algorithm
pisa-engine/nyt-corpus-reader
A parser and MongoDB backed store for searching the New York Times Annotated Corpus (LDC2008T19)
pisa-engine/nytpp
A C++ parser for the New York Times (NYT) format.
pisa-engine/pisa-engine.github.io
pisa-engine/pisa-jr
Minimal implementation of PISA in Rust
pisa-engine/search-benchmark-game
pisa-engine/standard-benchmark
Standard speed regression test for PISA
pisa-engine/trec-text-rs
TREC Text collection format parser