proycon/colibri-core
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
C++GPL-3.0
Issues
- 1
- 1
Package for Alpine Linux
#45 opened by proycon - 1
Implement begin-of-sentence and end-of-sentence markers in pattern model training
#11 opened by proycon - 2
Can't compile on CentOS 6.6
#22 opened by andreasvc - 3
Wrong threshold in model.filter
#39 opened by svetlana21 - 2
- 1
Problems compiling with anaconda
#43 opened by redadmiral - 5
Unable to load large corpora into memory because PatternPointer length can't exceed 2^32 bytes (32 bit size descriptor)
#42 opened by proycon - 1
- 4
Missing data in indexed model on large data set; yields much lower counts than unindexed model on the same data with the same parameters!
#41 opened by proycon - 4
how to expose colibri-ngrams from Python API?
#38 opened by mikkokotila - 2
Error with Tibetan Unicode
#37 opened by ngawangtrinley - 0
- 0
Implement ability to filter on (n)PMI for getleftneighbours(), getleftcooc(), etc..
#33 opened by proycon - 3
Clean warnings in v2
#3 opened by proycon - 1
- 1
- 0
buildpattern() does not raise an exception when unknown tokens are presented in the input and allowunknown=false (default)!
#25 opened by proycon - 1
Provide vocabulary file
#2 opened by naiaden - 3
- 1
Load corpora with mmap
#23 opened by andreasvc - 3
- 0
- 4
- 0
No flexgram support in IndexedPatternModel.getsubchildren() / getsubparents() yet
#19 opened by proycon - 0
- 1
- 2
- 0
lower-order ngrams not pruned when training with skipgrams, minlengh > 1 and t > 1
#15 opened by proycon - 1
- 0
Implement more efficient algorithms for the search and extraction of pre-specified skipgrams and flexgrams
#9 opened by proycon - 1
getskipcontent() broken in v2
#10 opened by proycon - 1
Python Tutorial needs an update for v2
#6 opened by proycon - 1
- 0
- 0
Check in advance for reverse index when training skipgrams with IndexedPatternModel
#4 opened by proycon - 1