savkov
I am an NLP scientist and leader interested in large language models and Python.
Babylon HealthLondon
Pinned Repositories
bioeval
CoNLL-2000 style evaluation of data using BIO and BEISO representation for mutli-token entities (i.e. chunks).
bratutils
A collection of utilities for manipulating data and calculating inter-annotator agreement in brat annotation files.
corrsim
Code for the papers: Correlation Coefficients and Semantic Textual Similarity, NAACL-HLT 2019 & Correlations between Word Vector Sets, EMNLP-IJCNLP 2019.
crfppftvec
Simplifies the CRF++ feature template notation
fuzzymax
Code for the paper: Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors, ICLR 2019.
harvey-corpus
Syntactic chunks and semantic entities annotations and guidelines for the Harvey corpus of primary care text.
hmrb
A sequence rule engine
LABPipe
Linguistic Processing Line for Bulgarian
primock57
Dataset of 57 mock medical primary care consultations: audio, consultation notes, human utterance-level transcripts.
simba
Semantic similarity measures from Babylon Health
savkov's Repositories
savkov/bunch
A Bunch is a Python dictionary that provides attribute-style access (a la JavaScript objects).
savkov/Spearmint
Spearmint Bayesian optimization codebase
savkov/LABPipe
Linguistic Processing Line for Bulgarian