savkov
I am an NLP scientist and leader interested in large language models and Python.
Babylon HealthLondon
Pinned Repositories
bioeval
CoNLL-2000 style evaluation of data using BIO and BEISO representation for mutli-token entities (i.e. chunks).
bratutils
A collection of utilities for manipulating data and calculating inter-annotator agreement in brat annotation files.
corrsim
Code for the papers: Correlation Coefficients and Semantic Textual Similarity, NAACL-HLT 2019 & Correlations between Word Vector Sets, EMNLP-IJCNLP 2019.
crfppftvec
Simplifies the CRF++ feature template notation
fuzzymax
Code for the paper: Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors, ICLR 2019.
harvey-corpus
Syntactic chunks and semantic entities annotations and guidelines for the Harvey corpus of primary care text.
hmrb
A sequence rule engine
LABPipe
Linguistic Processing Line for Bulgarian
primock57
Dataset of 57 mock medical primary care consultations: audio, consultation notes, human utterance-level transcripts.
simba
Semantic similarity measures from Babylon Health
savkov's Repositories
savkov/bratutils
A collection of utilities for manipulating data and calculating inter-annotator agreement in brat annotation files.
savkov/harvey-corpus
Syntactic chunks and semantic entities annotations and guidelines for the Harvey corpus of primary care text.
savkov/bioeval
CoNLL-2000 style evaluation of data using BIO and BEISO representation for mutli-token entities (i.e. chunks).
savkov/crfppftvec
Simplifies the CRF++ feature template notation
savkov/qsutils
A utility library and a collection of scripts to process and filter the output of SGE's qstat command
savkov/planchet
Your large data processing personal assistant
savkov/alpine-pandas
Alpine with pandas docker
savkov/GoogleScraper
A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, Baidu and others) by using proxies (socks4/5, http proxy) and with many different IP's, including asynchronous networking support (very fast).
savkov/savkov.github.io
Personal page
savkov/corrsim
Code for the papers: Correlation Coefficients and Semantic Textual Similarity, NAACL-HLT 2019 & Correlations between Word Vector Sets, EMNLP-IJCNLP 2019.
savkov/fuzzymax
Code for the paper: Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors, ICLR 2019.
savkov/hmrb
A sequence rule engine
savkov/primock57
Dataset of 57 mock medical primary care consultations: audio, consultation notes, human utterance-level transcripts.
savkov/simba
Semantic similarity measures from Babylon Health
savkov/BootstrapSplit
Resampling using Bootstrapping
savkov/cape-webservices
Entrypoint for all backend cape webservices
savkov/ChocolateBrownie
IPython pygment style and stylesheet
savkov/dlbook_notation
LaTeX files for the Deep Learning book notation
savkov/eval-word-vectors
Easy to use scripts for evaluating word vectors on a variety of tasks.
savkov/freebie
Free news and social media source database for information extraction
savkov/friggeri-cv
A LaTeX curriculum vitae/resume template
savkov/issue-sync
A tool for synchronizing issue tracking between GitHub and JIRA
savkov/MedAffix
Scraper for medical affixes from Wikipedia
savkov/mg2p
Multilingual grapheme-to-phoneme conversion
savkov/quay-docs
Documentation for Quay.io
savkov/randhy
Hypothesis thesting with approximate randomisation
savkov/slack-export
A python slack exporter
savkov/sling
SLING - A natural language frame semantics parser
savkov/spaCy
💫 Industrial-strength Natural Language Processing (NLP) with Python and Cython
savkov/streamlit-example
Example Streamlit app that you can fork to test out share.streamlit.io