sdadas
Full stack developer with machine learning and distributed computing experience. Half software engineer, half data scientist. Likes JVM, Python and TypeScript.
Warsaw
Pinned Repositories
commoncrawl-downloader
Application for downloading text data from Common Crawl
fsbrowser
Fast desktop client for Hadoop Distributed File System
gitdmp
A tool for automatic export of commits from git repositories
polish-nlp-resources
Pre-trained models and language resources for Natural Language Processing in Polish
polish-roberta
RoBERTa models for Polish
polish-sentence-evaluation
Evaluation of Sentence Representations in Polish
spring2ts
Generate TypeScript REST client directly from Spring MVC application source
taggraph
Tag visualization for wykop.pl
Tensorflow-Tutorials
Bare bones introduction to machine learning from linear regression to convolutional neural networks using Tensorflow.
warsaw-transport
A visualization of Warsaw public transport
sdadas's Repositories
sdadas/polish-nlp-resources
Pre-trained models and language resources for Natural Language Processing in Polish
sdadas/polish-roberta
RoBERTa models for Polish
sdadas/warsaw-transport
A visualization of Warsaw public transport
sdadas/fsbrowser
Fast desktop client for Hadoop Distributed File System
sdadas/polish-sentence-evaluation
Evaluation of Sentence Representations in Polish
sdadas/spring2ts
Generate TypeScript REST client directly from Spring MVC application source
sdadas/commoncrawl-downloader
Application for downloading text data from Common Crawl
sdadas/gitdmp
A tool for automatic export of commits from git repositories
sdadas/RankGPT
Is ChatGPT Good at Search? LLMs as Re-Ranking Agent
sdadas/scinote
A personal bibliography manager and paper recommendation engine
sdadas/vwsd
Code for SemEval 2023 Task 1: Visual Word Sense Disambiguation
sdadas/yast
Yet Another Sequence Tagging library
sdadas/DiPS
NAACL 2019: Submodular optimization-based diverse paraphrasing and its effectiveness in data augmentation
sdadas/elasticsearch-analysis-morfologik
Morfologik Polish Lemmatizer plugin for Elasticsearch
sdadas/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
sdadas/fake-smtp-server
A simple SMTP Server for Testing purposes. Emails are stored in an in-memory database and rendered in a Web UI
sdadas/genre-docker
sdadas/jasypt-intellij-plugin
Spring Boot Jasypt Intellij plugin
sdadas/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
sdadas/LASER
Language-Agnostic SEntence Representations
sdadas/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
sdadas/nested-ner-2019-bert
Implementation of Nested Named Entity Recognition using BERT
sdadas/pawls
Software that makes labeling PDFs easy.
sdadas/pirb
sdadas/polish-simple-analyzer
sdadas/sentence-transformers-qa-example
sdadas/simple-translator
sdadas/splade
SPLADE: sparse neural search (SIGIR21, SIGIR22)
sdadas/tevatron
Tevatron - A flexible toolkit for dense retrieval research and development.
sdadas/wiki-index
Simple full text indexing for Wikipedia