kobikun's Repositories
kobikun/wiki
wiki docuements
kobikun/dastrie
Static Double Array Trie (DASTrie)
kobikun/ake-datasets
Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.
kobikun/ConvLab
kobikun/CTranslate2
Fast inference engine for OpenNMT models
kobikun/dotfiles
kobikun/extractor-wiki-data
extracting multiple-language data from wiki-data
kobikun/gym
gym for execising
kobikun/Ivory
A Hadoop toolkit for web-scale information retrieval research
kobikun/kenlm
KenLM: Faster and Smaller Language Model Queries
kobikun/kobikun
Config files for my GitHub profile.
kobikun/korean-sentence-splitter
Split Korean text into sentences using heuristic algorithm.
kobikun/MPNet
MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf
kobikun/NER
한국어 개체명 정의 및 표지 표준화 기술보고서와 이를 기반으로 제작된 개체명 형태소 말뭉치
kobikun/niben
nihongo benkyo by chatbot
kobikun/nltk
NLTK Source
kobikun/numpy-study
numpy-study
kobikun/OpenNMT-py
Open-Source Neural Machine Translation in PyTorch http://opennmt.net/
kobikun/opensubtitles-parser
download, extract, parse and tokenize the opensubtitles dataset with this script
kobikun/RealChar
🎙️🤖Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation with AI everywhere (mobile, web and terminal) using LLM OpenAI GPT3.5/4, Anthropic Claude2, Chroma Vector DB, Whisper Speech2Text, ElevenLabs Text2Speech🎙️🤖
kobikun/sentence-transformers
Sentence Embeddings with BERT & XLNet
kobikun/simstring
SimString
kobikun/study
study ipython
kobikun/subtitle_chatpair
chatting pair from subtitles
kobikun/subword-nmt
Subword Neural Machine Translation
kobikun/text
kobikun/TIL
TIL