Pinned Repositories
awesome-bert-japanese
📝 A list of pre-trained BERT models for Japanese with word/subword tokenization + vocabulary construction algorithm information
bert_mlm
eda_nlp
Data augmentation for NLP, presented at EMNLP 2019
generic-pretrained-GEC
Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model.
lm_selector
m2scorer_python3
sumeval
Well tested & Multi-language evaluation framework for text summarization.
wikihow_japanese
Katsumata420's Repositories
Katsumata420/generic-pretrained-GEC
Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model.
Katsumata420/eda_nlp
Data augmentation for NLP, presented at EMNLP 2019
Katsumata420/sumeval
Well tested & Multi-language evaluation framework for text summarization.
Katsumata420/AutoPhrase
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
Katsumata420/beir-gpu
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Katsumata420/BLINK
Entity Linker solution
Katsumata420/DiffCSE
Code for the NAACL 2022 long paper "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings"
Katsumata420/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Katsumata420/GPT-GEC
Grammatical Error Correction using GPT3.5
Katsumata420/ja-vicuna-qa-benchmark
Katsumata420/JGLUE
JGLUE: Japanese General Language Understanding Evaluation
Katsumata420/Katsumata420.github.io
webpage
Katsumata420/leetcode
My answers for LeetCode
Katsumata420/llm-jp-dpo
Katsumata420/llm-jp-sft
Katsumata420/LLM-offline-inference
Katsumata420/lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
Katsumata420/luke
LUKE -- Language Understanding with Knowledge-based Embeddings
Katsumata420/ml-system-in-actions
machine learning system examples
Katsumata420/mlflow
Open source platform for the machine learning lifecycle
Katsumata420/post-specialisation
Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources
Katsumata420/SBERT-Sagemaker
Training scripts for sentence-bert using the sagemaker.
Katsumata420/self-instruct
Aligning pretrained language models with instruction data generated by themselves.
Katsumata420/sentence-transformers
Multilingual Sentence & Image Embeddings with BERT
Katsumata420/SentEval
A python tool for evaluating the quality of sentence embeddings.
Katsumata420/SimCSE
EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
Katsumata420/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Katsumata420/UCTopic
An easy-to-use tool for phrase encoding and topic mining (unsupervised aspect extraction); Code base for ACL 2022 paper, UCTopic: Unsupervised Contrastive Learning for Phrase Representations and Topic Mining.
Katsumata420/wikipedia_search
文書集合からwikipediaの似た記事を取得する
Katsumata420/WizardLM