Pinned Repositories
bunkai
Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)
ditto
Code for the paper "Deep Entity Matching with Pre-trained Language Models"
ginza
A Japanese NLP Library using spaCy as framework based on Universal Dependencies
HappyDB
A corpus of 100,000 happy moments
jrte-corpus
Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)
opiniondigest
OpinionDigest: A Simple Framework for Opinion Summarization (ACL 2020)
sato
Code and data for Sato https://arxiv.org/abs/1911.06311.
SubjQA
A question-answering dataset with a focus on subjective information
t5-japanese
Codes to pre-train Japanese T5 models
vecscan
Megagon Labs's Repositories
megagonlabs/ginza
A Japanese NLP Library using spaCy as framework based on Universal Dependencies
megagonlabs/ditto
Code for the paper "Deep Entity Matching with Pre-trained Language Models"
megagonlabs/bunkai
Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)
megagonlabs/sato
Code and data for Sato https://arxiv.org/abs/1911.06311.
megagonlabs/opiniondigest
OpinionDigest: A Simple Framework for Opinion Summarization (ACL 2020)
megagonlabs/vecscan
megagonlabs/SubjQA
A question-answering dataset with a focus on subjective information
megagonlabs/asdc
Accommodation Search Dialog Corpus (宿泊施設探索対話コーパス)
megagonlabs/instruction_ja
Japanese instruction data (日本語指示データ)
megagonlabs/cocosum
:coconut: Code & Data for Comparative Opinion Summarization via Collaborative Decoding (Iso et al; Findings of ACL 2022)
megagonlabs/starmie
Resources for PVLDB 2023 submission
megagonlabs/zett
:see_no_evil: Code for Zero-shot Triplet Extraction by Template Infilling (Kim et al; IJCNLP-AACL 2023)
megagonlabs/meganno-client
megagonlabs/llm-longeval
💵 Code for Less is More for Long Document Summary Evaluation by LLMs (Wu, Iso et al; EACL 2024)
megagonlabs/xatu
🕊️ Code and Data for XATU: A Fine-grained Instruction-based Benchmark for Explainable Text Updates (Zhang et al; LREC-COLING 2024)
megagonlabs/CMDBench
Data and Code for CMDBench experiments
megagonlabs/magneton
Repository of the Magneton framework for authoring interaction-aware and customizable widgets.
megagonlabs/witqa
megagonlabs/minun
Evaluating Counterfactual Explanations for Entity Matching
megagonlabs/watchog
The code for SIGMOD 2024 paper titled "Watchog: A Light-weight Contrastive Learning based Framework for Column Annotation"
megagonlabs/MCR
megagonlabs/pilota
✈ SCUD generator (解釈文生成器)
megagonlabs/quasi_japanese_reviews
Quasi Japanese Reviews (擬似レビューデータ)
megagonlabs/autotemplate
🧩 Code for AutoTemplate: A Simple Recipe for Lexically Constrained Text Generation (Iso; INLG 2024)
megagonlabs/meganno-service
megagonlabs/meganno-ui
megagonlabs/Megatron-LM
Ongoing research training transformer models at scale
megagonlabs/napa
🍷 Code for Noisy Pairing and Partial Supervision for Stylized Opinion Summarization (Iso et al; INLG 2024)
megagonlabs/rjdb
megagonlabs/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.