zaemyung

Pinned Repositories

crawl-naver-news-and-comments
Crawling the most read news articles per day over the years (with comments)
Language:Python1 4 01
crawl-reuters
A simple Scrapy script for crawling Reuters news articles (Python 3)
Language:Python13 4 55
dots_public
Language:Shell10
PDTB-discourse-relation-classifier
Language:Python0 2 20
sentsplit
A flexible sentence segmentation library using CRF model and regex rules
Language:Python22 4 125
streamlit-tutorial
A simple tutorial script on Streamlit using the Iris Dataset
Language:Python12 2 02
Visualizing-Cross-Lingual-Discourse-Relations
Codes for paper, "Visualizing Cross-Lingual Discourse Relations in Multilingual TED Corpora" at CODI 2021 @ EMNLP 2021
Language:Python2 2 10
wikiextractor
A tool for extracting plain text from Wikipedia dumps
Language:Python17 1 08

zaemyung's Repositories

zaemyung/sentsplit
A flexible sentence segmentation library using CRF model and regex rules
Language:Python22 4 125
zaemyung/streamlit-tutorial
A simple tutorial script on Streamlit using the Iris Dataset
Language:Python12 2 02
zaemyung/Visualizing-Cross-Lingual-Discourse-Relations
Codes for paper, "Visualizing Cross-Lingual Discourse Relations in Multilingual TED Corpora" at CODI 2021 @ EMNLP 2021
Language:Python2 2 10
zaemyung/crawl-naver-news-and-comments
Crawling the most read news articles per day over the years (with comments)
Language:Python1 4 01
zaemyung/dots_public
Language:Shell10
zaemyung/PDTB-discourse-relation-classifier
Language:Python0 2 20
zaemyung/bertviz
Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)
zaemyung/ContraPro
Contrastive evaluation of pronoun translation in neural machine translation
Language:Perl1 0
zaemyung/Cornell-Conversational-Analysis-Toolkit
ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the use of the toolkit on these datasets.
Language:Jupyter Notebook1 0
zaemyung/Creative-Commons-Markdown
Markdown-formatted Creative Commons licenses
1 0
zaemyung/disaster_tweets
1 0
zaemyung/Discourse-Phenomena-in-Document-level-Neural-Machine-Translation
Datasets for "A Test Suite for Evaluating Discourse Phenomena in Document-level Neural Machine Translation" accepted by Proceedings of the Second International Workshop of Discourse Processing
1 0
zaemyung/DMRST_Parser
One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".
Language:Python0 0
zaemyung/dockerfiles
Language:Dockerfile2 12
zaemyung/good-translation-wrong-in-context
This is a repository with the data and code for the ACL 2019 paper "When a Good Translation is Wrong in Context: ..." and the EMNLP 2019 paper "Context-Aware Monolingual Repair for Neural Machine Translation"
Language:Ruby1 0
zaemyung/google-research
Google Research
Language:Jupyter Notebook1 01
zaemyung/kmeans_pytorch
kmeans using PyTorch
Language:Jupyter Notebook1 0
zaemyung/korean_wordlist
korean wordlist
Language:Python1 0
zaemyung/language-programmes
Language:Jupyter Notebook0 0
zaemyung/Large-contrastive-pronoun-testset-EN-FR
Language:PLSQL1 0
zaemyung/mtdlc
Library for parsing document-level corpora for machine translation
2 0
zaemyung/Pytorch-Sequence-Bucket-Iterator
A minimal sampler example for bucketing sequences of similar lengths in Pytorch based off of @TrentBrick script https://gist.github.com/TrentBrick/bac21af244e7c772dc8651ab9c58328c.
Language:Python1 0
zaemyung/Shallow-Discourse-Annotation-for-Chinese-TED-Talks
Datasets for "Shallow Discourse Annotation for Chinese TED Talks" Accepted by LREC 2020
zaemyung/st-annotated-text
A simple component to display annotated text in Streamlit apps.
zaemyung/Ted-MDB-Annotations
1 0
zaemyung/transformer-lm
Transformer language model (GPT-2) with sentencepiece tokenizer
Language:Python1 0
zaemyung/transformers
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
Language:Python1 0
zaemyung/utils
simple scripts that make life easier...
Language:Shell2 0
zaemyung/weightedWWL
learning subtree pattern importance for WL based graph kernels
Language:Python0 0
zaemyung/zaemyung.github.io
A beautiful, simple, clean, and responsive Jekyll theme for academics
Language:HTML