Pinned Repositories
anago
Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.
ASER
ASER (activities, states, events, and their relations), a large-scale eventuality knowledge graph extracted from more than 11-billion-token unstructured textual data.
AutoPhrase
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
BERT-NER
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
biobert
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
blockchain_bootcamp
commonsenseqa
CoreNLP
Stanford CoreNLP: A Java suite of core NLP tools.
flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)
NewBioNer
xhuang28's Repositories
xhuang28/NewBioNer
xhuang28/anago
Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.
xhuang28/ASER
ASER (activities, states, events, and their relations), a large-scale eventuality knowledge graph extracted from more than 11-billion-token unstructured textual data.
xhuang28/AutoPhrase
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
xhuang28/BERT-NER
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
xhuang28/biobert
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
xhuang28/blockchain_bootcamp
xhuang28/commonsenseqa
xhuang28/CoreNLP
Stanford CoreNLP: A Java suite of core NLP tools.
xhuang28/flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)
xhuang28/GCDT
Code for the paper: GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling
xhuang28/giza-pp
GIZA++ is a statistical machine translation toolkit that is used to train IBM Models 1-5 and an HMM word alignment model. This package also contains the source for the mkcls tool which generates the word classes necessary for training some of the alignment models.
xhuang28/iclr2016
Python code for training all models in the ICLR paper, "Towards Universal Paraphrastic Sentence Embeddings". These models achieve strong performance on semantic similarity tasks without any training or tuning on the training data for those tasks. They also can produce features that are at least as discriminative as skip-thought vectors for semantic similarity tasks at a minimum. Moreover, this code can achieve state-of-the-art results on entailment and sentiment tasks.
xhuang28/is-xhuang1994
Distinguish Bots from Humans on Twitter
xhuang28/LM-LSTM-CRF
Empower Sequence Labeling with Task-Aware Language Model
xhuang28/MACROSCORE
MACROSCORE project at ISI - Micro Feature Extraction direction
xhuang28/mmner
Massively Multilingual Transfer for NER
xhuang28/mrc-for-flat-nested-ner
The code for "A Unified MRC Framework for Named Entity Recognition"
xhuang28/NER-GRN
Code for our AAAI2019 paper "GRN: Gated Relation Network to Enhance Convolutional Neural Network for Named Entity Recognition"
xhuang28/nlproc-cookbook
xhuang28/nltk_contrib
NLTK Contrib
xhuang28/OntoNotes-5.0-NER-BIO
A BIO formatted Named Entity Recognition data set extracted from the OntoNotes 5.0 release.
xhuang28/para-nmt-50m
Pre-trained models and code and data to train and use models from "Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations"
xhuang28/python-wordsegment
English word segmentation, written in pure-Python, and based on a trillion-word corpus.
xhuang28/scpn
syntactically controlled paraphrase networks
xhuang28/semi-supervised-baselines
Code for "Strong Baselines for Neural Semi-supervised Learning under Domain Shift" (Ruder & Plank, 2018 ACL)
xhuang28/Vanilla_NER
Vanilla Sequence Labeling w. Char-LSTM-CRF