yg37's Stars
utterworks/fast-bert
Super easy library for BERT based NLP models
ICLRandD/Blackstone
:black_circle: A spaCy pipeline and model for NLP on unstructured legal text.
infinilabs/analysis-pinyin
🛵 This Pinyin Analysis plugin is used to do conversion between Chinese characters and Pinyin.
tensorflow/text
Making text a first-class citizen in TensorFlow.
UniversalDependencies/UD_French-GSD
google/TensorNetwork
A library for easy and efficient manipulation of tensor networks.
chanzuckerberg/MedMentions
A corpus of Biomedical papers annotated with mentions of UMLS entities.
allenai/scispacy
A full spaCy pipeline and models for scientific/biomedical documents.
stanford-oval/genie-toolkit
The Genie open source kit for voice assistant (formerly known as Almond)
bollu/bollu.github.io
code + contents of my website, and programming life
uber-research/parallax
Tool for interactive embeddings visualization
google-research/mixmatch
deeplearning4j/deeplearning4j-examples
Deeplearning4j Examples (DL4J, DL4J Spark, DataVec)
tensorflow/examples
TensorFlow examples
tylerneylon/explacy
A small tool that EXPLains spACY parse results. See what I did there?
tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
re-search/DocProduct
Medical Q&A with Deep Language Models
benedekrozemberczki/graph2vec
A parallel implementation of "graph2vec: Learning Distributed Representations of Graphs" (MLGWorkshop 2017).
google-research-datasets/wiki-atomic-edits
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
msg-systems/holmes-extractor
Information extraction from English and German texts based on predicate logic
EtienneAb3d/OpenNeuroSpell
OpenNeuroSpell contains parts of NeuroSpell (http://neurospell.com/en.php) released as open-source. More code will be published as soon as proprietary parts will be rewritten.
HazyResearch/fonduer-tutorials
A collection of simple tutorials for using Fonduer
first20hours/google-10000-english
This repo contains a list of the 10,000 most common English words in order of frequency, as determined by n-gram frequency analysis of the Google's Trillion Word Corpus.
facebookresearch/spreadingvectors
Open source implementation of "Spreading Vectors for Similarity Search"
wikipedia2vec/wikipedia2vec
A tool for learning vector representations of words and entities from Wikipedia
ines/spacy-graphql
🤹♀️ Query spaCy's linguistic annotations using GraphQL
ahalterman/multiuser_prodigy
Running Prodigy for a team of annotators
RGBz/aws-s3-class-loader
A Java ClassLoader implementation that yanks classes directly from an Amazon Web Services S3 bucket.
almond-sh/almond
A Scala kernel for Jupyter
Ethonwu/Apriori-Python
Implement Frequent Itemset Mining Program in Python