Pinned Repositories
aurora
Multilingual, Multimodal, Multidomain model based on Starcoderplus and Bakllava
aurora-m
Adapting Starcoderplus for Multimodal Experts
KeyedVectorsANN
Genism word2vec + Pysparnn ANN + Trimmed GoogleNewsVec = Fast and lightweight NLP tool
M3rlin
Multilingual, Multimodal, Multidomain (M3) Model
M3rlin-fmengine
M3 Training Using FMengine
MDEL
Multi-Domain Expert Learning
muliwai
experimental PII framework
rio
Text pre-processing for NLP datasets
sungai
Sample multilingual data and tools for creating the data - used for NLP multilingual NLP research
riverbed
Tools for content datamining and NLP at scale
huu4ontocord's Repositories
huu4ontocord/MDEL
Multi-Domain Expert Learning
huu4ontocord/rio
Text pre-processing for NLP datasets
huu4ontocord/aurora
Multilingual, Multimodal, Multidomain model based on Starcoderplus and Bakllava
huu4ontocord/KeyedVectorsANN
Genism word2vec + Pysparnn ANN + Trimmed GoogleNewsVec = Fast and lightweight NLP tool
huu4ontocord/muliwai
experimental PII framework
huu4ontocord/sungai
Sample multilingual data and tools for creating the data - used for NLP multilingual NLP research
huu4ontocord/aurora-m
Adapting Starcoderplus for Multimodal Experts
huu4ontocord/M3rlin
Multilingual, Multimodal, Multidomain (M3) Model
huu4ontocord/M3rlin-fmengine
M3 Training Using FMengine
huu4ontocord/create_pii_dataset
huu4ontocord/data_tool_experiments
huu4ontocord/data_tooling
How should we store and serve the dataset?
huu4ontocord/hpj.py
Simple Python to Javascript translator with an emphasis on readability of generated code.
huu4ontocord/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
huu4ontocord/neox_alpaca_generate
huu4ontocord/oftf
One File Text Filter
huu4ontocord/pii_processing
PII Processing code to clean up BigScience datasets. Reference implementation for the PII Hackathon
huu4ontocord/summarize
Summarize. is a Streamlit application that performs automatic text summarization using both extractive and abstractive models.
huu4ontocord/tevatron
Tevatron - A flexible toolkit for dense retrieval research and development.
huu4ontocord/Viet-Mistral
Vietnamese Mistral