gentaiscool
Researcher @ Capital One AI Foundations. Natural Language Processing, Speech, Multilingual, Code-switching, Dialogue
Capital One AI FoundationsNew York
gentaiscool's Stars
meta-llama/llama
Inference code for Llama models
Nyandwi/machine_learning_complete
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
state-spaces/s4
Structured state space sequence models
IndoNLP/nusa-crowd
A collaborative project to collect datasets in Indonesian languages.
LAION-AI/Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
shauryr/ACL-anthology-corpus
This repository provides details and links to the ACL anthology corpus/collection including .bib, .pdf and grobid extractions of the pdfs
srush/do-we-need-attention
forestagostinelli/DeepCubeA
Code for DeepCubeA, a Deep Reinforcement Learning algorithm that can learn to solve the Rubik's cube.
ExpressAI/DataLab
The unified platform for data-related resources.
nlp-uoregon/Okapi
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
IndoNLP/nusax
High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)
bloomberg/minilmv2.bb
Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)
gentaiscool/indonesian-nlp
A curated list of research papers and resources on Indonesian languages
bloomberg/kbir_keybart
Experimental code used in pre-training the KBIR and KeyBART models
IndoNLP/nusa-writes
NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented and extremely low-resource Indonesian local languages.
bltlab/mot
Multilingual Open Text
yogisalomo/english-speaker-friendly-korean-companies
Repository to aggregate data about Korean companies that works with English as official language or accepts non-Korean speaking members
bltlab/paranames
ParaNames: A multilingual resource for parallel names
HLTCHKUST/KnowExpert
The implementation of the paper "Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters".
Southeast-Asia-NLP/LLM-Code-Mixing
Can LLMs generate code-mixed sentences through zero-shot prompting?
aparnadutta/code-mixed-lid
Word-level language identification for Bangla-English code-mixed social media data, using a BiLSTM with subword embeddings.
IndoNLP/nusa-catalogue
Dataset Catalogue Homepage for Indonesian Languages
neulab/globalbench
GlobalBench: A Benchmark for Global Progress in Language Technology
kongaskristjan/rubik
Solve a Rubik's Cube with neural networks
bharathichezhiyan/HopeEDI
HopeEDI: A Multilingual Hope Speech Detection Dataset for Equality, Diversity, and Inclusion
Genius1237/numpy-gpt2
holylovenia/emoji-GAN
HKUST's ELEC5680/COMP5214 Advanced Deep Learning Architectures Assignment 3
IndoNLP/.github
Landing page
IndoNLP/indonlp.github.io
wenliangdai/Weakly-Supervised-Multitask-MAR
Weakly-supervised Multitask Multimodal Affect Recognition.