oscmansan

Visiting researcher @ Meta FAIR, PhD candidate @ Mila, UdeM

Montréal

oscmansan's Stars

xai-org/grok-1
Grok open release
Language:Python49.7k 591 2148.3k
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Language:Python37.3k 352 1.8k4.6k
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
Language:Jupyter Notebook15.7k 204 3972.3k
stas00/ml-engineering
Machine Learning Engineering Open Book
Language:Python12.1k 117 30734
huggingface/trl
Train transformer language models with reinforcement learning.
Language:Python10.4k 77 1.3k1.3k
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Language:Python6.6k 44 83592
ml-explore/mlx-examples
Examples in the MLX framework
Language:Python6.4k 71 523918
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Language:Python5.7k 60 106521
pytorch/torchtune
PyTorch native post-training library
Language:Python4.5k 48 804468
openai/grok
Language:Python4.1k 144 35543
openai/transformer-debugger
Language:Python4k 25 14241
apple/ml-mgie
Language:Python3.9k 62 0253
facebookresearch/jepa
PyTorch code and models for V-JEPA self-supervised learning from video.
Language:Python2.7k 37 56258
NVlabs/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Language:Python2.2k 37 142183
stanford-crfm/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in HEIM (https://arxiv.org/abs/2311.04287) and vision-language models in VHELM (https://arxiv.org/abs/2410.07112).
Language:Python2k 36 1.1k263
facebookresearch/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
Language:Python1.9k 26 51112
facebookresearch/MetaCLIP
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
Language:Python1.3k 12 3456
unum-cloud/uform
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Language:Python1.1k 15 3063
merveenoyan/smol-vision
Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜
Language:Jupyter Notebook1k 13 1695
stas00/the-art-of-debugging
The Art of Debugging
Language:C827 16 037
cloneofsimo/minSDXL
Huggingface-compatible SDXL Unet implementation that is readily hackable
Language:Jupyter Notebook403 5 429
huggingface/open-muse
Open reproduction of MUSE for fast text2image generation.
Language:Python338 40 2728
linzhiqiu/t2v_metrics
Evaluating text-to-image/video/3D models with VQAScore
Language:Python240 15 1321
NVIDIA/Megatron-Energon
Megatron's multi-modal data loader
Language:Python153 10 1313
YiyangZhou/LURE
[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
Language:Python136 4 145
j-min/DSG
Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)
Language:Jupyter Notebook79 3 85
YiyangZhou/POVID
[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
Language:Python77 3 143
openai/dalle3-eval-samples
Text-to-image samples collected for the evaluation of DALL-E 3 in the whitepaper.
58 1 011
YiyangZhou/CSR
[NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models
Language:Python52 1 32
DavidMChan/aloha
A new reliable, localizable, and generalizable metric for hallucination detection in image captioning models.
Language:Python5

oscmansan

oscmansan's Stars

xai-org/grok-1

lm-sys/FastChat

meta-llama/llama-recipes

stas00/ml-engineering

huggingface/trl

facebookresearch/DiT

ml-explore/mlx-examples

pytorch-labs/gpt-fast

pytorch/torchtune

openai/grok

openai/transformer-debugger

apple/ml-mgie

facebookresearch/jepa

NVlabs/VILA

stanford-crfm/helm

facebookresearch/chameleon

facebookresearch/MetaCLIP

unum-cloud/uform

merveenoyan/smol-vision

stas00/the-art-of-debugging

cloneofsimo/minSDXL

huggingface/open-muse

linzhiqiu/t2v_metrics

NVIDIA/Megatron-Energon

YiyangZhou/LURE

j-min/DSG

YiyangZhou/POVID

openai/dalle3-eval-samples

YiyangZhou/CSR

DavidMChan/aloha