seungju-k1m's Stars
meta-llama/llama
Inference code for Llama models
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
jax-ml/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
google/sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
mistralai/mistral-inference
Official inference library for Mistral models
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
vikhyat/moondream
tiny vision language model
google/flax
Flax is a neural network library for JAX that is designed for flexibility.
princeton-nlp/tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
luosiallen/latent-consistency-model
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
facebookresearch/Pearl
A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.
noahshinn/reflexion
[NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning
EgoAlpha/prompt-in-context-learning
Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.
Farama-Foundation/chatarena
ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
google-deepmind/concordia
A library for generative social simulation
grok-ai/nn-template
Generic template to bootstrap your PyTorch project.
kvablack/ddpo-pytorch
DDPO for finetuning diffusion models, implemented in PyTorch with LoRA support
lqtrung1998/mwp_ReFT
nicklashansen/tdmpc2
Code for "TD-MPC2: Scalable, Robust World Models for Continuous Control"
zhuyiche/llava-phi
yingchengyang/Reinforcement-Learning-Papers
Related papers for reinforcement learning, including classic papers and latest papers in top conferences
Farama-Foundation/Minari
A standard format for offline reinforcement learning datasets, with popular reference datasets and related utilities
mihirp1998/AlignProp
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
abdulhaim/LMRL-Gym
tinker495/jax-baseline
Jax-Baseline is a Reinforcement Learning implementation using JAX and Flax/Haiku libraries, mirroring the functionality of Stable-Baselines.
moripiri/Reinforcement-Learning-on-FrozenLake
Reinforcement Learning Algorithms in FrozenLake-v1
seungju-k1m/CommonRoad