YX-S-Z

This guy is too lazy to describe himself

YX-S-Z's Stars

xai-org/grok-1
Grok open release
Language:Python50.2k 608 2208.4k
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python40.1k 397 3266.6k
karpathy/LLM101n
LLM101n: Let's build a Storyteller
32.7k 3k 01.8k
huggingface/trl
Train transformer language models with reinforcement learning.
Language:Python12.5k 82 1.6k1.7k
Farama-Foundation/Gymnasium
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
Language:Python8.6k 52 505957
thu-ml/tianshou
An elegant PyTorch deep reinforcement learning library.
Language:Python8.3k 91 7591.1k
vwxyzjn/cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Language:Python6.5k 40 192716
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Language:Python5.7k 34 536551
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python5.6k 49 463427
google-deepmind/alphageometry
Language:Python4.4k 58 138501
xjdr-alt/entropix
Entropy Based Sampling and Parallel CoT Decoding
Language:Python3.3k 72 40319
datamllab/rlcard
Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
Language:Python3.1k 77 200654
pytorch/rl
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
Language:Python2.6k 42 672347
Farama-Foundation/Arcade-Learning-Environment
The Arcade Learning Environment (ALE) -- a platform for AI research.
Language:C++2.2k 81 261438
apple/axlearn
An Extensible Deep Learning Library
Language:Python2k 66 29303
cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Language:Python1.9k 23 75128
xlang-ai/OSWorld
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Language:Python1.7k 32 82201
Ma-Lab-Berkeley/CRATE
Code for CRATE (Coding RAte reduction TransformEr).
Language:Python1.2k 20 2295
Genesis-Embodied-AI/RoboGen
A generative and self-guided robotic agent that endlessly propose and master new skills.
Language:Python938 18 3588
llava-rlhf/LLaVA-RLHF
Aligning LMMs with Factually Augmented RLHF
Language:Python353 8 4125
RL4VLM/RL4VLM
Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Language:Jupyter Notebook322 6 2821
nexusflowai/NexusRaven
NexusRaven-13B, a new SOTA Open-Source LLM for function calling. This repo contains everything for reproducing our evaluation on NexusRaven-13B and baselines.
Language:Python313 9 623
tianyi-lab/HallusionBench
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Language:Python270 5 138
brentyi/egoallo
Estimating Body and Hand Motion in an Ego-sensed World
Language:Python178 5 416
young-geng/scalax
A simple library for scaling up JAX programs
Language:Python134 8 210
moka-manipulation/moka
MOKA: Open-World Robotic Manipulation through Mark-based Visual Prompting (RSS 2024)
Language:Python71 3 18
rail-berkeley/fmb
Language:Python48 9 41
young-geng/mintext
Minimal but scalable implementation of large language models in JAX
Language:Python34 3 00
efrick2002/Starling
Language:Jupyter Notebook70
FengdiC/OTTD
Language:Python10

YX-S-Z

YX-S-Z's Stars

xai-org/grok-1

karpathy/nanoGPT

karpathy/LLM101n

huggingface/trl

Farama-Foundation/Gymnasium

thu-ml/tianshou

vwxyzjn/cleanrl

OpenRLHF/OpenRLHF

QwenLM/Qwen-VL

google-deepmind/alphageometry

xjdr-alt/entropix

datamllab/rlcard

pytorch/rl

Farama-Foundation/Arcade-Learning-Environment

apple/axlearn

cambrian-mllm/cambrian

xlang-ai/OSWorld

Ma-Lab-Berkeley/CRATE

Genesis-Embodied-AI/RoboGen

llava-rlhf/LLaVA-RLHF

RL4VLM/RL4VLM

nexusflowai/NexusRaven

tianyi-lab/HallusionBench

brentyi/egoallo

young-geng/scalax

moka-manipulation/moka

rail-berkeley/fmb

young-geng/mintext

efrick2002/Starling

FengdiC/OTTD