ziyan-wang98's Stars
ollama/ollama
Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
karpathy/LLM101n
LLM101n: Let's build a Storyteller
lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
openai/swarm
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
SakanaAI/AI-Scientist
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
NeoVertex1/SuperPrompt
SuperPrompt is an attempt to engineer prompts that might help us understand AI agents.
isaac-sim/IsaacLab
Unified framework for robot learning built on NVIDIA Isaac Sim
isaac-sim/IsaacGymEnvs
Isaac Gym Reinforcement Learning Environments
eloialonso/diamond
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
RayeRen/acad-homepage.github.io
AcadHomepage: A Modern and Responsive Academic Personal Homepage
PufferAI/PufferLib
Simplifying reinforcement learning for complex game environments
google-deepmind/rlax
facebookresearch/nle
The NetHack Learning Environment
lafmdp/Awesome-Papers-Autonomous-Agent
A collection of recent papers on building autonomous agent. Two topics included: RL-based / LLM-based agents.
Cledersonbc/tic-tac-toe-minimax
Minimax is a AI algorithm.
youssefHosni/Awesome-AI-Data-Guided-Projects
A curated list of data science & AI guided projects to start building your portfolio
ParisNeo/ollama_proxy_server
A proxy server for multiple ollama instances with Key security
Farama-Foundation/MAgent2
An engine for high performance multi-agent environments with very large numbers of agents, along with a set of reference environments
WindyLab/LLM-RL-Papers
Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.
kying18/tic-tac-toe
Tic-tac-toe AI using minimax
geochri/AlphaZero_Chess
PyTorch implementation of AlphaZero Chess from scratch
michaelnny/alpha_zero
A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board games
WeihaoTan/TWOSOME
Implementation of TWOSOME
luchris429/JaxLife
An Open-Ended Agentic Simulator
pickxiguapi/Uni-RLHF-Platform
Uni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024)
mgerstgrasser/super
suPER is a collaborative multi-agent RL algorithm
ninell-oldenburg/social-contracts
serenabooth/reward-design-perils
WeihaoTan/gym-macro-overcooked