TaciturnMute's Stars
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
meta-llama/llama3
The official Meta Llama 3 GitHub site
joonspk-research/generative_agents
Generative Agents: Interactive Simulacra of Human Behavior
bulletphysics/bullet3
Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
RUCAIBox/LLMSurvey
The official GitHub page for the survey paper "A Survey of Large Language Models".
nlpxucan/WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
DLR-RM/stable-baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
kingoflolz/mesh-transformer-jax
Model parallel transformers in JAX and Haiku
codertimo/BERT-pytorch
Google AI 2018 BERT pytorch implementation
google-deepmind/alphageometry
baichuan-inc/Baichuan2
A series of large language models developed by Baichuan Intelligent Technology
suragnair/alpha-zero-general
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
AI4Finance-Foundation/ElegantRL
Massively Parallel Deep Reinforcement Learning. 🔥
higgsfield-ai/higgsfield
Fault-tolerant, highly scalable GPU orchestration, and a machine learning framework designed for training models with billions to trillions of parameters
higgsfield/RL-Adventure
Pytorch Implementation of DQN / DDQN / Prioritized replay/ noisy networks/ distributional values/ Rainbow/ hierarchical RL
DLR-RM/rl-baselines3-zoo
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
MetaGLM/FinGLM
FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。
sfujim/TD3
Author's PyTorch implementation of TD3 for OpenAI gym tasks
haarnoja/sac
Soft Actor-Critic
lucidrains/mixture-of-experts
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
araffin/rl-tutorial-jnrr19
Stable-Baselines tutorial for Journées Nationales de la Recherche en Robotique 2019
higgsfield/np-hard-deep-reinforcement-learning
pytorch neural combinatorial optimization
DongChen06/MARL_CAVs
MARL for Autonomous Vehicles
lipengyuer/DataScience
polixir/NeoRL
Python interface for accessing the near real-world offline reinforcement learning (NeoRL) benchmark datasets
TianHongZXY/CoRe
[ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models
ghdrl95/stock_experiment_multimodal
'A Deep Multimodal Reinforcement Learning System Combined with CNN and LSTM for Stock Trading' 실험 소스
yhc582825016/NLP4math
自然语言处理和强化学习相关的资料
bofenghuang/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.