reasondk

reasondk's Stars

karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda25.1k 252 1412.9k
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
Language:Jupyter Notebook14k 99 181.1k
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Language:Python13.1k 170 247910
numba/numba
NumPy aware dynamic Python compiler using LLVM
Language:Python10.1k 198 5.3k1.1k
NVIDIA/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Language:Python8.5k 100 1.2k1.4k
tensorlayer/TensorLayer
Deep Learning and Reinforcement Learning Library for Scientists and Engineers
Language:Python7.3k 457 4671.6k
yandexdataschool/Practical_RL
A course in reinforcement learning in the wild
Language:Jupyter Notebook6k 210 1861.7k
keras-rl/keras-rl
Deep Reinforcement Learning for Keras.
Language:Python5.5k 200 2411.4k
google-research/timesfm
TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
Language:Python4.2k 41 138367
CrazyBoyM/llama3-Chinese-chat
Llama3、Llama3.1 中文仓库（随书籍撰写中... 各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档）
Language:Python4.1k 46 54339
sweetice/Deep-reinforcement-learning-with-pytorch
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
Language:Python4.1k 36 35861
wangshusen/DRL
Deep Reinforcement Learning
3.5k 42 55595
opendilab/DI-engine
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
Language:Python3.2k 23 215388
NeuronDance/DeepRL
Deep Reinforcement Learning Lab, a platform designed to make DRL technology and fun for everyone
2.4k 100 6583
XinJingHao/DRL-Pytorch
Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)
Language:Python1.8k 10 9216
AI-Hypercomputer/maxtext
A simple, performant and scalable Jax LLM!
Language:Python1.6k 39 102310
coreylynch/async-rl
Tensorflow + Keras + OpenAI Gym implementation of 1-step Q Learning from "Asynchronous Methods for Deep Reinforcement Learning"
Language:Python1k 68 24173
luchris429/purejaxrl
Really Fast End-to-End Jax RL Implementations
Language:Python797 13 2468
cbyn/bitpredict
Machine learning for high frequency bitcoin price prediction
Language:Python757 98 0220
wdndev/llama3-from-scratch-zh
从零实现一个 llama3 中文版
Language:Jupyter Notebook633 2 267
marload/DeepRL-TensorFlow2
🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2
Language:Python605 19 8141
google-deepmind/dqn_zoo
DQN Zoo is a collection of reference implementations of reinforcement learning agents developed at DeepMind based on the Deep Q-Network (DQN) agent.
Language:Python463 18 2381
jaromiru/AI-blog
Accompanying repository for Let's make a DQN / A3C series.
Language:Python395 23 20173
cyoon1729/deep-Q-networks
Implementations of algorithms from the Q-learning family. Implementations inlcude: DQN, DDQN, Dueling DQN, PER+DQN, Noisy DQN, C51
Language:Jupyter Notebook283 6 382
THINK989/Real-Time-Stock-Market-Prediction-using-Ensemble-DL-and-Rainbow-DQN
Language:Python188 6 441
Kaixhin/NoisyNet-A3C
Noisy Networks for Exploration
Language:Python186 10 626
CrazyBoyM/llama2-Chinese-chat
首个llama2 13b 中文版模型（Base + 中文对话SFT，实现流畅多轮人机自然语言交互)
89 3 67
LuEE-C/PPO-Keras
My implementation of the Proximal Policy Optisation algorithm using Keras as a backend
Language:Python88 6 1424
cocolico14/N-step-Dueling-DDQN-PER-Pacman
Using N-step dueling DDQN with PER for playing Pacman game
Language:Python22 3 13
Jannik0/RUG_ReinforcementLearning
Language:Python3 0 01