Pinned Repositories
alpha-zero-gomoku
A Multi-threaded Implementation of AlphaZero (C++)
Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
cuda-neural-network
Convolutional Neural Network with CUDA (MNIST 99.23%)
deep-reinforcement-learning-notes
Deep Reinforcement Learning Notes
mini-interpreter
A Simple Scripting Language
mini-os-kernel
A mini Unix-Like OS kernel
noisy-mappo
Multi-agent PPO with noise (97% win rates on Hard scenarios of SMAC)
pymarl2
Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)
reinforcement-learning-wechat-jump
Reinforcement Learning for WeChat Jump
OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
hijkzzz's Repositories
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
hijkzzz/pymarl2
Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)
hijkzzz/alpha-zero-gomoku
A Multi-threaded Implementation of AlphaZero (C++)
hijkzzz/cuda-neural-network
Convolutional Neural Network with CUDA (MNIST 99.23%)
hijkzzz/deep-reinforcement-learning-notes
Deep Reinforcement Learning Notes
hijkzzz/mini-os-kernel
A mini Unix-Like OS kernel
hijkzzz/reinforcement-learning-wechat-jump
Reinforcement Learning for WeChat Jump
hijkzzz/mini-interpreter
A Simple Scripting Language
hijkzzz/prisma
Prisma
hijkzzz/dht-crawler
A DHT Crawler based on Goroutine
hijkzzz/web-server
A Web Server designed with Reactor I/O Model
hijkzzz/noisy-mappo
Multi-agent PPO with noise (97% win rates on Hard scenarios of SMAC)
hijkzzz/deep-learning-notes
Deep Learning Notes
hijkzzz/reinforcement-learning-trading-robot
Trading Robot based on LSTM-PPO
hijkzzz/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
hijkzzz/dotfiles
Configuration file
hijkzzz/hijkzzz.github.io
Homepage
hijkzzz/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
hijkzzz/leetcode
LeetCode & LintCode
hijkzzz/Awesome-LLM-Long-Context-Modeling
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
hijkzzz/2025
hijkzzz/hijkzzz
hijkzzz/llamafia.github.io
hijkzzz/mame-street-fighter-3-ai
Reinforcement Learning for Street Fighter III: 3rd Strike
hijkzzz/NTU-Thesis-LaTeX-Template
🎓 Unofficial LaTeX templates for your graduate thesis (both master's theses and doctoral dissertations) at National Taiwan University. 國立臺灣大學碩博士學位論文 LaTeX 模板
hijkzzz/reinforcement-learning.pytorch
Reinforcement Learning Library
hijkzzz/staging
iclr-blogposts.github.io/staging
hijkzzz/termux-jupyter
Termux init script