LAKan233

LAKan233's Stars

hiyouga/LLaMA-Factory
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
Language:Python30.9k 196 4.8k3.8k
bbfamily/abu
阿布量化交易系统(股票，期权，期货，比特币，机器学习) 基于python的开源量化交易，量化投资架构
Language:Python11.8k 743 03.7k
AI4Finance-Foundation/FinRL
FinRL: Financial Reinforcement Learning. 🔥
Language:Jupyter Notebook9.7k 201 7132.3k
eriklindernoren/Keras-GAN
Keras implementations of Generative Adversarial Networks.
Language:Python9.2k 275 2253.1k
vwxyzjn/cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Language:Python5.3k 36 182603
AI4Finance-Foundation/ElegantRL
Massively Parallel Deep Reinforcement Learning. 🔥
Language:Python3.6k 51 252833
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
3.2k 61 3200
shibing624/MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Language:Python3.2k 38 388488
plotly/dash-sample-apps
Open-source demos hosted on Dash Gallery
Language:Jupyter Notebook3.1k 82 2053k
allenai/RL4LMs
A modular RL library to fine-tune language models to human preferences
Language:Python2.2k 25 55191
letianzj/QuantResearch
Quantitative analysis, strategies and backtests
Language:Jupyter Notebook1.9k 64 4410
opendilab/PPOxFamily
PPO x Family DRL Tutorial Course（决策智能入门级公开课：8节课帮你盘清算法理论，理顺代码逻辑，玩转决策AI应用实践）
Language:Python1.9k 16 17171
openai/prm800k
800,000 step-level correctness labels on LLM solutions to MATH problems
Language:Python1.4k 117 1692
zonechen1994/CV_Interview
I hope this repo can help you a lot!
1.2k 14 5213
tensorflow/gan
Tooling for GANs in TensorFlow
Language:Jupyter Notebook927 47 32246
PyPatel/Options-Trading-Strategies-in-Python
Developing Options Trading Strategies using Technical Indicators and Quantitative Methods
Language:Python790 59 4227
shariqiqbal2810/MAAC
Code for "Actor-Attention-Critic for Multi-Agent Reinforcement Learning" ICML 2019
Language:Python664 7 38173
MorvanZhou/pytorch-A3C
Simple A3C implementation with pytorch + multiprocessing
Language:Python607 14 27142
xuehy/pytorch-maddpg
A pytorch implementation of MADDPG (multi-agent deep deterministic policy gradient)
Language:Python605 12 20122
tinyzqh/light_mappo
Lightweight version of MAPPO to help you quickly migrate to your local environment.
Language:Python468 1 2179
online-books/moyu
🐟 在线摸鱼减压，今天你摸鱼了吗？
Language:JavaScript343 5 3430
poloclub/dodrio
Exploring attention weights in transformer-based models with linguistic knowledge.
Language:Svelte343 6 930
SunnyGJing/t5-pegasus-chinese
基于GOOGLE T5中文生成式模型的摘要生成/指代消解，支持batch批量生成，多进程
Language:Python214 4 2634
SCHENLIU/longformer-chinese
chinese version of longformer
Language:Python108 3 1814
CMACH508/DeepTrader
Language:Python80 2 729
Improbable-AI/eipo
Official codebase for Redeeming Intrinsic Rewards via Constrained Policy Optimization
Language:Python77 7 26
SparkJiao/llama-pipeline-parallel
A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to copy code and launch discussions about the problems you have encoured.
Language:Python45 1 62
Y-B-Class-Projects/Human-Fall-Detection
Human Falling Detection
Language:Python45 1 213
YukiYasuda2718/rl-bsmodel-with-costs
Option hedging strategies are investigated using two reinforcement learning algorithms: deep Q network and deep deterministic policy gradient.
Language:Jupyter Notebook19 1 02
christopher-hma/STOCKS_TRADING_RL
Language:Python2