xtma

Ph.D. of Tsinghua University. Interested in Reinforcement Learning and Agent.

Tsinghua University

Pinned Repositories

apo
Average-Reward Reinforcement Learning with Trust Region Methods
Language:Python6 1 03
dsac
Distributional Soft Actor Critic
Language:Python50 1 510
msvpo
The official implementation of "Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning"
Language:Python2 1 00
PGPortfolio
PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
Language:Python0 1 00
pytorch-a2c-ppo-acktr-gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Language:Python1 1 00
pytorch_car_caring
Reinforcement Learning for Gym CarRacing-v0 with PyTorch
Language:Python152 5 437
ray-maddpg
MADDPG implementation with Ray
Language:Python1 2 00
simple-pytorch-rl
Reinforcement Learning Methods with PyTorch
Language:Python38 1 314
vimrc
The ultimate Vim configuration: vimrc
Language:Vim script1 0 00
xtma.github.io
Language:SCSS1 1 04

xtma's Repositories

xtma/pytorch_car_caring
Reinforcement Learning for Gym CarRacing-v0 with PyTorch
Language:Python152 5 437
xtma/dsac
Distributional Soft Actor Critic
Language:Python50 1 510
xtma/simple-pytorch-rl
Reinforcement Learning Methods with PyTorch
Language:Python38 1 314
xtma/apo
Average-Reward Reinforcement Learning with Trust Region Methods
Language:Python6 1 03
xtma/msvpo
The official implementation of "Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning"
Language:Python2 1 00
xtma/pytorch-a2c-ppo-acktr-gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Language:Python1 1 00
xtma/ray-maddpg
MADDPG implementation with Ray
Language:Python1 2 00
xtma/vimrc
The ultimate Vim configuration: vimrc
Language:Vim script1 0 00
xtma/xtma.github.io
Language:SCSS1 1 04
xtma/PGPortfolio
PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
Language:Python0 1 00
xtma/rl-portfolio-management
Attempting to replicate "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem" https://arxiv.org/abs/1706.10059 (and an openai gym environment)
Language:Jupyter Notebook0 1 01
xtma/rlpyt
Reinforcement Learning in PyTorch
Language:Python0 0
xtma/self-play-pong
RoboSchool Pony in Self-Play Mode
Language:Python1 0
xtma/VEM
Codes accompanying the paper "Offline Reinforcement Learning with Value-Based Episodic Memory" (ICLR 2022 https://arxiv.org/abs/2110.09796)
Language:Python0 0

xtma

Pinned Repositories

apo

dsac

msvpo

PGPortfolio

pytorch-a2c-ppo-acktr-gail

pytorch_car_caring

ray-maddpg

simple-pytorch-rl

vimrc

xtma.github.io

xtma's Repositories

xtma/pytorch_car_caring

xtma/dsac

xtma/simple-pytorch-rl

xtma/apo

xtma/msvpo

xtma/pytorch-a2c-ppo-acktr-gail

xtma/ray-maddpg

xtma/vimrc

xtma/xtma.github.io

xtma/PGPortfolio

xtma/rl-portfolio-management

xtma/rlpyt

xtma/self-play-pong

xtma/VEM