Pinned Repositories
agents
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
awesome-deep-rl
For deep RL and the future of AI.
azure-cli-cheatsheet
Azure CLI Cheatsheet
BERT-pytorch
Google AI 2018 BERT pytorch implementation
c-planning
cs162-group
HIR
MADE
NovelD
TEMPERA
tianjunz's Repositories
tianjunz/HIR
tianjunz/TEMPERA
tianjunz/NovelD
tianjunz/MADE
tianjunz/agents
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
tianjunz/awesome-deep-rl
For deep RL and the future of AI.
tianjunz/azure-cli-cheatsheet
Azure CLI Cheatsheet
tianjunz/c-planning
tianjunz/DeepSpeedExamples
Example models using DeepSpeed
tianjunz/dreamerv2
Mastering Atari with Discrete World Models
tianjunz/guidance
A guidance language for controlling large language models.
tianjunz/gym
A toolkit for developing and comparing reinforcement learning algorithms.
tianjunz/Learn_Prompting
tianjunz/marLo
Multi Agent Reinforcement Learning using MalmÖ
tianjunz/MemGPT
Create LLM agents with long-term memory and custom tools 📚🦙
tianjunz/metaseq
Repo for external large-scale work
tianjunz/ml-agents
Unity Machine Learning Agents Toolkit
tianjunz/my-offlinerl
tianjunz/ort
Accelerate PyTorch models with ONNX Runtime
tianjunz/overcooked_ai
A benchmark environment for fully cooperative multi-agent performance.
tianjunz/poet
ML model training for edge devices
tianjunz/pymarl
Python Multi-Agent Reinforcement Learning framework
tianjunz/python
Official Python client library for kubernetes
tianjunz/pytorch-a2c-ppo-acktr-gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
tianjunz/PyTorch-GAN
PyTorch implementations of Generative Adversarial Networks.
tianjunz/raft
tianjunz/softlearning
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.
tianjunz/tianjunz.github.io
tianjunz/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
tianjunz/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs