Pinned Repositories
alpaca-lora
Code for reproducing the Stanford Alpaca InstructLLaMA result on consumer hardware
bogaaak.github.io
deepspeed-sagemaker-example
dominance-filters
filter actions using (cumulative) dominance
examples
TensorFlow examples
GazeHeuristic
Explore effectiveness and failure modes of the Gaze Heuristic
gradio
Create UIs for your machine learning model in Python in 3 minutes
pytetris
q-learning-introduction
Code for the talk "A gentle introduction to Q-learning in Python" held at the PyData Bristol Meetup on July 18, 2019.
stew-tetris
Shrinkage Toward Equal Weights in Tetris
janmaltel's Repositories
janmaltel/pytetris
janmaltel/stew-tetris
Shrinkage Toward Equal Weights in Tetris
janmaltel/q-learning-introduction
Code for the talk "A gentle introduction to Q-learning in Python" held at the PyData Bristol Meetup on July 18, 2019.
janmaltel/alpaca-lora
Code for reproducing the Stanford Alpaca InstructLLaMA result on consumer hardware
janmaltel/bogaaak.github.io
janmaltel/deepspeed-sagemaker-example
janmaltel/dominance-filters
filter actions using (cumulative) dominance
janmaltel/examples
TensorFlow examples
janmaltel/GazeHeuristic
Explore effectiveness and failure modes of the Gaze Heuristic
janmaltel/gradio
Create UIs for your machine learning model in Python in 3 minutes
janmaltel/gym
A toolkit for developing and comparing reinforcement learning algorithms.
janmaltel/gym-feature-gridworld
A simple gridworld environment where state-action pairs have a simple feature representation. Uses OpenAI gym format.
janmaltel/models
Models and examples built with TensorFlow
janmaltel/ProMP
ProMP: Proximal Meta-Policy Search
janmaltel/pytorch-maml-rl
Reinforcement Learning with Model-Agnostic Meta-Learning in Pytorch
janmaltel/rand_param_envs
Random parameter environments using gym 0.7.4 and mujoco-py 0.5.7
janmaltel/rl-baselines-zoo
A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
janmaltel/rl-visualizations
Simple visualizations of basic RL algorithms in a simple gridworld. Written in Python, the visualizations can be seen directly in the respective jupyter notebooks.
janmaltel/stable-baselines
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
janmaltel/stew
Shrinkage Toward Equal Weights
janmaltel/td-gammon
Implementation of TD-Gammon in TensorFlow.
janmaltel/template
This is the repository for the distill web framework
janmaltel/tetris
A Tetris implementation tailored for use in reinforcement learning applications.
janmaltel/toy-data
Create toy data for linear or deep machine learning models
janmaltel/v97
Proceedings of ICML 2019
janmaltel/weightagnostic.github.io
repo for interactive article