saschaschramm

Germany

Pinned Repositories

best-of-n-sampling
Toy example for best-of-n-sampling
7 2 01
chatgpt
Analysis of OpenAI's ChatGPT
Language:Jupyter Notebook144 8 415
github-copilot
Analysis of the Github Copilot extension
Language:Python148 4 311
LearningFromDemonstration
This project is a simplified implementation of the learning from demonstration algorithm developed by OpenAI.
Language:Python3 2 00
MonteCarloTreeSearch
This project applies Monte Carlo Tree Search (MCTS) to a simple grid world.
Language:Python11 2 12
PopulationBasedTraining
Asynchronous optimisation algorithm to optimise a population of models and their hyperparameters.
Language:Python5 3 00
self-delusion
Delusions in sequence models for interaction and control
Language:Python0 2 00
SwiftReinforce
Implementation of the Reinforce algorithm using Swift for Tensorflow.
Language:Swift11 3 61
TemporalDifferenceLearning
Temporal-difference learning is a method to compute the values of all states by sampling the environment. It approximates the current estimate of a state value based on previously learned estimates (bootstrapping).
Language:Python3 2 13
tiny-chatgpt
Researching the reinforcement learning algorithm of ChatGPT
Language:Jupyter Notebook2 3 00

saschaschramm's Repositories

saschaschramm/github-copilot
Analysis of the Github Copilot extension
Language:Python148 4 311
saschaschramm/chatgpt
Analysis of OpenAI's ChatGPT
Language:Jupyter Notebook144 8 415
saschaschramm/MonteCarloTreeSearch
This project applies Monte Carlo Tree Search (MCTS) to a simple grid world.
Language:Python11 2 12
saschaschramm/SwiftReinforce
Implementation of the Reinforce algorithm using Swift for Tensorflow.
Language:Swift11 3 61
saschaschramm/best-of-n-sampling
Toy example for best-of-n-sampling
7 2 01
saschaschramm/PopulationBasedTraining
Asynchronous optimisation algorithm to optimise a population of models and their hyperparameters.
Language:Python5 3 00
saschaschramm/LearningFromDemonstration
This project is a simplified implementation of the learning from demonstration algorithm developed by OpenAI.
Language:Python3 2 00
saschaschramm/TemporalDifferenceLearning
Temporal-difference learning is a method to compute the values of all states by sampling the environment. It approximates the current estimate of a state value based on previously learned estimates (bootstrapping).
Language:Python3 2 13
saschaschramm/tiny-chatgpt
Researching the reinforcement learning algorithm of ChatGPT
Language:Jupyter Notebook2 3 00
saschaschramm/autopilot-for-code
ChatGPT can develop, set up, and run a complete web applications
1 2 00
saschaschramm/chatgpt-eval-plugin
Very simple example of a ChatGPT plugin
Language:Python1 3 01
saschaschramm/diff-gpt
Incremental algorithm for program synthesis
Language:Python1 3 0
saschaschramm/MoveToBeacon
Application of Reinforcement Learning on StarCraft.
Language:Python1 2 02
saschaschramm/Pong
Application of different Reinforcement Learning algorithms on the Atari game Pong.
Language:Python1 3 0
saschaschramm/sc2-evals
Evaluation of GPT-4 on StarCraft II
Language:Python1 2 0
saschaschramm/slowloris
Language:Python1 3 0
saschaschramm/self-delusion
Delusions in sequence models for interaction and control
Language:Python0 2 00
saschaschramm/A2C
Synchronous implementation of the A3C algorithm.
Language:Python3 0
saschaschramm/codex
Evaluating the Codex language model from OpenAI
2 0
saschaschramm/language-models
Language models
Language:Python3 0
saschaschramm/LSTM
Shows how the BasicLSTMCell is implemented internally in Tensorflow.
Language:Python2 01
saschaschramm/mabuc
Bandits with unobserved confounders
Language:Jupyter Notebook2 0
saschaschramm/mlflow
Open source platform for the machine learning lifecycle
Language:Python1 0
saschaschramm/msteams-tts
Text-to-Speech for Microsoft Teams
Language:Python1 0
saschaschramm/MultiArmedBandits
Application of the stochastic gradient ascent algorithm on the multi-armed bandit problem.
Language:Swift2 0
saschaschramm/PolicyGradientMethods
Reinforcement learning methods that learn a parameterized policy. These methods learn by approximating the gradient of a performance measure with respect to its policy parameters.
Language:Python2 0
saschaschramm/pysc2
StarCraft II Learning Environment
Language:Python1 0
saschaschramm/QLearning
Implementation of the Q-Learning algorithm.
Language:Python2 0
saschaschramm/ReinforcementLearningBasics
Basics of Reinforcement Learning.
Language:Python2 0
saschaschramm/unobserved-confounders
Simple example of unobserved confounders and language models
3 0