RobertKirk

PhD student at @ucl-dark. Interested in understanding LLM fine-tuning, AI safety and (super)alignment.

@ucl-darkLondon

Pinned Repositories

minihack
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
Language:Python484 13 4059
rlfh-gen-div
This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity
Language:Python38 3 35
dotfiles
A collection of personal scripts, aliases and the like from my personal software engineering practice
Language:Vim script2 0 00
Graph-Comonads-from-Pebble-Games
Master Thesis code: Implementing Game Comonads in Finite Model Theory using Dependent Types in Idris
Language:Idris3 0 00
roam-solarized-theme
A strict solarized Roam Research theme
Language:CSS2 1 00
roam-tools
A small but growing collection of tools for Roam Research
Language:Python4 2 01
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python0 0 00
tinystories-wrappers
Code for the TinyStories experiments from "Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks".
Language:Jupyter Notebook5 2 22
tmux-ram
Plug and play RAM percentage and icon indicator for Tmux
Language:Shell1 0 01

RobertKirk's Repositories

RobertKirk/tinystories-wrappers
Code for the TinyStories experiments from "Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks".
Language:Jupyter Notebook5 2 22
RobertKirk/roam-tools
A small but growing collection of tools for Roam Research
Language:Python4 2 01
RobertKirk/Graph-Comonads-from-Pebble-Games
Master Thesis code: Implementing Game Comonads in Finite Model Theory using Dependent Types in Idris
Language:Idris3 0 00
RobertKirk/dotfiles
A collection of personal scripts, aliases and the like from my personal software engineering practice
Language:Vim script2 0 00
RobertKirk/roam-solarized-theme
A strict solarized Roam Research theme
Language:CSS2 1 00
RobertKirk/tmux-ram
Plug and play RAM percentage and icon indicator for Tmux
Language:Shell1 0 01
RobertKirk/client
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
Language:Python0 0 00
RobertKirk/DeepRLAlgos
A collection of my own implementations of a variety of DeepRL Algorithms
Language:Jupyter Notebook0 0 00
RobertKirk/phasic-policy-gradient
Code for the paper "Phasic Policy Gradient"
Language:Python00
RobertKirk/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python0 0 00
RobertKirk/check_pdb_hook
Pre-commit hook to check for exposed PDB statements in Python files
Language:Python0 0
RobertKirk/dmcontrol-generalization-benchmark
DMControl Generalization Benchmark
Language:Python0 0
RobertKirk/dmenu
My personal dmenu fork
Language:C0 0
RobertKirk/dwm
My personal fork of dwm
Language:C0 0
RobertKirk/homebrew-neovim-nightly
Homebrew Cask tap for nightly neovim
Language:Ruby0 0
RobertKirk/marge-bot
A merge-bot for GitLab
Language:Python0 0
RobertKirk/nle
The NetHack Learning Environment
Language:C0 0
RobertKirk/rlfh-gen-div
This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity
RobertKirk/RobertKirk.github.io
personal blog
Language:SCSS1 0
RobertKirk/RSSPlaylister
Language:TypeScript0 0
RobertKirk/scholar-alert-digest
Aggregate unread emails from Google Scholar alerts
Language:Go0 0
RobertKirk/st
My fork of Simple terminal, with some patches and colours applied.
Language:C0 0
RobertKirk/surfingkeys-conf
A SurfingKeys configuration which adds 200+ key mappings for 17+ unique sites and OmniBar search suggestions for 45+ sites
Language:JavaScript0 0
RobertKirk/trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
Language:Python0 01
RobertKirk/voyager
🚀 Secure HAProxy Ingress Controller for Kubernetes
Language:Go0 0
RobertKirk/weak-to-strong
Language:Python0 0

RobertKirk

Pinned Repositories

minihack

rlfh-gen-div

dotfiles

Graph-Comonads-from-Pebble-Games

roam-solarized-theme

roam-tools

stanford_alpaca

tinystories-wrappers

tmux-ram

RobertKirk's Repositories

RobertKirk/tinystories-wrappers

RobertKirk/roam-tools

RobertKirk/Graph-Comonads-from-Pebble-Games

RobertKirk/dotfiles

RobertKirk/roam-solarized-theme

RobertKirk/tmux-ram

RobertKirk/client

RobertKirk/DeepRLAlgos

RobertKirk/phasic-policy-gradient

RobertKirk/stanford_alpaca

RobertKirk/check_pdb_hook

RobertKirk/dmcontrol-generalization-benchmark

RobertKirk/dmenu

RobertKirk/dwm

RobertKirk/homebrew-neovim-nightly

RobertKirk/marge-bot

RobertKirk/nle

RobertKirk/rlfh-gen-div

RobertKirk/RobertKirk.github.io

RobertKirk/RSSPlaylister

RobertKirk/scholar-alert-digest

RobertKirk/st

RobertKirk/surfingkeys-conf

RobertKirk/trlx

RobertKirk/voyager

RobertKirk/weak-to-strong