Pinned Repositories
autoascend
The first place solution for the NeurIPS 2021 Nethack Challenge -- https://www.aicrowd.com/challenges/neurips-2021-the-nethack-challenge
baba-is-ai
crafter
Benchmarking the Spectrum of Agent Capabilities
dungeonsdata-neurips2022
Dataset Instructions and Tutorials for Submission to Neurips2022
fast_inference
finetuning-RL-as-CL
how-to-use-plgrid
LLAMA-compression
llm_debate
Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"
sample-factory
High throughput synchronous and asynchronous reinforcement learning
BartekCupial's Repositories
BartekCupial/sample-factory
High throughput synchronous and asynchronous reinforcement learning
BartekCupial/finetuning-RL-as-CL
BartekCupial/how-to-use-plgrid
BartekCupial/LLAMA-compression
BartekCupial/autoascend
The first place solution for the NeurIPS 2021 Nethack Challenge -- https://www.aicrowd.com/challenges/neurips-2021-the-nethack-challenge
BartekCupial/baba-is-ai
BartekCupial/crafter
Benchmarking the Spectrum of Agent Capabilities
BartekCupial/dungeonsdata-neurips2022
Dataset Instructions and Tutorials for Submission to Neurips2022
BartekCupial/fast_inference
BartekCupial/llm_debate
Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"
BartekCupial/minihack
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
BartekCupial/nle-code-wrapper
BartekCupial/nle-dashboard
BartekCupial/nle-demo
BartekCupial/nle-utils
BartekCupial/sample-pretrain
BartekCupial/CodeXGlue-defects
BartekCupial/BartekCupial.github.io
BartekCupial/Finetune-RL-as-CL
BartekCupial/Minigrid
Simple and easily configurable grid world environments for reinforcement learning
BartekCupial/nle-language-wrapper
Nethack Learning Environment Wrapper for Language Interface
BartekCupial/publications_2024
IDEAS scientific achievements
BartekCupial/rl-starter-files
RL starter files in order to immediatly train, visualize and evaluate an agent without writing any line of code
BartekCupial/torch-ac
Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO