dschaehi
I am a postdoc in the Knowledge Technology Group, University of Hamburg. I am interested in building robust neural architectures.
University of HamburgHamburg, Germany
Pinned Repositories
capturing-word-order
Capturing Word Order in Averaging Based Sentence Embeddings
CLEVRER
PyTorch implementation of ICLR 2020 paper "CLEVRER: CoLlision Events for Video REpresentation and Reasoning"
commenting.sty
dfg
A LaTeX template for a basic DFG (Deutsche Forschungsgemeinschaft, German Research Foundation) grant proposal.
google-research
Google Research
inductive_counting_with_LMs
This work provides extensive empirical results on training LMs to count. We find that while traditional RNNs trivially achieve inductive counting, Transformers have to rely on positional embeddings to count out-of-domain. Modern RNNs (e.g. rwkv, mamba) also largely underperform traditional RNNs in generalizing counting inductively.
looped_transformer
neural_networks_chomsky_hierarchy
Neural Networks and the Chomsky Hierarchy
nn-zero-to-hero
Neural Networks: Zero to Hero
dschaehi's Repositories
dschaehi/capturing-word-order
Capturing Word Order in Averaging Based Sentence Embeddings
dschaehi/commenting.sty
dschaehi/CLEVRER
PyTorch implementation of ICLR 2020 paper "CLEVRER: CoLlision Events for Video REpresentation and Reasoning"
dschaehi/neural_networks_chomsky_hierarchy
Neural Networks and the Chomsky Hierarchy
dschaehi/nn-zero-to-hero
Neural Networks: Zero to Hero
dschaehi/dfg
A LaTeX template for a basic DFG (Deutsche Forschungsgemeinschaft, German Research Foundation) grant proposal.
dschaehi/google-research
Google Research
dschaehi/inductive_counting_with_LMs
This work provides extensive empirical results on training LMs to count. We find that while traditional RNNs trivially achieve inductive counting, Transformers have to rely on positional embeddings to count out-of-domain. Modern RNNs (e.g. rwkv, mamba) also largely underperform traditional RNNs in generalizing counting inductively.
dschaehi/looped_transformer
dschaehi/micrograd
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
dschaehi/soho
Minimalist Hugo theme based on Hyde
dschaehi/xplique
👋 Xplique is a Neural Networks Explainability Toolbox
dschaehi/recurrent-chunked-models-regular-languages
Code of "Recurrent Transformers Trade-off Parallelism for Length Generalization on Regular Languages"