dschaehi

I am a postdoc in the Knowledge Technology Group, University of Hamburg. I am interested in building robust neural architectures.

University of HamburgHamburg, Germany

Pinned Repositories

capturing-word-order
Capturing Word Order in Averaging Based Sentence Embeddings
Language:Python2 2 11
CLEVRER
PyTorch implementation of ICLR 2020 paper "CLEVRER: CoLlision Events for Video REpresentation and Reasoning"
Language:Python1 0 00
commenting.sty
Language:TeX2 1 00
dfg
A LaTeX template for a basic DFG (Deutsche Forschungsgemeinschaft, German Research Foundation) grant proposal.
Language:TeX0 0 00
google-research
Google Research
Language:Jupyter Notebook00
inductive_counting_with_LMs
This work provides extensive empirical results on training LMs to count. We find that while traditional RNNs trivially achieve inductive counting, Transformers have to rely on positional embeddings to count out-of-domain. Modern RNNs (e.g. rwkv, mamba) also largely underperform traditional RNNs in generalizing counting inductively.
Language:Jupyter Notebook00
looped_transformer
Language:Jupyter Notebook0 0 00
neural_networks_chomsky_hierarchy
Neural Networks and the Chomsky Hierarchy
Language:Python1 0 00
nn-zero-to-hero
Neural Networks: Zero to Hero
Language:Jupyter Notebook1 0 00

dschaehi's Repositories

dschaehi/capturing-word-order
Capturing Word Order in Averaging Based Sentence Embeddings
Language:Python2 2 11
dschaehi/commenting.sty
Language:TeX2 1 00
dschaehi/CLEVRER
PyTorch implementation of ICLR 2020 paper "CLEVRER: CoLlision Events for Video REpresentation and Reasoning"
Language:Python1 0 00
dschaehi/neural_networks_chomsky_hierarchy
Neural Networks and the Chomsky Hierarchy
Language:Python1 0 00
dschaehi/nn-zero-to-hero
Neural Networks: Zero to Hero
Language:Jupyter Notebook1 0 00
dschaehi/dfg
A LaTeX template for a basic DFG (Deutsche Forschungsgemeinschaft, German Research Foundation) grant proposal.
Language:TeX0 0 00
dschaehi/google-research
Google Research
Language:Jupyter Notebook00
dschaehi/inductive_counting_with_LMs
This work provides extensive empirical results on training LMs to count. We find that while traditional RNNs trivially achieve inductive counting, Transformers have to rely on positional embeddings to count out-of-domain. Modern RNNs (e.g. rwkv, mamba) also largely underperform traditional RNNs in generalizing counting inductively.
Language:Jupyter Notebook00
dschaehi/looped_transformer
Language:Jupyter Notebook0 0 00
dschaehi/micrograd
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
Language:Jupyter Notebook0 0 00
dschaehi/soho
Minimalist Hugo theme based on Hyde
Language:HTML0 0 00
dschaehi/xplique
👋 Xplique is a Neural Networks Explainability Toolbox
Language:Python0 0 00
dschaehi/recurrent-chunked-models-regular-languages
Code of "Recurrent Transformers Trade-off Parallelism for Length Generalization on Regular Languages"
Language:Python0 0

dschaehi

Pinned Repositories

capturing-word-order

CLEVRER

commenting.sty

dfg

google-research

inductive_counting_with_LMs

looped_transformer

neural_networks_chomsky_hierarchy

nn-zero-to-hero

dschaehi's Repositories

dschaehi/capturing-word-order

dschaehi/commenting.sty

dschaehi/CLEVRER

dschaehi/neural_networks_chomsky_hierarchy

dschaehi/nn-zero-to-hero

dschaehi/dfg

dschaehi/google-research

dschaehi/inductive_counting_with_LMs

dschaehi/looped_transformer

dschaehi/micrograd

dschaehi/soho

dschaehi/xplique

dschaehi/recurrent-chunked-models-regular-languages