SkAndMl/gpt-variations
Code for the paper - "Towards smaller, faster decoder-only transformers: Architectural variants and their implications"
PythonApache-2.0
Code for the paper - "Towards smaller, faster decoder-only transformers: Architectural variants and their implications"
PythonApache-2.0