attention-is-all-you-need
There are 221 repositories under attention-is-all-you-need topic.
jadore801120/attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
Kyubyong/transformer
A TensorFlow Implementation of the Transformer: Attention Is All You Need
linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
awslabs/sockeye
Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch
gordicaleksa/pytorch-original-transformer
My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
Separius/awesome-fast-attention
list of efficient attention modules
brightmart/bert_language_understanding
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
kaituoxu/Speech-Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
hkproj/pytorch-transformer
Attention is all you need implementation
lsdefine/attention-is-all-you-need-keras
A Keras+TensorFlow Implementation of the Transformer: Attention Is All You Need
kyegomez/LongNet
Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
sooftware/kospeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
FreedomIntelligence/TextClassificationBenchmark
A Benchmark of Text Classification in PyTorch
jayparks/transformer
A Pytorch Implementation of "Attention is All You Need" and "Weighted Transformer Network for Machine Translation"
lvapeab/nmt-keras
Neural Machine Translation with Keras
feifeibear/long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
kyegomez/CM3Leon
An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal AI that uses just a decoder to generate both text and images
kyegomez/ScreenAI
Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"
sled-group/InfEdit
[CVPR 2024] Official implementation of CVPR 2024 paper: "Inversion-Free Image Editing with Natural Language"
kyegomez/PALM-E
Implementation of "PaLM-E: An Embodied Multimodal Language Model"
sgrvinod/a-PyTorch-Tutorial-to-Transformers
Attention Is All You Need | a PyTorch Tutorial to Transformers
leviswind/pytorch-transformer
pytorch implementation of Attention is all you need
hkproj/transformer-from-scratch-notes
Notes about "Attention is all you need" video (https://www.youtube.com/watch?v=bCz4OMemCcA)
brandokoch/attention-is-all-you-need-paper
Original transformer paper: Implementation of Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017.
kyegomez/RT-X
Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment: Robotic Learning Datasets and RT-X Models"
kyegomez/MambaTransformer
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
Shuijing725/CrowdNav_Prediction_AttnGraph
[ICRA 2023] Intention Aware Robot Crowd Navigation with Attention-Based Interaction Graph
kyegomez/Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
guillaume-chevalier/Linear-Attention-Recurrent-Neural-Network
A recurrent attention module consisting of an LSTM cell which can query its own past cell states by the means of windowed multi-head attention. The formulas are derived from the BN-LSTM and the Transformer Network. The LARNN cell with attention can be easily used inside a loop on the cell state, just like any other RNN. (LARNN)
OutofAi/StableFace
Build your own Face App with Stable Diffusion 2.1
tnq177/transformers_without_tears
Transformers without Tears: Improving the Normalization of Self-Attention
jshuadvd/LongRoPE
Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
alexivaner/Deep-Learning-Based-Radio-Signal-Classification
Final Project for AI Wireless
johnsmithm/multi-heads-attention-image-classification
Multi heads attention for image classification
kyegomez/Kosmos2.5
My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"