multi-head-attention
There are 37 repositories under multi-head-attention topic.
sooftware/attentions
PyTorch implementation of some attentions for Deep Learning Researchers.
imperial-qore/TranAD
[VLDB'22] Anomaly Detection using Transformers, self-conditioning and adversarial training.
anicolson/DeepXi
Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.
poloclub/dodrio
Exploring attention weights in transformer-based models with linguistic knowledge.
Rintarooo/VRP_DRL_MHA
"Attention, Learn to Solve Routing Problems!"[Kool+, 2019], Capacitated Vehicle Routing Problem solver
monk1337/Various-Attention-mechanisms
This repository contain various types of attention mechanism like Bahdanau , Soft attention , Additive Attention , Hierarchical Attention etc in Pytorch, Tensorflow, Keras
datnnt1997/multi-head_self-attention
A Faster Pytorch Implementation of Multi-Head Self-Attention
zhaocq-nlp/Attention-Visualization
Visualization for simple attention and Google's multi-head attention.
youngbin-ro/Multi2OIE
Multi^2OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT (Findings of ACL: EMNLP 2020)
ShaneTian/Att-Induction
Attention-based Induction Networks for Few-Shot Text Classification
JacobHanimann/scDINO
Self-Supervised Vision Transformers for multiplexed imaging datasets
knotgrass/attention
several types of attention modules written in PyTorch
engelnico/point-transformer
This is the official repository of the original Point Transformer architecture.
IParraMartin/An-Explanation-Is-All-You-Need
The original transformer implementation from scratch. It contains informative comments on each block
Bruce-Lee-LY/flash_attention_inference
Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.
shifop/datagrand_bert
2019达观杯信息提取第5名代码
jack57lee/Diversify-MHA
EMNLP 2018: Multi-Head Attention with Disagreement Regularization; NAACL 2019: Information Aggregation for Multi-Head Attention with Routing-by-Agreement
Zminghua/SentEncoding
Sentence encoder and training code for Mean-Max AAE
Bruce-Lee-LY/decoding_attention
Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the decoding stage of LLM inference.
M-e-r-c-u-r-y/pytorch-transformers
Collection of different types of transformers for learning purposes
shreyas-kowshik/nlp4if
Code for the runners up entry on the English subtask on the Shared-Task-On-Fighting the COVID-19 Infodemic, NLP4IF workshop, NAACL'21.
tranquoctrinh/Image-Captioning-EfficientNet-Transformer
Image Captioning with Encoder as Efficientnet and Decoder as Decoder of Transformer combined with the attention mechanism.
YigitTurali/HydraViT
HydraViT is a PyTorch implementation of the HydraViT model, an adaptive multi-branch transformer for multi-label disease classification from chest X-ray images. The repository provides the necessary code to train and evaluate the HydraViT model on the NIH Chest X-ray dataset.
pi-tau/transformer
The Transformer model implemented from scratch using PyTorch. The model uses weight sharing between the embedding layers and the pre-softmax linear layer. Training on the Multi30k machine translation task is shown.
tanishqgautam/Transformers
Pytorch Implementation of Transformers
AIMedLab/DeepCE
Code and Datasets for the paper "A deep learning framework for high-throughput mechanism-driven phenotype compound screening and its application to COVID-19 drug repurposing", published on Nature Machine Intelligence in 2021.
gazelle93/Attention-Various-Positional-Encoding
This project aims to implement the Scaled-Dot-Product Attention layer and the Multi-Head Attention layer using various Positional Encoding methods.
dev-geof/final-state-transformer
Machine learning development toolkit built upon Transformer encoder network architectures and tailored for the realm of high-energy physics and particle-collision event analysis.
liaoyanqing666/transformer_pytorch
完整的原版transformer程序,complete origin transformer program
navreeetkaur/learn-to-pay-attention
TensorFlow implementation of AlexNet with multi-headed Attention mechanism
SpydazWebAI-NLP/BasicNeuralNetWork2023
A Basic Multi layered Neural Network, With Attention Masking Features
young-zonglin/yangzl-deep-text-matching
Text matching using several deep models.
sushantkumar23/nano-gpt
Simple character level Transformer
tate8/translator
Transformer translator website with multithreaded web server in Rust
TmohamedashrafT/vision-transformer-implementation
This repository contains code for implementing Vision Transformer (ViT) model for image classification
sajith-rahim/transformer-classifier
A Transformer Classifier implemented from Scratch.