Issues
- 2
Cuda version
#117 opened by jiaji-huang - 12
ERROR: Could not build wheels for pytorch-fast-transformers, which is required to install pyproject.toml-based projects
#128 opened by ouusan - 0
TypeError: canonicalize_version() got an unexpected keyword argument 'strip_trailing_zero'
#132 opened by luispintoc - 1
Full Attention does not sum to 1
#131 opened by yourj4m - 2
Can't officially save Linear Attention model
#114 opened by maulberto3 - 0
- 1
- 2
- 6
installation error
#92 opened by davidliujiafeng - 1
ImportError
#127 opened by PaulaTeeuwen - 1
causal-linear do not use attn_mask ?
#105 opened by davidliujiafeng - 4
- 0
Provenance of algorithms
#126 opened by taibai123abc - 1
Got different result for the same batch
#123 opened by gaoshan2006 - 4
Installing error on linux
#112 opened by xxmlala - 27
Windows installation - building wheel error
#121 opened by BenoitDalFerro - 3
Windows installation - Building wheel
#106 opened by MaximeHoude - 3
- 0
- 3
pip install and c++ compilation error, then name 'compute_hashes_cuda' is not defined
#89 opened by nikjetchev - 0
Understanding how to define key, query and value for the cross attention calculation
#119 opened by neuronphysics - 0
Example for NLP
#118 opened by Bachstelze - 2
Training Language Model
#107 opened by lucasnfe - 2
Speed of recurrent model
#116 opened by mads-oestergaard - 0
can offer built code for linus?
#115 opened by li-car-fei - 3
Runtime error on causal_product_cpu on GCC/G++ 11
#110 opened by lsisoft - 1
Any decoder example?
#113 opened by ahmedraza1996 - 2
Installation failed on Windows
#97 opened by WRKULOL - 1
Parallel complexity of Linear Attention is O(N)?
#108 opened by haozheji - 1
Casual attention is cheating by looking in the future
#111 opened by jogardi - 0
how causal mask constructed in training batch model with linear causal attention?
#109 opened by Howuhh - 9
Huggingface Bert vs. Fast Transformer full attention
#100 opened by lipmem - 5
- 2
Quick start raise a ModuleNotFoundError
#99 opened by CaoYiqingT - 1
CUDA error: CUBLAS_STATUS_INVALID_VALUE
#104 opened by huu4ontocord - 2
- 1
Mask and QK not of the same shape ?
#101 opened by Baldwin-disso - 0
- 1
Can't import causal_product_cuda
#96 opened by 15805383399 - 1
support of cluster attention
#95 opened by TianhaoFu - 1
TypeError: forward() missing 3 required positional arguments: 'attn_mask', 'query_lengths', and 'key_lengths'
#94 opened by TianhaoFu - 2
- 1
- 1
Make fast-transformers JIT Compilable
#88 opened by AndriyMulyar - 3
- 2
- 4
CUDA version and CausalDotProduct time
#83 opened by caffeinetoomuch - 7
Tips and tricks for training linear_att
#84 opened by gaceladri - 3
Where is the sum operation of KV?
#82 opened by Yogurt928 - 3