epfml/landmark-attention

Landmark Attention: Random-Access Infinite Context Length for Transformers

PythonApache-2.0

Issues

Fine-tuned Weights Doesn't Work
#16 opened 7 months ago by baotruyenthach
0
How is TransformerFAM different from Landmark Attention?
#15 opened 9 months ago by Rock-Anderson
0
Question about training stability
#14 opened a year ago by meme-virus
3
[Q]: Quantize Per-Trained model Using QLoRa or LoRa , PFET Technique
#13 opened a year ago by deep-matter
0
Will this work with Qlora and 4bit inference?
#1 opened 2 years ago by jordancole21
36
when using auto_insert_landmarks, it appears to be view problems
#12 opened a year ago by EzTan1
1
Assertion `srcIndex < srcSelectDimSize` failed while running test.
#11 opened 2 years ago by L16H7
2
Peft module
#10 opened 2 years ago by NicolasMejiaPetit
0
How much VRAM do you need to run Inference?
#9 opened 2 years ago by FFFiend
0
Is true attention (full-context) calculation needed during training?
#2 opened 2 years ago by Andrey36652
3
Question about Training vs LLaMA Fine Tuning
#6 opened 2 years ago by eugenepentland
2
Unlimited context length and no prompt token limit
#8 opened 2 years ago by ekolawole
0
hello! i reached out on twitter before the release and received the link here when you guys dropped this
#5 opened 2 years ago by Alignment-Lab-AI
0
Readme clarification
#3 opened 2 years ago by StrangeTcy
1
model_name_or_path defaults
#4 opened 2 years ago by StrangeTcy
0