Issues
- 3
is it possible to use groq (llama 3.1 70b) api to compress text instead of running your own model?
#20 opened by sprappcom - 1
how to make n_gpu_layers work with cuda?
#19 opened by sprappcom - 1
ever tried tinyllama or smaller llamas?
#16 opened by sprappcom - 1
possible to provide benchmark for these?
#18 opened by sprappcom - 2
- 4
Does not work on all files because of utf-8 error
#13 opened by secemp9 - 7
Using different models like Phi-3
#5 opened by CyberTimon - 2
ollama version?
#14 opened by sokoow - 15
Post compression ratios
#3 opened by lee-b - 1
- 1
compare with brotli
#11 opened by jyrkialakuijala - 5
Interesting side effect of decompression - original training data extraction
#10 opened by bigattichouse - 1
- 2
- 2
Question
#7 opened by dillfrescott - 6
Perform compression in batches for texts exceeding the 8192 token limit of llama3.
#1 opened by dillfrescott - 2
Gibberish produced on 1 word spaces
#2 opened by P3GLEG