Issues
- 2
aria2c 'magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA' not working
#87 opened by Nolyzlel - 0
Question regarding EnergonAI repo
#114 opened by philipp-fischer - 7
- 1
- 2
12GB card
#109 opened by arthurwolf - 0
Quick Question
#112 opened by ArgusK17 - 9
Download 7B model seems stuck
#104 opened by guanlinz - 2
about rotary embedding in llama
#83 opened by irasin - 0
How to run an interactive mode in Jupyter?
#111 opened by myrainbowandsky - 3
Quantize Original LLaMA Model Files
#60 opened by htcml - 1
no module named llama
#108 opened by Cooper-Ji - 1
- 6
- 0
NVMLError_NoPermission: Insufficient Permissions
#106 opened by sz2three - 0
Download watchdog kicking in? (M1 mac)
#103 opened by kryt - 6
Cannot run on Mac with Python 3.11.3
#81 opened by kornhill - 13
Downloading get stuck in infinite loop
#56 opened by jarimustonen - 0
RuntimeError: Error(s) in loading state_dict for LLaMAForCausalLM: Unexpected key(s) in state_dict:
#102 opened by ZealHua - 4
RecursionError running llama.download
#90 opened by anyangpeng - 5
Why are params.json empty?
#93 opened by ItsCRC - 1
Strange characters
#82 opened by webpolis - 0
RecursionError: maximum recursion depth exceeded while calling a Python object
#101 opened by Vaibhav11002 - 0
shape mismatch error
#100 opened by Celppu - 0
- 0
parameter inncorrect when I run make command
#98 opened by GameDevKitY - 4
gptq github
#97 opened by austinmw - 0
Try Modular - Mojo
#96 opened by eznix86 - 0
Randomly get shape mismatch error
#95 opened by vedantroy - 0
Does this include the GPTQ quantization tricks?
#94 opened by vedantroy - 0
Quantize issue
#92 opened by ZenekZombie - 1
Is that possible to quantize a locally converted model, instead of downloading from hugging face?
#91 opened by chigkim - 2
downloading file to pyllama_data/30B/consolidated.00.pth ...please wait for a few minutes ...
#88 opened by Nolyzlel - 1
Apply Delta failed
#85 opened by majidbhatti - 2
already quantize to 4bit and get the model pyllama-7B4b.pt,but can not run in RTX3080. report torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB (GPU 0; 10.00 GiB total capacity; 9.24 GiB already allocated;
#57 opened by elven2016 - 1
Can't see progress bar
#66 opened by rahulvigneswaran - 0
- 8
- 1
- 6
Killed
#62 opened by javierp183 - 1
a questuon about the single GPU Inference
#74 opened by TitleZ99 - 0
Inference Error :UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 18: invalid continuation byte"
#80 opened by MaiziXiao - 6
quantify llama 7B, the md5 value and the model size does not equals to the value in README
#73 opened by balcklive - 1
- 1
Quantized version link suspect
#78 opened by thistleknot - 2
Can't Load Quantized Model with GPTQ-for-LLaMa
#75 opened by chigkim - 0
- 0
Document if it works with CPU / Macos
#69 opened by ikamensh - 0
Has black formatting been considered?
#65 opened by tanitna - 0
- 2
Error trying Quantize 7B model to 8-bit
#55 opened by guoti777