juncongmoo/pyllama

LLaMA: Open and Efficient Foundation Language Models

PythonGPL-3.0

Issues

aria2c 'magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA' not working
#87 opened 2 years ago by Nolyzlel
2
Question regarding EnergonAI repo
#114 opened 10 months ago by philipp-fischer
0
Run 'inference.py' and 'model parallel group is not initialized'
#86 opened 2 years ago by ildartregulov
7
torch.distributed.elastic.multiprocessing.errors.ChildFailedError
#113 opened a year ago by sido420
1
12GB card
#109 opened a year ago by arthurwolf
2
Quick Question
#112 opened a year ago by ArgusK17
0
Download 7B model seems stuck
#104 opened 2 years ago by guanlinz
9
about rotary embedding in llama
#83 opened a year ago by irasin
2
How to run an interactive mode in Jupyter?
#111 opened a year ago by myrainbowandsky
0
Quantize Original LLaMA Model Files
#60 opened 2 years ago by htcml
3
no module named llama
#108 opened a year ago by Cooper-Ji
1
evaluating has an extremely large value when quantize to 4bit.
#105 opened 2 years ago by JiachuanDENG
1
ModuleNotFoundError: No module named 'transformers'
#67 opened 2 years ago by tasteitslight
6
NVMLError_NoPermission: Insufficient Permissions
#106 opened 2 years ago by sz2three
0
Download watchdog kicking in? (M1 mac)
#103 opened 2 years ago by kryt
0
Cannot run on Mac with Python 3.11.3
#81 opened 2 years ago by kornhill
6
Downloading get stuck in infinite loop
#56 opened 2 years ago by jarimustonen
13
RuntimeError: Error(s) in loading state_dict for LLaMAForCausalLM: Unexpected key(s) in state_dict:
#102 opened 2 years ago by ZealHua
0
RecursionError running llama.download
#90 opened 2 years ago by anyangpeng
4
Why are params.json empty?
#93 opened 2 years ago by ItsCRC
5
Strange characters
#82 opened 2 years ago by webpolis
1
RecursionError: maximum recursion depth exceeded while calling a Python object
#101 opened 2 years ago by Vaibhav11002
0
shape mismatch error
#100 opened 2 years ago by Celppu
0
an operation was attempted on something that is not a socket
#99 opened 2 years ago by GameDevKitY
0
parameter inncorrect when I run make command
#98 opened 2 years ago by GameDevKitY
0
gptq github
#97 opened 2 years ago by austinmw
4
Try Modular - Mojo
#96 opened 2 years ago by eznix86
0
Randomly get shape mismatch error
#95 opened 2 years ago by vedantroy
0
Does this include the GPTQ quantization tricks?
#94 opened 2 years ago by vedantroy
0
Quantize issue
#92 opened 2 years ago by ZenekZombie
0
Is that possible to quantize a locally converted model, instead of downloading from hugging face?
#91 opened 2 years ago by chigkim
1
downloading file to pyllama_data/30B/consolidated.00.pth ...please wait for a few minutes ...
#88 opened 2 years ago by Nolyzlel
2
Apply Delta failed
#85 opened 2 years ago by majidbhatti
1
already quantize to 4bit and get the model pyllama-7B4b.pt，but can not run in RTX3080. report torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB (GPU 0; 10.00 GiB total capacity; 9.24 GiB already allocated;
#57 opened 2 years ago by elven2016
2
Can't see progress bar
#66 opened 2 years ago by rahulvigneswaran
1
How to run 13B model in a single GPU just by inference.by?
#84 opened 2 years ago by statyui
0
Quantization with "groupsize" makes the results completely wrong.
#58 opened 2 years ago by daniel-kukiela
8
Any way to infer a quantized model on multi GPUs?
#61 opened 2 years ago by Imagium719
1
Killed
#62 opened 2 years ago by javierp183
6
a questuon about the single GPU Inference
#74 opened 2 years ago by TitleZ99
1
Inference Error :UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 18: invalid continuation byte"
#80 opened 2 years ago by MaiziXiao
0
quantify llama 7B, the md5 value and the model size does not equals to the value in README
#73 opened 2 years ago by balcklive
6
Readme Should Have Inference Command to use for Quantization in Text
#72 opened 2 years ago by chigkim
1
Quantized version link suspect
#78 opened 2 years ago by thistleknot
1
Can't Load Quantized Model with GPTQ-for-LLaMa
#75 opened 2 years ago by chigkim
2
Is there a way to skip evaluating after quantizing because it takes forever?
#77 opened 2 years ago by chigkim
0
Document if it works with CPU / Macos
#69 opened 2 years ago by ikamensh
0
Has black formatting been considered?
#65 opened 2 years ago by tanitna
0
How to run the gradio with 30B model? and what devices are needed? please
#64 opened 2 years ago by TobiasWYH
0
Error trying Quantize 7B model to 8-bit
#55 opened 2 years ago by guoti777
2