tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
PythonApache-2.0
Issues
- 3
ValueError: Trying to set a tensor of shape torch.Size([32769536]) in "weight" (which has shape torch.Size([32001, 4096])), this looks incorrect.
#319 opened by daidaiershidi - 2
SFT Mistral;
#317 opened by feiying12343 - 1
- 2
- 3
weight_diff.py state_dict_recovered[key].add_(state_dict_raw[key]) RuntimeError: The size of tensor a (32001) must match the size of tensor b (32000) at non-singleton dimension 0
#304 opened by gaodexiaozheng - 0
RuntimeError: The size of tensor a (65539072) must match the size of tensor b (262156288) at non-singleton dimension 0
#316 opened by YuyangJ0 - 0
Tensors of the same index must be on the same device and the same dtype except `step` tensors that can be CPU and float32 notwithstanding
#315 opened by wurevvc - 0
train.py fails with TypeError: Object of type Tensor is not JSON serializable
#314 opened by khayamgondal - 1
openai version
#312 opened by cswangxiaowei - 0
Cuda OOM during training
#308 opened by hychaochao - 4
Finetune with A100 40G
#280 opened by jianchaoji - 0
- 0
Problems generating my own data offline
#305 opened by JieDengsc - 1
- 2
- 0
How to get the model
#302 opened by Guo-Chenxu - 0
Can you release your evaluation code and data?
#301 opened by mshich1 - 2
Loss will suddenly turn 0 during SFT
#298 opened by zhangyx0417 - 3
Question about padding the input sequence
#294 opened by BaleChen - 0
- 1
Confusion about instruction task
#299 opened by mitchelldehaven - 2
TypeError: 'type' object is not subscriptable
#287 opened by WYXG233 - 0
Utilize regen.json in finetuning
#297 opened by Yijia-Xiao - 0
Wonder how to inference after finetuning.
#295 opened by 5taku - 0
- 0
How to provide extra contexrt as a pdf file?
#292 opened by gamerjazzar - 0
- 5
Training bug for 13b, 30b, and 65b
#285 opened by alexgshaw - 1
BUG: "labels" information leakage into "input_ids" fields - incorrect attention_mask
#290 opened by Nsigma-Bill - 1
incorrect model_max_length
#289 opened by joemkwon - 0
Why do we pass both question and answer as input to the model during training?
#281 opened by ruiyigan - 0
DeepSpeed compilation (cpu_adam issue)
#288 opened by JohnTailor - 2
The OOM problem caused by the Transformers version
#278 opened by kiseliu - 1
- 1
How to fine tuning th model with limited resource?
#270 opened by GivanTsai - 0
where did the code define the wandb?
#286 opened by applepieiris - 0
Location of Log Files for the model?
#284 opened by harshaelon - 0
How to classify all the data?
#283 opened by rayrayraykk - 0
- 0
How to finetune using the customizer data?
#279 opened by JustinZou1 - 1
Inquiry about license
#276 opened by CallMeDek - 0
why I save two pytorch_model.bin with same size
#277 opened by qwjaskzxl - 2
- 0
Why the model I got after finetune is not good
#275 opened by wyzhhhh - 2
- 0
- 0
Can fine-tuning run on multi machine distributedly?
#269 opened by ovasty - 1
Number of trainable parameters is less than 7B
#266 opened by yuanzhedong - 0
I have trained 10000 records from alpaca_data.json; but encountered an unrecognized response
#267 opened by gugongerguo - 0
Why are 28 of the outputs empty?
#265 opened by tetratorus