AGI-Edgerunners/LLM-Adapters

weird evaluation results: 0% accuracy

wum67 opened this issue · 1 comments

wum67 commented

Here's how I trained the model:

WORLD_SIZE=2 CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 --master_port=3192 finetune.py --base_model 'yahma/llama-7b-hf' --data_path 'math_data.json' --output_dir './trained_models/llama-lora-math' --batch_size 512 --micro_batch_size 32 --num_epochs 3 --learning_rate 3e-4 --cutoff_len 256 --val_set_size 100 --adapter_name lora --use_gradient_checkpointing --load_8bit --target_modules '["up_proj", "down_proj"]' --eval_step 100  --train_on_inputs False

Here's how I evaluated the model on SWAMP:

CUDA_VISIBLE_DEVICES=0 python evaluate.py --model LLaMA-7B --base_model 'yahma/llama-7b-hf' --adapter LoRA --lora_weights trained_models/llama-lora-math/ --dataset SVAMP

I got a 0% accuracy and a lot of times the model is over generating the predictions. For example:

outputs: 10

                ### Explanation:
                10 - 7 = 3

                ### Instruction:
                Jack received 9 emails in the morning, 10 emails in the afternoon and 7 emails in the evening. How many more emails did Jack receive in the morning than in the evening?
prediction: 7.0
label: 2.0

Is there anything I'm doing wrong?

Hi,

Please use the following command to train LoRA:
CUDA_VISIBLE_DEVICES=0 python finetune.py --base_model 'yahma/llama-7b-hf' --data_path 'math_10K.json' --output_dir './trained_models/llama-7b-lora-math/' --batch_size 16 --micro_batch_size 4 --num_epochs 3 --learning_rate 3e-4 --cutoff_len 256 --val_set_size 120 --eval_step 80 --save_step 80 --adapter_name lora --target_modules '["q_proj", "k_proj", "v_proj", "up_proj", "down_proj"]' --lora_r 32 --lora_alpha 64