Retrieval-augmented baselines - Huggingface models

Question

Retrieval-augmented baselines - Huggingface models

mozhu621 opened this issue 10 months ago · 4 comments

python run_baseline_refactor.py
error：
python: can't open file 'run_baseline_refactor.py': [Errno 2] No such file or directory
This python file doesn't exist, I think it's still run_baseline_lm right, other than that I'm getting very low results from running it, can you give me the command line you ran it on?
python run_baseline_lm.py \

--model_name meta-llama/Llama-2-7b-hf
--input_file eval_data/health_claims_processed.jsonl
--max_new_tokens 100 --metric match
--result_fp RESULT_FILE_PATH --task qa
--mode retrieval
--prompt_name "prompt_no_input_retrieval"
overall result: 0.0070921985815602835

Answer 1 · 2024-03-20T11:12:29.000Z

and I run the example about PubHealth，
python run_baseline_lm.py \

--model_name meta-llama/Llama-2-7b-hf
--input_file eval_data/health_claims_processed.jsonl
--max_new_tokens 20
--metric accuracy
--result_fp llama2_7b_pubhealth_results.json
--task fever
overall result: 0.1702127659574468 is so different in paper table 1， I don't know what happen.

Answer 2 · 2024-04-16T02:54:19.000Z

and I run the example about PubHealth， python run_baseline_lm.py \

--model_name meta-llama/Llama-2-7b-hf
--input_file eval_data/health_claims_processed.jsonl
--max_new_tokens 20
--metric accuracy
--result_fp llama2_7b_pubhealth_results.json
--task fever
overall result: 0.1702127659574468 is so different in paper table 1， I don't know what happen.

The same result on pubhealth, which is lower than the result in the paper.

Answer 3 · 2024-05-08T13:46:54.000Z

Has this issue been resolved?

Answer 4 · 2024-06-06T02:06:26.000Z

Has this issue been resolved? I encountered the same problem。