likenneth/honest_llama

Inference-Time Intervention: Eliciting Truthful Answers from a Language Model

PythonMIT

Issues

Clarification on where to intervene
#42 opened 5 months ago by CheongWoong
4
Will you publish the Llama3 fine-tuned model?
#39 opened 5 months ago by aryopg
4
Will you publish the Llama3 fine-tuned model?
#40 opened 5 months ago by aryopg
1
ask about the insight behind ασθ
#37 opened 6 months ago by NieSYsc20
1
Issues related to reproducing the results of the paper
#35 opened 7 months ago by wytbwytb
3
Discrepancy in Reproducing Results with llama-7B on TriviaQA Dataset
#30 opened 7 months ago by XianfengJiao
5
Code for CCS
#33 opened 7 months ago by itsmemala
1
Clarifications on table 5 of paper
#34 opened 7 months ago by itsmemala
4
Why does memory accumulate and ultimately cause overflow when running get_activations.py?
#26 opened 9 months ago by Renpf2022
4
Inquiry on the GPT-judge cost and potential subsititutions
#23 opened 9 months ago by night-chen
3
Cannot replicate results on judge and info metric
#20 opened 9 months ago by fabrahman
6
Query regarding dimensions of activations
#32 opened a year ago by itsmemala
1
Interesting work! Providing an additional convenient way of reproducing the results!
#31 opened a year ago by frankaging
1
How to calculate the gap between generation accuracy and probe accuracy, which is 40% mentioned in the paper?
#28 opened a year ago by DLiquor
2
issue in validate_2fold.py ordering csv by huggingface order
#27 opened a year ago by tianlwang
7
Does ITI support Qwen？
#25 opened a year ago by menghonghan
1
False Answer in OamPatel/iti_trivia_qa_val
#22 opened a year ago by Vicent0205
3
Support llama2 series models？
#21 opened a year ago by skykiseki
2
why u chose ‘tqa_gen_end_q’ to compute std and not ‘tqa_mc2’ to compute std.
#19 opened a year ago by thyywr759
1
Question: doess INI support other kinds model like mpt or baichuan?
#18 opened a year ago by Mewral
1
what is self_attn.head_out
#7 opened 2 years ago by JianqiaoLu
3
Query on result of Vicuna and Alpaca
#17 opened a year ago by jongjyh
2
Why do we use `llama.*` instead of HuggingFace's llama?
#15 opened a year ago by RylanSchaeffer
1
inquery of visualizing result on the paper
#4 opened 2 years ago by jongjyh
3
Comparison with other tuning methods
#14 opened 2 years ago by FLLLIGHT
1
Cannot find truthful_qa.py
#13 opened 2 years ago by LebronX
3
disagreement about truthful qa results
#5 opened 2 years ago by Vicent0205
3
Saving the model after shifting activations
#12 opened 2 years ago by A-Raafat
1
Difference between tqag_gen_end_q and tqa_gen?
#11 opened 2 years ago by jongjyh
1
Which part of the paper does tqa_gen_end_q correspond to？
#8 opened 2 years ago by CaoYiqingT
2
How to use tqa_gen and tqa_end_end_q?
#9 opened 2 years ago by CaoYiqingT
2
The result of the code doesn't match the result in the paper
#10 opened 2 years ago by CaoYiqingT
2
Potential Data Leakage in Probes Training
#6 opened 2 years ago by jongjyh
7
validation code seems to have only few-shot settings
#2 opened 2 years ago by Maxlinn
3
Ask for providing the GPT-4 generated false answers for NQ/TriviaQA
#1 opened 2 years ago by voidism
6
inquery on equaltion of paper
#3 opened 2 years ago by jongjyh
1