Another Question about Default Transformer Decoding Config
chujiezheng opened this issue · 1 comments
I found that in transformers
, the default top_k
in the generate()
method was set to 50. So if we want to do strict control experiments, we should explicitly pass top_k=0
into the generate()
method.
However, I did not see this explicit configuration in the tune_temp
and tune_topp
parts in attack.py
. I am not sure whether this would affect experimental results...
Hi Chujie,
I appreciate your input. I acknowledge your suggestion that to isolate the effects of Top-k decoding and Top-p decoding, it may be beneficial to keep top_k
fixed at 0 (i.e., sampling from all tokens) while varying the other two hyperparameters.
However, in our current analysis, our primary focus is on examining how minor deviations from the default configuration impact performance. As a result, we've opted to modify a single parameter at a time while keeping the rest the same as the default setting.
But still, I will provide additional results in a follow-up where we fix top_k
at 0 in response to your suggestion.