UKPLab/gpl

GPL with low performant CE

Opened this issue · 0 comments

Does it make sense to train a model using GPL, when the CE used for pseudo labelling is a bad performer on the domain dataset (i.e. when using the CE directly for IR tasks on the domain dataset, the results are poor)? I would think the GPL trained model would also be a poor performer as the CE performance represents the upperbound the GPL can achieve.

If my reasoning is correct, is there a way to deal with this shortcoming?