semi-supervised ASR CoT thread Using CoT to approximate the JS-div. Setting: using half of the si284 for unlabeled text, half for unlabeled speech (to be the same as semi-supervised speech recognition). Train from epoch 25.