Training and implement details for Chatlaw2
Opened this issue · 0 comments
jacky-rui commented
I am currently exploring the ChatLaw models and I have a few questions regarding their training schemes and roles within the ensemble model.
-
Could you please provide detailed information about the training schemes used for ChatLaw2_plain and ChatLaw2E_plain? Specifically, I am interested in the datasets, preprocessing steps, model architectures, and any fine-tuning techniques applied.
-
Additionally, I would like to understand the role that ChatLaw2_plain and ChatLaw2E_plain play within the ChatLaw2_MOE (Mixture of Experts) model. How do these models interact and contribute to the overall performance of ChatLaw2_MOE?
Thank you in advance for your assistance. I look forward to your response.