Why we have a few amount of dialogs per speaker than expected 100? (v. 3.1)
Closed this issue · 0 comments
nicolay-r commented
#19 related.
Here is the log information:
Source of the problem:
It happens, because we first do selection of the speakers by relying on the amount of utterances related to them:
https://github.com/nicolay-r/chatbot_experiments/blob/3b83e2e8730a6d3d5b59e82671001ba135598dc9/my_s3_dataset_0_create.py#L10-L12
And then, filter some of these utterances here at dataset writing stage:
https://github.com/nicolay-r/chatbot_experiments/blob/3b83e2e8730a6d3d5b59e82671001ba135598dc9/my_s3_dataset_0_create.py#L28-L30