Dataset splits

Question

Dataset splits

JunMa11 opened this issue 4 years ago · 5 comments

JunMa11 commented 4 years ago

Dear @liuquande ,

Thanks for sharing the great work.

Could you please share the data split files in Table 2?

For example, in each Site, which cases are used as training/testing.

Best,
Jun

Answer 1 · 2021-05-10T05:11:30.000Z

Hi Jun,

Thanks for the interest.

We use all the data from each site for training for testing.
Taking Site A as target domain for instance, we will use all data from site B-F for training and all data from site A for testing.

Answer 2 · 2021-05-10T14:39:54.000Z

Hi @liuquande,

Thanks for your reply very much.

Taking Site A as target domain for instance, we will use all data from site B-F for training and all data from site A for testing.

This is the intra-site setting, right?

Q1. How about the training and testing data in DeepAll setting?

with some outlier cases excluded to provide general internal performance on each site

Q2. What are these outlier cases in each site?

Looking forward to your reply:)

Kindest regards,
Jun

Answer 3 · 2021-05-12T05:46:22.000Z

Hi Jun,

Taking Site A as target domain for instance, we will use all data from site B-F for training and all data from site A for testing.

This denote the DeepAll setting actually, and the Intra-site setting denote training and testing on the same site.

For Q2, we notice that in Intra-site setting, sometimes the model developed on Site X may not perform well on certain testing case of Site X (with Dice less than 20% if I remembered correctly). We think the reason could be the distribution of that particular testing case may not fit well with the learned data distribution at Site X, and regard cases like that as outlier cases.

Answer 4 · 2021-05-12T06:06:40.000Z

Hi @liuquande ,

Got it. Thanks for your kind reply very much.

Answer 5 · 2022-06-07T05:42:58.000Z

Could you share which cases are exclueded?
Looking forwart to your reply.