How to reproduce the result of Figure3

Question

How to reproduce the result of Figure3

Closed this issue 4 years ago · 5 comments

Hi,
It's hard for me to reproduce the result of Figure 3 in your paper. In my experiments, the F1 score of Augmented is much higher than Training and Upsampled but when percentage of cluster 2 training cells come to less then 1%, the f1 score decrease sharply.
Could kindly release the code for this experiment?

Answer 1 · 2020-09-01T14:30:52.000Z

Hello Hantao,
Sorry for the late reply. I had started digging out some of the code that was used to produces those results.
Did you manage to reproduce?

Answer 2 · 2020-09-01T14:53:44.000Z

Thanks for your reply.
For reproduce Figure3,I have one question. whether the cluster ratio must be descending order. At first, I directly use the preprocessed h5ad and downsample cluster 1 directly. And use this file to train a new network, so the cluster ratio in the parameters.json is disorder. Then I find the network can't convergence especially when the downsample rate less than 1% .
Now I rename the cluster id and let the cluster ratio in the descending order. And it seems to work well and I got similar results with paper.
It helps a lot if you can provide the code. And I can check whether I reproduce correctly.
I also wonder how to get the Figure 1a. I try to draw the tsne figure by using scanpy but it seems hard for me to get such beautiful figures in your paper. If you can provide corresponding codes that will help me a lot.

Answer 3 · 2020-09-01T15:18:59.000Z

Unfortunately, the person who ran those experiments is no longer a member of our institute, but I hope I'll be able to answer your questions.
I found back the json files from those experiments but the cluster ratios are not in descending order so there should be no need to reindex the clusters. For instance, with 0.5% of cluster 1, we get the following ratios: "clusters_ratios": [0.6109940910022927, 0.001416668220030943, 0.12330605625664064, 0.08807575446902902, 0.08565250619792346, 0.07599679385613362, 0.006803735530411766, 0.006244524390925868, 0.001174343392920387, 0.00033552668369153915].
I'm surprised that you experienced convergence problems. Even though we didn't report the results it in the paper, we even downsampled to 0.05% without experiencing any training issue.

For the t-SNE plots from the paper, there's been quite a bit of fine tuning with matplotlib, saving in vector format AND manual editing with illustrator for the colors, transparencies and the legends... It was very tedious I have to say.
I can always send you some plotting functions we used if you would like (if so, please send me your e-mail address), but it will still look pretty different from the final version.

Answer 4 · 2020-09-01T15:31:12.000Z

Thanks for your reply.
I am also quite surprised for the convergence problem. Maybe I make some mistake in the experiment.
I'm also quite interested in the plotting functions.

Answer 5 · 2020-09-01T15:55:16.000Z

I sent you an email with some code for plotting