TencentAILabHealthcare/scBERT

whether the vector of 0 value can be trained

zhuzn opened this issue · 1 comments

zhuzn commented

Thank you for sharing this useful tool. How to deal with the vector of 0 value in the pre-training model, whether the vector of 0 value can be trained

Please see the discussion of the paper. " Third, the efficiency of masking during pretraining is another point worth optimizing. The current masking strategy in scBERT is simplified with non-zero masking. With the zero-inflated input45, the model might be inclined to output all zeroes for the reconstruction task during pretraining. We therefore masked the non-zero values and calculated the loss based on the non-zero values during pretraining; however, masking only the non-zero values may lower the utilization of the single-cell data for pretraining, due to their minority. Advanced masking strategy tailored
for single-cell data could be introduced to improve the computational efficiency of the masking process."