mahmoodlab/UNI

Problems with the code for model training

Opened this issue · 2 comments

Hello author, thank you for selflessly sharing this great work. This is very inspiring to me, but I still have some questions:

  1. I did not find any code on how to train the model in the GitHub code. May I ask if it is not publicly available or if it exists in a place that I did not notice.
  2. Does your training process strictly follow DINOV2? Is the input model a patch image cut from the WSI or an entire WSI?
    Thank you again for sharing, and I look forward to hearing from you.

Hi @DaIGaN2019

  1. See the model card on HuggingFace for more documentation on how UNI was trained and all the codebase dependencies. The SSL code for training UNI is DINOv2. All downstream evaluation code were also used out-of-the-box with minimal modifications unless otherwise stated in the methods and supplement.

  2. Training process strictly follows DINOv2. See the attached configuration in the supplement, and highlighted sentences in the method section on what modifications we made different from default short training for DINOv2 (we also made the local crop sizes slightly larger). For training UNI, we pass in cut patch images from the WSI (see attached screenshot in methods section on how training data was curated).

image image image

Thank you very much for your answer, it's very helpful!I am a beginner in foundation models. Currently, I am using the training code provided in DINOV2 to train my dataset from scratch. I would like to know if there is a simple interface that can directly reproduce the training process of UNI or fine-tune it on my own dataset?I checked huggingface, but only saw some language descriptions.Thank you again for your help, whether or not you can provide these interfaces.I look forward to your reply.