JialianW/TraDeS

about baseline in paper

dangnh0611 opened this issue · 1 comments

Hi, thanks for your works. When reading your paper, i have few questions about the baseline:

  1. Baseline is build based on CenterNet with an extra head predicting tracking offset map O_B. As mention in #36 , backbone network just take a single image (frame at timestamp t) as input instead of previous_frame + current_frame + previous_detection_heatmap like CenterTrack? So I wonder how to predict offset map O_B with just one frame. My initial guess is that feature map of frame t and t-T is calculated independently, but attaching offset-map head on an aggregated feature map (e.g simple stack/average/subtraction) of t-T and t?
  2. In Fig. 4, since Baseline doesn't have O_C (w/o CVA), does it means Baseline == Baseline + CVA or O_C == O_B in the baseline context?

Thanks for your interest! The baseline in the paper should refer to just adding a simple tracking head on CenterNet. The O_B in baseline should be predicted from the single current image only.