transformer
a1wj1 opened this issue · 5 comments
Hello, may I ask what is the input for the decoder of the transformer? What is the difference with the input of the encoder.
Besides,will the collected bounding boxes filter out the ones outside the camera?
Hi!
- The input of the encoder includes the image feature of the front/left/right/focusing view and the LiDAR feature.
InterFuser/interfuser/timm/models/interfuser.py
Line 1024 in e4f0314
- The input of the decoder includes query embeddings(waypoints, traffic sign, object density map) and the output of the decoder.
InterFuser/interfuser/timm/models/interfuser.py
Line 1025 in e4f0314
- By the way, you can refer to the pipeline picture in our paper, it may solve your questions like above.
will the collected bounding boxes filter out the ones outside the camera?
No, we consider all the objects within a certain distance of the ego-car.
OK,In addition, the input of the encoder is information about the current frame, while the input of the decoder is information about future frames, right?
the input of the encoder is information about the current frame
Yes
the input of the decoder is information about future frames
No, it mainly includes the information of current frame. In addition to the waypoints inlcudes some future prediction.
Hi,
Is there any significance for taking 401 to 411 from hs(decoder output). Is it like only these 10 features need to be taken or can i take starting features also?