DSODSL Output tensor format

Question

DSODSL Output tensor format

anindya7 opened this issue 4 years ago · 1 comments

Hi - thank you for the well written implementation of SSD Text detection using Keras. I have been using the code from SL_end2end_predict.ipynb to get back detected characters along with their bounding boxes.

The Detection model has an output dimension of (batches, 5461, 31). Here, how may I retrieve the coordinates of a prediction box?

Thank you.

Answer 1 · 2020-08-17T19:42:11.000Z

Inferred from comments on function decode sl_utils.py

`def decode(self, model_output,
segment_threshold=0.55, link_threshold=0.35, top_k_segments=800, debug=False, debug_combining=False):
"""Decode local classification and regression results to combined bounding boxes.

    # Arguments
        model_output: Array with SegLink model output of shape 
            (segments, 2 segment_label + 5 segment_offset + 2*8 inter_layer_links_label 
            + 2*4 cross_layer_links_label)
        segment_threshold: Threshold for filtering segment confidence, float betwen 0 and 1.
        link_threshold: Threshold for filtering link confidence, float betwen 0 and 1.

    # Return
        Array with rboxes of shape (results, x + y + w + h + theta + confidence).`