edenai/edenai-apis

Google Object Localizer results parsed incorrectly

mockturtl opened this issue · 4 comments

Thanks for this tool!

There seems to be a problem parsing the response for an image detection API call with the google provider.

To reproduce

Google's documentation for Cloud Vision API "Detect multiple objects" has a live example (scroll to the bottom, "Try this method").

For the reference image, https://cloud.google.com/vision/docs/images/bicycle_example.png, it produces the following output .

  • Bicycle wheel (x2)
  • Bicycle
  • Picture frame
correct (raw Google response)
{
  "responses": [
    {
      "localizedObjectAnnotations": [
        {
          "mid": "/m/01bqk0",
          "name": "Bicycle wheel",
          "score": 0.94234306,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.31524897,
                "y": 0.78658724
              },
              {
                "x": 0.44186485,
                "y": 0.78658724
              },
              {
                "x": 0.44186485,
                "y": 0.9692919
              },
              {
                "x": 0.31524897,
                "y": 0.9692919
              }
            ]
          }
        },
        {
          "mid": "/m/01bqk0",
          "name": "Bicycle wheel",
          "score": 0.9337022,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.50342137,
                "y": 0.7553652
              },
              {
                "x": 0.6289583,
                "y": 0.7553652
              },
              {
                "x": 0.6289583,
                "y": 0.9428141
              },
              {
                "x": 0.50342137,
                "y": 0.9428141
              }
            ]
          }
        },
        {
          "mid": "/m/0199g",
          "name": "Bicycle",
          "score": 0.8973106,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.31594256,
                "y": 0.66489404
              },
              {
                "x": 0.63338375,
                "y": 0.66489404
              },
              {
                "x": 0.63338375,
                "y": 0.9687162
              },
              {
                "x": 0.31594256,
                "y": 0.9687162
              }
            ]
          }
        },
        {
          "mid": "/m/06z37_",
          "name": "Picture frame",
          "score": 0.7171168,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.7882889,
                "y": 0.16610023
              },
              {
                "x": 0.9662418,
                "y": 0.16610023
              },
              {
                "x": 0.9662418,
                "y": 0.3178568
              },
              {
                "x": 0.7882889,
                "y": 0.3178568
              }
            ]
          }
        }
      ]
    }
  ]
}

When I pass the same image to Eden (with providers=google), the output has duplicate items.

  • In the "google" JSON object, it appears the *_min, *_max fields are populated incrementally in four "copies" of the data. For each item, note

    • the first "copy" has no logical size (x_min == x_max && y_min == y_max).
    • the second "copy" has no logical height (y_min == y_max).
    • the third and fourth "copies" are identical.
  • The "eden-ai" JSON object contains the redundant "copies" of each item.

Eden AI output
{
  "google": {
    "status": "success",
    "items": [
      {
        "label": "Bicycle wheel",
        "confidence": 0.94234306,
        "x_min": 0.31524897,
        "x_max": 0.31524897,
        "y_min": 0.78658724,
        "y_max": 0.78658724
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.94234306,
        "x_min": 0.31524897,
        "x_max": 0.44186485,
        "y_min": 0.78658724,
        "y_max": 0.78658724
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.94234306,
        "x_min": 0.31524897,
        "x_max": 0.44186485,
        "y_min": 0.78658724,
        "y_max": 0.9692919
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.94234306,
        "x_min": 0.31524897,
        "x_max": 0.44186485,
        "y_min": 0.78658724,
        "y_max": 0.9692919
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.93370223,
        "x_min": 0.50342137,
        "x_max": 0.50342137,
        "y_min": 0.7553652,
        "y_max": 0.7553652
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.93370223,
        "x_min": 0.50342137,
        "x_max": 0.6289583,
        "y_min": 0.7553652,
        "y_max": 0.7553652
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.93370223,
        "x_min": 0.50342137,
        "x_max": 0.6289583,
        "y_min": 0.7553652,
        "y_max": 0.9428141
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.93370223,
        "x_min": 0.50342137,
        "x_max": 0.6289583,
        "y_min": 0.7553652,
        "y_max": 0.9428141
      },
      {
        "label": "Bicycle",
        "confidence": 0.89731073,
        "x_min": 0.31594256,
        "x_max": 0.31594256,
        "y_min": 0.66489404,
        "y_max": 0.66489404
      },
      {
        "label": "Bicycle",
        "confidence": 0.89731073,
        "x_min": 0.31594256,
        "x_max": 0.63338375,
        "y_min": 0.66489404,
        "y_max": 0.66489404
      },
      {
        "label": "Bicycle",
        "confidence": 0.89731073,
        "x_min": 0.31594256,
        "x_max": 0.63338375,
        "y_min": 0.66489404,
        "y_max": 0.9687162
      },
      {
        "label": "Bicycle",
        "confidence": 0.89731073,
        "x_min": 0.31594256,
        "x_max": 0.63338375,
        "y_min": 0.66489404,
        "y_max": 0.9687162
      },
      {
        "label": "Picture frame",
        "confidence": 0.7171168,
        "x_min": 0.7882889,
        "x_max": 0.7882889,
        "y_min": 0.16610023,
        "y_max": 0.16610023
      },
      {
        "label": "Picture frame",
        "confidence": 0.7171168,
        "x_min": 0.7882889,
        "x_max": 0.9662418,
        "y_min": 0.16610023,
        "y_max": 0.16610023
      },
      {
        "label": "Picture frame",
        "confidence": 0.7171168,
        "x_min": 0.7882889,
        "x_max": 0.9662418,
        "y_min": 0.16610023,
        "y_max": 0.3178568
      },
      {
        "label": "Picture frame",
        "confidence": 0.7171168,
        "x_min": 0.7882889,
        "x_max": 0.9662418,
        "y_min": 0.16610023,
        "y_max": 0.3178568
      }
    ],
    "cost": 0.00225
  },
  "eden-ai": {
    "status": "success",
    "items": [
      {
        "label": "Bicycle wheel",
        "confidence": 0.94234306,
        "x_min": 0.31524897,
        "x_max": 0.31524897,
        "y_min": 0.78658724,
        "y_max": 0.78658724
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.94234306,
        "x_min": 0.31524897,
        "x_max": 0.44186485,
        "y_min": 0.78658724,
        "y_max": 0.78658724
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.94234306,
        "x_min": 0.31524897,
        "x_max": 0.44186485,
        "y_min": 0.78658724,
        "y_max": 0.9692919
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.93370223,
        "x_min": 0.50342137,
        "x_max": 0.50342137,
        "y_min": 0.7553652,
        "y_max": 0.7553652
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.93370223,
        "x_min": 0.50342137,
        "x_max": 0.6289583,
        "y_min": 0.7553652,
        "y_max": 0.7553652
      },
      {
        "label": "Bicycle wheel",
        "confidence": 0.93370223,
        "x_min": 0.50342137,
        "x_max": 0.6289583,
        "y_min": 0.7553652,
        "y_max": 0.9428141
      },
      {
        "label": "Bicycle",
        "confidence": 0.89731073,
        "x_min": 0.31594256,
        "x_max": 0.31594256,
        "y_min": 0.66489404,
        "y_max": 0.66489404
      },
      {
        "label": "Bicycle",
        "confidence": 0.89731073,
        "x_min": 0.31594256,
        "x_max": 0.63338375,
        "y_min": 0.66489404,
        "y_max": 0.66489404
      },
      {
        "label": "Bicycle",
        "confidence": 0.89731073,
        "x_min": 0.31594256,
        "x_max": 0.63338375,
        "y_min": 0.66489404,
        "y_max": 0.9687162
      },
      {
        "label": "Picture frame",
        "confidence": 0.7171168,
        "x_min": 0.7882889,
        "x_max": 0.7882889,
        "y_min": 0.16610023,
        "y_max": 0.16610023
      },
      {
        "label": "Picture frame",
        "confidence": 0.7171168,
        "x_min": 0.7882889,
        "x_max": 0.9662418,
        "y_min": 0.16610023,
        "y_max": 0.16610023
      },
      {
        "label": "Picture frame",
        "confidence": 0.7171168,
        "x_min": 0.7882889,
        "x_max": 0.9662418,
        "y_min": 0.16610023,
        "y_max": 0.3178568
      }
    ]
  }
}

Hello @mockturtl ,
Thank you for pointing out this issue, we will be investigating and hopefully fix it as soon as possible,
Bests

Hello @mockturtl ,
We fixed this issue, thank you for bringing it to our attention!
Bests

@floflokie @DninoAdnane Any idea when this fix will be deployed/released? (ref: 33ade09)

I still see the bug when I send requests to https://api.edenai.run/v2/image/object_detection, or visit https://app.edenai.run/bricks/image/object-detection with a clean browser cache.

Hello @mockturtl ,
It should be deployed this week, probably within the next 2, 3 days!