How are npy files in the directory "link_npy" generated?

Question

How are npy files in the directory "link_npy" generated?

Closed this issue 6 months ago · 1 comments

I printed npy[0] out and it shows:

{
  'name': '101_alfonso',
  'bbox': tensor([[3.2000e+01, 0.0000e+00, 4.6000e-01, 7.3333e-02, 5.4000e-01],
                  [1.8000e+01, 9.6667e-02, 4.9667e-01, 1.5000e-01, 5.2333e-01],
                  [1.0000e+00, 1.9667e-01, 4.0000e-01, 2.3000e-01, 5.8667e-01],
                  [6.7000e+01, 2.5333e-01, 4.5667e-01, 3.5667e-01, 5.2000e-01],
                  [8.2000e+01, 2.5000e-01, 5.4333e-01, 2.8333e-01, 5.8333e-01],
                  [1.8000e+01, 2.9667e-01, 5.6333e-01, 3.2667e-01, 5.8333e-01],
                  [9.0000e+00, 3.4000e-01, 5.5000e-01, 3.8000e-01, 5.8667e-01],
                  [8.7000e+01, 2.8000e-01, 4.1333e-01, 3.1333e-01, 4.4667e-01],
                  [6.9000e+01, 3.9333e-01, 4.5667e-01, 4.4000e-01, 5.1333e-01],
                  [8.2000e+01, 4.3333e-01, 5.0667e-01, 4.7000e-01, 5.4333e-01],
                  [5.0000e+00, 4.8000e-01, 4.9000e-01, 5.2667e-01, 4.9667e-01],
                  [1.0000e+00, 5.7333e-01, 4.0667e-01, 5.9667e-01, 5.5000e-01],
                  [8.7000e+01, 6.1000e-01, 4.6333e-01, 6.5333e-01, 5.1333e-01],
                  [5.0000e+00, 6.7333e-01, 4.8667e-01, 7.0667e-01, 4.9333e-01],
                  [1.0000e+01, 7.2000e-01, 4.5667e-01, 7.6667e-01, 5.0667e-01],
                  [2.0000e+00, 7.5000e-01, 4.0333e-01, 7.8667e-01, 5.5000e-01],
                  [6.0000e+01, 8.0333e-01, 4.5667e-01, 8.5333e-01, 5.1333e-01],
                  [2.0000e+00, 8.1000e-01, 3.8667e-01, 8.7000e-01, 6.0667e-01],
                  [9.1000e+01, 9.0333e-01, 4.6000e-01, 9.5000e-01, 5.2333e-01],
                  [1.0000e+01, 9.5667e-01, 4.0333e-01, 1.0000e+00, 4.5000e-01]]), 
  'edge_type': tensor([[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 2, 0, 3, 0, 0, 4, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 4, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 2, 0, 0, 0, 0, 0, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 4],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0]])
}

If I want to change to my own dataset, and my annotations are like:

[
  {
      "image": "xxxx.jpg",
      "annotations": [
        {
          "label": "a",
          "coordinates": {
            "x": 78,
            "y": 83.5,
            "width": 29.5,
            "height": 161
          }
        },
        {
          "label": "b",
          "coordinates": {
            "x": 120,
            "y": 68.5,
            "width": 31.5,
            "height": 137
          }
        },
        {
          "label": "c",
          "coordinates": {
            "x": 57.5,
            "y": 143,
            "width": 33.5,
            "height": 44.5
          }
        },
        ......
      ],
      "edges": ["'a', 'b', 'right'", "'b', 'c', 'sub'"]
  },
  ......
]

How to preprocess my dataset to apply your code?
I'd appreciate it if you can help me with this.

Answer 1 · 2024-07-02T06:03:30.000Z

I changed the way to print it and understood the data structure now.
Types of symbols and relations are in vocab.json.

(dict_keys(['name', 'bbox', 'edge_type']),
 {'name': '2_em_3',
  'bbox': tensor([[71.0000,  0.0000,  0.1433,  0.2067,  0.8533],
                  [33.0000,  0.3267,  0.2133,  0.7567,  0.6367],
                  [72.0000,  0.9167,  0.1400,  1.0000,  0.8200]]),
  'edge_type': tensor([[0, 1, 0],
                       [2, 0, 1],
                       [0, 2, 0]])})