something about data format

Question

something about data format

wkywwds opened this issue 5 months ago · 29 comments

If I want to replace the data format in "example/vipseg" with my own (as shown below), need I give the masked objects（with red colour）different ids, that is, label them with different colors? And need i do this for each picture in my own dataset?

hkchengrex commented 5 months ago

Yes.

Answer 1 · 2024-04-24T05:35:49.000Z

The masks should be read by PIL as index masks. See https://github.com/hkchengrex/XMem/blob/main/docs/PALETTE.md

Answer 2 · 2024-04-24T09:21:39.000Z

The masks should be read by PIL as index masks. See https://github.com/hkchengrex/XMem/blob/main/docs/PALETTE.md

If I want to replace the data format in "example/vipseg" with my own (as shown below), need I give the masked objects（with red colour）different ids, that is, label them with different colors? And need i do this for each picture in my own dataset?

OK.But The objects with the red mask are the same class of objects, but I hope each of them separated by DEVA, which means I want to mark them in different colors. In this case, do I have to give each object a different ID in the corresponding json file

Answer 3 · 2024-04-24T15:44:06.000Z

In this mode, we don't care about the class. It doesn't matter that they are in the same class. If you want to track them independently, label them independently.

On a side note, this type of data seems very out-of-domain.

Answer 4 · 2024-04-25T02:48:29.000Z

Fine,it is the CT of ceramic matrix composites(CMCs),which is applied in high-temperature components of aerospace spacecraft.And again, I want to confirm if in the data set required by DEVA, only a part of the pictures（not all） divided by my own model are needed, and each object is numbered in the corresponding json file, that is, different objects are marked with different colors.

Answer 5 · 2024-04-29T08:09:44.000Z

Hi,where to change the "Max allocated memory"

Answer 6 · 2024-04-29T10:27:52.000Z

Is the json file in the example necessary？I found that the database even without the json file, the program can still run.

Answer 7 · 2024-04-29T16:59:23.000Z

The json files allow users to propagate segment information (e.g., object classes) to the output. It is not strictly necessary.

Max allocated memory is just reporting the maximum amount of GPU memory allocated by PyTorch.

Answer 8 · 2024-04-30T15:02:15.000Z

Why isn't it ideal to run my dataset with DEVA

The mask is this

Answer 9 · 2024-04-30T15:11:31.000Z

I think your mask is an RGB image and not an index mask as mentioned above.

Answer 10 · 2024-05-01T12:39:08.000Z

If so,how to change the RGB mask into index mask

Answer 11 · 2024-05-01T16:27:55.000Z

The masks should be read by PIL as index masks. See https://github.com/hkchengrex/XMem/blob/main/docs/PALETTE.md

Please see this reply from above.
Simply put, the underlying data structure should be a single-channel integer mask. You can verify this by reading the image with PIL and converting it to a numpy array.

Answer 12 · 2024-05-03T14:29:36.000Z

Is it to change colour mask into palette mask? But after transforming into palette mask by PIL, it seems that the result is still not satisfactory

Answer 13 · 2024-05-05T16:53:47.000Z

The conversion is wrong. You would need to find the unique colors in the image and remap the pixels.

The following is a response from Claude 3 Sonnet.

To find the unique colors in an RGB image and remap the corresponding pixels to an index mask using PIL (Python Imaging Library) and NumPy, you can follow these steps:

Load the image using PIL and convert it to a NumPy array.

Reshape the NumPy array to a 2D array, where each row represents a pixel and each column represents the R, G, and B values.

Use NumPy's unique function to find the unique color combinations in the reshaped array.

Create a mapping from each unique color combination to a unique index.

Use NumPy's searchsorted function to find the index of each pixel's color combination in the unique color combinations array.

Create a new NumPy array with the same shape as the original image, but with the indices from the previous step as the values.

Here's the code to implement this:
from PIL import Image
import numpy as np

# Load the image
image = Image.open('image.jpg')

# Convert the image to a NumPy array
image_array = np.array(image)

# Reshape the array to a 2D array (pixels x RGB)
reshaped_array = image_array.reshape(-1, 3)

# Find unique color combinations
unique_colors = np.unique(reshaped_array, axis=0)

# Create a mapping from unique colors to indices
color_to_index = {tuple(color): index for index, color in enumerate(unique_colors)}

# Find the index of each pixel's color combination
indices = np.array([color_to_index[tuple(color)] for color in reshaped_array])

# Reshape the indices back to the original image shape
index_mask = indices.reshape(image_array.shape[0], image_array.shape[1])
In this code:

image_array is the NumPy array representation of the image.

reshaped_array is a 2D array where each row represents a pixel and each column represents the R, G, and B values.

unique_colors is a 2D array containing the unique color combinations in the image.

color_to_index is a dictionary that maps each unique color combination to a unique index.

indices is a 1D array containing the index of each pixel's color combination in the unique_colors array.

index_mask is a 2D array with the same shape as the original image, but with the indices from indices as the values.

After running this code, index_mask will contain the index mask, where each pixel value corresponds to the index of its color combination in the unique_colors array.

Note that this approach assumes that the image has a limited number of unique colors. For images with a large number of unique colors (e.g., high-resolution photographs), this method may not be efficient, and you might need to use alternative techniques, such as color quantization or clustering.

Answer 14 · 2024-05-06T07:26:34.000Z

I wrote the related program based on the above code but the effect is not very good. What goes wrong this time.

Answer 15 · 2024-05-08T17:30:11.000Z

Two observations:

The cropped version is working better
The input and output colors don't match

It seems to me that your mask input (or conversion) is still buggy.

Answer 16 · 2024-05-09T10:11:19.000Z

After trying, it is found that even if the mask is RGB, the program can run normally, and the fewer the number of targets, the better the prediction effect. May I ask whether there will be a discount for the prediction effect of DEVA for a larger number of targets

Answer 17 · 2024-05-09T15:08:44.000Z

Thank you for the update. It is possible that having many targets degrades the output (due to increased noise in memory matching), especially in out-of-domain cases like yours.

Answer 18 · 2024-05-10T11:47:23.000Z

Is there a solution for this case

Thank you for the update. It is possible that having many targets degrades the output (due to increased noise in memory matching), especially in out-of-domain cases like yours.

Answer 19 · 2024-05-11T18:05:59.000Z

Can you provide all the output frames up to the point of failure so that we can have a closer look?
What if you supply fewer objects but the image remains uncropped?

Answer 20 · 2024-05-12T07:49:26.000Z

The file size exceeds the limit and needs to be sent separately

…

------------------ 原始邮件 ------------------ 发件人: "hkchengrex/Tracking-Anything-with-DEVA" ***@***.***>; 发送时间: 2024年5月12日(星期天) 凌晨2:06 ***@***.***>; ***@***.******@***.***>; 主题: Re: [hkchengrex/Tracking-Anything-with-DEVA] something about data format (Issue #85) Can you provide all the output frames up to the point of failure so that we can have a closer look? What if you supply fewer objects but the image remains uncropped? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

Answer 21 · 2024-05-12T07:50:11.000Z

The file size exceeds the limit and needs to be sent separately

…

------------------ 原始邮件 ------------------ 发件人: "hkchengrex/Tracking-Anything-with-DEVA" ***@***.***>; 发送时间: 2024年5月12日(星期天) 凌晨2:06 ***@***.***>; ***@***.******@***.***>; 主题: Re: [hkchengrex/Tracking-Anything-with-DEVA] something about data format (Issue #85) Can you provide all the output frames up to the point of failure so that we can have a closer look? What if you supply fewer objects but the image remains uncropped? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

Answer 22 · 2024-05-15T04:25:36.000Z

Hello, have you received the file I sent? Have you found the reason why DEVA can't efficiently identify my dataset

Answer 23 · 2024-05-15T16:24:02.000Z

No, I haven't received anything.

Answer 24 · 2024-05-16T02:29:47.000Z

No, I haven't received anything.
Unet is used first, and then the connected domain is filtered according to the area threshold to form a mask.Only seven images have been tested.
IMG.zip

Answer 25 · 2024-05-20T00:57:12.000Z

Sorry I cannot check it right now. I'll get back at this later.

Answer 26 · 2024-05-20T07:34:10.000Z

Can you check the "IMG.zip" file? If not, I'll send it to you by email

Answer 27 · 2024-06-03T12:22:00.000Z

Have you found the cause of the problem when deva tracking multiple targets?

Answer 28 · 2024-06-04T15:53:12.000Z

I haven't had the time to test it yet.