Division of data into training and validation set & COCO Metric Callback not working with Keras CV implementation as expected

Question

Division of data into training and validation set & COCO Metric Callback not working with Keras CV implementation as expected

Opened this issue 2 months ago · 2 comments

Filing this as an issue, since I just encountered it as well.

I had to make a slight edit to the callback, changing images, y_true = batch[0], batch[1] to images, y_true = batch['images'], batch['bounding_boxes'], as for some reason it errored out if I use batch[0]. I'm using Keras 3.1.1.

Discussed in #2126

^{Originally posted by Inshu32 November 6, 2023}
I am trying to implement Keras-cv based pipeline to train a custom dataset using https://keras.io/examples/vision/yolov8/ example. I have an object detection problem with 6 classes. I am facing two issues:
1. While dividing the dataset using take and skip, since the data is divided sequentially it takes first 2 classes for validation and rest 4 training. This is creating problems as data is being trained on different data and tested on different dataset. I used tf.data.shuffle to overcome this problem but still division of dataset doesn't ensure that all the classes are represented in both training and val set.
2. While running Yolo.fit, I expect the algorithm to evaluate the predictions using COCO metric call back. for which I am using teh following function:
class EvaluateCOCOMetricsCallback(keras.callbacks.Callback):
def init(self, data, save_path):
super().init()
self.data = data
self.metrics = keras_cv.metrics.BoxCOCOMetrics(
bounding_box_format="xyxy",
evaluate_freq=1e9,
)

    self.save_path = save_path
    self.best_map = -1.0

def on_epoch_end(self, epoch, logs):
    self.metrics.reset_state()
    for batch in self.data:
        images, y_true = batch[0], batch[1]
        y_pred = self.model.predict(images, verbose=0)
        self.metrics.update_state(y_true, y_pred)

    metrics = self.metrics.result(force=True)
    logs.update(metrics)

    current_map = metrics["MaP"]
    if current_map > self.best_map:
        self.best_map = current_map
        self.model.save(self.save_path)  # Save the model when mAP improves

    return logs

Which produces the following error:
**tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node _wrapped__ConcatV2_N_13_device/job:localhost/replica:0/task:0/device:CPU:0}} ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [32,2,4] vs. shape[2] = [32,1,4] [Op:ConcatV2] name: concat
More detailed traceback {File "/home/lib/python3.9/site-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 262, in _compute_result
_box_concat(self.ground_truths),

File "/home/lib/python3.9/site-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 44, in _box_concat
result[key] = tf.concat([b[key] for b in boxes], axis=0)
}**
This to my understanding is problem with multiple bounding boxes in one image. Ragged tensor solves the problem while training with multiple bounding boxes. In the above case I think predicted bounding box is only one and ground truth has 2 bounding boxes for the same image. How to solve this problem ?

Answer 1 · 2024-04-08T00:44:57.000Z

This error happens when box_coco_metrics.py concatenates the ground truth bounding boxes, which can have different counts per image. Perhaps we should pad the smaller tensor to fill it with zeros for the concat to work?

Answer 2 · 2024-06-06T20:42:58.000Z

@photown do you have a repro colab we can look at?