Packet timestamp mismatch - mediapipe does not recover

Question

Packet timestamp mismatch - mediapipe does not recover

maikthomas opened this issue a month ago · 6 comments

maikthomas commented a month ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

macOS 14.6.1

MediaPipe Tasks SDK version

I tested with

@mediapipe/tasks-vision 0.10.18
@mediapipe/tasks-vision 0.10.15

Task name (e.g. Image classification, Gesture recognition etc.)

ImageSegmenter

Programming Language and version (e.g. C++, Python, Java)

Javascript - Web

Describe the actual behavior

mediapipe throws the same error with the same timestamp when given new frames. It also offers no method to reset the timestamp.

This issue is concerning error below. I may open an issue as to why the error is occurring but for now I want to address the fact that mediapipe does not recover from this error.
I forked and adapted a codepen to show this behaviour:
https://codepen.io/maikthomas/pen/GRVbWNb
In this pen, when using the webcam, on the 11th frame I manually use a timestamp greater than the previous timestamp. I then return to using the correct timestamps. However mediapipe will continue to report the same error with the timestamps from before.

Another possible case is when it is known that the timestamp may be lower, due to different input sources (some timestamps start at 0). To avoid this I am currently using Date.now(). But it would be good to be able to reset the current minimum expected timestamp back to 0 if needed.

Describe the expected behaviour

mediapipe uses the newly provided timestamp and recovers. And also exposes a method in order to reset timestamp manually

Standalone code/steps you may have used to try to get what you need

As a workaround for now I am calling segmentForVideo in a tr/catch and recreating the ImageSegmenter when it fails

Other info / Complete Logs

vision_wasm_internal.js:10 E1120 09:57:18.478000 1895072 gl_graph_runner_internal_image.cc:68] Adding Image to stream image_in was not ok: INVALID_ARGUMENT: Graph has errors: 
Packet timestamp mismatch on a calculator receiving from stream "norm_rect". Current minimum expected timestamp is 1732096638495001 but received 1732096638472000. Are you using a custom InputStreamHandler? Note that some InputStreamHandlers allow timestamps that are not strictly monotonically increasing. See for example the ImmediateInputStreamHandler class comment. [type.googleapis.com/mediapipe.StatusList='\n\xf5\x02\x08\x03\x12\xf0\x02Packet timestamp mismatch on a calculator receiving from stream \"norm_rect\". Current minimum expected timestamp is 1732096638495001 but received 1732096638472000. Are you using a custom InputStreamHandler? Note that some InputStreamHandlers allow timestamps that are not strictly monotonically increasing. See for example the ImmediateInputStreamHandler class comment.']
=== Source Location Trace: ===
third_party/mediapipe/framework/input_stream_manager.cc:159

Answer 1 · 2024-11-20T17:08:07.000Z

Hi @maikthomas,

We have newer version available 0.10.18, Can you try and let us know the status now?

Thank you!!

Answer 2 · 2024-11-20T17:11:40.000Z

@kuaashish apologies this was a mistake filling out the form, I have already tried with 0.10.18.
I was using 0.10.15 and upgraded to 0.10.18 and saw the same issue. I'll update this info the issue description too.

Answer 3 · 2024-11-23T13:48:13.000Z

Hi, @maikthomas

I encountered a similar issue when deploying MediaPipe code on a device. The device’s time was unstable and could even be manually modified by the user. If some data is cached within the pipeline for business processing, a restart could lead to data loss. A simple and effective strategy is to use an auto-increment counter as the input for the MediaPipe timestamp. This helps maintain consistency in processing when the system time is unreliable.

Answer 4 · 2024-11-23T14:22:15.000Z

Hi, @kuaashish

I found the DisableTimestamps function in the code (link: DisableTimestamps), and it seems to disable timestamp validation. However, I’m unsure about when DisableTimestamps is actually called. Is there any documentation that explains its usage?

Thank you!

Answer 5 · 2024-11-27T18:51:10.000Z

I ran into the same problem with 0.10.14, when I was using videoFrame.timestamp. I had to switch to performance.now() for the timestamp and ignore the timestamp coming from the video frame.s

Answer 6 · 2024-11-29T05:54:17.000Z

Hi @maikthomas,

We have newer version available 0.10.18, Can you try and let us know the status now?

Thank you!!

mediapipe 0.10.18, win11, python 3.10.15

ValueError: Graph has errors:
; Packet timestamp mismatch on a calculator receiving from stream "image". Current minimum expected timestamp is 3633298 but received 3599964. Are you using a custom InputStreamHandler? Note that some InputStreamHandlers allow timestamps that are not strictly monotonically increasing. See for example the ImmediateInputStreamHandler class comment.

flask-socketio

import base64
from flask import Flask, render_template
from flask_socketio import SocketIO, send, emit
from flask_cors import CORS
import numpy as np
import cv2
import mediapipe as mp


app = Flask(__name__) 
app.config['SECRET_KEY']  = 'secret!' 
CORS(app, resources={r"/*": {"origins": "*"}}) 
socketio = SocketIO(app, cors_allowed_origins="*") 

# Mediapipe set up
mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose
pose = mp_pose.Pose(static_image_mode=False, smooth_landmarks=True)


def binary_to_gray_to_binary(binary_data): 
    nparr = np.frombuffer(binary_data,  np.uint8)  
    img = cv2.imdecode(nparr,  cv2.IMREAD_COLOR) 
     
    gray_img = cv2.cvtColor(img,  cv2.COLOR_BGR2GRAY) 
     
    _, binary_data_out = cv2.imencode('.jpg',  gray_img) 
     
    return binary_data_out.tobytes()  


def get_keypoints(frame, width, height):
    keypoints_list = []
    # Convert the BGR image to RGB.
    image_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    results = pose.process(image_rgb)

    if results.pose_landmarks:
        for id, landmark in enumerate(results.pose_landmarks.landmark):
            keypoints_list.append((landmark.x * width, landmark.y * height))

    if len(keypoints_list) > 0:
        
        temp_point_1_x = (keypoints_list[11][0]+keypoints_list[12][0])/2
        temp_point_1_y = (keypoints_list[11][1]+keypoints_list[12][1])/2
        
        temp_point_2_x = (keypoints_list[9][0]+keypoints_list[10][0])/2
        temp_point_2_y = (keypoints_list[9][1]+keypoints_list[10][1])/2
        keypoints_list.append((temp_point_1_x, temp_point_1_y))
        keypoints_list.append((temp_point_2_x, temp_point_2_y))
        
        keypoints = np.array(keypoints_list)
        return keypoints
    return keypoints_list


def draw_skeleton_opencv(keypoints, width, height):
    image = np.zeros((height, width, 3), dtype=np.uint8)

    connections = [
        (12, 14), (14, 16), (16, 22), (16, 18), (16, 20), (18, 20), (12,24), (24,26), (26,28), (28,30), (28,32), (30,32), (12,11), (11,23), (23,24), (23,25), (25,27), (27,29), (27,31), (29,31), (11,13), (13,15), (15,21), (15,17), (17,19), 
        # (10, 9), (8, 6), (6,5), (5,4), (4,0), (0,1), (1,2), (2,3), (3,7),
        (33, 34)
    ]
    
    for connection in connections:
        pt1 = keypoints[connection[0]]
        pt2 = keypoints[connection[1]]
        cv2.line(image, (int(pt1[0]), int(pt1[1])), (int(pt2[0]), int(pt2[1])), (255, 0, 0), 50)    
    
    poly_points = [keypoints[11], keypoints[12], keypoints[24], keypoints[23]]
    
    if all(point[0] != 0 and point[1] != 0 for point in poly_points):
        pts = np.array([[int(point[0]), int(point[1])] for point in poly_points], np.int32)
        pts = pts.reshape((-1, 1, 2))
        cv2.fillPoly(image, [pts], (255, 0, 0))
    
    for index, point in enumerate(keypoints):
        if point[0] != 0 and point[1] != 0:
            if index == 0:
                cv2.circle(image, (int(point[0]), int(point[1])), 60, (255, 0, 0), -1)
            elif index in (1,2,3,4,5,6,7,8,9,10,33,34):
                pass
            else:
                cv2.circle(image, (int(point[0]), int(point[1])), 10, (0, 0, 255), -1)

    return image


@app.route('/')
def index():
    return render_template('index.html')


@socketio.on('connect')
def test_connect():
    send('AAA')


@socketio.on('disconnect')
def test_disconnect():
    send('BBB')


@socketio.on('frame')
def handle_video_frame(json_data):

    data = json_data['blob']
    timestamp = json_data['timestamp']
    
    print(f'Received frame at {timestamp}')
    
    # Convert the binary data to a numpy array
    np_array = np.frombuffer(data, np.uint8)
    frame = cv2.imdecode(np_array, cv2.IMREAD_COLOR)

    height, width = frame.shape[:2]

    keypoints_list = get_keypoints(frame, width=width, height=height)
    if len(keypoints_list) > 0:
        new_frame = draw_skeleton_opencv(keypoints_list, width=width, height=height)

        success, buffer = cv2.imencode('.jpg', new_frame)
        if success:
            base64_string = base64.b64encode(buffer).decode('utf-8')      

            emit('123', base64_string)


if __name__ == '__main__':
    socketio.run(app, host='0.0.0.0', port='9342')

VUE

<template>
  <div class="box" style="display: flex;">
    <video id="video1" width="640" height="480" autoplay></video>

    <img id="video" width="640" height="480" />
  </div>
</template>
<script>
import { io } from 'socket.io-client'
export default {
  created() {},
  data() {
    return {
      list: ''
    }
  },
  mounted() {
    navigator.mediaDevices
      .getUserMedia({ video: true })
      .then(function (stream) {
        const video = document.getElementById('video1')
        video.srcObject = stream
        let sock = io('http://127.0.0.1:9342')
        sock.on('connect', (sock) => {
          console.log('000,')
        })
        sock.emit('send_message', 'c')
        sock.on('123', (sock) => {
          // console.log(sock, '11')
          const imgElement = document.getElementById('video')
          imgElement.src = 'data:image/jpeg;base64,' + sock
        })
        
        video.addEventListener('play', function () {
          const canvas = document.createElement('canvas')
          const ctx = canvas.getContext('2d')
          canvas.width = video.videoWidth
          canvas.height = video.videoHeight
          function sendFrame() {
            ctx.drawImage(video, 0, 0, canvas.width, canvas.height)
            const timestamp = Date.now();
            canvas.toBlob(
              (blob) => {
                // sock.emit('frame', blob)
                sock.emit('frame', { blob, timestamp });
                requestAnimationFrame(sendFrame)
              },
              'image/jpeg',
              0.8
            )
          }
          sendFrame()
        })
      })
      .catch(function (err) {
        console.error('Error accessing webcam: ' + err)
      })
  },
  methods: {}
}
</script>
<style scoped lang="scss"></style>