/mediapipe2osc

Cross-platform, customizable ML solutions for live and streaming media.

Primary LanguageC++Apache License 2.0Apache-2.0

MediaPipe2Osc

This repository contains MediaPipe2Osc, a modified version of Google's MediaPipe framework that sends motion tracking data as Open Sound Control (OSC) encoded datagram packets. This enables landmarks and other tracking information to be used within any OSC compatible environment (e.g. Max/MSP, Python, PD, C++, Processing, you name it, see the oscexamples folder).

!Mediapipe2osc Max Example


Build Instructions

mediapipe2osc has been tested on OSX (Windows version on the way). These instructions are adapted from the MediaPipe installation page if you want to try this on other platforms.

Download or clone mediapipe2osc.

In a terminal change into the mediapipe2osc folder:

cd <path to mediapipe2osc>

Now install Xcode and its Command Line tools:

xcode-select --install

MediaPipe requires Bazel, OpenCV and FFmpeg. I'd recommend installing these with Homebrew, which can be installed as follows:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Now install Bazelisk:

brew install bazelisk

and OpenCV, which includes FFmpeg:

brew install opencv@3

There is a known issue with the glog dependency, so uninstall glog:

$ brew uninstall --ignore-dependencies glog

Now build the modified MediaPipe example application:

bazel build -c opt --cxxopt='-std=c++17' --define MEDIAPIPE_DISABLE_GPU=1 mediapipe/examples/desktop/hand_tracking:hand_tracking_cpu

And run:

GLOG_logtostderr=1 bazel-bin/mediapipe/examples/desktop/hand_tracking/hand_tracking_cpu   --calculator_graph_config_file=mediapipe/graphs/hand_tracking/hand_tracking_desktop_live.pbtxt

OSC Dictionary

Landmarks are streamed as UDP datagrams in OSC format on port 8000.

The OSC address pattern is either /left or /right for detected hands, and is followed by 63 float32 arguments which are the x, y, z coordinates of the 21 landmarks shown below.

!Hand landmarks

That is, the 1st, 2nd and 3rd arguments are the x, y and z coordinates of the WRIST landmark, etc. For further details see the oscexamples.

The x and y values are in the range 0.0 to 1.0 and represent the position of the landmark relative to the camera image height and width. The z value is a depth estimation relative to the wrist and can be -ve.


Info

This project was developed by Tom Mitchell @teamaxe with support from:

Bristol and Bath Creative R+DCreative Technologies LabMiMU GlovesUWE, BristolMediaPipe