luxonis/depthai-unity

Roadmap

gespona opened this issue · 54 comments

Why?

Sharing the roadmap with everyone is always good for transparency and best way to drive the development in order to bring the most value in each step.

What?

All feedback is more than welcome !
And we'd love to know more about you and get some insights:

  • Main platform are you using? (MacOS, Win, Linux)
  • Could you explain bit more about your use case? What are you looking to build with the Unity plugin?
  • What part of the Unity plugin you find most interesting? (OAK for Creators, OAK for Developers, OAK for CV/AI)
  • Do you have experience with OAK API/SDK? What platform are you using the most? (C/C++ , Python)

Features and Roadmap:

v1.0.0 Initial Release to AssetStore (free):

  • No code approach. Just drag and drop some prefabs. Bullet-proof.
  • Access to OAK device streams:
    • RGB and depth (point cloud viz currently ongoing dev).
      • About point cloud: Add support for external libraries: PCL, ... and VFX integration.
    • Access to IMU if available and retrieve system information.
    • Record and Replay capability.
  • OAK Device Manager and multiple OAK devices support.
  • OAK For Unity Creators: High-Level API - Unlock “Myriad” applications with predefined and ready-to-use pretrained models (with depth support if you run OAK-D family):
    • Face Detector, 3D Head Pose Estimation, Face emotion, Body Pose (MoveNet), Hand Tracking and Gesture recognition (Blaze Hands), Object detector (Tiny Yolo v3/4/5 and MobileNet SSD). DeepLabv3 segmentation. Background removal, Depth (only OAK-D family) - point cloud visualization
    • Multiple camera demo.
    • Example how to integrate your own pipeline
    • Integration with some paid assets (demos)
  • OAK For CV/AI/Robotics: Unity Virtual OAK Camera

Version 1.0 is just the top of the iceberg. Foundations to build applications on top. Beta testing along the path to get v1.0 ready for assetstore.

Some topics for next versions (big blocks without much detail)

  • More complex demos on top of pretrained models (see list below)
  • Send unity virtual camera images to OAK pipeline.

Do you miss some models / use cases / integration with some specific paid asset? Please let us know.

  • Android support. Integration with AR.
  • End-to-end workflows for synthetic dataset generation, training and deploy.
  • Integration with Unity Robotics Hub (ROS) / SystemGraph / SensorSDK. Publish to ROS.
  • Integration with Unity Simulation and Unity Simulation Pro.
  • Intergration with Unity ML-Agents.
  • Define custom pipelines inside Unity (visual tool).
  • Implement partial C# wrapper

How?

We're opening beta testing on the repo. The plan is start delivering all the features gradually following initial order and always covering full feature (unity plugin lib + unity project/demos) and for all platforms (Win, Mac and Linux)

Following features (v1.0) are ready:

  • Device Manager
  • Demo menu scene
  • Demos:
  • Device streams (rgb, depth), Point cloud visualization using VFX
  • Object detector demo
  • Face detector demo
  • Body pose demo
  • Head pose demo
  • Face emotion demo
  • Hand tracking demo

Please tell us about your project !

More resources:
unity forums: https://forum.unity.com/threads/oak-for-unity-spatial-ai-meets-the-power-of-unity.1205764/
youtube unity plugin playlist: https://youtu.be/CSFOZLBV2RA?list=PLFzqMMJPSNSbsHp7QeJpOHrZu_1BAdDms

I use Windows. I'm looking forward for the Unity plugin to create games using the human pose estimation and I don't have any experience with the OAK api/sdk 😅

My idea is a game where you can move (a lot). There will be objects falling from above and you can use any part of your body to repel them. The objective is to avoid objects from touching the ground

Hi @MiguelPereyraC
Thanks a lot for the feedback. Definetly human pose is going to be great and fun use case with Unity :)
I have more feedback about interest on body pose so it's prioritized.
About the plugin is coming with precompiled libraries so you don't need to deal with C/C++ for pretrained models.

About body pose here is sneak peak: https://youtu.be/cCeF87glJh4

predefined/pretrained models are just prefabs you can drag&drop and build on top. In this example I was doing the squid game (red light/green light)

In the meantime here some more resources:
discord: https://discord.gg/4hGT3AFPMZ
unity forums: https://forum.unity.com/threads/oak-for-unity-spatial-ai-meets-the-power-of-unity.1205764/
youtube unity plugin playlist: https://youtu.be/CSFOZLBV2RA?list=PLFzqMMJPSNSbsHp7QeJpOHrZu_1BAdDms

Any C# Examples?

Any C# Examples?

Hi @acidhax ! Could you share bit more detail about your use case and platform using?

Hi @acidhax ! Could you share bit more detail about your use case and platform using?

Prototyping AR experiences in Unity

Hi @acidhax ! Could you share bit more detail about your use case and platform using?

Prototyping AR experiences in Unity

About C# examples, just want to share that Unity plugin library is based on depthai-core but it's not a C# wrapper, if you're looking for that. In first stage depthai pipelines are developed in C/C++ and create interface for Unity C#. More in long term the idea is to create the pipelines inside Unity.

@acidhax @gespona
I think we would actually be better off creating C# wrapper (at least partial) along the Unity efforts, to also take better care with exceptions between languages, etc..

If anyone has any good recommendations for such tool/library let us know.
We currently use pybind11 for Python wrapper and community driven Java bindings use JavaCPP.

@acidhax @gespona I think we would actually be better off creating C# wrapper (at least partial) along the Unity efforts, to also take better care with exceptions between languages, etc..

If anyone has any good recommendations for such tool/library let us know. We currently use pybind11 for Python wrapper and community driven Java bindings use JavaCPP.

Yes I agree. Exceptions management currently is bit pain. Partial C# wrapper sounds good approach and probably similar approach for this mid-long term goal of defining pipelines inside Unity with visual node tool. Did PoC in the past and basically was exposing part of the core C++ objects for pipeline creation.

As Martin said if there is any recommendation, feel free to share here

I just updated the roadmap adding Partial C# wrapper for better integration on Unity side.

Will do small PoC with these two options:

https://github.com/mono/CppSharp
https://github.com/swig/swig

Please feel free to share other options or any experience with them :)

Is the Squid Game demo on the repo? I really want to try it out! 😁

Is the Squid Game demo on the repo? I really want to try it out! 😁

Not yet but will be ;) Will include all the demo scenes you can find today on weekly video updates (+prebuild binaries). Some paid assets not going to be availabe but replaced with free ones and instructions on documentation how to integrate paid assets if you want to reproduce exactly the demo scene.

In order to make more clear the state of the repo, I will update How section with progress. Also you will find on README in "beta" branch. I will merge first with Windows10 support and then for rest of platforms (Mac/Linux)

Quick update here: "beta" branch updated with basic streams and point cloud vfx for Windows. Working on macos/linux support. Think almost all the feedback received is on Windows platform, depends on next feedbacks moving faster with pretrained models on Windows ;)

I'm an augmented reality developer for Windows Unity. I can not wait for the Oak for Unity plugin because of the possibilities these cameras have with range in general. Putting two up on opposite corners of the room would give me so many possibilities for controlling the AR environment with greater accuracy. The problem with most AR headsets is that they only track your hands when they see your hands. So setting up these cameras to interact with Unity would mean that my hands, body, pose, posture, etc would ALWAYS be tracked and available to use as peripherals

I'm an augmented reality developer for Windows Unity. I can not wait for the Oak for Unity plugin because of the possibilities these cameras have with range in general. Putting two up on opposite corners of the room would give me so many possibilities for controlling the AR environment with greater accuracy. The problem with most AR headsets is that they only track your hands when they see your hands. So setting up these cameras to interact with Unity would mean that my hands, body, pose, posture, etc would ALWAYS be tracked and available to use as peripherals

Hi ! @treyDavisMCC I can't agree more with you ! ;) OAK + Unity = endless possibilites in terms of HCI / AR / Interactive ... you can check few experiments I've doing in YT playlist. Also in the past I was merging two OAK cameras in same unity 3D space.

If you're on windows, please check 'beta' branch .. right now you can get small sneak peek of what's coming soon ! ;)

@gespona I was actually watching your experiments. You're a much more experienced coder when it comes to machine learning. All I can think of are applications of other people's machine learning. You're out here creating realistic water flow with it. I really need to take a class on this or start working entry level at a company for this, because this is so over my head I do not think I could learn it by myself.

@gespona I was actually watching your experiments. You're a much more experienced coder when it comes to machine learning. All I can think of are applications of other people's machine learning. You're out here creating realistic water flow with it. I really need to take a class on this or start working entry level at a company for this, because this is so over my head I do not think I could learn it by myself.

Thank you for your kind words ;) Btw just want to clarify or bring some light here if you're talking about the hand tracking + depth + fluid demo ... Credits for fluid simulation are for Zibra AI Liquids ;) . This is another nice thing of Unity as you can use great assets (free or paid). So for my experiments you'll see often I'm combining with other assets to give a meaning to the demo (p.eg: airplane controller with Overcloud, nautilus demo with CrestOcean, photobooth demo, ...)

Anyway I will add all demos to the plugin, and instructions to download extra assets if need it.

I plan also to create lot of tutorials to help everyone to get the maximum profit of the plugin and OAK devices potential, much further of my demos (it's just top of iceberg). And feel free to contact me on Discord if you need specific help.

@gespona I am struggling to set up the beta on my unity. I have tried cloning the git into a folder to open in unity, and I've tried cloning it into an already made Unity project. Neither worked when for the purposes of step two in the OAK for Unity section of your md

@gespona I am struggling to set up the beta on my unity. I have tried cloning the git into a folder to open in unity, and I've tried cloning it into an already made Unity project. Neither worked when for the purposes of step two in the OAK for Unity section of your md

Hi ! I just tested again, everything looks good but opened this issue with following steps for you: #5

I'm still working on documentation (pretty behind on this). After solving your issue I will update docs with more detailed info and screenshots about the steps.

Successfully tested Beta Branch with Oak-D Lite on Windows 11 Enterprise v21H2 running on an XPS 13 9300 laptop.

New demo menu scene added. Just easier to navigate though the demos.

@gespona We are going support OAK-D cameras in our rehabilitation system. We plan to use OAK-D camera for professional (in facility) use, but we are thinking also about home therapy. I wonder whether your plugin (especially body tracking) is planned to work with normal 2D webcams too?

@brontesprocessing This is a really interesting application of the technology and I would definitely be interested in working with you on it potentially via the Djinn project I've got rolling at the moment. I've checked out your website and I can confirm from my own experience of the Oak-D sensors that they would fit your use case perfectly. And the sensor is cheap enough, particularly the entry level sensor, to make home analysis practical and economical. Its something I'd be interested in discussing if you're still looking for help.

Hi @brontesprocessing,
First thank you for sharing your project. The scope of the plugin is oak devices, that said when release body tracking (hope very soon) I can give you support to make basic example (2D) work with standard webcam. Please join discord and let's keep in touch there ;)

Hi everyone ! I just updated the repo with some improvements on point cloud. Right now I'm slowing down a bit on the planned roadmap as the second part - synthetic datasets and simulation - got some priority, so hope soon I will start give back some of this development to the repo. That said we keep the same priority list here so body pose coming soon !

Btw, we decided to promote beta branch to main, so from now on just check main branch.

Hi everyone ! Just updated the repo with Body Pose

Any update on when the Unity plugin will have the hand-tracking ability?

After reworked a bit the preview modes, now I'm working on head pose and hand tracking , revisiting initial implementation you probably saw in the video (hand track+liquids) . Hand track is by far the heaviest model to run on camera, specially to support two hands, so basically I'm focus on performance now. Could you share bit more about your application / use case?

The immediate use case is to use the camera to allow people to interact with a digital billboard. The board will contain various types of information and we'd like to allow people to use gestures such a grab, swipe, pinch, etc. to move and manipulate the data.

Hi,

First of all great project! Really. I made a post on the Luxonis forum asking about the status of DepthAI for Unity and now I came here to see this is still in works and that's awesome. I'm just going to copy paste my post and formulate this as a question here:

I've been using DepthAI as a python package together with my OAK-D camera for personal projects and I can say I enjoy the experience. i would like to extend those projects and such am wondering if the Unity roadmap is still on the radar for luxonis? The initial forecast seemed very exciting:

v1.0.0 Initial Release to AssetStore (free):

No code approach. Just drag and drop some prefabs. Bullet-proof.

Access to OAK device streams:

RGB and depth (point cloud viz currently ongoing dev).

About point cloud: Add support for external libraries: PCL, ... and VFX integration.

Access to IMU if available and retrieve system information.

Record and Replay capability.

OAK Device Manager and multiple OAK devices support.

OAK For Unity Creators: High-Level API - Unlock “Myriad” applications with predefined and ready-to-use pretrained models (with depth support if you run OAK-D family):

Face Detector, 3D Head Pose Estimation, Face emotion, Body Pose (MoveNet), Hand Tracking and Gesture recognition (Blaze Hands), Object detector (Tiny Yolo v3/4/5 and MobileNet SSD). DeepLabv3 segmentation. Background removal, Depth (only OAK-D family) - point cloud visualization

Multiple camera demo.

Example how to integrate your own pipeline

Integration with some paid assets (demos)

OAK For CV/AI/Robotics: Unity Virtual OAK Camera

Do you think we can expect a release to the asset store in 2024, or is this not of a high priority right now?

Hi,

First of all great project! Really. I made a post on the Luxonis forum asking about the status of DepthAI for Unity and now I came here to see this is still in works and that's awesome. I'm just going to copy paste my post and formulate this as a question here:

I've been using DepthAI as a python package together with my OAK-D camera for personal projects and I can say I enjoy the experience. i would like to extend those projects and such am wondering if the Unity roadmap is still on the radar for luxonis? The initial forecast seemed very exciting:

v1.0.0 Initial Release to AssetStore (free):
No code approach. Just drag and drop some prefabs. Bullet-proof.
....
Do you think we can expect a release to the asset store in 2024, or is this not of a high priority right now?

Hi @AndreiMoraru123 , if you clone this repo you'll find unity project with almost all the roadmap implemented. Right now working on the last examples: hand tracking and head pose.
Regarding AssetStore yes, I hope to publish after complete the examples (before EoY) .. that said the main content is here in the repo and under releases you have also some examples as .unitypackages

@gespona

you'll find unity project with almost all the roadmap implemented. Right now working on the last examples: hand tracking and head pose.
Regarding AssetStore yes, I hope to publish after complete the examples (before EoY)

It seems that the last 5 months of commits have only been for updating the Readme file.... It has been over two years since the Kickstarter for the OAK-D Lite. Will the final collection of demos ever be fully implemented? Even that would still be falling extremely short of what the Luxonis team were implying would be available long before now.

#36 Reading this issue thread, the remaining demos have been in the works for over half a year. Can those of us who invested in the product based on what Luxonis were claiming during the Kickstarter please have a solid timeframe for, at the very least, the completion of these basic examples?

Thank you.

Hi all,

Just to mention that head pose example will be available today.

Also we got recurrent feedback on difficulty to do quick experiments with other models or just play in deep with device parameters due the necessity to develop the pipeline in C++.

we're taking the opportunity with the last example (hand tracking) to put in place a python bridge with excellent repository https://github.com/geaxgx/depthai_hand_tracker

With this python bridge, should be much more easy run experiments like https://github.com/luxonis/depthai-experiments before thinking to implement in C++

Also we're reworking the readme to make more easy to find all the information regarding how to build the plugin library for all the systems, looking to start the publish process on the AssetStore.

Thank you all for your patience,

Hi @gespona, I was thinking it could be helpful to lay out some contribution templates and good first issues that need work for this project. I for one did not know about this:

we're taking the opportunity with the last example (hand tracking) to put in place a python bridge with excellent repository geaxgx/depthai_hand_tracker

Maybe there are more people who could help, and would like to see this completed, me included.

Hi @gespona, I was thinking it could be helpful to lay out some contribution templates and good first issues that need work for this project. I for one did not know about this:

we're taking the opportunity with the last example (hand tracking) to put in place a python bridge with excellent repository geaxgx/depthai_hand_tracker

Maybe there are more people who could help, and would like to see this completed, me included.

Hi @AndreiMoraru123
Definetly. Let me work on that also and take the opportunity to repeat that everyone is more than welcome to contribute on this repository. In the past we had great contributions, for example helping to diagnostic and bring good Linux support :)

Btw, forgot to comment that initial version of hands tracking using python bridge should be available during next week. Will comment here for everyone to be aware. Now you have head pose available on the repo too.

@gespona Thanks for the updates.

While it would of course be wonderful if those in the community who have the time and are equipped to do so can share, e.g. mattfryed@0a545d5 - "Showing 2,143 changed files with 971,083 additions and 645 deletions." (here's looking at you @mattfryed ;) - some financial support from Luxonis for a second dedicated dev would be a more reliable way to assure us that the remaining necessary contributions will arrive within a reasonable timeframe, and will deliver on the multi-device combined functionality for Unity that was implied by the Luxonis team at the point of sale...

everyone is more than welcome to contribute on this repository

Great, but also, this is a code repo with an MIT license, for a commercial hardware product, that we all dropped cash for. Perhaps Luxonis could budget for an extra team member to work with you on development for Unity, beyond the basic examples..?

With that said, thank you and very much looking forward to the final hand tracker example this week :)

@gespona Thanks for the updates.

While it would of course be wonderful if those in the community who have the time and are equipped to do so can share, e.g. mattfryed@0a545d5 - "Showing 2,143 changed files with 971,083 additions and 645 deletions." (here's looking at you @mattfryed ;) - some financial support from Luxonis for a second dedicated dev would be a more reliable way to assure us that the remaining necessary contributions will arrive within a reasonable timeframe, and will deliver on the multi-device combined functionality for Unity that was implied by the Luxonis team at the point of sale...

everyone is more than welcome to contribute on this repository

Great, but also, this is a code repo with an MIT license, for a commercial hardware product, that we all dropped cash for. Perhaps Luxonis could budget for an extra team member to work with you on development for Unity, beyond the basic examples..?

With that said, thank you and very much looking forward to the final hand tracker example this week :)

Hi @ricardofeynman

Thank you for your feedback.

Could you elaborate bit more on your use cases? Specially "remaining necessary contributions will arrive within a reasonable timeframe, and will deliver on the multi-device combined functionality"

I would like to understand the full scope you're looking at

Thanks in advance,

@gespona Great, will do. When the final hand tracker example arrives (this week...?) we'll be back in touch to elaborate, thanks.

Edit: Forgot to ask, is the hand tracking branch ready to test at present?

@gespona Great, will do. When the final hand tracker example arrives (this week...?) we'll be back in touch to elaborate, thanks.

Edit: Forgot to ask, is the hand tracking branch ready to test at present?

@ricardofeynman Currently merged in development branch (so you can test there). I'm improving a bit the example scene and the documentation. I'll merge everything in main by end of today.

handtrack_lowres

^^ Example scene above showing hand tracking results and some gesture based control

@gespona Good stuff! You fair cobbled the last examples together, but hey, we all at least have some cobble to finally test with, thank you :)

I'm guessing from the other issue currently still open that we aren't at the UVC compliant stage? Or is there a firmware update available somewhere so that we can use the devices as standalone 4k webcams, in OBS for example?

Also, as not to derail the focus of the roadmap update thread, I'll open some new issues where these topics can be more thoroughly discussed. I propose these titles (in no particular order):

  • Full Code Documentation - when will this be available?

  • Performance metrics for real world applications - moving beyond the tech demo.

  • MediaPipe / YOLO / MobileNet SSD / DeepLabv3 / MoveNet / Sentis: What pre-trained models and Unity-side integrations will pair best for performance?

  • Impact of third party asset reliance - e.g. Netly

  • Performance overheads for the currently required Python utilisation?

  • C# wrapper for depthai-core C++ library - Timeframe?

  • More complex demos on top of pretrained models - Timeframe?

  • v1.0.0 Initial Release to AssetStore (free) "No code approach. Just drag and drop some prefabs. Bullet-proof": - Timeframe?

I'll fill these in and post these on the Issues tab when time permits.

I would like to understand the full scope you're looking at

For the moment, could you please provide an example demonstarting the most performant approach for full body, head/face and hands tracking in combination. First on a single, and then multiple devices.

The main appeal of these devices was the prospect of reducing overheads by using AI models loaded on to the hardware, so we don't have a major frame rate reduction for real time applications. The current approach conversely appears to introduce a lot of additional overheads. We'd appreciate some assurances of when, concretely, Luxonis will be squaring that circle.

We have multiple units, so for the meantime we can fit models on to separate devices if required, but then of course we would need some guidance on how to ensure multiple devices work together in a unified manner to provide performant in-app results, beyond the bare bones tech demo. Thanks again!

@gespona Good stuff! You fair cobbled the last examples together, but hey, we all at least have some cobble to finally test with, thank you :)

I'm guessing from the other issue currently still open that we aren't at the UVC compliant stage? Or is there a firmware update available somewhere so that we can use the devices as standalone 4k webcams, in OBS for example?

Also, as not to derail the focus of the roadmap update thread, I'll open some new issues where these topics can be more thoroughly discussed. I propose these titles (in no particular order):

  • Full Code Documentation - when will this be available?
  • Performance metrics for real world applications - moving beyond the tech demo.
  • MediaPipe / YOLO / MobileNet SSD / DeepLabv3 / MoveNet / Sentis: What pre-trained models and Unity-side integrations will pair best for performance?
  • Impact of third party asset reliance - e.g. Netly
  • Performance overheads for the currently required Python utilisation?
  • C# wrapper for depthai-core C++ library - Timeframe?
  • More complex demos on top of pretrained models - Timeframe?
  • v1.0.0 Initial Release to AssetStore (free) "No code approach. Just drag and drop some prefabs. Bullet-proof": - Timeframe?

I'll fill these in and post these on the Issues tab when time permits.

I would like to understand the full scope you're looking at

For the moment, could you please provide an example demonstarting the most performant approach for full body, head/face and hands tracking in combination. First on a single, and then multiple devices.

The main appeal of these devices was the prospect of reducing overheads by using AI models loaded on to the hardware, so we don't have a major frame rate reduction for real time applications. The current approach conversely appears to introduce a lot of additional overheads. We'd appreciate some assurances of when, concretely, Luxonis will be squaring that circle.

We have multiple units, so for the meantime we can fit models on to separate devices if required, but then of course we would need some guidance on how to ensure multiple devices work together in a unified manner to provide performant in-app results, beyond the bare bones tech demo. Thanks again!

@ricardofeynman all good points and feel free to open issues to discuss them in detail. Regarding UVC issue I kept open because also discussing other topics, but UVC mode is supported for a while in OAK. Example here: https://github.com/luxonis/depthai-python/blob/main/examples/ColorCamera/rgb_uvc.py

Answering in general, trying to cover the rest of the points. First think it's good idea to prioritize now an example of full body for several reasons:

  • most common request from other customers (other issue and requests)
  • it's interesting experiment to see how far we can push OAK S2 VPU - we always prioritize execution on edge, so leave host resources free
  • the first approach would be using python bridge mainly to build on top of hand tracking application (that in fact has some body prefocusing - read more details in repo)
  • regarding python overhead, just to remark that pipelines and AI is running on OAK in the same way doing in C++ (also for hand tracking) so no big concern here. But it's easy to conduct some comparision using the same pipeline in C++ and Python. Will do. Regarding Netly .. it's just handling tcp socket communication on Unity C# side, that could be easy implemented from scratch, but it's library used for years on production grade projects (Mirror) so thought worth to use it ;)
  • regarding use multiple devices, now would be pretty easy integrate from python but probably easy also to implement in C++ this example: https://github.com/luxonis/depthai-experiments/tree/master/gen2-multiple-devices/spatial-detection-fusion
  • regarding performance metrics in real world applications: maybe we need to discuss more in detail apart, but I can share (but not disclouse) that unity plugin is used in current development for real world applications also with very low hw specs and performance metrics / reliability (24x7 continuous run) are all good (if you have any concern)

I will update here soon the next priorities and will do my best to give solid timeframe. Meanwhile just invite everyone to contribute on roadmap discussion as usual :)

@gespona Awesome :)

I had thought UVC would require new firmware or at least drivers. I'll have to test again with OBS.

  • regarding performance metrics in real world applications: maybe we need to discuss more in detail apart, but I can share (but not disclouse) that unity plugin is used in current development for real world applications also with very low hw specs and performance metrics / reliability (24x7 continuous run) are all good (if you have any concern)

As we fast approach the 'Terminators exist' phase of the 21st century, I remain optimistic the NDA you allude to does not pertain to mounting these devices on Killer AIBO's, flying or otherwise... :)

Thank you for the updates. Looking forward to the full body example with great anticipation!

@gespona Awesome :)

I had thought UVC would require new firmware or at least drivers. I'll have to test again with OBS.

@ricardofeynman
According https://github.com/luxonis/depthai-core/releases you need depthai-core 2.22+ (that's the current version in the repo)

jacoos commented

Hello,The body example of Unity does not support multiple people, and the received JSON data is also single person data. How to detect multiple people.I want to create a multi human sensory trigger project, but I don't know how to get started. Please advise me :-)

Hello,The body example of Unity does not support multiple people, and the received JSON data is also single person data. How to detect multiple people.I want to create a multi human sensory trigger project, but I don't know how to get started. Please advise me :-)

Hi @jacoos , yes the body example inside Unity uses MoveNet single pose model, so only single person is supported.

Regarding advise for multi person .. first I'd like to understand better the project: do you need multiple body pose or just people detection? Could you explain more in detail your use case?

For running multi person/pose body on RVC2 please check our forum for support:
https://discuss.luxonis.com/blog/1308-depthai-sdk-human-pose-estimation
https://discuss.luxonis.com/d/3349-multi-pose-body-tracking-models
https://discuss.luxonis.com/?q=multipose
https://discuss.luxonis.com/d/2240-multi-person-hand-tracker/2

Notice that model support/conversion is not related to use it inside Unity or elsewhere.

I guess another option is put person detector first, crop persons and use MoveNet single pose, but likely this is more a workaround and limited in terms of performance

jacoos commented

Hello,The body example of Unity does not support multiple people, and the received JSON data is also single person data. How to detect multiple people.I want to create a multi human sensory trigger project, but I don't know how to get started. Please advise me :-)

Hi @jacoos , yes the body example inside Unity uses MoveNet single pose model, so only single person is supported.

Regarding advise for multi person .. first I'd like to understand better the project: do you need multiple body pose or just people detection? Could you explain more in detail your use case?

For running multi person/pose body on RVC2 please check our forum for support: https://discuss.luxonis.com/blog/1308-depthai-sdk-human-pose-estimation https://discuss.luxonis.com/d/3349-multi-pose-body-tracking-models https://discuss.luxonis.com/?q=multipose https://discuss.luxonis.com/d/2240-multi-person-hand-tracker/2

Notice that model support/conversion is not related to use it inside Unity or elsewhere.

I guess another option is put person detector first, crop persons and use MoveNet single pose, but likely this is more a workaround and limited in terms of performance

Thank you very much for your reply. My project requires identifying the position of multiple people and requires one person to wave a hand gesture to send commands. I know that face recognition in Unity supports multiple people, but in Unity, bones cannot be bound to a fixed person. In this case, I want to determine the nearest person and bind the bones to him. How to achieve this? How to bind the bones to the nearest person? I hope to receive a prompt

Hello,The body example of Unity does not support multiple people, and the received JSON data is also single person data. How to detect multiple people.I want to create a multi human sensory trigger project, but I don't know how to get started. Please advise me :-)

Hi @jacoos , yes the body example inside Unity uses MoveNet single pose model, so only single person is supported.
Regarding advise for multi person .. first I'd like to understand better the project: do you need multiple body pose or just people detection? Could you explain more in detail your use case?
For running multi person/pose body on RVC2 please check our forum for support: https://discuss.luxonis.com/blog/1308-depthai-sdk-human-pose-estimation https://discuss.luxonis.com/d/3349-multi-pose-body-tracking-models https://discuss.luxonis.com/?q=multipose https://discuss.luxonis.com/d/2240-multi-person-hand-tracker/2
Notice that model support/conversion is not related to use it inside Unity or elsewhere.
I guess another option is put person detector first, crop persons and use MoveNet single pose, but likely this is more a workaround and limited in terms of performance

Thank you very much for your reply. My project requires identifying the position of multiple people and requires one person to wave a hand gesture to send commands. I know that face recognition in Unity supports multiple people, but in Unity, bones cannot be bound to a fixed person. In this case, I want to determine the nearest person and bind the bones to him. How to achieve this? How to bind the bones to the nearest person? I hope to receive a prompt

@jacoos Thanks for sharing more details about your project.

If you need only body pose for nearest person, one option could be run person detector (take a look to the object detector example) - so in this case you can run TinyYolo detector for multiple person detector with depth. Then just crop the nearest person to the camera and pass the crop to the MoveNet body pose estimation.
You can see similar 2-stage approach in the emotion example (first detect faces, crop them and then pass the crops to emotion model)

That makes sense for you? Thoughts?

error Hello,I'm here again, The scene of bone recognition can run normally after packaging,But running for a period of time will result in an error message,the program cannot capture where it is,Have you ever encountered this situation? Using Unity2021.3, Windows 10 OS 22H2 version

@jacoos

error Hello,I'm here again, The scene of bone recognition can run normally after packaging,But running for a period of time will result in an error message,the program cannot capture where it is,Have you ever encountered this situation? Using Unity2021.3, Windows 10 OS 22H2 version

Don't really encountered this situation but I will try to reproduce on my end. Could you provide bit more info about your setup?:

  • Which camera are you using?
  • Are you using USB3 cables / port?
  • How long it requires to reproduce the crash?
  • I understand this is happening with binary right?
  • How much memory/CPU has the system?

Maybe would be good to transfer this conversation to specific issue instead of having here in the roadmap. Would you mind to open the issue with this information?

  • Camera type is oak-D
  • Yes, use USB3.0
  • Time is not fixed,10 minutes or 1 hour
  • I didn't understand what you meant
  • My pc have 8GB memory, this program only use 300MB memory
    and I know how to cause the crash, just unplug the USB cable,I check this usb cable,there is no poor contact, it is inserted quite tightly, so I think this dosn't the main problem.
  • Camera type is oak-D
  • Yes, use USB3.0
  • Time is not fixed,10 minutes or 1 hour
  • I didn't understand what you meant
  • My pc have 8GB memory, this program only use 300MB memory
    and I know how to cause the crash, just unplug the USB cable,I check this usb cable,there is no poor contact, it is inserted quite tightly, so I think this dosn't the main problem.

@jacoos

Do you mean after 10 minutes or 1 hour you unplug the usb cable? Unplug usb cable for sure produce crash as camera is communicating with host. I understand the crash is happening just running the program right?

  • I didn't understand what you meant

I mean the crash is happening with build binary, not reproduce inside editor (?)

@jacoos btw could you open the issue and attach editor / player log ?

Hello,
I just wanted to express my team's interest in this asset.

Main platform are you using? (MacOS, Win, Linux)

We primarily operate on Windows.

Could you explain bit more about your use case? What are you looking to build with the Unity plugin?

We are biomechanics researchers using full-body tracking and pose estimation in different apps/games with built-in biomechanics analysis. Historically, this type of work has been restricted to pre-recorded motions, but pose estimation software allows us to analyze movements in (near) real-time. Access to the camera's RGB and depth data as well as the pose estimation results (potentially converted to a consistent/universal format, regardless of the pose tool) and root (pelvis) position in world space, would be our primary needs. Ease of switching between different pose estimation software (MediaPipe, MMPose, etc.) and custom models within those software would also be very desirable for us. The ability to detect multiple people (incorporating YOLOPose? MMPose does this by default) would also be a major plus.

What part of the Unity plugin you find most interesting? (OAK for Creators, OAK for Developers, OAK for CV/AI)

Likely OAK for Developers since, as mentioned above, we would like to be able to use custom pretrained models and multiple different pose estimation software.

Do you have experience with OAK API/SDK? What platform are you using the most? (C/C++ , Python)

Our experience is very minimal. Our only use of the OAK API/SDK has been through an implementation of geaxgx's depth_blazepose repo (https://github.com/geaxgx/depthai_blazepose) with the current OAKForUnity code to stream a video feed and corresponding BlazePose results to a Unity scene in (near) real-time. We have not yet written any of our own code that interfaces with the API/SDK. This repo mostly handle our needs, but removing the need for the Unity bridge and adding any of the features mentioned above that are missing would be a major improvement.

Hello, I just wanted to express my team's interest in this asset.

Main platform are you using? (MacOS, Win, Linux)

We primarily operate on Windows.

Could you explain bit more about your use case? What are you looking to build with the Unity plugin?

We are biomechanics researchers using full-body tracking and pose estimation in different apps/games with built-in biomechanics analysis. Historically, this type of work has been restricted to pre-recorded motions, but pose estimation software allows us to analyze movements in (near) real-time. Access to the camera's RGB and depth data as well as the pose estimation results (potentially converted to a consistent/universal format, regardless of the pose tool) and root (pelvis) position in world space, would be our primary needs. Ease of switching between different pose estimation software (MediaPipe, MMPose, etc.) and custom models within those software would also be very desirable for us. The ability to detect multiple people (incorporating YOLOPose? MMPose does this by default) would also be a major plus.

What part of the Unity plugin you find most interesting? (OAK for Creators, OAK for Developers, OAK for CV/AI)

Likely OAK for Developers since, as mentioned above, we would like to be able to use custom pretrained models and multiple different pose estimation software.

Do you have experience with OAK API/SDK? What platform are you using the most? (C/C++ , Python)

Our experience is very minimal. Our only use of the OAK API/SDK has been through an implementation of geaxgx's depth_blazepose repo (https://github.com/geaxgx/depthai_blazepose) with the current OAKForUnity code to stream a video feed and corresponding BlazePose results to a Unity scene in (near) real-time. We have not yet written any of our own code that interfaces with the API/SDK. This repo mostly handle our needs, but removing the need for the Unity bridge and adding any of the features mentioned above that are missing would be a major improvement.

Hi @garrett-tuer ! Thank you very much for you feedback and interest in OAK For Unity. Evaluating different pose estimation models including multi-pose models makes sense. As probably you read above in previous comments, we're looking on some of those topics, taking into account RVC2 capabilities.