NVIDIAGameWorks/PhysX

Multithreading crash in PhysX 5.1 in CPU mode when using eNOTIFY_TOUCH_PERSISTS pair flag

StefanRehling opened this issue · 3 comments

Hello all,
I have an issue with multithreading CPU collision detection and PhysX 5.1. I want to track contact points between complex geometries (modelled as kinematic bodies) in a simulation application for press tools. Therefore, I set up two scene descriptions, one for GPU mode, one for CPU mode. If no capable GPU is installed in the user's system, the CPU model mode shall be the fallback. Most settings of both are taken from the given SDK examples. The GPU mode is working without issues.

GPU description:

PxSceneDesc descGPU(m_pPhysX->getTolerancesScale());
descGPU.gravity = PxVec3(0.0f, 0.0f, 0.0);
descGPU.kineKineFilteringMode = PxPairFilteringMode::eKEEP;
descGPU.broadPhaseType = PxBroadPhaseType::eGPU;
descGPU.flags |= PxSceneFlag::eENABLE_GPU_DYNAMICS;
descGPU.gpuMaxNumPartitions = 8;
descGPU.simulationEventCallback = m_pEventCallback;
descGPU.cudaContextManager = g_pCudaContextManager;
descGPU.flags |=  PxSceneFlag::eENABLE_PCM | PxSceneFlag::eREQUIRE_RW_LOCK;
descGPU.dynamicTreeRebuildRateHint = 500;
descGPU.solverType = PxSolverType::eTGS;
descGPU.cpuDispatcher = m_pDispatcher;
descGPU.filterShader = SimFilterShader;
descGPU.filterCallback = m_pEventCallback;

CPU description:

PxSceneDesc descCPU(m_pPhysX->getTolerancesScale());
descCPU.gravity = PxVec3(0.0f, 0.0f, 0.0);
descCPU.kineKineFilteringMode = PxPairFilteringMode::eKEEP;
descCPU.broadPhaseType = PxBroadPhaseType::ePABP;
descCPU.simulationEventCallback = m_pEventCallback;
descCPU.flags |=  PxSceneFlag::eENABLE_PCM | PxSceneFlag::eREQUIRE_RW_LOCK;
descCPU.dynamicTreeRebuildRateHint = 500;
descCPU.solverType = PxSolverType::eTGS;
descCPU.cpuDispatcher = m_pDispatcher;
descCPU.filterShader = SimFilterShader;
descCPU.filterCallback = m_pEventCallback;

I set up the dispatcher with the number of threads to be used and left the default parameters as they were:

m_pDispatcher = PxDefaultCpuDispatcherCreate(2); //use two threads

In my filter shader as well as in my event callback, I use the following flag combination for pair filtering:

pairFlags = PxPairFlag::eDETECT_DISCRETE_CONTACT
      | PxPairFlag::eNOTIFY_TOUCH_FOUND
      | PxPairFlag::eNOTIFY_TOUCH_LOST
      | PxPairFlag::eNOTIFY_TOUCH_PERSISTS
      | PxPairFlag::eNOTIFY_CONTACT_POINTS;

The following combination brings PhysX to crash:

  • eNOTIFY_TOUCH_PERSISTS set in pairFlags
  • thread count in PxDefaultCpuDispatcherCreate > 1 (no error with single threading)
  • CPU collision detection
  • High number of collision bodies

I use convex decomposition (VHACD) for approximate modelling of the complex part geometries I have to deal with. Thus, there are hunderds of PxShape objects which (potentially) collide wiht each other.

At the end the error occurs within the PxArray class.

image

The call stack is as follows:

image

My assumption is that two threads concurrently try to alter the array of persistant contact event pairs, which leads to re-allocation of memory triggered by one thread and then the other thread tries to work on a memory block which just became invalid.
I had such issues several times in my own code. Or maybe I am wrong and threre is a setting which I am doing wrong?

Could you please give me an advise?

Many thanks
Stefan

I cannot reproduce this so far. I tried a scene with a pile of objects, each of them using a convex decomposition. I enabled the same contact notification flags, used multiple threads with (as far as I can tell) the same setup as above, and it doesn't crash. I can look at the code in search of something wrong but I didn't see anything yet.
image

Hello Pierre, thanks for the investigation. Perhaps it is also relevant how I run the simulation.
I run the simulation stepwise each timestep of my simulation system's engine. I use PhysX only to perform the collision detection. I use setKinematicTarget on the PxActors which hold the PxShape objects of the convex decomposition (see code below):


//loop over all object instances and set actual positions on the collision PxActors
m_pScene->lockWrite(__FILE__, __LINE__);
for (auto& it : m_setCollsionObjectInstances)
{
    PxTransform tr = ...; //here the poses of the kinematic actors are queried from the simulation data model
    PxRigidDynamic* pActor = it.pActor;
    pActor->setGlobalPose(tr); 
    //using setGlobalPose here needs to be done to really set the actor to the desired position AT THE BEGINNING of the time step. 
    //use of setKinematicTarget function is advised but in that case, the desired position is reached at the end of the simulation step. 
    //the collision detection is done with the initial (one timestep behind) position
    /* 
    Quote NVIDIA (https://github.com/NVIDIAGameWorks/PhysX/issues/255):
    This is the expected behaviour. It ensures consistent behaviour with kinematic and dynamic rigid bodies. Set kinematic target instructs the simulation to move the kinematic 
    to that pose in the next simulated frame. The actor doesn't immediatiately move to that location but, instead a velocity is calculated that will 
    result in the actor reaching the target at the end of the next time-step. Collision detection (including trigger processing) is performed using the initial poses of the actors, 
    the constraint solver is invoked and the rigid bodies are integrated by the velocities produced by the constraint solver. 
    You could get trigger events immediately by using setGlobalPose rather than setKinematicTarget but that would cause bad simulation behaviour 
    when your kinematic interacted with dynamic rigid bodies (e.g. friction would not work properly because the kinematic would have teleported 
    into a pose rather that moved continuously).        
    */
    
    // to make collision detection work, the call to setKinematicTarget(tr) is still neccessary
    pActor->setKinematicTarget(tr);
}


m_pScene->simulate(info.dSimTimeInterval);   

m_pScene->fetchResults(true);
m_pScene->unlockWrite();


std::chrono::steady_clock::time_point physx_profiler_end = std::chrono::steady_clock::now();
long lMicroSecs = std::chrono::duration_cast<std::chrono::microseconds>(physx_profiler_end - physx_profiler_start).count();


What information / data would you additionally need?

sorry, accidently clicked close...