TGS and gaps between PxD6Joints

Question

TGS and gaps between PxD6Joints

alanjfs opened this issue 2 years ago · 12 comments

Hello PhysX team!

I've had a problem for many moons whereby any gap between two rigid bodies connected by a PxD6Joint (or any joint, it seems) would cause severe jitter in my simulations.

gaps_pvd.mp4

The problem is worsened with thinner radii and greater differences in mass. Except when using PGS, at which point there seems to be no limit to how thin or short things can get.

PGS

pgs.mp4

I've managed to reproduce the problem using only these capsule, but the problem is much worse in more complex setups and much less obvious that gaps are to blame. Here are a few PVDs to illustrate things.

PVD	Description
gaps_1_sphere.zip	Largest possible gap, a sphere
gaps_2_capsule.zip	Slightly less of a gap, with a short capsule
gaps_3_mass1.zip	Less gap, with with a constant mass of 1 for each rigid body
gaps_4_pgs.zip	PGS, works in any scenario.

I really enjoy TGS for it's strong drives and would much prefer to keep using it, but these jitters make the choice less easy. :( It's taken a while to get it to happen on such a small scale, and next time it happens in a more complex scenario I'll try and upload the PVD for it as well. In case there are two separate issues hiding underneath.

Any idea of what to do about this?

Answer 1 · 2022-02-25T14:24:02.000Z

Here's another example from the host application. The Pose Stiffness value is the PxD6Joint angular drive stiffness.

gaps.mp4

Answer 2 · 2022-02-25T14:31:51.000Z

I'll check out the repro and report back if I find a cause

Kier

Answer 3 · 2022-02-25T15:49:11.000Z

I am fairly sure this is a bug. I'm investigating now, but if you flip the actors in the joint connecting the static capsule to the sphere, it works correctly. I'll hopefully track down the error soon

Answer 4 · 2022-02-25T16:07:21.000Z

I can confirm that does appear fixed when these two are swapped. What a relief. 😅 If you do find a way to avoid having to do that, that would be amazing!

Answer 5 · 2022-02-25T16:30:54.000Z

Working on trying to find out what the difference is. I'm hopeful we'll have a solution soon

Answer 6 · 2022-02-28T13:12:54.000Z

I was able to figure this out. This occurs if joint pre-processing is disabled. The reason is that, as the anchor point is offset from the rigid body significantly, the effective mass of the constraint becomes small and therefore a small force at the joint anchor causes a large change in angular velocity of the attached body. There are 3 Jacobians operating on the constraint. The Jacobians remain mostly independent when the joint frame follows the dynamic body (dynamic body is the first body in the pair), but there becomes significant overlap between the angular components when the frame remains in the space of the static body. The feedback loop between the constraints due to the large offset creates a situation where it takes a large number of solver iterations to converge on a decent result.

PhysX uses joint pre-processing in the PGS solver to overcome this, and it resolves the issue by making the 3 Jacobians independent. If you turn off joint pre-processing, you should see PGS begin to fail as well. With joint pre-processing, PGS gets the right result with 1 iteration.

We do have joint pre-processing with the TGS solver, but we had disabled a portion of it that pre-processed these linear constraints. At the time, I was not convinced that this would work as intended because some of the properties were calculated on-the-fly and some used these pre-processed values. As a result, I disabled this step. I've been through with a fine-tooth comb verifying that it does produce the correct results and I am extremely relieved to say that the original code that was disabled with the TGS solver does exactly what it was intended to do.

To solve the issue, please go into DyConstraintSetup.cpp

in preprocessRows, find this line:

if(groupMajorId == 4 || (groupMajorId == 8 && preprocessLinear))

and remove the reference to preprocessLinear. This will enable linear constraint pre-processing when using the TGS solver. Please then check your original fail case and you should see that it resolves the issues.

Please note that soft constraints like springs do not get pre-processed, so they would still have issues with large joint offsets and would require either an increase in solver position iterations or an increase in inertia tensor to make them converge.

Answer 7 · 2022-02-28T13:56:43.000Z

I can confirm that this particular test case works! I'll keep an eye out in more complex situations, but this is already very promising. 😄

A few things to just strengthen my understanding:

Why wasn't this pre-processing necessary when flipping the actors?
Is there an alternative configuration that does not require pre-processing?

1. Flipped Actors

The the first question, swapping places of parentRigid and childRigid also solved the problem. How come this doesn't need pre-processing?

PxD6Joint* pxjoint = PxD6JointCreate(*p.physics, parentRigid, parentFrame, childRigid, childFrame);

Initially, I thought it had to do with the base being eKINEMATIC but the same problem appears between the two tip rigids that are both dynamic.

2. Large Joint Offset Alternative

You mention the large joint offsets being more of a challenge for the solver; in a situation like the above, is there any alternative to having these large joint offsets?

For example, in this particular setup, one of the frames is identity with the other doing all of the heavy lifting. Would it help leaving the shape in the center of the rigid, and have both frames adjust to compensate?

Answer 8 · 2022-02-28T14:22:20.000Z

Here's what I had in mind for an alternative joint offset configuration, does this matter at all? 🤔 That is, having the PxTransform of the PxRigidDynamic (i.e. pose) sit closer to the center of mass, rather than at the joint position.

Answer 9 · 2022-02-28T14:28:05.000Z

The important component is the cross product of the joint offset (r) and the linear constraint direction (n). This produces the angular component for the constraint is calculated from r X n. In the case of flipping the bodies, r and n are collinear directions (both along the x-axis of the dynamic body), so the cross product is zero. In this case, there's one jacobian doing most of the work and the other 2 don't do anything. This Jacobian is fully independent of the other two. However, when we follow the frame of the static body, 2 Jacobians (equating to the X and Y world space axes) need to do work, and they are not independent because applying a force along X causes a torque that makes the body rotate around the Z-axis (all depending on the transform of the dynamic body). Similarly, applying a force along the Y axis causes a torque that makes the body rotate around Z. As the anchor is significantly offset from the COM of the bodies, small rotations of the dynamic body yield large changes in anchor position.
There are alternatives where it's easier on the solver, but without pre-processing you will be unlikely to always end up in a state where all axes are independent in all cases unless you attach constraints at the COM of the body. If the joint anchors are on the surface of the bodies (and not offset by a large distance), the problem is a lot easier for the solver. The problem gets harder as the inertia of the body becomes much smaller.

Changing the location of the transform wouldn't make a difference. Changing the position of the COM of your joint or moving the joint anchors closer to the COM of the bodies would make a difference. Artificially increasing the inertia tensor such that it coincided with a shape large enough to touch/include the joint anchor location would also make things easier on the solver.

An alternative would be to use articulations, which would handle this case just fine, although they may require relatively small time-steps if the angular velocities become large.

Answer 10 · 2022-02-28T14:43:51.000Z

Awesome. I'm happy with this. Thanks @kstorey-nvidia ❤️

Answer 11 · 2022-03-14T19:25:06.000Z

An update on this, it has worked well!

Overall much more reliable and absolutely zero issues with gaps between any rigids.

Until today.

gaps_5_thereturn.pxd2

gaps5_thereckoning.mp4

As you can tell:

0 gaps, all is well
1 gap, things seem stable
2 gaps, trouble is brewing
3 gaps, chaos ensues

This is with TGS and PxD6Joint with locked linear limits. The problem goes away with PGS, except when distances grow even larger. With larger distances and PGS, I'm able to increase the number of position iterations to 256 and reduce the timestep to 5 ms or less to work around it.

I suspect (and would accept) that this isn't the same problem as before, or even a "bug" in the system, but rather that it is simply a difficult problem to solve. Is there anything I'm overlooking, is there a way of getting this type of setup to run smoothly?

Answer 12 · 2022-03-15T10:14:40.000Z

Hi Alan

I wasn't able to open up your PVD file for some reason, so I can't see the behaviour. However, I do have something you could try that I've got working locally that resolves your original issue without needing to switch around bodies in the pair.

Please go to DyConstraintSetup.cpp and find the preprocessRows function.

Find the line that looks like this:

if(groupMajorId == 4 || (groupMajorId == 8 && preprocessLinear))

and remove the check for preprocessLinear (or just set the calling code to true for this in the TGS solver).

The TGS solver disables pre-processing linear constraints, whereas the PGS solver doesn't. Part of the data used by TGS for linear constraints is computed on-the-fly and I was not at all confident that orthogonalization would work correctly so I disabled it for safety's sake. I went back to check this due to the issue you reported earlier and confirmed that the original code does actually work and the derived values computed match expectations.

You are right that these problems are very difficult to solve because the constraint anchors are so far offset from the COM of one of the bodies. This means that any forces applied at the anchor point (where work is done by the constraint) introduce a large torque acting on that body. This kind of offset also results in the effective mass ratio at the constraint becoming large because the effective mass of the body that is offset from the joint decreases.

At the moment, the best bets are as follows:
(1) Try the pre-processing change I suggested to see if it improves things
(2) Consider increasing the inertia on the bodies if you separate them like this. This would control the mass ratio better.
(3) Consider using an articulation. Articulations should handle this kind of large effective mass ratio more robustly.