med-air/DEX

Coordinate Transformation and Workspace

Closed this issue · 9 comments

Hello,

I am receiving a sensible cartesian trajectory for the needle pick.
untitled
However, the cartesian position doesn't quite match the joint positions. For example providing the dVRK with the joint positions [0.41041827508953993 -0.5074103527953661 0.16461946887696666 0.48424603099967994 0.6265858251749896 -0.1466937918716302] I would expect to receive a position close to x=0.0312652, y=0.0035137, z=0.11708. The learned trajectory however gives me x= 2.6260259151458745, y=0.01377844344824547, z= 3.5077011585235596.

I assume this is due to the pyBullet coordinate frame. Could you maybe provide some details on how to transform this to the robot coordinate system?

Additionally, the area in which the needle can be picked is confined by the workspace limits 2.5<x<3, -0.25<y<0.25, 3.426<z<3.776. Is it possible to extend this so that it would be possible to pick a needle with z=0.0?

I would really appreciate your help.

Hi,

Happy to hear that you have recorded the trajectory successfully.

For problem 1 --- how to transform pose from world coordinate (PyBullet coordinate) to robot coordinate --- please consider this function.

For problem 2 --- how to adjust the limit of the workspace --- please consider taking a look at this part.

Please feel free to let me know if any further questions.

Hello,
Thank you for the quick reply.
The function you linked for problem 1 was exactly what I needed.

For problem 2 however, I already tried to change this part. But if for example I just overwrite (workspace_limits[0].mean() + (np.random.rand() - 0.5) * 0.1 with 0.0 the robot will try to grasp the needle, but is not able to because it is outside the workspace. Is it possible to extent or shift the workspace itself?

Hi,

I would suggest changing this line to adjust the limit of the workspace itself.

However, I am not 100% sure this would fit your intention. If not, please let me know.

Hello,

I found those workspace limits as well, but it seems that those are not the ones that are actually used. If I print workspace_limits here I receive 2.5 < x < 3, -0.25 < y 0.25, 3.426 < z < 3.776.
Do you have an idea where these values might come from? Could they just be in reference to a different coordinate system?

Hi,

This limited space sounds sensible to me, as there is a scale factor on the WORKSPACE_LIMIT. Please check that.

Hello,
Thank you for your replies. Unfortunately I'm still facing problems setting a different pick position. But I think the problem wasn't really the workspace after all. When I don't make any changes at all, I receive the following trajectories when running the evaluation.
oldWorkspace_oldPosition
Those look sensible. After that I tried setting a position that must be within the normal workspace: x= workspace_limits[0].mean() + 0.1, y = workspace_limits[1].mean() + (- 0.5) * 0.1, z = workspace_limits[2][0] + 0.01) Basically I just removed the random() function. For this pick position I receive the following trajectories which don't make sense:
oldWorkspace_fixedPosition
Am I doing something wrong when setting a different pick position?

Hi,

There are several things you could pay attention to.

Firstly, the x-position in your code is set as x= workspace_limits[0].mean() + 0.1. The + 0.1 here differs greatly from the original setting (np.random.rand() - 0.5) * 0.1.

Secondly, such a difference will lead to bad policy performance, because the RL agent has a slim opportunity to encounter the needle position with x-position set as x= workspace_limits[0].mean() + 0.1 during training. As a result, the RL agent has no idea of picking the needle at this position due to the limited generalization ability.

To summarize, you need to make sure that (1) the initialized position of the needle you set here is in a reasonable range that is likely to be encountered during the RL training, or (2) you can change the initialization position of the needle, for example making it wider, and then train an RL agent based on you changed settings to make RL more generalizable. Both can ensure a sensible trajectory executed by the RL policy.

Closing because of inactivity.