Questions about the script: action, prev, goal

Question

Questions about the script: action, prev, goal

nndei opened this issue 5 years ago · 5 comments

I have doubts about the code that I would like to solve before applying some modifications. Some I just don't know and some I have ideas about what they are.

I have seen that you clip the action vector at line 141 of PsmEnv in order to ensure its elements are bounded:

action = np.clip(action, self.action_space.low, self.action_space.high)

and later in PsmEnv_Position, this is copied and used to conduct position and gripper control:

action = action.copy()
                        [...]
			pos_ctrl, gripper_ctrl = action[:3], action[3]
			gripper_ctrl = (gripper_ctrl+1.0)/2.0

Q1: Is action in the first place randomized from the policy? Is it done in some other way?

I haven't understood why each time you perform an action in the _set_action function inside PsmEnv_Position (line 131), you apply a rotation control. I know it is done to set the quaternion of the EE's frame to the desired standard orientation, but I don't get how it may change, considering the actions and since you already specify it in the reset method.

temp_q =  quaternions.qmult([q_table[3], q_table[0], q_table[1], q_table[2]], [ 0.5, -0.5, -0.5,  0.5])
rot_ctrl = np.array([temp_q[1], temp_q[2], temp_q[3], temp_q[0]])

Then rot_ctrl is set as the quaternion with setPoseAtEE. So,
Q2: Why is rot control needed if the action is simply to change the end effector's position in space and open/close the gripper? Is it a precaution towards modifications?

Q3: What are the variables prev_ for and what's the meaning of prev? I see they get reinitialized in the methods but I don't see them used anywhere.

Q4: Why do you call _simulator_step inside the method _reset_sim in PsmEnv_Position e.g at lines 244, 269, 314. Is it necessary every time you apply a big change like turning off dynamics (244), removing grasped (269) and resetting the positions and orientations of object (314)?

Q5: In the goal sampling method, I don't understand why you randomize z off a value target_offset, why is it needed?

Q6: it is unclear to me which object exactly j=6 refers to in ArmPSM. I believe it is EE_virtual_handle but I confuse it with TT. We refer to j=6 and define fns like setPoseAtEE, so it surely refers to the EE, but I don't know which exact object.
If it refers to EE_virtual_handle, then does setting its position change the whole robot pose because it is parent of all the links?

Q7: checking the dV joints I have seen that the tool tip should rotate about 2 horizontal axes intersecting at J2_virtual_TOOL1, then have an additional opening-closing DoF. I don't get why in the hierarchy the TOOL has just 2 such axes/joints, 1 vertical and 1 horizontal, then the remaining one is split with the left and right reference, which should perform opening and closing. Is the opening and closing together with the last revolute DoF? I see how it may be done together, simply moving the 2 parts apart or together.

Q8: dynamics_enabled is set to False in the code, but does this mean that respondable and dynamic bodies in the scene are treated differently?

Thank you for your support and patience, Neri

Answer 1 · 2020-04-21T17:06:28.000Z

Q1: Is action in the first place randomized from the policy? Is it done in some other way?
I am confused by this question... A policy for this environment can be whatever you would like it to be, and it should input an action of the appropriate dimensions to the environment.

Q2: Why is rot control needed if the action is simply to change the end effector's position in space and open/close the gripper? Is it a precaution towards modifications?
The V-REP simulation uses an IK solver which requires a target orientation, so an orientation needs to be set every time when using it. If you just get the current EE orientation then set it, floating point errors will cause the orientation to drift.

Q3: What are the variables prev_ for and what's the meaning of prev? I see they get reinitialized in the methods but I don't see them used anywhere.
These were states I experimented with and found they did not help with the training, so I left them out of the return in the .step()

Q4: Why do you call _simulator_step inside the method _reset_sim in PsmEnv_Position e.g at lines 244, 269, 314. Is it necessary every time you apply a big change like turning off dynamics (244), removing grasped (269) and resetting the positions and orientations of object (314)?
Yes

Q5: In the goal sampling method, I don't understand why you randomize z off a value target_offset, why is it needed?
Was necessary when setting up and debugging the environment. It would need to be changed if the table were lowered or something.

Q6: it is unclear to me which object exactly j=6 refers to in ArmPSM. I believe it is EE_virtual_handle but I confuse it with TT. We refer to j=6 and define fns like setPoseAtEE, so it surely refers to the EE, but I don't know which exact object.
If it refers to EE_virtual_handle, then does setting its position change the whole robot pose because it is parent of all the links?
In ArmPSM.py I wrote a function called getPoseAtJoint(self, j, ignoreError = False, initialize = False) starting at line 140. This function gets the pose at joint j relative to the base frame if j > 0 and j < 6. If j <=0 it returns the base of the robot in the world frame. If j = 6, it returns the end-effector pose (which is also the 6th joint coordinate frame). If j > 7, it returns the tool tip pose which is defined at the end of the gripper.

The setPoseAtEE runs the IK solver and is completely separate from the getPoseAtJoint function. This function actually moves the robot to the input pose.

Q7: checking the dV joints I have seen that the tool tip should rotate about 2 horizontal axes intersecting at J2_virtual_TOOL1, then have an additional opening-closing DoF. I don't get why in the hierarchy the TOOL has just 2 such axes/joints, 1 vertical and 1 horizontal, then the remaining one is split with the left and right reference, which should perform opening and closing. Is the opening and closing together with the last revolute DoF? I see how it may be done together, simply moving the 2 parts apart or together.
The last joint and gripper are technically mixed joints on the real da Vinci. This simulation copies that by moving J3_dx_TOOL1 and J3_sx_TOOL1 independently. Some simple arithmetic is done such that they can be opened and closed together or actuate the last joint independently like the real da Vinci.

Q8: dynamics_enabled is set to False in the code, but does this mean that respondable and dynamic bodies in the scene are treated differently?
Yes, this is part of how V-REP/Coppelia does their simulation.

Answer 2 · 2020-04-21T17:19:05.000Z

Dear @bango123, your help is as always much appreciated.

I have not understood these two answers:
A1: To be more clear, I have not understood where you initially set that "action" I am talking about in the pre-question introduction. Where does the code take the vector "action" from? Is it updated every time-step?

A8: Does it mean that if I were to use the code as you put it and add an object, I wouldn't have to care about specifying if it is respondable or non-static or anything, because dynamics_enabled is set to False?
Like does turning it on consider such properties in the simulation and turning it off excludes them from its computations?

Best regards, thank you

Neri

Answer 3 · 2020-04-21T17:29:43.000Z

No problem @nndei!

A1: To be more clear, I have not understood where you initially set that "action" I am talking about in the pre-question introduction. Where does the code take the vector "action" from? Is it updated every time-step?

PSMEnv.py is a gym-wrapper which has a .step() function (defined on line 140). This function takes action as an input which is then passed to the other lines of code you are referring to. The action would be supplied from a policy as per the open-ai gym standard

A8: Does it mean that if I were to use the code as you put it and add an object, I wouldn't have to care about specifying if it is respondable or non-static or anything, because dynamics_enabled is set to False?

It depends on your end goal with the simulation. For now, this is just a position based simulation and dynamics are not considered. Please refer to https://www.coppeliarobotics.com/helpFiles/en/designingDynamicSimulations.htm for more information on how V-REP does dynamic vs static objects.

Answer 4 · 2020-04-21T17:35:54.000Z

PSMEnv.py is a gym-wrapper which has a .step() function (defined on line 140). This function takes action as an input which is then passed to the other lines of code you are referring to. The action would be supplied from a policy as per the open-ai gym standard

Great! Glad I had understood. I meant this in the first post.

It depends on your end goal with the simulation. For now, this is just a position based simulation and dynamics are not considered. Please refer to https://www.coppeliarobotics.com/helpFiles/en/designingDynamicSimulations.htm for more information on how V-REP does dynamic vs static objects.

I gave this a read, but I couldn't find much of a certain answer for my case. I just need to import another object and use a dummy on one of its vertices as goal.
I believe no dynamics would be needed as for your code. Think so?
Also, do you think that a mesh with a high content of triangles is problematic if dynamics are off?

Best regards, Neri

Answer 5 · 2020-04-23T16:03:07.000Z

I think no dynamics will be okay if it is stationary unless grasped by the PSM. Also, the more triangles you put in the mesh the longer each simulation time will take, even with no dynamics. The meshes are used for collision checking as well.