Handed-ness of coordinate system
sagadre opened this issue · 0 comments
Problem
The THOR medatada for agent pose and objects seems to live in the native Unity left-handed coordinate system. This might be confusing for people who are not familiar with how Unity works as right-handed coordinate systems are much more common. For example, creating point clouds in world space using allenact.embodiedai.mapping.mapping_utils.point_cloud_utils.depth_frame_to_camera_space_xyz
and allenact.embodiedai.mapping.mapping_utils.point_cloud_utils.camera_space_xyz_to_world_xyz
and visualizing in meshlab will give the appearance of a flip (as seen below). Ultimately this should be handled on the user side, but it should be clear that THOR is using a left-handed coordinate system. Hence updated THOR docs and docs for allenact.embodiedai.mapping.mapping_utils.point_cloud_utils
would be helpful.
Steps to reproduce
from ai2thor.controller import Controller
import trimesh
import torch
from allenact.embodiedai.mapping.mapping_utils.point_cloud_utils import \
depth_frame_to_camera_space_xyz, camera_space_xyz_to_world_xyz
from PIL import Image
controller = Controller(
renderDepthImage=True,
renderInstanceSegmentation=True,
width=672,
height=672,
visibilityDistance=20.0,
fieldOfView=90,
agentMode='locobot',
rotateStepDegrees=30
)
event = controller.step(action="Done")
camera_space_xyz = depth_frame_to_camera_space_xyz(
depth_frame=torch.as_tensor(event.depth_frame),
mask=None,
fov=90
)
x = event.metadata['agent']['position']['x']
y = event.metadata['agent']['position']['y']
z = event.metadata['agent']['position']['z']
world_points = camera_space_xyz_to_world_xyz(
camera_space_xyzs=camera_space_xyz,
camera_world_xyz=torch.as_tensor([x, y, z]),
rotation=event.metadata['agent']['rotation']['y'],
horizon=event.metadata['agent']['cameraHorizon'],
)
world_points = torch.transpose(world_points, 0, 1)
rgba_colors = torch.ones(world_points.shape[0], 4)
rgba_colors[:, :3] = 0.
ply = trimesh.points.PointCloud(vertices=world_points.numpy(), colors=rgba_colors.numpy())
ply.export("dbg.ply")
Image.fromarray(event.frame).save('dbg.png')