How to train/test on my own dataset
changfali opened this issue ยท 30 comments
Hi,
I want to try IRON on my own datasets, but I don't know how to get cam_dict_norm.json, could you share with me this part of code? Thanks!
Hi, I used colmap to get cam_dict_norm.json. The results in exp_iron_stage1/ours/validations_fine/ are good( after 30000 iters)
, but results in exp_iron_stage1/ours/normals are not good( after 30000 iters):
Could it was because I didn't do any things to make sure that:"Note we also assume the objects are inside the unit sphere. " as you said in readme?
Hi Kai,
I used the code "run_colmap.py" in Nerf++ to produce the cam_dict_norm.json, but the result is also bad, even for your dataset.
I used all images in train and test of "Xmen" to run the code "run_colmap.py" in Nerf++, then I run "camera_visualizer\visualize_cameras.py" in Nerf++ to visualize the cameras, here is the result, the first picture the cameras result of "Xmen" given by "run_colmap.py" (the json file is xmen/posed_images/kai_cameras_normalized.json), and the second is the cameras in the original dataset of "Xmen" :
you can see the deference. After that I splited the images in the dir "xmen/mvs/image"(this is the result of "run_colmap.py") to train and test by myself, and then I used them to train Iron. But after 15000 iters, the result of normal.png was still the same as the beginning:
and the result of mesh has 0 Vertices,0 Faces:
while for the original data, the result after 2500 iters is already good:
Do you know which part could be the reason?
I have been troubled by this problem for a long time, and it would be a GREAT HELP for me!! THANKS!! @Kai-46
Hi @changfali , I'm sorry that you have spent a lot of time on this; camera conventions are always a pain when you work in this domain. It takes time to learn.
To help you, I'd like to ask a few questions in order to make sure we are on the same page:
-
Did you happen to notice that colmap SfM outputs camera parameters as well as a sparse point cloud: https://github.com/Kai-46/nerfplusplus/blob/ebf2f3e75fd6c5dfc8c9d0b533800daaf17bd95f/colmap_runner/extract_sfm.py#L107 ? And you can visualize the point cloud together with the cameras?
-
Did you happen to understand how camera normalization (as well as point cloud normalization, and that they need to be done together) work?
-
Did you happen to understand the technique of debugging poses with epipolar geometry visualization?
Totally fine if you haven't thought about these questions. But this would be helpful for me to answer your questions based on your familiarity in this area.
Best,
Kai
Also struggling to get cam_dict_norm.json
Hi @Kai-46 , Thanks for the reply! I know that colmap SfM outputs camera parameters as well as a sparse point cloud, and we can visualize the point cloud together with the cameras, but this link(https://github.com/Kai-46/nerfplusplus/blob/ebf2f3e75fd6c5dfc8c9d0b533800daaf17bd95f/colmap_runner/extract_sfm.py#L107) is not for point could, but for triangle mesh. So I visualized the camera and the sparse point cloud without normalization:
right view:
top view:
for the camreas, the blue ones are the output of colmap(without normalization): ./xmen/posed_images/kai_cameras.json
and the green ones are from the dataset you provided(with normalization).
the sparse point cloud is from: ./xmen/posed_images/kai_points.ply
and here is the inspect_epipolar_geometry.py result:
Is this a bug of colmap or the script "run_colmap.py"?
I find that you have share the complete output of scripts on an example data here: Kai-46/nerfplusplus#16, but the link can not be used anymore, could you please share the complete scripts for us to get the cam_dict_norm.json (with the input of our data)?
Thanks a lot!
best,
changfa
- Hi, I can try to share you a script later this week. But if you'd like to figure things out yourself, here are the steps you need starting from the unnormalized cameras and sparse point cloud from Colmap: 1) remove the extreme outliers in the sparse point cloud using meshlab; 2) then compute the oriented bounding box for the sparse point cloud using open3d; 3) next, compute a translation vector and scale scalar to move the center of the bounding box to the origin, and diagonal of the bounding box to something smaller than 2, say 1.75. 4) use the translation vector and scale scalar to normalize the cameras and the point cloud; after this normalization, the point cloud (hence the object) will be inside the unit sphere.
Hi, here are some others results:
cameras (xmen/posed_images/kai_cameras_normalized.json) and colmap MVS result mesh (xmen/mvs/meshed_trim_3.ply with normalization using https://github.com/Kai-46/nerfplusplus/blob/ebf2f3e75fd6c5dfc8c9d0b533800daaf17bd95f/colmap_runner/extract_sfm.py#L107):
you can see the face part is not right, and the cameras are all in front of the scene.
here are the inspect_epipolar_geometry.py results:
The version of my colmap is 3.8, build from source following: https://colmap.github.io/install.html on Ubuntu 20.04.
- Hi, I can try to share you a script later this week. But if you'd like to figure things out yourself, here are the steps you need starting from the unnormalized cameras and sparse point cloud from Colmap: 1) remove the extreme outliers in the sparse point cloud using meshlab; 2) then compute the oriented bounding box for the sparse point cloud using open3d; 3) next, compute a translation vector and scale scalar to move the center of the bounding box to the origin, and diagonal of the bounding box to something smaller than 2, say 1.75. 4) use the translation vector and scale scalar to normalize the cameras and the point cloud; after this normalization, the point cloud (hence the object) will be inside the unit sphere.
Hi@Kai-46, did you finish the script? I have no idea which step could be wrong for amera conventions..
Here it is. I hope the script is self-explanatory. Please keep in mind that the normalization must be applied to the point cloud and cameras at the same time (might worth thinking why this is the case from your side); and when you inspect the epipolar geometry, try picking pairs of images that overlap in content rather than a front image and a back image (might worth thinking about the logic too).
import numpy as np
import json
import copy
import open3d as o3d
def normalize_cam_dict(
in_cam_dict_file, out_cam_dict_file, target_radius, in_geometry_file, out_geometry_file
):
# estimate a translate and scale that centers the objects from the sparse point cloud
#! note if your sparse point cloud contains extreme outliers, you should remove the outliers in meshlab first,
#! ohterwise, the estimated bounding box is going to be too big.
pcd = o3d.io.read_point_cloud(in_geometry_file)
box = pcd.get_oriented_bounding_box()
box_corners = np.asarray(box.get_box_points()) # [8, 3]
box_center = np.mean(box_corners, axis=0, keepdims=True) # [1, 3]
dist = np.linalg.norm(box_corners - box_center, axis=1, keepdims=True) # [8, 1]
diagonal = np.max(dist) * 2.
translate = -box_center.reshape((3, 1))
scale = target_radius / (diagonal / 2.)
# apply translate and scale to the sparse point cloud
tf_translate = np.eye(4)
tf_translate[:3, 3:4] = translate
tf_scale = np.eye(4)
tf_scale[:3, :3] *= scale
tf = np.matmul(tf_scale, tf_translate)
pcd_norm = pcd.transform(tf)
o3d.io.write_point_cloud(out_geometry_file, pcd_norm)
# apply translate and scale to the cameras
with open(in_cam_dict_file) as fp:
in_cam_dict = json.load(fp)
def transform_pose(W2C, translate, scale):
C2W = np.linalg.inv(W2C)
cam_center = C2W[:3, 3]
cam_center = (cam_center + translate) * scale
C2W[:3, 3] = cam_center
return np.linalg.inv(C2W)
out_cam_dict = copy.deepcopy(in_cam_dict)
for img_name in out_cam_dict:
W2C = np.array(out_cam_dict[img_name]["W2C"]).reshape((4, 4))
W2C = transform_pose(W2C, translate, scale)
assert np.isclose(np.linalg.det(W2C[:3, :3]), 1.0)
out_cam_dict[img_name]["W2C"] = list(W2C.flatten())
with open(out_cam_dict_file, "w") as fp:
json.dump(out_cam_dict, fp, indent=2, sort_keys=True)
if __name__ == "__main__":
in_cam_dict_file = ""
out_cam_dict_file = ""
in_geometry_file = ""
out_geometry_file = ""
normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, target_radius=0.8, in_geometry_file=in_geometry_file, out_geometry_file=out_geometry_file)
Hi, thanks for the script!! In my understanding, the normalization for both cameras and the point cloud is to put them into a unit sphere( Is this right?).
This is the result of the normalization:
But as you can see, the positions of the cameras are not right: all the cameras are in front of the object. And this is the reason why I picked a front image and a back image for inspecting the epipolar geometry. The images which have non overlap in content should not have keypoints with others, but in my case it went wrong. Is this the mismatch of SIFT? Can you test the Xmen data from your side? Again, the version of my colmap is 3.8, build from source following: https://colmap.github.io/install.html on Ubuntu 20.04.
Please allow me to ask a simple question: how did you manage to reconstruct the back of X-men (as shown in your point cloud) if all your cameras are in the front?
Btw, the script I shared only normalizes the point cloud such that the object is inside the unit sphere; there is no guarantee that the normalized cameras are also inside the unit sphere.
Ignore the white camera cones, I am too lazy to delete them from the code so i set them to white.
This is what I got from running the script Kai provided.
- Delete the points that don't 'belong' to the model. The normalization calculation is based on the entire point cloud's furthest points forming a bounding box on the point cloud.
- I set the scale numerator to 0.5 to fit the pointcloud inside the sphere.
Do the results look reasonable?
This one looks more reasonable!
Btw, one side tip in case you don't know: you can visualize the bounding box in meshlab easily through "Render -> Show box corners". Showing the box also makes it easier to manually remove the extreme outliers in meshlab.
Glad to hear that!
So maybe changfali can get the results hes looking for if he starts by cropping away all the noisy points in his pointcloud
Glad to hear that! So maybe changfali can get the results hes looking for if he starts by cropping away all the noisy points in his pointcloud
Please allow me to ask a dumb question, if I run colmap gui by myself and get the cameras,images,points3D file in both bin and txt format.What should be the input of the in_cam_dict_file = "" in the script ?
Glad to hear that! So maybe changfali can get the results hes looking for if he starts by cropping away all the noisy points in his pointcloud
Please allow me to ask a dumb question, if I run colmap gui by myself and get the cameras,images,points3D file in both bin and txt format.What should be the input of the in_cam_dict_file = "" in the script ?
You have to transform it into a json file with the coordinate convention they are using.
That can be done using part of the function extract_all_to_dir, from https://github.com/Kai-46/nerfplusplus/blob/master/colmap_runner/extract_sfm.py. Seems to be from line 97 to line 99. Just a reminder this one gives unnormalized values. I hope this helps~
Glad to hear that! So maybe changfali can get the results hes looking for if he starts by cropping away all the noisy points in his pointcloud
Please allow me to ask a dumb question, if I run colmap gui by myself and get the cameras,images,points3D file in both bin and txt format.What should be the input of the in_cam_dict_file = "" in the script ?
You have to transform it into a json file with the coordinate convention they are using. That can be done using part of the function extract_all_to_dir, from https://github.com/Kai-46/nerfplusplus/blob/master/colmap_runner/extract_sfm.py. Seems to be from line 97 to line 99. Just a reminder this one gives unnormalized values. I hope this helps~
I tried to run this
sparse_dir = 'E:\deep_learning_stuff\New_Folder\ngp5\instant-ngp\colmap_sparse\0'
cameras, images, points3D = read_model(sparse_dir, ext)
camera_dict = parse_camera_dict(cameras, images)
with open(camera_dict_file, 'w') as fp:
json.dump(camera_dict, fp, indent=2, sort_keys=True)
Sorry I am not a code guy, can you write a complete script with input and output ? Thanks - -
Here it is. I hope the script is self-explanatory. Please keep in mind that the normalization must be applied to the point cloud and cameras at the same time (might worth thinking why this is the case from your side); and when you inspect the epipolar geometry, try picking pairs of images that overlap in content rather than a front image and a back image (might worth thinking about the logic too).
import numpy as np import json import copy import open3d as o3d def normalize_cam_dict( in_cam_dict_file, out_cam_dict_file, target_radius, in_geometry_file, out_geometry_file ): # estimate a translate and scale that centers the objects from the sparse point cloud #! note if your sparse point cloud contains extreme outliers, you should remove the outliers in meshlab first, #! ohterwise, the estimated bounding box is going to be too big. pcd = o3d.io.read_point_cloud(in_geometry_file) box = pcd.get_oriented_bounding_box() box_corners = np.asarray(box.get_box_points()) # [8, 3] box_center = np.mean(box_corners, axis=0, keepdims=True) # [1, 3] dist = np.linalg.norm(box_corners - box_center, axis=1, keepdims=True) # [8, 1] diagonal = np.max(dist) translate = -box_center scale = target_radius / (diagonal / 2.) # apply translate and scale to the sparse point cloud tf_translate = np.eye(4) tf_translate[:3, 3:4] = translate tf_scale = np.eye(4) tf_scale[:3, :3] *= scale tf = np.matmul(tf_scale, tf_translate) pcd_norm = pcd.transform(tf) o3d.io.write_point_cloud(out_geometry_file, pcd_norm) # apply translate and scale to the cameras with open(in_cam_dict_file) as fp: in_cam_dict = json.load(fp) def transform_pose(W2C, translate, scale): C2W = np.linalg.inv(W2C) cam_center = C2W[:3, 3] cam_center = (cam_center + translate) * scale C2W[:3, 3] = cam_center return np.linalg.inv(C2W) out_cam_dict = copy.deepcopy(in_cam_dict) for img_name in out_cam_dict: W2C = np.array(out_cam_dict[img_name]["W2C"]).reshape((4, 4)) W2C = transform_pose(W2C, translate, scale) assert np.isclose(np.linalg.det(W2C[:3, :3]), 1.0) out_cam_dict[img_name]["W2C"] = list(W2C.flatten()) with open(out_cam_dict_file, "w") as fp: json.dump(out_cam_dict, fp, indent=2, sort_keys=True) if __name__ == "__main__": in_cam_dict_file = "" out_cam_dict_file = "" normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, target_radius=0.8)
Hi, can you provide a script to convert colmap gui output to the camera convention used here ?
After running run_colmap.py I get kai_cameras.json,kai_cameras_normalized.json,kai_keypoints.json,kai_points.ply.Then I open ply file in meshlab and remove points that don't belong to the model and save to replace the original ply.
Then I run the script provided by kai, I get the following error.
File "camera.py", line 63, in
normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, target_radius=0.8)
TypeError: normalize_cam_dict() missing 2 required positional arguments: 'in_geometry_file' and 'out_geometry_file'
After running run_colmap.py I get kai_cameras.json,kai_cameras_normalized.json,kai_keypoints.json,kai_points.ply.Then I open ply file in meshlab and remove points that don't belong to the model and save to replace the original ply. Then I run the script provided by kai, I get the following error. File "camera.py", line 63, in normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, target_radius=0.8) TypeError: normalize_cam_dict() missing 2 required positional arguments: 'in_geometry_file' and 'out_geometry_file'
in_geometry_file is the .ply file you edited. out_geometry_file is the name to save the output pointcloud file
After running run_colmap.py I get kai_cameras.json,kai_cameras_normalized.json,kai_keypoints.json,kai_points.ply.Then I open ply file in meshlab and remove points that don't belong to the model and save to replace the original ply. Then I run the script provided by kai, I get the following error. File "camera.py", line 63, in normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, target_radius=0.8) TypeError: normalize_cam_dict() missing 2 required positional arguments: 'in_geometry_file' and 'out_geometry_file'
in_geometry_file is the .ply file you edited. out_geometry_file is the name to save the output pointcloud file
I understand, but I don't know what to do...
I modified the last few lines:in_geometry_file = "/home/michael/rd12_out/posed_images/kai_points.ply"
out_geometry_file = "/home/michael/rd12_out/posed_images/out.ply"
in_cam_dict_file = "/home/michael/rd12_out/posed_images/kai_cameras.json"
out_cam_dict_file = "/home/michael/rd12_out/posed_images/normalized.json"
normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, in_geometry_file, out_geometry_file, target_radius=0.8)
And it gives error :Traceback (most recent call last):
File "camera.py", line 63, in
normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, in_geometry_file, out_geometry_file, target_radius=0.8)
TypeError: normalize_cam_dict() got multiple values for argument 'target_radius'
I tried to remove "target_radius" in last line and replace target_radius with 0.8, but then "File "camera.py", line 64, in
normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, in_geometry_file, out_geometry_file)
TypeError: normalize_cam_dict() missing 1 required positional argument: 'out_geometry_file'
"
It must be some stupid mistake - -
run the function like this :
normalize_cam_dict(in_cam_dict_file, out_cam_dict_file,0.8, in_geometry_file, out_geometry_file)
run the function like this : normalize_cam_dict(in_cam_dict_file, out_cam_dict_file,0.8, in_geometry_file, out_geometry_file)
Reading PLY: [========================================] 100%
Traceback (most recent call last):
File "camera.py", line 64, in
normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, 0.8,in_geometry_file, out_geometry_file)
File "camera.py", line 18, in normalize_cam_dict
box = pcd.get_oriented_bounding_box()
AttributeError: 'open3d.open3d.geometry.PointCloud' object has no attribute 'get_oriented_bounding_box'
Seems like something wrong with open3d, I installed it by pip install open3d-python.
Now I install open3d by pip install open3d,
now it gives "Traceback (most recent call last):
File "camera.py", line 64, in
normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, 0.8,in_geometry_file, out_geometry_file)
File "camera.py", line 29, in normalize_cam_dict
tf_translate[:3, 3:4] = translate
ValueError: could not broadcast input array from shape (1,3) into shape (3,1)
"
I have tried installing open3d in many ways, they all lead to the same error.
run the function like this : normalize_cam_dict(in_cam_dict_file, out_cam_dict_file,0.8, in_geometry_file, out_geometry_file)
Sorry to bother you again, do you know how to solve the issue above ?
Hi @Michaelwhite34, sorry that there were some bugs in the code I shared earlier. I just created a self-contained camera normalization demo with examples included; you can simply use the normalize_then_visualize_camera.py
to convert COLMAP outputs to JSON files: https://www.icloud.com/iclouddrive/00f6DcB-NJNHH16r3o5BVcovg#camera_demo
Hi @Michaelwhite34, sorry that there were some bugs in the code I shared earlier. I just created a self-contained camera normalization demo with examples included; you can simply use the
normalize_then_visualize_camera.py
to convert COLMAP outputs to JSON files: https://www.icloud.com/iclouddrive/00f6DcB-NJNHH16r3o5BVcovg#camera_demo
Can the scripts run on windows ? First I run colmap gui to get a folder sparse/0 with 3 bin and 1 ini. And then run normalize_then_visualize_cameras.py to get "'Please manually crop your sparse point cloud in meshlab to remove outliers!'". I know I should have run run_colmap.py first instead of using Colmap gui, but that would make error on windows and I deleted my ubuntu long time ago.
(open3d) D:\nerfplusplus\colmap_runner>python run_colmap.py
Running sift matching...
Running cmd: D:/COLMAP-3.9.1-windows-cuda/COLMAP-3.9.1-windows-cuda/bin/colmap feature_extractor --database_path D:\New5\sfm\database.db --image_path D:/New_Folder4/
--ImageReader.single_camera 1 --ImageReader.camera_model SIMPLE_RADIAL
--SiftExtraction.max_image_size 5000 --SiftExtraction.estimate_affine_shape 0 --SiftExtraction.domain_size_pooling 1
--SiftExtraction.use_gpu 1 --SiftExtraction.max_num_features 16384
--SiftExtraction.gpu_index -1
Traceback (most recent call last):
File "run_colmap.py", line 167, in
main(img_dir, out_dir, run_mvs=run_mvs)
File "run_colmap.py", line 128, in main
run_sift_matching(img_dir, db_file, remove_exist=False)
File "run_colmap.py", line 39, in run_sift_matching
bash_run(cmd)
File "run_colmap.py", line 15, in bash_run
subprocess.check_call(['/bin/bash', '-c', cmd])
File "D:\anaconda3\envs\open3d\lib\subprocess.py", line 359, in check_call
retcode = call(*popenargs, **kwargs)
File "D:\anaconda3\envs\open3d\lib\subprocess.py", line 340, in call
with Popen(*popenargs, **kwargs) as p:
File "D:\anaconda3\envs\open3d\lib\subprocess.py", line 858, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "D:\anaconda3\envs\open3d\lib\subprocess.py", line 1327, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
Finally I figured it out, but there is still error in the mesh generation phase.
global_step: 50000
loss.item(): 0.17388814687728882
img_loss.item(): 0.17244312167167664
img_l2_loss.item(): 0.08620146661996841
img_ssim_loss.item(): 0.08624166250228882
eik_loss.item(): 0.0014450259041041136
roughrange_loss.item(): 0.0
color_network_dict["point_light_network"].get_light().item(): 1.240997076034546
100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 50001/50001 [2:19:22<00:00, 5.98it/s]
ic| f"Exporting mesh and materials to: {export_out_dir}": ('Exporting mesh and materials to: '
'./exp_iron_stage2/rabbit/mesh_and_materials_50000')
ic| 'Exporting mesh and uv...'
Traceback (most recent call last):
File "render_surface.py", line 549, in
export_mesh_and_materials(export_out_dir, sdf_network, color_network_dict)
File "render_surface.py", line 325, in export_mesh_and_materials
export_mesh(sdf_fn, os.path.join(export_out_dir, "mesh.obj"))
File "/workspace/iron/models/export_mesh.py", line 73, in export_mesh
areas = np.array([c.area for c in components], dtype=np.float)
File "/root/anaconda3/envs/iron/lib/python3.8/site-packages/numpy/init.py", line 305, in getattr
raise AttributeError(former_attrs[attr])
AttributeError: module 'numpy' has no attribute 'float'.
np.float
was a deprecated alias for the builtin float
. To avoid this error in existing code, use float
by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64
here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations