The code can work normally, but the error between the estimated value and the real value is particularly large.
liaoyu1992 opened this issue · 10 comments
Here is the actual value and the estimated value.
the error between the estimated value and the real value is particularly large.
I checked and debugged the code,the code static bool readPTZFeatureLocationAndDescriptors
in btdtr_ptz_util.cpp
,the keypoint_data and descriptor_data and ptz_data are Eigen::MatrixXf
,Because the accuracy is not enough, the error after reading the decimal places is very large.I guess this is one of the reasons, are there any other suggestions?
thank you very much.
The result is worse than expected. Here is main step of the program
- Extract sift feature and save to .mat file (prepare_train_data_ptz.m). Make sure the features are from the background, not from players.
- Train a random forest and predict the pan/tilt (btdtr_ptz_test_soccer.cpp). At that step, you can check the precision of the predicted pan/tilt/zoom values for each sift feature. There should be enough inliers.
- Estimate pan-tilt-zoom value of the camera.
I suspect there is something wrong in step 1 and 2.
Eigen::MatrixXf is not a problem because 'float' is accurate enough for SIFT.
Hope that helps.
when extracting sift feature and save to .mat file (prepare_train_data_ptz.m).
As shown in the figure, there are a lot of feature points in the auditorium, but the characteristics of the stadium are particularly small. Is this correct?
and,how big a data set is best for training?Rich lens or single lens?What is the best value of the training parameter in BTDTRTreeParameter
?
From this image, the feature point is good enough because the random forests and RANSAC will remove some outliers (the one in the auditorium).
If you use the world cup dataset, you can matching the image using their pan angles because similar pan angle images have large overlap so that more good feature matches are kept as training examples.
The training parameter is here:
sampled_frame_num 15
pp_x 640
pp_y 360
tree_num 5
max_tree_depth 20
max_balanced_depth 4
max_sample_num 1000
min_leaf_node 1
min_split_node 1
candidate_dim_num 6
candidate_threshold_num 10
min_split_node_std_dev 0.1
verbose 0
verbose_leaf 0
The training set is randomly sampled 50% images from two games (BRA vs. MEX and BRA vs. NED). The other 50% is used as testing data.
Hope that helps.
I assume the table shows the ground truth PTZ and estimated PTZ. It looks good.
For a new image, the image should be from the same stadium and the same PTZ camera of the training set.
- extract SIFT features and save as .mat file
- use btdtr_ptz_test_soccer.cpp to predict PTZ.
You can also modify the source code to make it easy.
If the new image is from another stadium or another camera, this method does not work.
The algorithm randomly selects two point feature and to get an initial PTZ cameras. It will generate many hypotheses. From the output, it looks like all hypotheses failed. The error is because point feature matching has too much outliers. You can draw these points in the image to verify their locations, for example, on the static background objects.
It is expected because the second image has fewer point feature. If you want to get the camera parameter of the whole video, an alternative way is to use PTZ SLAM https://github.com/lood339/Pan-tilt-zoom-SLAM .
This is an unfinished project and I am still working on that.
ok,thank you!!