alexvbogdan/DeepCalib

Focal results interpretation and accuracy

Closed this issue · 6 comments

I have ran the SingleNet classification model on a few test images. Now I obtain 260 as a focal result for most of my tests (but not all). For one of the test I know the photo was taken with a 26mm camera and I got focal=260 out of your model. What is the unit of your focal results? Pixels? How do I convert that to mm focal?
Thanks a lot for your research and help on this.

Can you give a little more details, such as were the test images taken with the same camera? Did you take those pictures or you downloaded them from somewhere? What were the sizes of the images?
Also, it might be useful to check #1 where it is described how to utilize the predicted focal length. It seems that 260px is the prediction for 299x299 image. If the network predicts 260px for each test image but their original sizes are different, then the real focal length will be different for all of them.

Yes, the units are pixels. The conversion formula is the following: f = (F / sensor_width) x w where F is a focal length is in mm, f is a focal length in px, sensor_width is a width of a sensor in mm and w is the image width in px (in case of heigth > width, use height). It is usually computed for x and y directions separately, but in our case, we assumed that focal length is the same in both directions.

Dear Tetsujinfr,

As described by Alex in the previous message, the focal length is provided in pixel. Thus, knowledge about the sensor (size of the pixel in mm) is required to convert the estimation from pixel to mm.

additionally, it would be interesting to have some information regarding the images you aim to calibrate. Keep in mind that this network has been trained for wide field-of-view cameras and the performances are not guaranteed if your focal length is not included in the training set (typically long focal length can outside of this range).

Best

thanks for your replies. I attach 17 photos mostly I found on the internet ([https://www.photographyblog.com/]) where the 35mm equivalent focal length was mentioned. This way I have tried to be consistent and not get fooled by crop factors and things like that. I ran the Single_net classification model with your weights, below are the results I obtain.
Please let me know if I am doing something nonsensical, and how I should interpret the focal results from the model inference. Ultimately, I am trying to infer camera intrinsic parameters for images with unknown source, like the "unknown_fov.jpg" image attached below.

Thanks a lot for your help and guidance.

`

Photo file Single_net Classifier Results Real focal length
unknown_fov.jpg focal: 260 dist: 0.0 unknown
26mm_photo.jpg focal: 260 dist: 0.48 26mm
14mm_SigmaLens_photoCompareMED.jpg focal: 180 dist: 0.2 14mm
16mm_SigmaLens_photoCompareMED.jpg focal: 180 dist: 0.3 16mm
17mm_SigmaLens_photoCompareMED.jpg focal: 350 dist: 0.68 17mm
19mm_SigmaLens_photoCompareMED.jpg focal: 350 dist: 0.52 19mm
21mm_SigmaLens_photoCompareMED.jpg focal: 350 dist: 0.88 21mm
24mm_SigmaLens_photoCompareMED.jpg focal: 480 dist: 0.68 24mm
13mm_IP11_photo1MED.jpg focal: 150 dist: 0.22 13mm
26mm_S9_photo1MED.jpg focal: 290 dist: 0.28 26mm
31mm_S5_photo1MED.jpg focal: 380 dist: 0.0 31mm
105mm_D780_photo1MED.jpg focal: 480 dist: 0.0 105mm
14mm_SigmaLens_photo2MED focal: 190 dist: 0.0 14mm
24mm_D780_photo1MED.jpg focal: 290 dist: 0.78 24mm
33mm_CoolPixP950_photo1MED.jpg focal: 320 dist: 0.04 33mm
43mm_CoolPixP950_photo1MED.jpg focal: 290 dist: 0.14 43mm
85mm_D780_photo1MED.jpg focal: 490 dist: 0.0 85mm

test_photos.zip
`

I have checked the images you provided. You are not doing something wrong, except for interpreting the results. As I mentioned above, #1 gives a simple guide on how to interpret the predicted focal length. I will make it clear on the example you provided.

The real focal length for a particular camera is calculated in a following way: Real_focal = (Image_width / 299) * predicted_focal (if height > width, then use height). The width of 26mm_photo.jpg is 4032 which means that the real focal length in pixels for this image will be (4032 / 299) * 260 = 3506.
As you can see, the 105mm image and the 85mm image have the same size, but different focal length in mm. However, our prediction is almost the same. This is a kind of error you might get using this network, as we never said the accuracy is 100%.
Anyway, the output is still meaningful as the predicted focal length in pixels grows with the physical one in mm. In addition, all the images you provided do not have strong visible distortion, the network output for distortion is pretty close to 0 for most of the cases. Do not be mislead by the output for 24mm_SigmaLens_photoCompareMED.jpg since with bigger focal length the distortion parameter has a weaker effect on the image. In terms of distortion output, 24mm_D780_photo1MED.jpg is clearly an outlier.

Regarding conversion from pixels to mm, you need to know the sensor width in mm.
I hope this answer helped.

By the way, I just checked one of the images 26mm_photo.jpg. It says 26mm in the image name, but when I opened the properties of this image the focal length that was provided from the camera (EXIF file probably) it says 4.2mm. In this case, it is a phone camera, so it is clear that the 2.6cm focal length is impossible and 4.2mm makes sense. I also found that the sensor width is 5.76 mm.

From the previous comment, we can get the focal length in mm by F = (f x sensor_width) / w.
Substituting these numbers in the formula gives us 3506 * 5.76 / 4032 ~= 5mm. So, you can use this network to get an approximate number, but if some constraint is violated, e.g. image is cropped, then the result will be incorrect.
focal

thanks for taking the time to clarify all this ,that is super useful. I am clear and I need to strengthen my optics knowledge I guess.
The 26mm photo file was marked 26mm because it is converted to a 35mm sensor (full frame) equivalence, but you are correct, it is a S7 phone with a real physical 4.2mm focal length. Actually all the file names are maked as fullframe equivalent focal length, so not sure if I confused everything here or not, that would definitely blur the lines for Smartphones photos, but even possibly for APS-C type of DSLR camera sensors.