OpenScanEu/OpenScan

crop() destroys quality and EXIF metadata

ExtReMLapin opened this issue · 12 comments

Meshrooms uses the EXIF metadata

commented crop() call :

ExifTool Version Number         : 11.54
File Name                       : nocrop.jpg
Directory                       : .
File Size                       : 4.4 MB
File Modification Date/Time     : 2020:10:05 01:36:04+02:00
File Access Date/Time           : 2020:10:27 13:34:59+01:00
File Creation Date/Time         : 2020:10:27 13:34:35+01:00
File Permissions                : rw-rw-rw-
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg
Exif Byte Order                 : Big-endian (Motorola, MM)
Make                            : RaspberryPi
Camera Model Name               : RP_imx219
X Resolution                    : 72
Y Resolution                    : 72
Resolution Unit                 : inches
Modify Date                     : 2020:10:05 01:36:04
Y Cb Cr Positioning             : Centered
Exposure Time                   : 1/50
F Number                        : 2.0
Exposure Program                : Aperture-priority AE
ISO                             : 100
Exif Version                    : 0220
Date/Time Original              : 2020:10:05 01:36:04
Create Date                     : 2020:10:05 01:36:04
Components Configuration        : Y, Cb, Cr, -
Shutter Speed Value             : 1/50
Aperture Value                  : 2.0
Brightness Value                : 0.97
Max Aperture Value              : 2.0
Metering Mode                   : Center-weighted average
Flash                           : No Flash
Focal Length                    : 3.0 mm
Maker Note Unknown Text         : (Binary data 340 bytes, use -b option to extract)
Flashpix Version                : 0100
Color Space                     : sRGB
Exif Image Width                : 3280
Exif Image Height               : 2464
Interoperability Index          : R98 - DCF basic file (sRGB)
Exposure Mode                   : Auto
White Balance                   : Auto
Compression                     : JPEG (old-style)
Thumbnail Offset                : 1058
Thumbnail Length                : 24576
Image Width                     : 3280
Image Height                    : 2464
Encoding Process                : Baseline DCT, Huffman coding
Bits Per Sample                 : 8
Color Components                : 3
Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
Aperture                        : 2.0
Image Size                      : 3280x2464
Megapixels                      : 8.1
Shutter Speed                   : 1/50
Thumbnail Image                 : (Binary data 24576 bytes, use -b option to extract)
Focal Length                    : 3.0 mm
Light Value                     : 7.6

not commented crop() call :

File Name                       : crop.jpg
Directory                       : .
File Size                       : 698 kB
File Modification Date/Time     : 2020:10:05 01:34:34+02:00
File Access Date/Time           : 2020:10:27 13:35:16+01:00
File Creation Date/Time         : 2020:10:27 13:33:06+01:00
File Permissions                : rw-rw-rw-
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg
JFIF Version                    : 1.01
Resolution Unit                 : None
X Resolution                    : 1
Y Resolution                    : 1
Image Width                     : 2464
Image Height                    : 3280
Encoding Process                : Baseline DCT, Huffman coding
Bits Per Sample                 : 8
Color Components                : 3
Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
Image Size                      : 2464x3280
Megapixels                      : 8.1

it also destroys quality even with no crop/resize

One good way would be to write the (raw?) pixels to stream and then encode it when needed to file. Instead of saving file (first quality loss iteration) and then reencode it when rotate/crop (second quality loss iteration).

It would also open doors to more file formats (RAW, DNG...)

Thank you for these very valid points! I am currently reworking the firmware and removed the crop functionality. So there is no unnecessary quality loss by double compression. I am currently still testing the new firmware, but I hope to release/implement it within the next week. (Sorry, I had to take some time off, but now I am 100% back on the project :) See this list of the current changes: https://github.com/OpenScanEu/OpenScan/blob/master/update_universal.log

Unfortunately, the problem with the metadata still remains, as it is caused by the picamera python library:
from its documentation:

If resize is specified, or use_video_port is True, Exif metadata will not be included in JPEG output. This is due to an underlying firmware limitation. https://picamera.readthedocs.io/en/release-1.10/api_camera.html

I know, that Meshroom asks for the Metadata and will show a little Icon next to the images, but nevertheless, Meshroom can still be used without Metadata. I haven't seen any visible loss in quality, when using an image set without metadata.

Unfortunately, the problem with the metadata still remains, as it is caused by the picamera python library:
from its documentation:

Well that's quite surprising, as right now, removing the crop() calls literally brings back the metadata.

As right now, the use_video_port arg is not used, unless it is in the next version ?

Ah yes, I have been a bit unclear:
In the current version crop() is used to crop and rotate the image (and thus applying a second operation of opening, manipulating and saving --> loss of quality and loss of metadata)
In the upcoming version, I am using camera.capture's resize option: https://picamera.readthedocs.io/en/release-1.10/api_camera.html#picamera.camera.PiCamera.capture
which will also remove the metadata but should not decrease the overall quality (?!)

Alright, understood !

Unbenannt
Here is a side by side comparison of jpg vs png. I really do not see any difference.

Here is a really interesting article about raw vs. jpg... :https://peterfalkingham.com/2020/05/22/photogrammetry-does-shooting-raw-or-jpg-make-a-difference/

JPG compression happens by block of 8x8 pixels, which means, the smaller the object you're scanning, the bigger the compression problem is as the % of losses increases.

I didn't have the occasion of scanning a small black object, with double compression disabled so I cannot give input about it.
In the article wrote by peter, he's not scanning dark homogeneous-ish objects.

Here is an example of where it's a big issue (but keep in mind there was a crop() call so it ended up being compressed twice)

It's clearly not a straight line because JPEG compression.

It was an object I needed to edit and re-print with VERY HIGH accuracy.

As the double JPEG compression bug is fixed, it will probably do the job as it is in the next version

I see what you mean.
Anyway, there would be another option to increase the quality of the scan output. I assume, that you have used some kind of powder to cover the surface? Powder tends to give quite uneven areas, some are fully covered, others are not covered at all. I am a big fan of chalk spray, or when it comes to more delicate objects, scanning spray (namely Aesub White or Blue). Here is one example of a 5cm tall figurine sprayed with Aesub white.
26
It took some practice to get such even coverage, but it is totally worth it compared to all other methods I've tried..

I'll give a try with chalk spray the next time I have to scan a small black object.

Right now I've been using chalk (not spray, just chalk) and baby powder, which is more affordable.

You can get chalk spray (marking spray) at the department store for around 3-5€ a bottle, which will last for 200+ benchys. Dry shampoo should work similarly, but I haven't tried it myself :)

That's good to know, because I only found chalk spray on amazon for 20€

Fixed