paolobarbolini/img-parts

Improper EXIF header using WebP set_exif function

akinsella opened this issue · 4 comments

Describe the bug

WebP EXIF chunk is not read successfully after being written with the library.

WebP specification mentions that exif chunk header value should be a 4 bytes value (EXIF). I can see in source code that chunk header value is 6 bytes long in lib.rs on line 82:

pub(crate) const EXIF_DATA_PREFIX: &[u8] = b"Exif\0\0";

After writing exif chunk on a WebP image, then on following image read the error "Invalid TIFF byte order" is being returned.

Reproduction

Source code below can be used to reproduce the issue. It will require 2 dependencies:

Dependencies:

image = { version = "0.24.7", features = ["webp-encoder", "avif-encoder"] }
kamadak-exif = "0.5.5"

Sample WebP file: (Ultra minimalist with 400 bytes):

Source code: (Code might not be optimal)

use image::EncodableLayout;
use exif::experimental::Writer;

use exif::parse_exif;
use std::{env, io::BufReader, fs::File};
use exif::Field;

use img_parts::ImageEXIF;

use std::io::prelude::*;

pub fn convert_to_bytes(exif_fields: &[Field]) -> Result<Vec<u8>, exif::Error> {
    let mut writer = Writer::new();
    let mut buf = std::io::Cursor::new(Vec::new());

    exif_fields.iter().for_each(|f| {
        writer.push_field(f);
    });

    writer.write(&mut buf, false)?;

    Ok(buf.into_inner())
}

fn main() {
    let args = env::args().collect::<Vec<String>>();

    println!("args: {:?}", args);

    let input_file_path = &args[1];
    let output_file_path = &args[2];
    println!("File path - file_path: {:?}", input_file_path);
    println!("File path - output_file: {:?}", output_file_path);

    let file = File::open(input_file_path).expect("failed to open file");
    let read_result = exif::Reader::new().read_from_container(&mut BufReader::new(&file));
    let exif = read_result.unwrap();

    let exif_data_buf: Vec<u8> = exif.buf().to_vec();
    println!("exif_bytes: {:?}", exif_data_buf);
    let exif_data = exif_data_buf.as_bytes();
    let exif_fields = parse_exif(exif_data).ok().unwrap().0;

    for f in exif_fields.iter() {
        println!("exif_data: {:?}", f);
    }

    let file = File::open(input_file_path).expect("failed to open file");
    let mut reader = BufReader::new(file);
    let buffer = reader.fill_buf().unwrap();

    let buffer_length = buffer.len();
    println!("buffer_length: {:?}", buffer_length);

    let buf = buffer.to_owned();
    let bytes = buf.into();

    let mut webp = img_parts::webp::WebP::from_bytes(bytes).unwrap();
    let output_exif_bytes = convert_to_bytes(&exif_fields).unwrap().to_vec();
    println!("exif_bytes: {:?}", output_exif_bytes);

    let exif_bytes = Some(output_exif_bytes.into());
    let exif_bytes_unwrapped = exif_bytes.clone().unwrap();
    println!("exif_bytes: {:?}", exif_bytes_unwrapped);

    webp.set_exif(exif_bytes);
    let mut output_file = std::fs::File::create(output_file_path).unwrap();
    let result = webp.encoder().write_to(&mut output_file).unwrap();

    println!("Result size: {}", result);
}

Execute this code a first time (after adding it in examples directory):

RUST_BACKTRACE=1 cargo run --example webp-exif ~/Downloads/exif.webp ~/Downloads/exif-out.webp

Then execute it a second time :

RUST_BACKTRACE=1 cargo run --example webp-exif ~/Downloads/exif-out.webp ~/Downloads/exif-out-2.webp

The 2nd execution should trigger an error.

It can be verified as well thanks to exiftool binary (brew install exiftool) that mentions a warning which is not present for initial file:

❯ exiftool ~/Downloads/exif-out.webp

ExifTool Version Number : 12.70
File Name : exif-out.webp
Directory : /Users/alexis.kinsella/Downloads
File Size : 406 bytes
File Modification Date/Time : 2023:12:05 21:55:03+01:00
File Access Date/Time : 2023:12:05 21:55:05+01:00
File Inode Change Date/Time : 2023:12:05 21:55:03+01:00
File Permissions : -rw-r--r--
File Type : Extended WEBP
File Type Extension : webp
MIME Type : image/webp
WebP Flags : EXIF
Image Width : 15
Image Height : 7
VP8 Version : 0 (bicubic reconstruction, normal loop)
Horizontal Scale : 0
Vertical Scale : 0
Warning : [minor] Improper EXIF header
Exif Byte Order : Big-endian (Motorola, MM)
Image Description : WebP test
X Resolution : 72
Y Resolution : 72
Resolution Unit : inches
Software : GIMP + implex + kamadak-exif
Exif Version : 0232
Image Size : 15x7
Megapixels : 0.000105

Expected behavior

It should write exif chunk successfully into a WebP image, and not prevent reading of exif data on the newly generated image with an exif library such as kamada-exif.

Proposed solution

Change of constant EXIF_DATA_PREFIXvalue from 6 bytes to 4 bytes seems to work as expected:

pub(crate) const EXIF_DATA_PREFIX: &[u8] = b"Exif";

Change of EXIF_DATA_PREFIX value to a 4bytes value has been tested and seems to behave successfully at least for the purpose of writing exif into a WebP image. However the overall impact of this change has not been thoroughly assessed in all possible ways. Other use of this value in the codebase might be problematic or not.

Additional context

All in all, I am not sure whether it is expected to have a 6 bytes long constant of 4 bytes long or if I missed some key point. It seems to me that value should be 4 bytes long and not 6 bytes long based on specification (metadata section): Metadata

Ok, the proposed fix finally does not seem to be working as expected as exiftool does not parse anymore the EXIF chunk, even thought it is still able to identify and extract chunk with the following command: exiftool -b ~/Downloads/exif-out.webp.

Comparing binary hex of image, I wonder if header shouldn't be empty as I finally got a satisfying result as shown below.
Resulting image with exif section rewritten seems to be correct when header remains empty.

Here are some screenshots comparing hexdump of initial image:

CleanShot 20231206-075449

Then the image modified with img-parts with EXIF_DATA_PREFIX: &[u8] = b"Exif\0\0":

CleanShot 20231206-075628

Then the image modified with img-parts with EXIF_DATA_PREFIX: &[u8] = b"":

CleanShot 20231206-075501

WebP specification states that Chunk header should be 4 bytes long with uppercase. I still misunderstand the purpose of a 6 bytes long header with 4 mixed case characters (Exif instead of EXIF) following by 2 bytes of value 0.

I was able to find additional images with this header: Exif\0\0. exiftool says it has "Improper EXIF header" as well, which does not help to make sure whether it is ok or not.
kamadak-exif returns Invalid TIFF byte order for such images and eventually fails to parse EXIF.
Would you have some reference documents for WebP EXIF format that you relied on for the development of the library that would help to understand what should be expected ?

Workarounded the issue.