How do you resize FFHQ? Is it PIL's bicubic?

Question

How do you resize FFHQ? Is it PIL's bicubic?

Closed this issue 7 days ago · 7 comments

Hi, How do you resize FFHQ? Do you use PIL's bicubic implementation? That is img.resize((target_size, target_size), Image.Resampling.BICUBIC)

Answer 1 · 2024-11-07T03:04:19.000Z

You mention that you resize FFHQ images to 512 * 512 with bi-cubic downsampling. However, in your code, I see you rely on torchvision.transforms.Resize() this line ,it‘s default mode is bilinear.

Answer 2 · 2024-11-07T06:09:55.000Z

I down-scaled ffhq in advance and saved the results to a folder, using opencv bicubic down-sampling. The code I used to do so is not included in this github repo. I simply loaded the images with a for loop, downsampled them, and saved the results.

The line of code you mentioned is useless unless you choose to load images of size that is different than the size you wish to use for training.

Answer 3 · 2024-11-07T06:29:56.000Z

Hi @ohayonguy , I concern that pre-precessing difference (resize) will harm reproducibility. Furthermore, I recommend PIL's resize implementation, it process image with anti-aliasing; please ref to clean-fid

Answer 4 · 2024-11-07T06:41:43.000Z

I am sorry, I wrote that I used opencv, but that's a mistake. I checked my code, and I actually did use PIL's implementation. Here is the code I used to down-scale FFHQ:

import os
from concurrent.futures import ProcessPoolExecutor, as_completed

from PIL import Image
from tqdm import tqdm


def downscale_image(input_path, output_path, size=(512, 512)):
    with Image.open(input_path) as img:
        img_resized = img.resize(size, Image.BICUBIC)
        img_resized.save(output_path)
    return input_path


def process_images(input_folder, output_folder, size=(512, 512)):
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)

    image_files = [f for f in os.listdir(input_folder) if f.lower().endswith(('.png', '.jpg', '.jpeg', '.gif', '.bmp'))]
    total_images = len(image_files)

    with ProcessPoolExecutor() as executor:
        futures = []
        for image_file in image_files:
            input_path = os.path.join(input_folder, image_file)
            output_path = os.path.join(output_folder, image_file)
            futures.append(executor.submit(downscale_image, input_path, output_path, size))

        with tqdm(total=total_images, desc="Processing Images", unit="image") as pbar:
            for future in as_completed(futures):
                processed_file = future.result()  # This will raise any exceptions that occurred during processing
                pbar.update(1)
                pbar.set_postfix(file=os.path.basename(processed_file), refresh=False)


process_images(
    input_folder="./data/ffhq1024",
    output_folder="./data/ffhq512",
    size=(512, 512),
)

Notice that this is only for resizing the ground-truth images. But to resize degraded images (e.g., of low resolution) back to 512x512, we use the bilinear kernel. This is common practice.

I disagree that resizing FFHQ using bilinear instead of bicubic will cause significant differences in training. It will just mildly change the distribution of the ground-truth images. Regardless, we did mention explicitly in the paper that we used the bicubic kernel, so I am not sure what do you mean by "harm reproducibility." Could you please elaborate? :)

Answer 5 · 2024-11-07T06:56:26.000Z

I disagree that resizing FFHQ using bilinear instead of bicubic will cause significant differences in training. It will just mildly change the distribution of the ground-truth images

Yes, it should only have mild influence on visual results, but as stated in clean-fid paper, image processing lib will affects metrics like FID significantly. In terms of reproducibility, your work is very good already, thanks! I just suggest adding instruction about how to resize FFHQ to further enhance it.

Answer 6 · 2024-11-07T07:01:54.000Z

@Luciennnnnnn Ahh I see what you mean. Though, clean-fid is quite an old package, and the image you shared showing the antialiasing caused from different packages is from 3 years ago. Do you think that the same issues will arise when using more recent versions of torch (e.g., >=2), which, does implement antialiasing for bilinear and bicubic kernels on default?

I will add the instructions on FFHQ! Many thanks for pointing this out! Appreciate it :)

Answer 7 · 2024-11-07T07:07:45.000Z

torchvision has already resolved its aliasing issues and now aligns with PIL's behavior in recent versions. However, OpenCV may still maintain its original behavior. If you're interested in exploring this further, I encourage you to test it out; Its simple to do that. By the way, just mention what you use in experiments is enough for users.