How do you resize FFHQ? Is it PIL's bicubic?
Closed this issue · 7 comments
Hi, How do you resize FFHQ? Do you use PIL's bicubic implementation? That is img.resize((target_size, target_size), Image.Resampling.BICUBIC)
I down-scaled ffhq in advance and saved the results to a folder, using opencv bicubic down-sampling. The code I used to do so is not included in this github repo. I simply loaded the images with a for loop, downsampled them, and saved the results.
The line of code you mentioned is useless unless you choose to load images of size that is different than the size you wish to use for training.
Hi @ohayonguy , I concern that pre-precessing difference (resize) will harm reproducibility. Furthermore, I recommend PIL's resize implementation, it process image with anti-aliasing; please ref to clean-fid
I am sorry, I wrote that I used opencv, but that's a mistake. I checked my code, and I actually did use PIL's implementation. Here is the code I used to down-scale FFHQ:
import os
from concurrent.futures import ProcessPoolExecutor, as_completed
from PIL import Image
from tqdm import tqdm
def downscale_image(input_path, output_path, size=(512, 512)):
with Image.open(input_path) as img:
img_resized = img.resize(size, Image.BICUBIC)
img_resized.save(output_path)
return input_path
def process_images(input_folder, output_folder, size=(512, 512)):
if not os.path.exists(output_folder):
os.makedirs(output_folder)
image_files = [f for f in os.listdir(input_folder) if f.lower().endswith(('.png', '.jpg', '.jpeg', '.gif', '.bmp'))]
total_images = len(image_files)
with ProcessPoolExecutor() as executor:
futures = []
for image_file in image_files:
input_path = os.path.join(input_folder, image_file)
output_path = os.path.join(output_folder, image_file)
futures.append(executor.submit(downscale_image, input_path, output_path, size))
with tqdm(total=total_images, desc="Processing Images", unit="image") as pbar:
for future in as_completed(futures):
processed_file = future.result() # This will raise any exceptions that occurred during processing
pbar.update(1)
pbar.set_postfix(file=os.path.basename(processed_file), refresh=False)
process_images(
input_folder="./data/ffhq1024",
output_folder="./data/ffhq512",
size=(512, 512),
)
Notice that this is only for resizing the ground-truth images. But to resize degraded images (e.g., of low resolution) back to 512x512, we use the bilinear kernel. This is common practice.
I disagree that resizing FFHQ using bilinear instead of bicubic will cause significant differences in training. It will just mildly change the distribution of the ground-truth images. Regardless, we did mention explicitly in the paper that we used the bicubic kernel, so I am not sure what do you mean by "harm reproducibility." Could you please elaborate? :)
I disagree that resizing FFHQ using bilinear instead of bicubic will cause significant differences in training. It will just mildly change the distribution of the ground-truth images
Yes, it should only have mild influence on visual results, but as stated in clean-fid paper, image processing lib will affects metrics like FID significantly. In terms of reproducibility, your work is very good already, thanks! I just suggest adding instruction about how to resize FFHQ to further enhance it.
@Luciennnnnnn Ahh I see what you mean. Though, clean-fid is quite an old package, and the image you shared showing the antialiasing caused from different packages is from 3 years ago. Do you think that the same issues will arise when using more recent versions of torch (e.g., >=2), which, does implement antialiasing for bilinear and bicubic kernels on default?
I will add the instructions on FFHQ! Many thanks for pointing this out! Appreciate it :)
torchvision has already resolved its aliasing issues and now aligns with PIL's behavior in recent versions. However, OpenCV may still maintain its original behavior. If you're interested in exploring this further, I encourage you to test it out; Its simple to do that. By the way, just mention what you use in experiments is enough for users.