elixir-waffle/waffle

Add a post-transform step/callback

jcelliott opened this issue · 10 comments

I'm looking for a way to extract some information about the file after the transform step. Specifically, I want to get some metadata about the file after the transformation has run. Currently, the store implementation deletes the temp file immediately after storing it (here). Is there a way to get access to that file currently, or is there a feature we could add to do that? The only option I can see with the current implementation is to just re-download the file and process it afterwards.

It depends on what kind of metadata you want,

if you want to just get a size of a file you could do it with a simple HEAD request and then check the response headers.

Yeah, I thought of that, but unfortunately it's more complex than that. I need access to the actual file. It just seems unfortunate to have to re-download the file when we theoretically already have access to it.

Right now there are no straight forward ways to skip cleanup action,

Basically I see two other options

  • process file up-front and let waffle manage only the uploading step
  • define custom transformation and make your custom logic as part of this transformation

like this

def transform(:thumb, _) do
  {:elixir, "run optimise.exs"}
end

I looked into the custom transformations, but I didn't see any straightforward way to get information out of the transform step. I do like the up-front processing idea though, I might try that. Is this something that might have wide enough appeal to add to the library itself? The only other related ticket I found was just for getting the file size, which you can do easily with a HEAD request like you mentioned.

We would love this feature for changelog.com where we're using Waffle to upload mp3 files and the transform step to write ID3 tags.

After we perform the transformation we want to get the new file size and the duration of the audio to be stored alongside the episode in our database. Our current setup uses local storage, so I find the path to the transformed file and do my processing on that. But we're switching to S3 storage for our uploads and it looks like I'd have to re-download each mp3 (50-100 MB) to get this info for now.

I'll look into the pre-process idea, but having an opportunity to hook in after the transform step would be nice for sure.

caspg commented

How could we get the width and height of the resized images? Would I have to download them (from s3 in my case) and read dimensions? Is there any way to avoid fetching files from s3?

How could we get the width and height of the resized images?

I'm not sure I get the issue, I think you already know the dimensions of resized image. If you mean the original one, you can take the image source and parse the dimensions before passing it to the waffle.

A second way would be to encode dimensions inside filename directly during initial upload process.

caspg commented

I'm resizing using -resize 2000\> which resizes larger images to fit this width and keeps aspect ratio for height. Because of that I can't know the dimension of the output file (maybe I'm wrong?).

A second way would be to encode dimensions inside filename directly during initial upload process.

but with my above example, I would still need to get dimensions of the output file in the first place, right?

I get it, but knowing the original file dimensions you can calculate dimensions of the output file by yourself.

caspg commented

@achempion oh yeah, you are right. Thanks :)