samastur/image-diet

Refactor/rewrite image-diet

Closed this issue · 1 comments

I am really unhappy with the way I wrote image-diet. It bothers me that it supports only EasyThumbnails and filesystem storage. It bothers me that tools with their parameters are baked in so you can't add a new one or change their parameters without mucking around code. This also makes it impossible to create a test suite to find combination of tools and settings that works best in your scenario. It's a mess.

So I would like to rewrite/refactor it with following goals:

  • still easy to get going (dead simple support for filesystem storage)
  • can also support other storage backends and thumbnails apps with minimum of fuss
  • adding new tools should be easy; also changing their parameters
  • changing configuration should require no programming knowledge

It would be nice if current version's settings would still work, but that is not a requirement.

Ideas on how to do this

Django has supported custom storage backends for a while now and new version of image-diet would come with two components. A storage mixin doing the squeezing of images as they flow to storage and a filesystem backend that uses it so it is at the same time an example of mixin use and easier default choice. This mixin will still require access to filesystem (can be temporary) since at least some optimizing tools can only work with files.

Process pipeline should know as little as possible about images or processing. It's main purpose would be to identify the type of image (file), select the right pipeline definition and manage processing pipeline (run tools in order specified in configuration file and provide seamless handling of stdin/file inputs.

Everything else should be stored in configuration file (default one which could be overridden by user's). Configuration file would store info about tools themselves: location, configuration options and type (does it read data from stdin or from file). It would also define pipelines for each image type (which tools are used for processing and their order). Not sure which format would be better: python (dicts&co) or something like YAML.

Adding a new tool or even a new image type could then be done completely with configuration. All it would need is to add tool definition, new type section with its pipeline.

If it was done well, then it would not even care if files in question are images and could in principle process other files too (but that is not its goal).

Thoughts, suggestions and critic are very welcome.

This has more or less already happened with pyimagediet and image-diet2 (both of which are already on pypi).

They currently require documentation and some polishing, but are also clearly the way forward.