imgdataopt: A C repository from pts

README for imgdataopt: raster (bitmap) image data size optimizer
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
imgdataopt is a command-line tool that converts PNG and PNM raster (bitmap)
image formats to each other, doing some lossless size optimizations such as
converting RGB to grayscale or indexed (palette), converting high bit depth
to low, saving the PNG file with a high flate compression level. imgdataopt
is written in ANSI C (C89). For much slower alternatives which produce
smaller PNG files, see PNGOUT, zopflipng, optipng, ECT, advpng.

How to use:

* Compile it using `make', or use one of the released binary executables on
  http://github.com/pts/imgdataopt/releases .

* Run the conversion in the command-line (terminal window) as:

    ./imgdataopt input.img output.img

  The input file format is autodetected.

  The output file format is derived from the filename extension of output.img.

  The output filename can be the same as the input filename.

  You can pass command-line flags in front of input.img, for example, do
  this to get grayscale output:

    ./imgdataopt -s:grays input.img output.img

* There is no GUI.

Features and comparison
~~~~~~~~~~~~~~~~~~~~~~~
* imgdataopt is a command-line tool, there is no GUI.

* imgdataopt is relatively fast. It's much faster than the PNG optimizers
  PNGOUT, zopflipng, optipng, ECT, advpng. As a tradeoff, imgdataopt produces
  a bit larger PNG output than those.

* imgdataopt does only lossless processing: it keeps all the colors intact,
  and it retains the original width and height.

* imgdataopt is idempotent: running it again on the output, with the same
  flags produces an output image file identical to the first output.

* imgdataopt is small: it's less than 2500 lines of C code in a single file
  (excluding zlib). Similar image conversion tools such as sam2p and
  ImageMagick have much larger codebases, and some are implemented in C++
  (thus they may be more complicated to understand and they may have more
  library dependencies).

* imgdataopt is portable: it is written in ANSI C (C89), and it's ported
  officially to Linux (i386 and amd64 etc.), macOS 10.5 or later, and Windows
  (Win32, Windows 95 or later).

* imgdataopt compiles and works with a wide variety of programming language
  versions and compilers:

  * C89: gcc -ansi
  * C99: gcc -std=c99
  * C11: gcc -std=c11
  * C++98: g++ -ansi
  * C++11: g++ -std=c++0x
  * C++17: g++ -std=c++1y
  * gcc -m32 (i386) and gcc -m64 (amd64)
  * gcc-4.1 ... gcc-4.4 ... gcc-4.8 ... gcc-7.3 ...
  * g++-4.1 ... g++-4.4 ... g++-4.8 ... gcc-7.3 ...
  * i686-w64-mingw32-gcc (gcc 4.8.2)
  * clang-3.0 ... clang-3.5 ...
  * clang++-3.0 ... clang++-3.5 ...
  * o32-clang and o64-clang on macOS
  * tcc 0.9.25 ... 0.9.26 ...

* imgdataopt is self-contained: it doesn't have any external dependencies
  other than libc (the C standard library) and zlib (-lz). There is a also a
  copy of zlib bundled with imgdataopt in zlib_src. Other tools such as
  ImageMagick have dozens of library dependencies, with libpng (-lpng) as a
  minimum. sam2p also has the dependency on external tools png22pnm (or
  pngtopnm).

* imgdataopt doesn't support all features of image file formats when
  reading, e.g. it is not able to read interlaced PNG, it ignores PNG gamma
  correction and transparency, and it is not able to read ASCII PNM.

* imgdataopt ignores and strips metadata (such as comments and digital
  camera info such as EXIF).

* imgdataopt doesn't support transperency or alpha channel (e.g. RGBA): it
  can read and write only images without an alpha channel, and all pixels
  are assumed to have an implicit alpha value of 255 (fully opaque). sam2p
  supports paletted images with transparency, i.e. the pixel alpha values of
  0 (fully transparent) and 255 (fully opaque), but no arbitrary alpha
  values.

* imgdataopt has some tests (see the png_test directory) for various bit
  depths and predictors.

* imgdataopt uses a reasonable amout of memory: it uses about 300 000 + 3 *
  width * (height + 6) bytes of memory (including to the code size). That
  is, it can keep an uncompressed RGB8 version of the image in memory. 300
  kB is needed for the code, the ZIP compression window (of 32 kB) and
  other buffers. (In fact, it uses even less memory: the multiplier 3 will
  be only 1 if the input image has the colorspace Gray or Indexed.)

* imgdataopt can read temporary PNG files generated by pdfsizeopt.

* imgdataopt can convert RGB images to grayscale and indexed (palette) etc.,
  and does it by default.

* imgdataopt can convert an 8-bits-per-sample image to 4, 2 and 1, and does
  it by default, picking the smallest value. (It also converts
  4-bits-per-sample to 2 or 1, and 2-bits-per sample to 1 if possible.)

* imgdataopt can write PNG files with smart predictor selection for each
  row. It does it by default, just like sam2p does it. The explicit
  command-line flag is -c:zip:25:9 for both imgdataopt and sam2p.

* imgdataopt can write PNG files with the None predictor in each row
  (like sam2p with the -c:zip:10:9 flag).

* imgdataopt can write PNG-like files without a per-row predictor specified.
  This is compatible with PDF /Filter /FlateDecode without any /Predictor.
  To get this non-conforming PNG output, use `imgdataopt -j:00
  -c:zip:1:9'. (sam2p ignores -j:00, and writes a PNG with the regular
  predictor byte in front of each row.)

* imgdataopt works successfully even if the input image data is truncated,
  has the wrong checksum, or is too long. In this case it will succeed and
  use color 0 for the rest of the image. (This is used for processing
  bad image objects in pdfsizeopt.)

* imgdataopt can write PNG-like files with RGB and bit depth smaller than 8.
  To get this, use `imgdataopt -j:00'. (sam2p ignores -j:00 and upgrades the
  bit depth to 8.)

Motivation
~~~~~~~~~~
imgdataopt was designed as a fast backend tool for image size optimization
in pdfsizeopt, as a replacement of sam2p. pdfsizeopt will find images
embedded in the PDF, create a PNG file for each, call image optimizers such
as imgdataopt, and replace the PDF image object data with the output of the
optimizers.

The reasons why a replacement of sam2p was designed:

* In pts/pdfsizeopt#51, it's hard to implement
  recovery from truncated, too long or bad-checksum image data with sam2p.

* In pts/pdfsizeopt#52, it's hard to make sam2p
  find the right tool (png22pnm or pdftopnm) to parse PNG files. One way
  to solve is making sam2p able to read PNG files directly. The other way
  is writing a brand new tool (i.e. imgdataopt) which can do it.

* sam2p has too many features not needed by pdfsizeopt.

* sam2p is implemented in C++, and it has a bit brittle build system. A
  small and simple tool written in C such as imgdataopt is much more
  future-compatible.

See also pts/pdfsizeopt#51 for more comments on
the motivation and initial design of imgdataopt.

Copyright
~~~~~~~~~
imgdataopt is written and owned by Péter Szabó <pts@fazekas.hu> (except for
the zlib_src directory).

imgdataopt may be used, modified and redistributed only under the terms of the
GNU General Public License, version 2 or newer, found at

  http://www.fsf.org/licenses/gpl.html

__END__