UB-Mannheim/tesseract

libwebp in windows executable

dhouse13 opened this issue · 5 comments

Current Behavior

When installing the windows versions provided here (https://github.com/UB-Mannheim/tesseract/wiki), the newest version (5.3.1.2023401) contains libwebp 1.3.0 which has a zero day vulnerability.

Similarly, the version we use (5.0.0-alpha.20190708) contains libwebp 0.6.1 which also has a zero day vulnerability.

We do not use the webp functionality (directly) but removing or replacing the dll with a good version causes negative results.

Expected Behavior

Update supported versions of the tesseract windows installer to include a non-vulnerable version of libwebp

Suggested Fix

No response

tesseract -v

tesseract v5.3.1.20230401
leptonica-1.83.1
libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.1.4) : libpng 1.6.39 : libtiff 4.5.0 : zlib 1.2.13 : libwebp 1.3.0 : libopenjp2 2.5.0
Found AVX512BW
Found AVX512F
Found AVX512VNNI
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found libarchive 3.6.2 zlib/1.2.13 liblzma/5.2.9 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.2
Found libcurl/8.0.1 Schannel zlib/1.2.13 brotli/1.0.9 zstd/1.5.4 libidn2/2.3.4 libpsl/0.21.2 (+libidn2/2.3.3) libssh2/1.10.0


tesseract v5.0.0-alpha.20190708
leptonica-1.78.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX512BW
Found AVX512F
Found AVX2
Found AVX
Found SSE
Found libarchive 3.3.2 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5

Operating System

Windows 11

Other Operating System

No response

uname -a

No response

Compiler

N/A, using your pre-built installers

CPU

N/A

Virtualization / Containers

n/a

Other Information

Details on how we can compile our own version with a good copy of libwebp could also solve the problem, particularly on the version we currently use (5.0.0-alpha.20190708)

stweil commented

Replacing the relevant DLL by a newer compatible one (from msys2) should work.

And the code is only used for WebP images which are still very rare. It is possible to avoid or minimize the risk if you either don't process such images or only process WebP images from trusted sources.

stweil commented

Details on how we can compile our own version with a good copy of libwebp could also solve the problem, particularly on the version we currently use (5.0.0-alpha.20190708)

The build script make-installer.sh is part of the sources. And all installer versions are tagged, for example release v5.0.0-alpha.20190708. You still have to get a working Debian build environment which requires some work. Use the cross build GitHub action as a starting point.

Replacing the relevant DLL by a newer compatible one (from msys2) should work.

And the code is only used for WebP images which are still very rare. It is possible to avoid or minimize the risk if you either don't process such images or only process WebP images from trusted sources.

Sadly, we tried this and it broke our tooling, thus why we are looking at other solutions

Replacing the relevant DLL by a newer compatible one (from msys2) should work.
And the code is only used for WebP images which are still very rare. It is possible to avoid or minimize the risk if you either don't process such images or only process WebP images from trusted sources.

Sadly, we tried this and it broke our tooling, thus why we are looking at other solutions

We discovered why replacing the libwebp-7 DLL did not work. In the newest (1.3.2) version, libwebp-7 adds another dependency. We are testing, but adding the additional dependency does seem to solve our problem. The additional dependency is libsharpyuv-0.dll. We used files from here: https://packages.msys2.org/package/mingw-w64-x86_64-libwebp

stweil commented

I think this issue was fixed by the installer for Tesseract 5.3.3. Please reopen if it still exists.