CERT-Polska/mwdb-core

Certain files cause error code 500 on upload

yankovs opened this issue · 4 comments

Environment information

  • MWDB version (from /about):
  • Installation method:
    • mwdb.cert.pl service
    • From PyPi (pip install mwdb-core)
    • From docker-compose
    • Other (please explain)
  • Plugins installed: None

Behaviour the bug (what happened?)

Some files cause error 500 when trying to upload:

image

On the web side, the error is:

2023-07-04 09:33:48 6:33:48 AM [vite] http proxy error at /api/file:
2023-07-04 09:33:48 Error: socket hang up
2023-07-04 09:33:48     at connResetException (node:internal/errors:705:14)
2023-07-04 09:33:48     at Socket.socketOnEnd (node:_http_client:518:23)
2023-07-04 09:33:48     at Socket.emit (node:events:525:35)
2023-07-04 09:33:48     at endReadableNT (node:internal/streams/readable:1358:12)
2023-07-04 09:33:48     at processTicksAndRejections (node:internal/process/task_queues:83:21)

And on the backend side, unfortunately it just crashes and produces no trace:

2023-07-04 09:33:48 [2023-07-04 06:33:48 +0000] [1] [WARNING] Worker with pid 5221 was terminated due to signal 11
2023-07-04 09:33:48 [2023-07-04 06:33:48 +0000] [5437] [INFO] Booting worker with pid: 5437

So it seems like it gets a segmentation violation error somehow, maybe related to nginx.

A small list of such files, i'll try to update if I find any more of them:

  • 5c695a37c4eb17703e1d4b95b8c2366bcead07171d3ccb22c091a77bee9c9c81

Expected behaviour

File upload works

Reproduction Steps

Get the sample and try to upload it

psrok1 commented

Confirmed on v2.8.2 as well, possibly caused by libmagic.

psrok1 commented

Ok, so this is caused by different behavior of libmagic on glibc and musl (Alpine).

File has incorrect Total Editing Time which is shown as *Bad* 0x000000bb2bb31ea9

bad-file-5c695a37c4eb17703e1d4b95b8c2366bcead07171d3ccb22c091a77bee9c9c81: Composite Document File V2 Document, Little Endian, Os: Windows, Version 5.0, Code page: 1252, Title: WinCC-Graphics-Document, Comments: Saved with Version: K6.0.2.0Saved with Version: K6.0.2.8Saved with Version: K6.0.3.0Saved with Version: K6.0.4.0Saved with Version: V6.2 incl. SP2Saved with Version: V6.2 incl. SP3 incl. HF12Saved with Version: V7.0 incl. SP3, Revision Number: 619, Total Editing Time: *Bad* 0x000000bb2bb31ea9, Last Saved Time/Date: Wed Jun 21 08:30:38 2023, Create Time/Date: Fri Feb  8 10:14:53 2002, Number of Pages: 1, Number of Words: 0, Number of Characters: 0, Name of Creating Application: Grafexe, 0x80000002: 0

*Bad* is generated when ctime in cdf_ctime returns NULL (https://github.com/file/file/blob/master/src/cdf_time.c#L169).

So it relies on the fact that glibc will do whole validation and in case of overflow will return NULL: https://github.com/lattera/glibc/blob/master/time/asctime.c#L53

Unfortunately musl behaves differently and calls a_crash which executes HLT trap and causes SIGSEGV: https://git.musl-libc.org/cgit/musl/tree/src/time/asctime_r.c#n23

(gdb) x/i 0x00007ffff7f50931
   0x7ffff7f50931 <cdf_ctime+17>:	test   %rax,%rax
(gdb) x/i 0x00007ffff7fbd2df
   0x7ffff7fbd2df <ctime_r+34>:	add    $0x40,%rsp
(gdb) x/i $rip
=> 0x7ffff7fbd0c1 <asctime_r+150>:	hlt

Crucial parts of cdf_ctime code were not changed for at least 11 years, so libmagic downgrade/upgrade won't help: file/file@a5faca5

psrok1 commented

Submitted bug report to file: https://bugs.astron.com/view.php?id=465

psrok1 commented

Fix is on master, so it will be working in next release of file: file/file@a457672

Memory-safe libmagic replacement was already discussed, but we haven't found anything good (#671).

Let's see what the future will bring...