onekey-sec/unblob

seek in file_utils.py - Python int too large to convert to C ssize_t

steven-hh-ding opened this issue · 4 comments

Describe the bug
Thanks for the very nicely built package. We encounter an error when extracting a integer size error when unpacking a KVM image. Probably due to a malformat chunk. Unfortunately we cannot share the image here yet.. But I felt like it is related to #22

To Reproduce
Steps to reproduce the behavior:

  1. Launch unblob with command unblob image_file_name (a proprietary kvm image)
Error details
2023-06-21 01:22.20 [error    ] Unhandled Exception during chunk calculation handler=elf64 pid=3706 severity=<Severity.ERROR: 'ERROR'> start_offset=0x2cf8a8
Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/unblob/finder.py", line 35, in _calculate_chunk
    return handler.calculate_chunk(file, real_offset)
  File "/opt/conda/lib/python3.9/site-packages/unblob/handlers/executable/elf.py", line 267, in calculate_chunk
    end_offset = self.get_signed_kernel_module_end_offset(file, end_offset)
  File "/opt/conda/lib/python3.9/site-packages/unblob/handlers/executable/elf.py", line 230, in get_signed_kernel_module_end_offset
    file.seek(end_offset, io.SEEK_SET)
  File "/opt/conda/lib/python3.9/site-packages/unblob/file_utils.py", line 38, in seek
    super().seek(pos, whence)
OverflowError: Python int too large to convert to C ssize_t

Environment information (please complete the following information):

  • OS: Ubuntu Linux
  • Unblob with all extractor dependency
  • Python 3.9.16
  • Ubuntu 22.04.2 LTS

I have a few questions since it's affecting the ELF kernel parser:

  • can you share the architecture of the KVM image guest ?
  • would you be ok with sharing the whole unblob report (using --report option) or metadata is also considered sensitive ?

The culprit is get_end_offset in the ELF handler (https://github.com/onekey-sec/unblob/blob/main/unblob/handlers/executable/elf.py#L199) returning a value larger than what ssize_t can handle.

It's difficult to exactly see what the problem is without the ELF sample.

Of course an easy fix would be to do this:

diff --git a/unblob/handlers/executable/elf.py b/unblob/handlers/executable/elf.py
index 1b61519..2775882 100644
--- a/unblob/handlers/executable/elf.py
+++ b/unblob/handlers/executable/elf.py
@@ -226,8 +226,10 @@ class _ELFBase(StructHandler):
         #   - a custom footer value '~~Module signature appended~\n~'
         # we check if a valid kernel module signature is present after the ELF file
         # and returns an end_offset that includes that whole signature part.
-
-        file.seek(end_offset, io.SEEK_SET)
+        try:
+            file.seek(end_offset, io.SEEK_SET)
+        except OverflowError:
+            return end_offset
         for footer_offset in iterate_patterns(file, KERNEL_MODULE_SIGNATURE_FOOTER):
             file.seek(
                 footer_offset - KERNEL_MODULE_SIGNATURE_INFO_LEN,

But I'd rather understand the issue in depth than implement fixes like this.

@steven-hh-ding we're actually tracking a similar bug internally so I'll reproduce it here and see if the fix also solve the bug you reported.

@qkaiser Thanks for the prompt reply! Since it is a kvm image we mount and extract the file system instead so that's fine now. I think that ELF file is corrupted. Thanks a lot for your help.