onekey-sec/unblob

`7z` fails to extract initramfs

NiklasGollenstede opened this issue · 8 comments

Describe the bug
I was browsing Nix projects when I found this one and thought "that's cool, should be useful to inspect initrds, but unfortunately, that currently does not really work very much.

The below command tries to extract a zstd-compressed NixOS initramfs with prepended Intel CPU microcode, but:

  • 7z fails to extract the main CPIO archive, but unblob still exits with 0/success.
  • The prepended microcode CPIO archive is reported to be found, but then missing in the output.
  • There are unknown chunks that are really just zero-padding.

To Reproduce

 nix run github:onekey-sec/unblob/23.5.31 -- $( nix build --no-link --print-out-paths github:srid/nixos-config/1a6879bbd1c0f87f67533a7b91bc438e042b3bf6#nixosConfigurations.actual.config.system.build.initialRamdisk )/initrd
Command output
2023-08-10 23:58.54 [info     ] Start processing file          file=/nix/store/msmx1ylsyhxk6hx3p4nz39vqi2gkzn3j-initrd-linux-6.1.43/initrd pid=3806009
2023-08-10 23:58.54 [warning  ] Found unknown Chunks           chunks=[0x6f1200-0x6f1800] pid=3806014
2023-08-10 23:58.54 [info     ] Extracting unknown chunk       chunk=0x6f1200-0x6f1800 path=initrd_extract/7279104-7280640.unknown pid=3806014
2023-08-10 23:58.54 [info     ] Extracting valid chunk         chunk=0x6f1800-0x12284cb path=initrd_extract/7280640-19039435.zstd pid=3806014
2023-08-10 23:58.54 [info     ] Extracting valid chunk         chunk=0x0-0x6f1200 path=initrd_extract/0-7279104.cpio_portable_ascii pid=3806014
2023-08-10 23:58.54 [warning  ] Found unknown Chunks           chunks=[0x19b3000-0x19b4000] pid=3806016
2023-08-10 23:58.54 [info     ] Extracting unknown chunk       chunk=0x19b3000-0x19b4000 path=initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/26947584-26951680.unknown pid=3806016
2023-08-10 23:58.54 [info     ] Extracting valid chunk         chunk=0x0-0x19b3000 path=initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/0-26947584.cpio_portable_ascii pid=3806016
2023-08-10 23:58.54 [error    ] Extract command failed         command=7z x -y /tmp/initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/0-26947584.cpio_portable_ascii -o/tmp/initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/0-26947584.cpio_portable_ascii_extract exit_code=0x2 pid=3806016 severity=<Severity.WARNING: 'WARNING'> stderr=
ERRORS:
There are data after the end of archive

ERROR: There are some data after the end of the payload data : 0-26947584
 stdout=
7-Zip [64] 17.05 : Copyright (c) 1999-2021 Igor Pavlov : 2017-08-28
p7zip Version 17.05 (locale=C,Utf16=off,HugeFiles=on,64 bits,16 CPUs x64)

Scanning the drive for archives:
1 file, 26947584 bytes (26 MiB)

Extracting archive: /tmp/initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/0-26947584.cpio_portable_ascii
--
Path = /tmp/initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/0-26947584.cpio_portable_ascii
Type = xz
ERRORS:
There are data after the end of archive
Offset = 35432
Physical Size = 5228
Tail Size = 26906924
Method = LZMA2:21
Streams = 1
Blocks = 1


Sub items Errors: 1

Archives with Errors: 1

Open Errors: 1

Sub items Errors: 1

tree -s ./initrd_extract
[          4]  initrd_extract
├── [       1536]  7279104-7280640.unknown
└── [          4]  7280640-19039435.zstd_extract
    ├── [   26951680]  zstd.uncompressed
    └── [          5]  zstd.uncompressed_extract
        ├── [   26947584]  0-26947584.cpio_portable_ascii
        ├── [          3]  0-26947584.cpio_portable_ascii_extract
        │   └── [      27568]  0-26947584
        └── [       4096]  26947584-26951680.unknown

Expected behavior

  • 7z not to fail / something else (like cpio) to extract the archive.
  • unblob to exit non-zero upon sub-command failure.
  • The microcode in the output tree.
  • unknown blocks that are entirely zero to be called zero-padding or something like that.

Environment information

  • Nix 2.13.3 on NixOS 23.05
  • (all other versions are pinned via nix flakes, see the above commands)

Thanks for the very detailed report @NiklasGollenstede !

7z not to fail / something else (like cpio) to extract the archive.

7z is failing due to unblob miscalculating the CPIO chunk. We will look into it.

unblob to exit non-zero upon sub-command failure.

It should be non-zero according to get_exit_code_from_reports. We will look into it.

The microcode in the output tree.

It will be present if you run unblob with -k option.

unknown blocks that are entirely zero to be called zero-padding or something like that.

Agree 100%. We're tracking this at #263 and have a draft branch for it.

Sounds good, thanks!

The microcode in the output tree.

It will be present if you run unblob with -k option.

Well, that keeps the prepended CPIO archive. But that archive contains a file, which apparently gets completely ignored. Running with -k and then cpio -idv < initrd_extract/0-7279104.cpio_portable_ascii extracts that file:

kernel/x86/microcode/GenuineIntel.bin
14217 blocks

I gutess the expected result (with -k) would be:

[          7]  initrd_extract/
├── [    7279104]  0-7279104.cpio_portable_ascii
├── [          3]  0-7279104.cpio_portable_ascii_extract/
│   └── [          3]  kernel/
│       └── [          3]  x86/
│           └── [          3]  microcode/
│               └── [    7278592]  GenuineIntel.bin
├── [       1536]  7279104-7280640.zero-padding
├── [   11758795]  7280640-19039435.zstd
└── [          4]  7280640-19039435.zstd_extract/
    ├── [   26951680]  zstd.uncompressed
    └── [          5]  zstd.uncompressed_extract/
        ├── [   26947584]  0-26947584.cpio_portable_ascii
        ├── [         ??]  0-26947584.cpio_portable_ascii_extract/
        │   └── ...
        └── [       4096]  26947584-26951680.zero-padding

7z will not create the extraction directory if the source file name does not follow some convention which I still don't fully comprehend (name.cpio is OK, name.cpio.truncated is OK, but name.cpio.ext is not).

Quick update: I'll probably write a CPIO extractor since we're already parsing the entries anyway, should not take long with the recent addition of the Filesystem API.

@NiklasGollenstede I opened a pull request to handle this, this will be reviewed over the coming weeks.

These are the results I'm getting with your sample and that branch:

.
├── 0-7279104.cpio_portable_ascii
├── 0-7279104.cpio_portable_ascii_extract
│   └── kernel
│       └── x86
│           └── microcode
│               └── GenuineIntel.bin
├── 7279104-7280640.unknown
├── 7280640-19039435.zstd
└── 7280640-19039435.zstd_extract
    ├── zstd.uncompressed
    └── zstd.uncompressed_extract
        ├── 0-26947584.cpio_portable_ascii
        ├── 0-26947584.cpio_portable_ascii_extract
        │   ├── dev
        │   ├── etc
        │   │   ├── mdadm.conf -> ../nix/store/ivzdqwmjb3g5cddb0l3kakqpym53n4sk-mdadm.conf
        │   │   └── modprobe.d
        │   │       ├── debian.conf -> ../../nix/store/1hvskwda7r1spasqqg4ascjngqpnp0qw-kmod-debian-aliases.conf-22-1.1
        │   │       ├── nixos.conf -> ../../nix/store/fg1iypr8qlc4li832bsnqsv2182wjkmb-etc-modprobe.d-nixos.conf
        │   │       └── ubuntu.conf -> ../../nix/store/pyzxg3hb6r88l7bqfya22q002sbchfxi-initrd-kmod-blacklist-ubuntu
        │   ├── init -> nix/store/320svbpp3f9mhdc4xr0p1n2gm3nfwzv1-stage-1-init.sh
        │   └── nix
        │       └── store
        │           ├── 1hvskwda7r1spasqqg4ascjngqpnp0qw-kmod-debian-aliases.conf-22-1.1
        │           ├── 320svbpp3f9mhdc4xr0p1n2gm3nfwzv1-stage-1-init.sh
        │           └── 5cxd4ywn7sis9h5yibxfc6bwvjz15af9-linux-6.1.43-modules-shrunk
        └── 26947584-26951680.unknown

13 directories, 14 files

Don't hesitate to give it a try.

$ nix run github:onekey-sec/unblob/a5536446208f749c9df77f3d5a07528933e9e418 -- $( nix build --no-link --print-out-paths github:srid/nixos-config/1a6879bbd1c0f87f67533a7b91bc438e042b3bf6#nixosConfigurations.actual.config.system.build.initialRamdisk )/initrd
╭──────────────── unblob (23.8.11) ────────────────╮
│ Extracted files: 5                               │{{1}}
│ Extracted directories: 12                        │{{2}}
│ Extracted links: 5                               │
│ Extraction directory size: 50.82 MB              │
│ Chunks identification ratio: 99.99%              │
╰──────────────────── Summary ─────────────────────╯
            Chunks distribution
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┓
┃ Chunk type          ┃   Size   ┃ Ratio  ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━┩
│ CPIO_PORTABLE_ASCII │ 32.64 MB │ 74.42% │{{3}}
│ ZSTD                │ 11.21 MB │ 25.57% │{{3}}
│ UNKNOWN             │ 5.50 KB  │ 0.01%  │
└─────────────────────┴──────────┴────────┘
       Encountered errors
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Severity       ┃ Name         ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ Severity.ERROR │ UnknownError │{{4}}
└────────────────┴──────────────┘
$ ech0 $?
1
$ tree -sF initrd_extract/
initrd_extract/
|-- [          3]  0-7279104.cpio_portable_ascii_extract/
|   `-- [          3]  kernel/
|       `-- [          3]  x86/
|           `-- [          3]  microcode/
|               `-- [    7278592]  GenuineIntel.bin {{1}}
|-- [       1536]  7279104-7280640.unknown {{1}}
`-- [          4]  7280640-19039435.zstd_extract/
    |-- [   26951680]  zstd.uncompressed {{5}}
    `-- [          5]  zstd.uncompressed_extract/
        |-- [   26947584]  0-26947584.cpio_portable_ascii
        |-- [          6]  0-26947584.cpio_portable_ascii_extract/
        |   |-- [          2]  dev/
        |   |-- [          4]  etc/
        |   |   |-- [         56]  mdadm.conf -> ../nix/store/ivzdqwmjb3g5cddb0l3kakqpym53n4sk-mdadm.conf
        |   |   `-- [          5]  modprobe.d/
        |   |       |-- [         80]  debian.conf -> ../../nix/store/1hvskwda7r1spasqqg4ascjngqpnp0qw-kmod-debian-aliases.conf-22-1.1
        |   |       |-- [         74]  nixos.conf -> ../../nix/store/fg1iypr8qlc4li832bsnqsv2182wjkmb-etc-modprobe.d-nixos.conf
        |   |       `-- [         77]  ubuntu.conf -> ../../nix/store/pyzxg3hb6r88l7bqfya22q002sbchfxi-initrd-kmod-blacklist-ubuntu
        |   |-- [         58]  init -> nix/store/320svbpp3f9mhdc4xr0p1n2gm3nfwzv1-stage-1-init.sh
        |   `-- [          3]  nix/
        |       `-- [          5]  store/
        |           |-- [        655]  1hvskwda7r1spasqqg4ascjngqpnp0qw-kmod-debian-aliases.conf-22-1.1 {{1}}
        |           |-- [      20667]  320svbpp3f9mhdc4xr0p1n2gm3nfwzv1-stage-1-init.sh {{1}}
        |           `-- [          2]  5cxd4ywn7sis9h5yibxfc6bwvjz15af9-linux-6.1.43-modules-shrunk/
        `-- [       4096]  26947584-26951680.unknown {{1}}

13 directories, 12 files

That looks better. It seems to be handling the first archive correctly!
But then there is still the/an error, and most of the files from the nested archive were not extracted.

Some further nitpickiness (largely unrelated to this overall issue):

  1. I only see 3 extracted files. Do the unknown chunks count as files? I don't think they are "files" in that sense. (They result in regular files in the output, but semantically they are not files in the archive.)
  2. Similarly, there are 14 dirs in the output tree (incl. top-level, 9 of which are within *.cpio_portable_ascii_extract dirs (i.e. were encoded in the input).
  3. The CPIO (largely) was inside the ZSTD. I don't think it makes very much sense to express their relative size of a whole (which one?).
  4. Good. But knowing at least which extraction was attempted and failed would be nice. I know to expect that there are things missing in the output, but not where.
  5. It seems to me that without the -k option, unblob removes blobs that it processed successfully. Why is zstd.uncompressed still there? It was split into 0-26947584.cpio_portable_ascii and 26947584-26951680.unknown and should then be done, no?

@NiklasGollenstede converted to a discussion at #650 so everyone in the team can chime in. Thanks for taking the time writing this by the way.

Closing this issue since CPIO is properly extracted now. The discussion on console output is kept open for further exchanges.