nicoboss/nsz

XCI with title id bbb-h-aacca cant be compressed/decompressed

popy2k14 opened this issue · 9 comments

Hey guys.

tried to compress the game with title id "bbb-h-aacca".
A little curious is that the compress takes just a few seconds and game shrunk from 1,75 GB to 1,38 GB. But there was no progressbar shown during compression,
When i decompress it, the resulting XCI is the same size as the shrunken XCZ.
I have compared them with an binary compare program and it says the are binary the same!

Failed to extract TitleID/Version from filename "1 2 SWitch - bbb-h-aacca.xcz". Use -p to extract > > from Cnmt.
Decompressing ....xcz -> ....xci
[ADDING] secure 0 bytes to NSP
[ADDING] fbc46f4b672d686bcc618ec3c0c37753.nca 1480868352 bytes to NSP
[VERIFIED] fbc46f4b672d686bcc618ec3c0c37753.nca
[ADDING] ce6606d93e67c89b7c75bb83c89e7507.nca 184320 bytes to NSP
[VERIFIED] ce6606d93e67c89b7c75bb83c89e7507.nca
[ADDING] 55c76a09e5bdc9a2bfd0a9ac74be3a33.nca 278528 bytes to NSP
[VERIFIED] 55c76a09e5bdc9a2bfd0a9ac74be3a33.nca
[ADDING] c7ce40219ce34c22c388e915a856bd69.cnmt.nca 4096 bytes to NSP

Thats the log on decompressing.

PS.: The File name has spaces in it, maybe this is the issue?

I looked into it and this indeed is a huge issue. Some NCA files inside some XCI files have an offset of 0x4200 instead of 0x4000 (like the fbc46f4b672d686bcc618ec3c0c37753.nca inside "1 2 Switch").

When designing the NSZ file format we decided to hardcode the offset of 0x4000 as every NCA except CNMT and system update which have 0xC00 seemed to have this offset. This also applies for all NCA files inside NSP containers but unfortunately this doesn't seem to hold for all NCA files found inside a XCI. Because of this such NCA files are seen as unpacked by isNcaPacked because of fs[0].offset != 0x4000 and skipped.

In the game "1 2 Switch" this leads to no NCA being compressed. The reduction in size you see is because of NSZ deleting useless files like system updates and trimming filler space at the end of the XCI. That's also why it has the same size after decompressing it again. The decompressed file has still a smaller size then the original game because I don't restore useless files or untrimmed the game while decompressing.

This is such a fundamental file format issue that I have to discuss it with blawar and others that implemented the xcz file format in their software. Fixing this issue should under no circumstances break backwards compatibility as the xcz file format is designed to always keep backwards compatibility however this most like will break forwards compatibility meaning all current versions of software which implemented xcz won't be compatible after fixing this issue without updating.

The old NSPZ/XCIZ file format had full offset support and even decrypted the data before the offset of the first partition but then the idea was unfortunately scrapped from the NSZ file format when it was quite young in order to reduce complexity. I guess we have to introduce it now.

Relevant quotes of the conversation from Oct 11, 2019:

Blawar:

I am also not terribly happy about hard coding the 0x4000 as the start of the NCA section, as not all NCA's do this (some are 0xC00 such as cnmt and system update NCA's). Since we do not decrypt the NCA heade anything, i think we can compress from 0xC00 onward and then just not run any crypto over 0xC00 to 0x4000 (memcpy basically).

nicoboss:

It really bothers me that NSZ don't decrypt the empty space after the actual header. I don't see any reason why we should keep this whole area encrypted.

Blawar:

The space between 0xC00 and 0x4000 isnt always empty space. In the program NCA it is part of the NCA header (it decrypts with the header key), in other NCA's, it is not part of the header and cannot be decerypted with the header key. To avoid all of this, i just put in the spec that only 0x4000 sized headers are supported.
The other reason is that I am caught between preservationists, who activeless resist modifying NCA's. Any byte I touch, becomes more of a power, so the less I touch, the more they will accept it. The preservationists generally control distribution so its important for them to be on board.

thx a lot for your findings!
When you have an test version ready (after discussing), i am here to test.

We found a way to implement this without changing the NSZ file format specification.

If the offset of the first section is larger than 0x4000 we just start compressing from 0x4000
If the offset of the first section is smaller then 0x4000 we store its first part uncompressed and then the NSZ header followed by the compressed rest of the section.

That way the NSZ header is always at 0x4000 and the First block always starts after the NSZ header.
What's even more beautiful is that this way all complex logic is inside the compression and the decompressor theoretically doesn't need to be changed at all. However, because probably nobody made the effort of implementing the specifications so strictly they still needs to be changed but at least only very slightly.

Just tested block compressed "1 2 Switch" which has the offset of the first section larger than 0x4000 on Tinfoil and worked perfectly fine. I'm very impressed how well this is made. I would never have thought this will just work. The same code is also used by SX Installer, Goldbricks, OG Tinfoil and Awoo according to blawar so everybody should be fine.

The case with the first section is smaller than 0x4000 will most likely not work but there currently isn't any known game having such an NCA of the type PROGRAM or PUBLICDATA anyways.

Because there currently isn't any known game having an NCA with the first section is smaller than 0x4000 of the type PROGRAM or PUBLICDATA I simulated such a case by compressing every type of NCA with a total size larger than 0x4000. Guess what I worked! That's unbelievable! That Tinfoil code must be so clean.

Nice that you found the issue and fixed in an backwards compatible way.
Is it possible to get an test version?
Can give you feedback on that before release.

NSZ is written in Python so every commit can be just downloaded and extracted once all dependencies are installed. Running NSZ directly from source is quite easy and the portable Windows builds and pip releases only exists for convenience. Just follow the following guide and you should be able to test the latest commit. If you run into any problems while following this guide just let me know. Latest commit 5b3afac should be quite stable. In case you want to use it to compress all your XCIs I recommend to always use -V for XCI compression as I only tested the highly complex fix of this issue with “1 2 Switch”. As long -V doesn’t flag files as corrupted the resulting XCZ files will be clean. I also recommend you to test some of them on Tinfoil even though Tinfoil should support the them. Thanks a lot, that you want to help testing this. Once you’re done please let me know how it went and if any other issues occurred.

To try out latest commit on Windows:

  1. Install https://www.python.org/ftp/python/3.7.6/python-3.7.6-amd64.exe if you don't have Python already. For python 3.8 and later also install Buildtools for Visual Studio 2019 from https://visualstudio.microsoft.com/de/downloads/ if you don't have Visual Studio installed already.
  2. Download and extract https://github.com/nicoboss/nsz/archive/master.zip
  3. Go to your extracted folder and shift right click => "Open PowerShell Window here"
  4. Enter "pip install -r requirements.txt"
  5. Enter "python .\nsz.py" to launch it.

(clap) (clap) (clap) (clap) bravo! Happy to see such an enligthment moment.

@nicoboss sorry for giving you too much hope.
New year eve is over and i started work again, so sadly not much freetime to test it soon.