JesusFreke/smali

Dex file size increases by ~50% without making changes (DexFileFactory.loadDexFile then DexFileFactory.writeDexFile)

Opened this issue · 7 comments

I'm using "DexFileFactory" class to read and write dex files.
A simple example:

DexFile dexFile = DexFileFactory.loadDexFile(inputPath, null);
DexFileFactory.writeDexFile(outputPath, dexFile

Just by loading the dex file and rewriting it (without any modifications) the size of the dex files increase by sometimes up to 50%.
What are the reasons for this increase in size and are there any ways to decrease the file’s size when writing it?

post the before/after dex and ill take a look

I'm attaching one example here. The size increase is about 40% in this case.
classes.zip

Hi, I'm following this issue as well.
Not sure what you mean by "modern optimizing compiler". Is there a way to reduce the changes in size between the initial dex file and the one written?
Thanks

@katzdan after I get the brisket going on the smoker, I will post a detailed explanation. I deleted my far too simple reply (im just waking up and was being a bit lazy). I do think there is something else going on here too, but I need to run them through my disassembler to look closer.

but yes, there is a lot of duplicate data in a dex file, and a lot of data is pointed to with pointers/offsets.

For example, if two classes have identical debug sections, both can point to the same data for their debug section.

@katzdan @marwan-bushara

The after file is 3322156 bytes longer
the debug section accounts for 46276 of those bytes

TLDR: at least a good chunk of the difference is because smali does not employ some of the space saving tricks that some compilers do, by not writing duplicate data, and by pointing multiple references at the same data

Is there any hope to get a reduction in the size of a Dex in the future? as an optimization feature?

@katzdan there are a number of ways to do that, it would be fairly easy.

Easiest route would be to just write a script to parse the smali files and remove debug information and dead code