gildas-lormeau/SingleFileZ

How to manually modify the zip.html file?

onoter opened this issue · 8 comments

Just wanted to manually modify a saved zip.html file. I tried to unzip it, modify, and re-zip. But the file can't be opened correctly, only showing random characters.

I'm curious what this 'self-extracting zip file' is. The only 'self-extracting zip file' I know is exe file on windows, which is certainly not the case.

And most importantly, how to re-zip it manually?

I think you need to make your changes, then just open index.html and save rendered page with SingleFileZ.

its not a zip file but a html zip hybrid file so either you need a zip editor that edits the zip data stream without rewriting the file at all or you would have to separate the zip part from the html part, edit the zip part and then put it back together, that may be possible with determening the offsets with an hexeditor and then clever binutils split/concanate use, but its only a theory ;)

Maybe someone with more experience could write some tools for that...

this should be the offset, detectable by searching for "PK" in ascii mode:
Screenshot from 2024-09-08 16-42-43

/tmp $ dd if=How\ to\ manually\ modify\ the\ zip.html\ file_\ ·\ Issue\ #184\ ·\ gildas-lormeau_SingleFileZ\ \(9_8_2024\ 4_40_44\ PM\).zip.html of=How\ to\ manually\ modify\ the\ zip.html\ file_\ ·\ Issue\ #184\ ·\ gildas-lormeau_SingleFileZ\ \(9_8_2024\ 4_40_44\ PM\).zip.html.part1 bs=1 count=57564
57564+0 records in
57564+0 records out
57564 bytes (58 kB, 56 KiB) copied, 0,122914 s, 468 kB/s
/tmp $ dd if=How\ to\ manually\ modify\ the\ zip.html\ file_\ ·\ Issue\ #184\ ·\ gildas-lormeau_SingleFileZ\ \(9_8_2024\ 4_40_44\ PM\).zip.html of=How\ to\ manually\ modify\ the\ zip.html\ file_\ ·\ Issue\ #184\ ·\ gildas-lormeau_SingleFileZ\ \(9_8_2024\ 4_40_44\ PM\).zip.html.part2 bs=1 skip=57564
1334105+0 records in
1334105+0 records out
1334105 bytes (1,3 MB, 1,3 MiB) copied, 2,4241 s, 550 kB/s
/tmp $ file How\ to\ manually\ modify\ the\ zip.html\ file_\ ·\ Issue\ #184\ ·\ gildas-lormeau_SingleFileZ\ \(9_8_2024\ 4_40_44\ PM\).zip.html*
How to manually modify the zip.html file_ · Issue #184 · gildas-lormeau_SingleFileZ (9_8_2024 4_40_44 PM).zip.html:       data
How to manually modify the zip.html file_ · Issue #184 · gildas-lormeau_SingleFileZ (9_8_2024 4_40_44 PM).zip.html.part1: HTML document, UTF-8 Unicode text, with very long lines
How to manually modify the zip.html file_ · Issue #184 · gildas-lormeau_SingleFileZ (9_8_2024 4_40_44 PM).zip.html.part2: Zip archive data, at least v2.0 to extract
/tmp $ 

ok, i split it modified the zip part and put it back together but now it is not loading the page anymore, so there seams to be some more magic in there then i can determine, maybe some expert on that html.zip hybrid format could give some hint ;)

o it lookes like there is at least a third part but I don't have the desire to play around with it right now anymore...
Screenshot from 2024-09-08 17-05-10

also it seams like there a comments within the zip datastream that would be lost when unziping and reziping the data

@nanderer It's actually a long story, you can find some info about the format in this presentation: https://github.com/gildas-lormeau/Polyglot-HTML-ZIP-PNG. I guess that the entry offsets in the ZIP data are incorrect in your file. The extra data is used for Chromium and Webkit-based browsers to recover corrupted data when reading the ZIP payload from the DOM.

thanks for looking into it, what would be the correct offset for the file and how to get to it?

$ sha256sum How\ to\ manually\ modify\ the\ zip.html\ file_\ ·\ Issue\ #184\ ·\ gildas-lormeau_SingleFileZ\ \(9_8_2024\ 4_40_44\ PM\).zip.html 
209a76e1de8d8da9ce9fbdbe34c1fa4e12d4757eb5c941c7435de6e24f3221b2  How to manually modify the zip.html file_ · Issue #184 · gildas-lormeau_SingleFileZ (9_8_2024 4_40_44 PM).zip.html

How to manually modify the zip.html file_ · Issue #184 · gildas-lormeau_SingleFileZ (9_8_2024 4_40_44 PM).zip.html.zip

is that sfz-extra-data depended on the zip part, if so, how to regenerate it?