Word found unreadable content in "report.odt" (and workaround)
Closed this issue · 8 comments
- Given a
template.odt
that MS Word opens without error, - Use
odf-report 0.7.0
to generatereport.odt
- Opening
report.odt
in MS Word (Office 365, v. 16.36) produces the following error:
Word found unreadable content in "report.odt". Do you want to recover the contents of this document?
Workaround
Interestingly, after unzipping and re-zipping the report, MS Word does not complain!
unzip report.odt -d odt_contents
cd odt_contents
zip -r ../report_rezipped.odt .
# `report_rezipped.odt` can be opened in MS Word without complaint.
zip -h
Copyright (c) 1990-2008 Info-ZIP - Type 'zip "-L"' for software license.
Zip 3.0 (July 5th 2008).
Maybe rubyzip
is doing something that MS Word doesn't like? and re-zipping with "official" zip
fixes it?
Now that I have report.odt
and report_rezipped.odt
, and I can compare them to eachother, what should I be looking for? Would you like to see an unzip -l
on each?
bundle | egrep -e '(rubyzip|odf-report|mime-types|nokogiri)'
Using nokogiri 1.10.9
Using mime-types-data 3.2019.1009
Using mime-types 3.3.1
Using rubyzip 2.3.0
Using odf-report 0.7.0
I think this is a problem similar to #104. I haven't had the time to look into it, but this week will be OSS week, :-) I'll have a look
Hi Sandro.
I think this is a problem similar to #104.
My gut says, similar maybe, but not exactly the same. #104 describes a problem in META-INF/manifest.xml
. However, when I compare the contents of my report.odt
(which Word dislikes) with my report_rezipped.odt
(which Word likes) there is no difference in manifest.xml
.
unzip -vl report.odt
Length Method Size Cmpr Date Time CRC-32 Name
-------- ------ ------- ---- ---------- ----- -------- ----
...
962 Defl:N 266 72% 04-27-2020 20:30 7f41a5b1 META-INF/manifest.xml
...
unzip -vl report_rezipped.odt
...
962 Defl:N 266 72% 04-27-2020 20:30 7f41a5b1 META-INF/manifest.xml
...
Note the identical CRC-32 checksum.
I suspect we're using rubyzip incorrectly, but don't have more details yet.
I haven't had the time to look into it, but this week will be OSS week, :-) I'll have a look
Thanks, and please let me know if I can help in any way.
Well, I couldn't find any obvious culprits. I'm doing the same rubyzip handling I always did, except editing the MANIFEST.XML for the repeated images.
I still think #104 is related. Altough the OP find a diference in MANIFEST.XML, he removed the "\n" at the end and had to rezip the file. I suspect the reziping solved the problem, not the removal.
I'm gonna try to find a windows machine to run some tests (i'm a mac user).
Could #67 be related also?
Altough the OP find a diference in MANIFEST.XML, he removed the "\n" at the end and had to rezip the file. I suspect the reziping solved the problem, not the removal.
Yeah, it's possible the newline is a red herring.
I'm gonna try to find a windows machine to run some tests (i'm a mac user).
Actually, in my steps-to-reproduce above, I was using Word for Mac. So, you can reproduce this on a mac.
Could #67 be related also?
I'll try the binread
patch and see if I can still reproduce the issue.
Could #67 be related also?
I'll try the binread patch and see if I can still reproduce the issue.
After applying this patch, I'm still able to reproduce this issue. Probably, the patch didn't help because I have no images in my template file. 🤦
I think I nailed it. Will be releasing 0.7.2 shortly
0.7.2
seems to work for me. I can remove my Rezipper
class 🎉 Thanks Sandro.