Corrupted File Error

Question

Corrupted File Error

Closed this issue 4 months ago · 4 comments

Reporting an Issue Here

Expected Behavior

After the PfdReader.Open is called on the pdf file it is expected to open for reading.

Actual Behavior

It throws an error in the program below.
Would be nice to have some type of repair functionality or ability to open this PDF as it opens in PDF viewer programs just fine.

Steps to Reproduce the Behavior

Run attached solution and it will error out.
Zip attached. PDF files also attached that it errors out on.
PDFsharp.IssueSubmissionTemplate.zip
TestUser%20%202%202023W2.pdf
TestUser202023W2.pdf

Answer 1 · 2024-07-22T06:53:10.000Z

Tried to open the PDF with Adobe Reader and got this:

If the files does not open with Adobe Reader, then I assume there is something wrong with the file.

Looks like some sort of archive file containing several files. Should open fine with PDFsharp if your code removes the extra headers and trailers before sending the contents to PDFsharp.

Answer 2 · 2024-07-22T09:16:58.000Z

A valid PDF file starts with %PDF-x.y, e.g. %PDF-1.5. Your file starts with
PK�� ô0òXœb4kV?� V?� � FormW2_TestUser2_782477.pdf%PDF-1.5
(open it in Notepad++)
The file ends in line 1929 with %%EOF. In the next line a new PDF file begins
PK�� ô0òX”§ë¾@?� @?� � FormW2_estUser2_782478.pdf%PDF-1.5
Your file seems to be some kind of concatenation of 7 PDF files. I never saw this before. It is interesting that some browsers can open it but not Adobe Reader. I tried to extract the first two PDF file parts with Notepad++, but Adobe Reader still cannot open the single files.

What tool produces this file?

Answer 3 · 2024-07-22T12:33:18.000Z

This is produced using Tax1099 through their GeneratePDF API. I will see if some pre-processing alleviates the issue.

Answer 4 · 2024-07-22T22:44:43.000Z

Actually, both of the attached PDF files are ZIP files.