allenai/pdf-component-library

Remove generated hash file from library bundle

Closed this issue · 6 comments

The PDF library imported to S2 is not being initialized with a web worker. pdfjs uses a web worker to asynchronously process most tasks which take time. Without a web worker, we see a delay rendering the PDF reader page. At times the page silently fails to load.

PDF component library package contains some hash files. These are the worker file which need to be referenced when initialized in S2. First 2 in this screenshot

image

I have a hunch that instead of removing this hash file, we will need to update initPdfWorker() to point workerSrc to it instead. It will probably need to be a different value for dev vs prod builds (webpack updates).

Partially tested this by moving the hash file to the top level of the demo. No worker error disappears and PDF renders successfully.

Possibly relevant: ReactPdfImport.js in Scholar project- not sure why this exists

Fixed this issue in PDF CL v0.0.9 by pointing PDFjs workerSrc to a CDN instead of loading it through Webpack. This fixes the following issues:

  • Unexpected hash file produced in dist bundle
  • Fake worker warning on PDF load
  • (Expected but not tested until prod deploy) Sometimes PDF page needed to be refreshed in order to load

New library output:
image

Created ticket for adding a test to make sure the CDN this now relies on is up: https://github.com/allenai/scholar/issues/30552