dasch-swiss/sipi

When a PDF in /tmp is requested, Sipi redirects to info.json

benjamingeer opened this issue · 10 comments

Steps to reproduce:

  1. Check out the wip/1202-upload-non-images branch in Knora.

  2. In one terminal window, cd Knora/sipi and start Sipi using the instructions in that directory in README.md under "With Docker".

  3. In another terminal window, upload a PDF file to Sipi using Knora's upload.lua:

$ cd Knora/webapi/_test_data/test_route/files/
$ curl -F "file=@minimal.pdf;filename=minimal.pdf" http://0.0.0.0:1024/upload?token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJLbm9yYSIsInN1YiI6Imh0dHA6Ly9yZGZoLmNoL3VzZXJzLzlYQkNyRFYzU1JhN2tTMVd3eW5CNFEiLCJhdWQiOlsiS25vcmEiLCJTaXBpIl0sImV4cCI6NDY5NTE5MzYwNSwiaWF0IjoxNTQxNTkzNjA1LCJqdGkiOiJsZmdreWJqRlM5Q1NiV19NeVA0SGV3IiwiZm9vIjoiYmFyIn0.qPMJjv8tVOM7KKDxR4Dmdz_kB0FzTOtJBYHSp62Dilk
{
   "uploadedFiles": [
      {
         "temporaryUrl": "http://0.0.0.0:1024/tmp/9sIYG85Qjen-EirGUiR6J0K.pdf",
         "internalFilename": "9sIYG85Qjen-EirGUiR6J0K.pdf",
         "fileType": "document",
         "originalFilename": "minimal.pdf"
      }
   ]
}

This line appears in Sipi's console output, showing that the file was uploaded successfully to Sipi's tmp directory:

sipi            | Sipi: upload.lua: wrote non-image file to /sipi/images/tmp/9sIYG85Qjen-EirGUiR6J0K.pdf
  1. Request the PDF file from Sipi's tmp directory:
$ curl -v http://0.0.0.0:1024/tmp/9sIYG85Qjen-EirGUiR6J0K.pdf
*   Trying 0.0.0.0...
* TCP_NODELAY set
* Connected to 0.0.0.0 (127.0.0.1) port 1024 (#0)
> GET /tmp/9sIYG85Qjen-EirGUiR6J0K.pdf HTTP/1.1
> Host: 0.0.0.0:1024
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 303 See other
< Content-Type: text/plain
< Connection: close
< Location: http://0.0.0.0:1024/tmp/9sIYG85Qjen-EirGUiR6J0K.pdf/info.json
< Content-Length: 73
< 
* Closing connection 0
Redirect to http://0.0.0.0:1024/tmp/9sIYG85Qjen-EirGUiR6J0K.pdf/info.json

This line appears in Sipi's console output, confirming that it returned a redirect:

sipi            | Sipi: GET: redirect to http://0.0.0.0:1024/tmp/9sIYG85Qjen-EirGUiR6J0K.pdf/info.json

If I issue the GET request in a browser instead of in Curl, this appears in the browser:

Internal Server Error: Sipi image error at [/sipi/src/SipiImage.cpp: 373]: Could not read file /sipi/images/tmp/9sIYG85Qjen-EirGUiR6J0K.pdf

And this appears in Sipi's console output:

sipi            | Sipi: GET /tmp/9sIYG85Qjen-EirGUiR6J0K.pdf: file /sipi/images/tmp/9sIYG85Qjen-EirGUiR6J0K.pdf
sipi            | Sipi: GET: redirect to http://0.0.0.0:1024/tmp/9sIYG85Qjen-EirGUiR6J0K.pdf/info.json
sipi            | Sipi: ERROR IN TIFF! Module: /sipi/images/tmp/9sIYG85Qjen-EirGUiR6J0K.pdf
sipi            | Sipi: Not a TIFF or MDI file, bad magic number 20517 (0x5025)
sipi            | Sipi: GET /tmp/9sIYG85Qjen-EirGUiR6J0K.pdf/info.json failed (Internal Server Error): Sipi image error at [/sipi/src/SipiImage.cpp: 373]: Could not read file /sipi/images/tmp/9sIYG85Qjen-EirGUiR6J0K.pdf

But the PDF file is there in the filesystem under Knora/sipi/images/tmp.

Needed by dasch-swiss/dsp-api#1206.
Needed by dasch-swiss/dsp-api#1202.

The redirect also happens if you upload a PNG image and request it from tmp.

@benjamingeer Finally I'm working on it!

@benjamingeer @subotic If I start Sipi with the local image:

../../Sipi/build/sipi --config config/sipi.knora-local-config.lua

it works as it should. Could it be that the SIPI docker image is somehow outdated, or it uses a different config file??

Could it be that the SIPI docker image is somehow outdated, or it uses a different config file?

Maybe and probably. We don’t have a satisfactory way of making sure that the Sipi config is updated across the different repositories. It is currently a manual and error prone process.

It would help, if the Sipi documentation would have config migration notes.

I can’t compile Sipi from source because of #299. @subotic, in that issue you said that I should use the Docker image, and that it would take the config file from Knora’s sipi directory.

The currently configured (supported) Sipi image used in the Knora develop branch is dhlabbasel/sipi:v1.4.3 (see here). I didn't have time to update Sipi to the newest version yet. For the newest Sipi to work with Knora, you would probably need to update the config in Knora's sipi directory. Also, it would be good to change https://github.com/dhlab-basel/Knora/blob/fb7adba749bf6b37ed36f49f2d6cb3f553b92382/project/Dependencies.scala#L52.

To be able to develop Knora, I need to be able to run the current development version of Sipi. This means that I either need to be able to compile it from source, or I need to have it in a Docker image. @lrosenth what would you suggest?

There is a published Docker image of each develop version: https://cloud.docker.com/u/dhlabbasel/repository/docker/dhlabbasel/sipi/tags

You can change the tag in the sipi/docker-compose.yml in your Knora branch. If you need Sipi from a branch other then develop or any of the releases, then you would need to compile it locally. The only sure way to compile it locally (if building under macOS fails) is to build inside Docker (see the docs). Also, you could create locally a Docker image (see .travis.yml).

@subotic That works, thank you! I didn't know about these different Docker images for Sipi; it would help to mention this in the Sipi docs.