authenticvision/digitalsoul

Failed to write image cache / Image uploads fail

Closed this issue · 1 comments

Not sure if an issue yet, but certainly a warning;
After deploying v0.0.3 to our kubernetes-cluster and visiting the NFT-Overview of RDFC, I get a warning in the logs for image-caches;

kubectl --namespace metaanchor-digitalsoul logs -f digitalsoul-59f44bdf9c-sr2xr
- ready started server on 0.0.0.0:8080, url: http://localhost:8080


- warn "next" should not be imported directly, imported in /srv/app/.next/server/pages/index.js
See more info here: https://nextjs.org/docs/messages/import-next
Warning: For production Image Optimization with Next.js, the optional 'sharp' package is strongly recommended. Run 'npm i sharp', and Next.js will use it automatically for Image Optimization.
Read more: https://nextjs.org/docs/messages/sharp-missing-in-production
Failed to write image to cache gbguDfDgRFyqDd1eqCRG9oFfVAnkcTKm3Ye4VNAI9xQ= [Error: ENOENT: no such file or directory, mkdir '/srv/app/.next/cache/images'] {
  errno: -2,
  code: 'ENOENT',
  syscall: 'mkdir',
  path: '/srv/app/.next/cache/images'
}

I tried uploading an image;
image

The upload proceeds to 100%, but then the Frontend gets stuck. Logs reveal;

TypeError: Cannot read properties of null (reading 'id')
    at getServerSideProps (/srv/app/.next/server/pages/contracts.js:239:33)
TypeError: Cannot read properties of null (reading 'id')
    at getServerSideProps (/srv/app/.next/server/pages/contracts.js:239:33)
TypeError: Cannot read properties of null (reading 'id')
    at getServerSideProps (/srv/app/.next/server/pages/contracts.js:239:33)
TypeError: Cannot read properties of null (reading 'id')
    at getServerSideProps (/srv/app/.next/server/pages/contracts.js:239:33)
[Error: EROFS: read-only file system, open '/tmp/upload_b31972bc95c8abd1d9bee5ba08d07272'] {
  errno: -30,
  code: 'EROFS',
  syscall: 'open',
  path: '/tmp/upload_b31972bc95c8abd1d9bee5ba08d07272'
}
Error: ENOENT: no such file or directory, open '/tmp/upload_b31972bc95c8abd1d9bee5ba08d07272'
    at Object.openSync (node:fs:603:3)
    at readFileSync (node:fs:471:35)
    at /srv/app/.next/server/pages/api/internal/assets/[anchor].js:281:77
    at IncomingForm.<anonymous> (/srv/app/node_modules/formidable-serverless/lib/index.js:33:9)
    at IncomingForm.emit (node:events:517:28)
    at IncomingForm._maybeEnd (/srv/app/node_modules/formidable-serverless/node_modules/formidable/lib/incoming_form.js:563:8)
    at /srv/app/node_modules/formidable-serverless/node_modules/formidable/lib/incoming_form.js:242:12
    at /srv/app/node_modules/formidable-serverless/node_modules/formidable/lib/file.js:79:5
    at process.processTicksAndRejections (node:internal/process/task_queues:81:21) {
  errno: -2,
  syscall: 'open',
  code: 'ENOENT',
  path: '/tmp/upload_b31972bc95c8abd1d9bee5ba08d07272'
}

Root causes
The file-upload generates tmp-files at /tmp/, which is *not* located in the mounted /srv/app/nftdata. Hence, in a production environment (kubernetes), writing to /tmp/*` is not possible.

The same likley applies to /srv/app/.next

Possible solutions

  • Working without temporary file systems
  • Use /srv/app/nftdata/tmp/ as temporary filesystem, since this needs to be mounted anyway
  • Create a two separate volumes /tmp and /srv/app/.next for temporary filesystem also in production

There is a conceptual issue with the trivial solution of mounting some tmpfs systems: K8s's memory accounting will add any tmpfs memory to your container's resource usage:

The memory limit for the Pod or container can also apply to pages in memory backed volumes, such as an emptyDir. The kubelet tracks tmpfs emptyDir volumes as container memory use, rather than as local ephemeral storage.

If the user uploads a large file that exceeds your resource quota, then K8s will kill and restart your entire container mid-upload.

I'd suggest to:

  • First move the application's code out of /srv, so that there is clean separation between code (read-only) and data (read-write). FHS standard places would be /opt/digitalsoul (if node_module contains architecture-specific stuff) or /usr/share/digitalsoul/app (if it's 100% interpreted code).
  • Cosmetic, adding to the previous point: Move /srv/app/nftdata one folder up (can be done by editing the deployment yaml)
  • Create /srv/nftdata/tmp like in your solution, and have your application place in-progress uploads there so that they are backed by physical storage.
  • Move, don't copy, once an upload is completed. This happens in near zero time, so your application is even faster than if you were to write to /tmp.
  • Create a routine that runs regularly and deletes files older than say 24h from the temp folder
  • Ensure that the folder name of .next is user-controllable, e.g. through an env var
  • If the contents of .next need to be persisted or if they are large, then also place it into the nftdata folder by setting the env var. Ensure that user uploads live in a separate folder and cannot collide with application-internal data.
  • Otherwise, if the contents of .next are small or ephemeral, contact me to have /tmp backed by a Linux tmpfs and set the env var to place next stuff into /tmp/next.