Flowko/website-shot

Trying to take screenshots of medical studies and failing with an error.

Leopere opened this issue · 6 comments

webshot_webshot.1.z90jb671hvo4@macmini1    | $ nuxt start
webshot_webshot.1.z90jb671hvo4@macmini1    | ℹ Listening on: http://10.0.1.187:3000/
webshot_webshot.1.z90jb671hvo4@macmini1    | /
webshot_webshot.1.z90jb671hvo4@macmini1    | /api/screenshot
webshot_webshot.1.z90jb671hvo4@macmini1    | net::ERR_ABORTED at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7244430/
webshot_webshot.1.z90jb671hvo4@macmini1    |
webshot_webshot.1.z90jb671hvo4@macmini1    |   at navigate (node_modules/puppeteer/lib/cjs/puppeteer/common/FrameManager.js:195:23)
webshot_webshot.1.z90jb671hvo4@macmini1    |   at processTicksAndRejections (node:internal/process/task_queues:96:5)
webshot_webshot.1.z90jb671hvo4@macmini1    |   at async FrameManager.navigateFrame (node_modules/puppeteer/lib/cjs/puppeteer/common/FrameManager.js:171:21)
webshot_webshot.1.z90jb671hvo4@macmini1    |   at async Frame.goto (node_modules/puppeteer/lib/cjs/puppeteer/common/FrameManager.js:584:16)
webshot_webshot.1.z90jb671hvo4@macmini1    |   at async Page.goto (node_modules/puppeteer/lib/cjs/puppeteer/common/Page.js:1109:16)
webshot_webshot.1.z90jb671hvo4@macmini1    |   at async internalCaptureWebsiteCore (server-middleware/capture.js:295:3)
webshot_webshot.1.z90jb671hvo4@macmini1    |   at async internalCaptureWebsite (server-middleware/capture.js:168:12)
webshot_webshot.1.z90jb671hvo4@macmini1    |   at async server-middleware/website-shot.js:167:5

Pulled a log from my attempt to pull a screenshot from a medical journal on failure. I used this same instance to pull a screenshot from this Git Repository moments afterwards with full success. Not sure whats wrong.

hi, can u send the options u had selected as well?

tho even after the fix, you will face this, they are blocking the requests
image

can u test the latest version please, and let me know if all good

Absolutely also here's more feedback.
image
At least now it's not throwing an error in the console.

 docker service logs -f webshot_webshot
webshot_webshot.1.vstdf41tchtf@macmini1    | yarn run v1.22.19
webshot_webshot.1.vstdf41tchtf@macmini1    | $ nuxt start
webshot_webshot.1.vstdf41tchtf@macmini1    | ℹ Listening on: http://10.0.1.189:3000/
webshot_webshot.1.vstdf41tchtf@macmini1    | /
webshot_webshot.1.vstdf41tchtf@macmini1    | /api/screenshot
webshot_webshot.1.vstdf41tchtf@macmini1    | /api/screenshot

Docker swarm stack file (Compose)

version: "3.9"
services:
  webshot:
    image: flowko1/website-shot:latest
    dns:
      - "1.1.1.1"
    volumes:
      - /redacted/data:/usr/src/website-shot/screenshots
    deploy:
      replicas: 1
      placement:
        constraints:
          # - node.labels.role == db
          # - node.hostname == redacted
          - node.labels.home-rack == true
      labels:
        - "traefik.enable=true"
        - "traefik.http.routers.webshot.tls=true"
        - "traefik.http.services.webshot.loadbalancer.server.port=3000"
        - "traefik.http.routers.webshot.rule=Host(`webshot.redacted.com`)"
        - "traefik.http.routers.webshot.entrypoints=websecure"
        - "traefik.http.routers.webshot.tls.certresolver=letsencryptresolver"
        - "traefik.http.routers.webshot.service=webshot"
        - "traefik.docker.network=traefik"
        - 'traefik.http.routers.webshot.middlewares=authelia@docker'
    networks:
      - traefik

networks:
  traefik:
    external: true

that's probably due to restrictions they have on that website, will see if #43 will solve that issue

yep adding a user agent seems to fix that
image