gatsbyjs/gatsby-source-wordpress-experimental

Error fetching Media with no clear message on where is coming from.

DanielRiveraHQ opened this issue · 21 comments

I’m having an issue with gatsby develop and gatsby build when it pulls assets from WordPress. It seems to be related to a missing asset. Usually, the terminal lets you know if there is a 404 getting an error and the URL, but not this time. Do you guys have a clue? Here is a screenshot of the error:

Screen Shot 2020-12-17 at 6 27 34 PM

I'm using
“gatsby-source-WordPress-experimental”: “^5.0.2"
WPGatsby 0.8.0
WP GraphQL 1.0.3

@jasonbahl said he could reproduce the error but not have an answer to fix it and post it here to look into it further.

In case anyone needs access to the repository, ping me, and I can add you as a collaborator.

I am facing the same issue and I would like to start working on this. How would you prefer to deal with this issue:

  1. Node won't get created and content will ignore the missing content
  2. Content will be ignored - page won't be created

Probably just the specific node, and maybe a notification to know where the error is happening. What do you think?

Also, I'm new at this, but if there is any other way I can help, let me know.

We might need the advice from somebody who maintains this project - this seems to be a big decision

I am having the same issue as well. I was able to use the gatsby-source-worpress@v3 plugin to successfully pull the data. I also tried using gatsby-source-graphql as well and that also worked for me. For now I'm going to use gatsby-source-graphql and eventually switch over to gatsby-source-worpress-experimental when the kinks are worked out. In the experimental docs it looks like migrating from gatsby-source-graphql to gatsby-source-worpress-experimental requires less work than migrating from v3.

with the latest build of gatsby-source-worpress-experimental and WP-Gatsby 0.9 the issue no longer reproducible.
@DanielRiveraHQ & @ryancybul can you please confirm so I can close the issue?

Hi @cstefanache, thanks for the quick response! I'm still running into the same issue after updating gatsby-source-wordpress-experimental and making sure my WP plugins WP Gatsby, WP GraphQL, and WP GraphiQL are all up to date. Below is the console log when I run Gatsby Build. I did not add any additional code to this project other than the gatsby-config.js where I just added the gatsby-source-wordpress-experimental plugin with just the graphql endpoint URL. I don't know if it matters or not but I have about 150 images in my media library and am using the Media Library Categories and reSmush.it Image Optimizer plugins as well. These plugins work with gatsby-source-wordpress@v3.

⋊> ~/c/S/wordpress-tutorial-site on master ⨯ gatsby develop                                         09:14:06
success open and validate gatsby-configs - 0.090s
success load plugins - 1.318s
success onPreInit - 0.054s
success initialize cache - 0.026s
success copy gatsby files - 0.126s
success onPreBootstrap - 0.023s
success  gatsby-source-wordpress  ensuring plugin requirements are met - 0.834s
⠀
info  gatsby-source-wordpress 

        This is either your first build or the cache was cleared.
        Please wait while your WordPress data is synced to your Gatsby cache.

        Maybe now's a good time to get up and stretch? :D

success  gatsby-source-wordpress  ingest WPGraphQL schema - 1.461s
success createSchemaCustomization - 2.356s
success  gatsby-source-wordpress  fetch root fields - 0.562s
success  gatsby-source-wordpress  User - 0.550s - fetched 2
success  gatsby-source-wordpress  ContentType - 0.836s - fetched 4
success  gatsby-source-wordpress  Comment - 0.992s - fetched 0
success  gatsby-source-wordpress  MenuItem - 1.025s - fetched 0
success  gatsby-source-wordpress  Category - 1.159s - fetched 4
success  gatsby-source-wordpress  Menu - 1.157s - fetched 0
success  gatsby-source-wordpress  Tag - 1.254s - fetched 1
success  gatsby-source-wordpress  Taxonomy - 1.270s - fetched 3
success  gatsby-source-wordpress  Page - 1.434s - fetched 5
success  gatsby-source-wordpress  Interview - 1.539s - fetched 10
success  gatsby-source-wordpress  UserRole - 1.396s - fetched 0
success  gatsby-source-wordpress  PostFormat - 1.560s - fetched 0
success  gatsby-source-wordpress  Post - 1.663s - fetched 6

 ERROR 

HTTPError: Response code 404 (Not Found)
    at Request._onResponseBase (/Users/ryancybul/code/Stacey Rozich/wordpress-tutorial-site/node_modules/gat
sby-source-wordpress-experimental/node_modules/got/dist/source/core/index.js:896:31)
    at Request._onResponse (/Users/ryancybul/code/Stacey Rozich/wordpress-tutorial-site/node_modules/gatsby-
source-wordpress-experimental/node_modules/got/dist/source/core/index.js:931:24)
    at ClientRequest.<anonymous> (/Users/ryancybul/code/Stacey Rozich/wordpress-tutorial-site/node_modules/g
atsby-source-wordpress-experimental/node_modules/got/dist/source/core/index.js:945:23)
    at Object.onceWrapper (events.js:418:26)
    at ClientRequest.emit (events.js:323:22)
    at ClientRequest.origin.emit (/Users/ryancybul/code/Stacey
Rozich/wordpress-tutorial-site/node_modules/@szmarczak/http-timer/dist/source/index.js:39:20)
    at HTTPParser.parserOnIncomingClient [as onIncoming] (_http_client.js:603:27)
    at HTTPParser.parserOnHeadersComplete (_http_common.js:119:17)
    at Socket.socketOnData (_http_client.js:476:22)
    at Socket.emit (events.js:311:20)
    at addChunk (_stream_readable.js:294:12)
    at readableAddChunk (_stream_readable.js:275:11)
    at Socket.Readable.push (_stream_readable.js:209:10)
    at TCP.onStreamRead (internal/stream_base_commons.js:186:23) {
  name: 'HTTPError',
  code: undefined,
  timings: {
    start: 1609164922490,
    socket: 1609164922490,
    lookup: 1609164922653,
    connect: 1609164922653,
    secureConnect: undefined,
    upload: 1609164922653,
    response: 1609164922783,
    end: 1609164922787,
    error: undefined,
    abort: undefined,
    phases: {
      wait: 0,
      dns: 163,
      tcp: 0,
      tls: undefined,
      request: 0,
      firstByte: 130,
      download: 4,
      total: 297
    }
  }
}


 ERROR 

The "url" argument must be of type string. Received undefined



  Error: TypeError [ERR_INVALID_ARG_TYPE]: The "url" argument must be of type string. Received undefined
  
  - validators.js:117 validateString
    internal/validators.js:117:11
  
  - create-remote-media-item-node.js:26 getMediaItemEditLink
⋊> ~/c/S/wordpress-tutorial-site on master ⨯ gatsby develop                                                                                                                                                                                                         09:15:53
success open and validate gatsby-configs - 0.079s
success load plugins - 1.669s
success onPreInit - 0.052s
success initialize cache - 0.036s
success copy gatsby files - 0.207s
success onPreBootstrap - 0.022s
success  gatsby-source-wordpress  ensuring plugin requirements are met - 0.823s
⠀
info  gatsby-source-wordpress 

        This is either your first build or the cache was cleared.
        Please wait while your WordPress data is synced to your Gatsby cache.

        Maybe now's a good time to get up and stretch? :D

success  gatsby-source-wordpress  ingest WPGraphQL schema - 1.424s
success createSchemaCustomization - 2.502s
success  gatsby-source-wordpress  fetch root fields - 0.580s
success  gatsby-source-wordpress  Taxonomy - 0.546s - fetched 3
success  gatsby-source-wordpress  Tag - 0.677s - fetched 1
success  gatsby-source-wordpress  Post - 0.848s - fetched 6
success  gatsby-source-wordpress  Page - 0.940s - fetched 5
success  gatsby-source-wordpress  ContentType - 1.021s - fetched 4
success  gatsby-source-wordpress  Comment - 1.216s - fetched 0
success  gatsby-source-wordpress  Interview - 1.240s - fetched 10
success  gatsby-source-wordpress  PostFormat - 1.397s - fetched 0
success  gatsby-source-wordpress  Menu - 1.417s - fetched 0
success  gatsby-source-wordpress  Category - 1.492s - fetched 4
success  gatsby-source-wordpress  MenuItem - 1.475s - fetched 0
success  gatsby-source-wordpress  UserRole - 1.452s - fetched 0
success  gatsby-source-wordpress  User - 1.432s - fetched 2

 ERROR 

HTTPError: Response code 404 (Not Found)
    at Request._onResponseBase (/Users/ryancybul/code/Stacey Rozich/wordpress-tutorial-site/node_modules/gat
sby-source-wordpress-experimental/node_modules/got/dist/source/core/index.js:896:31)
    at Request._onResponse (/Users/ryancybul/code/Stacey Rozich/wordpress-tutorial-site/node_modules/gatsby-
source-wordpress-experimental/node_modules/got/dist/source/core/index.js:931:24)
    at ClientRequest.<anonymous> (/Users/ryancybul/code/Stacey Rozich/wordpress-tutorial-site/node_modules/g
atsby-source-wordpress-experimental/node_modules/got/dist/source/core/index.js:945:23)
    at Object.onceWrapper (events.js:418:26)
    at ClientRequest.emit (events.js:323:22)
    at ClientRequest.origin.emit (/Users/ryancybul/code/Stacey
Rozich/wordpress-tutorial-site/node_modules/@szmarczak/http-timer/dist/source/index.js:39:20)
    at HTTPParser.parserOnIncomingClient [as onIncoming] (_http_client.js:603:27)
    at HTTPParser.parserOnHeadersComplete (_http_common.js:119:17)
    at Socket.socketOnData (_http_client.js:476:22)
    at Socket.emit (events.js:311:20)
    at addChunk (_stream_readable.js:294:12)
    at readableAddChunk (_stream_readable.js:275:11)
    at Socket.Readable.push (_stream_readable.js:209:10)
    at TCP.onStreamRead (internal/stream_base_commons.js:186:23) {
  name: 'HTTPError',
  code: undefined,
  timings: {
    start: 1609164973126,
    socket: 1609164973126,
    lookup: 1609164973211,
    connect: 1609164973211,
    secureConnect: undefined,
    upload: 1609164973212,
    response: 1609164973350,
    end: 1609164973351,
    error: undefined,
    abort: undefined,
    phases: {
      wait: 0,
      dns: 85,
      tcp: 0,
      tls: undefined,
      request: 1,
      firstByte: 138,
      download: 1,
      total: 225
    }
  }
}


 ERROR 

The "url" argument must be of type string. Received undefined



  Error: TypeError [ERR_INVALID_ARG_TYPE]: The "url" argument must be of type string. Received undefined
  
  - validators.js:117 validateString
    internal/validators.js:117:11
  
  - create-remote-media-item-node.js:26 getMediaItemEditLink
    [wordpress-tutorial-site]/[gatsby-source-wordpress-experimental]/src/steps/source-nodes/create-nodes/cre
ate-remote-media-item-node.js:26:38
  
  - create-remote-media-item-node.js:39 errorPanicker
    [wordpress-tutorial-site]/[gatsby-source-wordpress-experimental]/src/steps/source-nodes/create-nodes/cre
ate-remote-media-item-node.js:39:19
  
  - create-remote-media-item-node.js:281 Object.onRetry
    [wordpress-tutorial-site]/[gatsby-source-wordpress-experimental]/src/steps/source-nodes/create-nodes/cre
ate-remote-media-item-node.js:281:9
  
  - index.js:33 onError
    [wordpress-tutorial-site]/[async-retry]/lib/index.js:33:17
  
  - index.js:50 catchIt
    [wordpress-tutorial-site]/[async-retry]/lib/index.js:50:11
  
  - task_queues.js:97 processTicksAndRejections
    internal/process/task_queues.js:97:5
  

not finished source and transform nodes - 4.952s
not finished  gatsby-source-wordpress  fetching nodes - 4.943s - 40 total
not finished  gatsby-source-wordpress  creating nodes - 3.350s
not finished  gatsby-source-wordpress  MediaItem - 3.356s - fetched 5
not finished Downloading remote files - 3.035s
not finished Generating image thumbnails - 2.046s```

Exactly the same error here. I have thousands of pictures in my Wordpress site. It looks like it crashes because of a very few of these images. I am trying to have more logs about this problem but it takes a very long time since it starts from the beginning at every gatsby develop run.

success open and validate gatsby-configs - 0.104s
warn Warning: there are unknown plugin options for "gatsby-source-wordpress-experimental": debug.graphql.copyHtmlResponseOnError
Please open an issue at ghub.io/gatsby-source-wordpress-experimental if you believe this option is valid.
success load plugins - 2.407s
success onPreInit - 0.044s
success initialize cache - 0.031s
success copy gatsby files - 0.153s
success onPreBootstrap - 0.026s

 ERROR 

(node:93196) Warning: Setting the NODE_TLS_REJECT_UNAUTHORIZED environment variable to '0' makes TLS connections and HTTPS requests insecure by disabling certificate verification.

success  gatsby-source-wordpress  ensuring plugin requirements are met - 7.437s
⠀
info  gatsby-source-wordpress 

	This is either your first build or the cache was cleared.
	Please wait while your WordPress data is synced to your Gatsby cache.

	Maybe now's a good time to get up and stretch? :D

success  gatsby-source-wordpress  writing GraphQL queries to disk at ./WordPress/GraphQL/ - 0.456s
success  gatsby-source-wordpress  ingest WPGraphQL schema - 7.957s
success createSchemaCustomization - 15.705s
success  gatsby-source-wordpress  fetch root fields - 2.632s
success  gatsby-source-wordpress  ContentType - 4.164s - fetched 3
success  gatsby-source-wordpress  Category - 5.980s - fetched 10
success  gatsby-source-wordpress  Comment - 21.332s - fetched 310
success  gatsby-source-wordpress  Menu - 2.718s - fetched 2
success  gatsby-source-wordpress  MenuItem - 2.895s - fetched 18
success  gatsby-source-wordpress  PostFormat - 4.450s - fetched 7
success  gatsby-source-wordpress  Page - 7.526s - fetched 17
success  gatsby-source-wordpress  Post - 137.028s - fetched 599
success  gatsby-source-wordpress  Taxonomy - 2.465s - fetched 3
success  gatsby-source-wordpress  UserRole - 3.622s - fetched 0
success  gatsby-source-wordpress  Tag - 19.253s - fetched 125
success  gatsby-source-wordpress  User - 3.367s - fetched 5

 ERROR 

RequestError: getaddrinfo ENOTFOUND www.terresceltes.net
    at ClientRequest.<anonymous> (/Users/olivierthierry/dev/terresceltes/gatsby/node_modules/gatsby-source-wordpress-experimental/node_modules/got/dist/source/core/index.js:953:111)
    at Object.onceWrapper (events.js:422:26)
    at ClientRequest.emit (events.js:327:22)
    at ClientRequest.EventEmitter.emit (domain.js:485:12)
    at ClientRequest.origin.emit
(/Users/olivierthierry/dev/terresceltes/gatsby/node_modules/gatsby-source-wordpress-experimental/node_modules/@szmarczak/http-timer/dist/source/index.js:39:20)
    at TLSSocket.socketErrorListener (_http_client.js:432:9)
    at TLSSocket.emit (events.js:315:20)
    at TLSSocket.EventEmitter.emit (domain.js:485:12)
    at emitErrorNT (internal/streams/destroy.js:84:8)
    at processTicksAndRejections (internal/process/task_queues.js:84:21)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:66:26) {
  code: 'ENOTFOUND',
  timings: {
    start: 1609174041600,
    socket: 1609174041600,
    lookup: 1609174041604,
    connect: undefined,
    secureConnect: undefined,
    upload: undefined,
    response: undefined,
    end: undefined,
    error: 1609174041611,
    abort: undefined,
    phases: {
      wait: 0,
      dns: 4,
      tcp: undefined,
      tls: undefined,
      request: undefined,
      firstByte: undefined,
      download: undefined,
      total: 11
    }
  }
}


 ERROR 

HTTPError: Response code 404 (Not Found)
    at Request._onResponseBase (/Users/olivierthierry/dev/terresceltes/gatsby/node_modules/gatsby-source-wordpress-experimental/node_modules/got/dist/source/core/index.js:896:31)
    at Request._onResponse (/Users/olivierthierry/dev/terresceltes/gatsby/node_modules/gatsby-source-wordpress-experimental/node_modules/got/dist/source/core/index.js:931:24)
    at ClientRequest.<anonymous> (/Users/olivierthierry/dev/terresceltes/gatsby/node_modules/gatsby-source-wordpress-experimental/node_modules/got/dist/source/core/index.js:945:23)
    at Object.onceWrapper (events.js:422:26)
    at ClientRequest.emit (events.js:327:22)
    at ClientRequest.EventEmitter.emit (domain.js:485:12)
    at ClientRequest.origin.emit
(/Users/olivierthierry/dev/terresceltes/gatsby/node_modules/gatsby-source-wordpress-experimental/node_modules/@szmarczak/http-timer/dist/source/index.js:39:20)
    at HTTPParser.parserOnIncomingClient [as onIncoming] (_http_client.js:603:27)
    at HTTPParser.parserOnHeadersComplete (_http_common.js:117:17)
    at Socket.socketOnData (_http_client.js:472:22)
    at Socket.emit (events.js:315:20)
    at Socket.EventEmitter.emit (domain.js:485:12)
    at addChunk (_stream_readable.js:297:12)
    at readableAddChunk (_stream_readable.js:273:9)
    at Socket.Readable.push (_stream_readable.js:214:10)
    at TCP.onStreamRead (internal/stream_base_commons.js:186:23) {
  code: undefined,
  timings: {
    start: 1609174967042,
    socket: 1609174967042,
    lookup: 1609174967043,
    connect: 1609174967060,
    secureConnect: undefined,
    upload: 1609174967060,
    response: 1609174967129,
    end: 1609174967764,
    error: undefined,
    abort: undefined,
    phases: {
      wait: 0,
      dns: 1,
      tcp: 17,
      tls: undefined,
      request: 0,
      firstByte: 69,
      download: 635,
      total: 722
    }
  }
}


 ERROR 

The "url" argument must be of type string. Received undefined



  Error: TypeError [ERR_INVALID_ARG_TYPE]: The "url" argument must be of type string. Received undefined
  
  - validators.js:120 validateString
    internal/validators.js:120:11
  
  - create-remote-media-item-node.js:26 getMediaItemEditLink
    [gatsby]/[gatsby-source-wordpress-experimental]/src/steps/source-nodes/create-nodes/create-remote-media-item-node.js:26:38
  
  - create-remote-media-item-node.js:39 errorPanicker
    [gatsby]/[gatsby-source-wordpress-experimental]/src/steps/source-nodes/create-nodes/create-remote-media-item-node.js:39:19
  
  - create-remote-media-item-node.js:281 Object.onRetry
    [gatsby]/[gatsby-source-wordpress-experimental]/src/steps/source-nodes/create-nodes/create-remote-media-item-node.js:281:9
  
  - index.js:33 onError
    [gatsby]/[async-retry]/lib/index.js:33:17
  
  - index.js:50 catchIt
    [gatsby]/[async-retry]/lib/index.js:50:11
  
  - runMicrotasks
  
  - task_queues.js:97 processTicksAndRejections
    internal/process/task_queues.js:97:5
  

not finished source and transform nodes - 1894.402s
not finished  gatsby-source-wordpress  fetching nodes - 1894.279s - 5574 total
not finished  gatsby-source-wordpress  creating nodes - 1710.330s
not finished  gatsby-source-wordpress  MediaItem - 1710.334s - fetched 4475
not finished Downloading remote files - 1706.984s
not finished Generating image thumbnails - 1703.417s

Yes, I experience the same issue, it fails when creating MediaItem nodes. This is what I see the moment before the error kicks in:
image

Right after that I get the exact same error like @ryancybul and @chawax. I was looking into node_modules/gatsby-source-wordpress-experimental/dist/steps/source-nodes/create-nodes/create-remote-media-item-node.js and it seems the error is at line 44: node.link seems to be undefined:

const getMediaItemEditLink = node => {
  const {
    protocol,
    hostname
  } = _url.default.parse(node.link); // <- node.link is undefined

  const editUrl = `${protocol}//${hostname}/wp-admin/upload.php?item=${node.databaseId}`;
  return editUrl;
};

I was able to bypass the issue for test purposes by changing the line to } = _url.default.parse(node.link + "");. I got a bunch of 404 errors for all my images, obviously, but the develop build passes.

Too bad that it's not working, but this is experimental, so I should expect issues down the line.

Same issue here - I've added mine here #384

Hi everyone! Is anyone able to provide a reproduction repo and /graphql endpoint I can use to debug this?

Thanks

Hi everyone! Is anyone able to provide a reproduction repo and /graphql endpoint I can use to debug this?

Thanks

I've sent you what I could in PM on slack

I had this problem because there was an image with a broken link in some of my posts. Correcting the URL made the problem disappear.

To find the url that threw 404 error, I wrapped getMediaItemEditLink method in node_modules/gatsby-source-wordpress-experimental/dist/steps/source-nodes/create-nodes/create-remote-media-item-node.js file. Something this way :

const getMediaItemEditLink = node => {
  try {
    const {
      protocol,
      hostname
    } = _url.default.parse(node.link);
  
    const editUrl = `${protocol}//${hostname}/wp-admin/upload.php?item=${node.databaseId}`;
    return editUrl;
  } catch (e) {
    console.log(node)
  }
};

The node.mediaItemUrl property contains the url that caused the problem.

Maybe this piece of code should be wrapped such a way to throw a warning and not an error ?

I'm having a look at this now. I've been out on holidays which is why this one sat so long. I'll keep yall updated!

I've identified the problem (and a few others) and there should be a release out for this tomorrow. Thanks everyone for your patience here!

Great news, thanks !

Great, thank you @TylerBarnes !

A fix has been published in gatsby-source-wordpress-experimental@6.1.0! If you're still having problems, feel free to comment here or re-open this issue.
Thanks!

Thanks @TylerBarnes, the issue here is solved. However, the server is not starting for me since 6.1.0. When I downgrade to 6.0.0 and apply my bypass to make it build the server starts again. Something changed that introduced a new issue that I experience: #404

@TylerBarnes It's working on my end. Thank you so much for the fix!

I propose issue for closing

Awesome :D great to hear it's working for yall. Let me know if you run into any other problems!