szymonkaliski/archivist

'archivist fetch' fails

Closed this issue · 3 comments

I'm having no luck running 'fetch' using archivist-pinterest-crawl.

Using cookies as loginMethod fails without giving any error (I do get a security popup request for accessing Chrome's data, though):

[archivist-pinterest-crawl] crawling profile https://pinterest.com/watzmarius
[archivist-pinterest-crawl] 0 crawled pins, exiting

Using login / password gives an error, it looks like it's failing while looking for a DOM selector:

[archivist-pinterest-crawl] fetching error Error: No node found for selector: [data-test-id=login-button] > button

I looked at the Pinterest page the script is parsing and the DIV it's looking for seems to have data-test-id="loginButton", not sure if that's the issue.

My config:

{
  "archivist-pinterest-crawl": {
    "loginMethod": "password",
    "username": "mariuswatz@gmail.com",
    "password": "XXXX",
    "profile": "watzmarius"
    }
  }

I changed the following code in crawler.js to reflect the current selector of the login button:

await page.click("[data-test-id=login-button] > button");

To:

await page.click("div[data-test-id='simple-login-button'] > button");

Now it seems to get past the login, at least it gives no errors. But it gives me the same result as I got with 'cookies' previously, i.e.:

[archivist-pinterest-crawl] crawling profile https://pinterest.com/watzmarius
[archivist-pinterest-crawl] 0 crawled pins, exiting

One more change, and I got it working:

In crawler.js I changed:

await page.goto(profileUrl, { waitUntil: "networkidle2" });

To:

await page.goto(profileUrl + '/boards', { waitUntil: 'networkidle2' })

After that I was able to crawl all my pins successfully. Thanks for writing the tool!

Hey, sorry for the long wait, this should be fixed in archivist-pinterest-crawl v 1.3.0 (just published: https://www.npmjs.com/package/archivist-pinterest-crawl/v/1.3.0)