news.ycombinator.com Refuses to load the script

Question

news.ycombinator.com Refuses to load the script

ThinkDigitalSoftware opened this issue 7 years ago · 17 comments

ThinkDigitalSoftware commented 7 years ago

This error occurs when trying to follow to follow the initial tutorial and clicking the artoo bookmark
VM4037:1 Refused to load the script 'https://medialab.github.io/artoo/public/dist/artoo-latest.min.js' because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ https://cdnjs.cloudflare.com/".

Answer 1 · 2018-04-04T08:40:03.000Z

Hum... That's unfortunate but HackerNews just updated its header to include Content-Security-Policy thus forbidding arbitrary script execution. You'll have to use a browser extension bypassing those headers and I should probably find another site as example in my docs :)

Answer 2 · 2018-04-04T15:38:15.000Z

No worries. I figured as much. Thanks for the response. Where can I ask for help with using artoo that's unrelated to this issue?

Answer 3 · 2018-04-04T15:48:24.000Z

Well here seems to be a good place to do so :)

Answer 4 · 2018-04-04T15:56:04.000Z

Awesome. I'm trying to select items by class name under a certain tag, but I can't find a way to do so with artoo. The best I get is selecting all the p tags, but that's gives me more results I can't use than results I can. Let me post an example

…

On Wed, Apr 4, 2018, 8:48 AM Guillaume Plique ***@***.***> wrote: Well here seems to be a good place to do so :) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#280 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AV-HfcgYj4KCbiSO1lZk_u8MnkNHhXoOks5tlOtKgaJpZM4TGKsC> .

Answer 5 · 2018-04-04T15:58:33.000Z

To select items by tag + class, here is what you need to write in CSS:

tagname.class

So, using artoo, you'd probably do something of the kind:

artoo.scrape('tagname.class', ...);

Answer 6 · 2018-04-04T17:17:22.000Z

oh! OK, I was putting a space... OK, thank you. It would also help you if you could continue using your site as the example so we can stay on the page while we work the tutorial :)

…

On Wed, Apr 4, 2018 at 8:58 AM, Guillaume Plique ***@***.***> wrote: To select items by tag + class, here is what you need to write in CSS: tagname.class So, using artoo, you'd probably do something of the kind: artoo.scrape('tagname.class', ...); — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#280 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AV-HfRa9U8WoYm4MLvwBive8AmQrbQ-5ks5tlO2qgaJpZM4TGKsC> .

-- Think Digital 323-638-9448 760-678-8833 Facebook.com/ThinkDigitalRepair

Answer 7 · 2018-04-04T17:35:56.000Z

OK, so on this page, I'm running
I'm running artoo.scrape('li.card-btn square ', { text: {sel: 'span', method: 'text'}, url: {sel: 'a', attr: 'href'} });
and I'm getting an empty array. I isolated the element that's on the page and pasted it on this pastebin service.
https://dpaste.de/Oz5n
what I wan't to pull out from the page results that look like this

{
    name: 'Yelena M Stepanenko',
    address: 'Spc 157'
}

What am I doing wrong? I also realize that the selector is wrong. I haven't gotten to that part yet I have no CSS background. I'm more of a desktop programmer, so It's a little slower for me to figure this out. Thanks for your patience.

Answer 8 · 2018-04-04T17:49:08.000Z

selector should be li.card-btn.square since you attempt to match two classes.

Answer 9 · 2018-04-04T17:53:14.000Z

Could you type it out? I honestly dont understand it. I just need to get it to match once and I can figure the rest out

…

On Wed, Apr 4, 2018, 10:49 AM Guillaume Plique ***@***.***> wrote: selector should be li.card-btn.square since you attempt to match two classes. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#280 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AV-Hfe_W0NLwRfc7OV0-5oXlu1Z6CgSkks5tlQeWgaJpZM4TGKsC> .

Answer 10 · 2018-04-04T17:59:29.000Z

artoo.scrape('li.card-btn.square', { text: {sel: 'span', method: 'text'}, url: {sel: 'a', attr: 'href'} });

Answer 11 · 2018-04-04T18:01:39.000Z

Oh, you're saying the card-button square is listed as two classes in the html? That's because of the space that's in the class name?

…

On Wed, Apr 4, 2018, 10:59 AM Guillaume Plique ***@***.***> wrote: artoo.scrape('li.card-btn.square', { text: {sel: 'span', method: 'text'}, url: {sel: 'a', attr: 'href'} }); — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#280 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AV-HfR7daq42er_Da9zxSnX_vVKfPi5yks5tlQoCgaJpZM4TGKsC> .

Answer 12 · 2018-04-04T18:04:53.000Z

Yes. You have several classes listed in your example. You should probably do a quick html/css tutorial before scraping. It will definitely help you achieve your goals. Scraping is basically html/css retro-engineering.

Answer 13 · 2018-04-04T18:07:35.000Z

You're amazing, thank you. I'll do more research on this

…

On Wed, Apr 4, 2018 at 11:04 AM, Guillaume Plique ***@***.***> wrote: Yes. You have several classes listed in your example. You should probably do a quick html/css tutorial before scraping. It will definitely help you achieve your goals. Scraping is basically html/css retro-engineering. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#280 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AV-HfS9Y0xRbvK5zKclcLzhnBYAhEjrWks5tlQtHgaJpZM4TGKsC> .

-- Think Digital 323-638-9448 760-678-8833 Facebook.com/ThinkDigitalRepair

Answer 14 · 2018-04-12T04:41:36.000Z

Just going to close this. Researching Jquery and CSS taught me a lot about selectors!

Answer 15 · 2019-03-09T16:53:22.000Z

I should probably find another site as example in my docs

Please do @Yomguithereal -- I need a working example as the sprint board to jump further. thx.

Answer 16 · 2019-03-09T17:29:20.000Z

How about echojs.com?

Answer 17 · 2019-03-09T17:39:18.000Z

Yeah, super.

While you are at it changing the scrapping code, please throw in some comment as well, as you helped me before:

artoo.ajaxSpider(

  // This function is an iterator.
  // Its aim is to return the next url to fecth or false if you want to stop
  //-- 'i' is the index in the iteration of urls
  //-- '$data' is the jQuery-parsed data of the last fetched url
  function(i, $data) {

    // nextUrl is a function that take a jQuery selector and returns
    // the next url to fetch

    // If !i then, we are only starting the spider meaning that the next url
    // is available on the current page rather than the last fetched one.
    return nextUrl(!i ? artoo.$(document) : $data);
  },

  // Spider's settings
  {

    // We want to fetch a maximum of two pages
    limit: 2,

    // We are going to scrape the pages using the scrape definition written above in the doc example
    scrape: scraper,

    // We want to concatenate results so we have [title1, title2, title3, title4]
    // rather than [[title1, title2], [title3, title4]]
    concat: true,

    // Final callback fired when the spider retrieved everything
    //-- 'data' is the scraped data
    done: function(data) {
      artoo.log.debug('Finished retrieving data. Downloading...');
      artoo.savePrettyJson(
        frontpage.concat(data),
        {filename: 'hacker_news.json'}
      );
    }
  }
);

thx