Index.html downloaded only

Question

Index.html downloaded only

arch-user-france1 opened this issue 2 years ago · 3 comments

I suppose this should download the whole webpage including the links it's relying on.

However if I run cairn nau.ch it only download the index.html - pictures are still loaded from nau.ch

Is that supposed to be like that?

Answer 1 · 2022-08-09T14:23:09.000Z

nau.ch is a web page that uses javascript to dynamically load content. Currently its support by cairn is insufficient. We will need to improve this aspect of the functionality.

Answer 2 · 2022-08-10T07:03:50.000Z

Ah okay. But I see links in the html which are still the original link nau.ch and if I change some wget options it works I think...
I also wondered why a "popular" site doesn't have a robots.txt

Answer 3 · 2022-09-09T06:21:16.000Z

I have solved my needs using Firefox's gecko driver and python....