wabarc/cairn

Index.html downloaded only

arch-user-france1 opened this issue · 3 comments

I suppose this should download the whole webpage including the links it's relying on.

However if I run cairn nau.ch it only download the index.html - pictures are still loaded from nau.ch

Is that supposed to be like that?

nau.ch is a web page that uses javascript to dynamically load content. Currently its support by cairn is insufficient. We will need to improve this aspect of the functionality.

Ah okay. But I see links in the html which are still the original link nau.ch and if I change some wget options it works I think...
I also wondered why a "popular" site doesn't have a robots.txt

I have solved my needs using Firefox's gecko driver and python....