JuliaWeb/Gumbo.jl

what to do ? ERROR: automatic download failed (error: 2148270088): http://gazeta.pl

Closed this issue · 1 comments

julia> using Gumbo

julia> using AbstractTrees

julia> using StringEncodings

julia> getpage(url) = parsehtml(String(read(download(url))))
getpage (generic function with 1 method)
ERROR: automatic download failed
What to do ?
julia> text_only(doc::HTMLDocument) = text_only(doc.root)
text_only (generic function with 2 methods)

julia> text_only(frag) = join([text(leaf) for leaf in Leaves(frag) if leaf isa HTMLText], " ")
text_only (generic function with 2 methods)

julia> get_page_text(url) = text_only(getpage(url))
get_page_text (generic function with 1 method)

julia> doc=parsehtml(decode(read(download("http://gazeta.pl")), "iso-8859-2"))
ERROR: automatic download failed (error: 2148270088): http://gazeta.pl
Stacktrace:
[1] download(::String, ::String) at .\interactiveutil.jl:598
[2] download(::String) at .\interactiveutil.jl:632

Paul

This is definitely not a problem with Gumbo. I'm not sure where that "automatic download failed" error is coming from, but it's definitely not Gumbo, since Gumbo doesn't deal with downloading HTML from the web, only parsing it once you have it in memory. I'm guessing something is wrong in your download function, wherever that came from.