How to extract the contents of elements such as div
Closed this issue · 3 comments
How to extract the contents of elements such as div ?
I suspect all "body" of divs..
julia> for elem in preorder(body)
#println(elem)
if typeof(elem)==HTMLElement{:div} push!(divy,(elem)) end
end
julia> unique(divy)
289-element Array{Any,1}:
Gumbo.HTMLElement{:div}
Gumbo.HTMLElement{:div}
Gumbo.HTMLElement{:div}
Gumbo.HTMLElement{:div}
Gumbo.HTMLElement{:div}
Gumbo.HTMLElement{:div}
Paul
If you have an HTMLElement elem
, children(elem)
will return the array of it's child nodes. So concretely in you example children(divy[1])
would get the children of the first div, etc. Does that work for you?
OK, now i have (add children):
divy=[]
for elem in preorder(body)
#println(elem)
if typeof(elem)==HTMLElement{:div} push!(divy,children(elem)) end
end
if I call now divy[1]
julia> divy[1]
3-element Array{Gumbo.HTMLNode,1}:
Gumbo.HTMLElement{:p}
Gumbo.HTMLElement{:span}
Gumbo.HTMLElement{:script}
but if I call divy[1:1] i have the body of first div.
Question: Is it the best way to get body od elelemnts ?
julia> divy[1:1]
1-element Array{Any,1}:
Gumbo.HTMLNode[Gumbo.HTMLElement{:p}:
Prawdopodobnie dawno nie było Cię w Wirtualnej Polsce. Zobacz jak się zmieniła!
,Gumbo.HTMLElement{:span}:
,Gumbo.HTMLElement{:script}:
]
ok, i see :)
After
divy=[]
for elem in preorder(body)
#println(elem)
if typeof(elem)==HTMLElement{:div} push!(divy,(elem)) end
end
divy[1] return body of this div, ....nice...
julia> divy[1]
Gumbo.HTMLElement{:div}:
Prawdopodobnie dawno nie było Cię w Wirtualnej Polsce. Zobacz jak się zmieniła!
<script> $('.mail-info--close').click(function () { $('.mail-info').slideUp(); }); (function($){var getUrlParam=function(paramName){var regEx=new RegExp("[?& ]"+paramName+"=([^&#]*)"),currSearchUrl=document.location.search,resultsArray=currSearchUrl.match(regEx);if(result sArray){return resultsArray[1]}return false},wzp=getUrlParam("wzp");if (wzp=='wp'){$('.mail-info--content p').prep end('Wylogowano z Poczty WP. ');$('aside.mail-info').show();} else if (wzp=='o2') {$('.mail-info--content p').prep end('Wylogowano z Poczty o2. ');$('aside.mail-info').show();};return}(WP.$)); </script>julia>