How to parse element attributes
Closed this issue · 4 comments
AlainPilon commented
I want this script to return an array of hashes having [name, url]. But since the iterator returns what is INSIDE the a tag, I can't figure out how to get the info.
require 'wombat'
video_url = 'https://vimeo.com/26594942'
result = Wombat.crawl do
base_url video_url + "/likes"
path "/"
likers "css=.browse_people li a", :iterator do
name "css=p.title"
url "[href]", :html do |link|
link
end
end
end
puts result
felipecsl commented
you have to use xpath for that, something like url { xpath: './@href' }
should work
felipecsl commented
Here is an example of something I did in the past:
products "css=.list-view>li", :iterator do
thumb({ xpath: ".//img/@src" })
url({ xpath: ".//a[1]/@href" })
details({ css: "h3.list-view-item-title a:first-child" }, :follow) do
end
end
AlainPilon commented
Thanks, guess I will have to learn about xpath now!