Empty array with multiple sites
Closed this issue · 2 comments
Hey!
I have the following code:
var sites = [
{
"url": "https://www.origo.hu/index.html",
"selector": "a.news-title",
"side": "right"
},
{
"url": "https://index.hu",
"selector": "h1.cikkcim a",
"side": "left"
}
]
for (site of sites) {
superagent
.get(site.url)
.end((err, res) => {
htmlMiner(res.text,{
_each_: site.selector,
text: function(arg) {
return arg.$scope.text();
},
href: function(arg) {
return arg.$scope.attr('href');
}
})
});
}
The parsing of one site does always work. But when using two sites, it returns the results from one site, and an empty array for the another site.
Why? I guess it doesn't parse it fast enough? How could I make it wait and then print the results?
Can you help me?
Thank you
Hi @daaniiieel,
htmlMiner is a synchronous task, that means that it takes its time to parse the full HTML.
The issue here is the combination of a for-of
loop with an asynchronous task.
for (site of sites) {
superagent
.get(site.url)
.end((err, res) => {
console.log(site.selector)
});
}
You might ordinarily expect this code to print a.news-title
and then h1.cikkcim a
, but it outputs h1.cikkcim a
two times.
If you can use ES2015 syntax in your node.js project, you can easly solve the issue just using let
or const
instead of the implicit variable declaration.
- for (site of sites) {
+ for (const site of sites) {
superagent
.get(site.url)
.end((err, res) => {
const json = htmlMiner(res.text,{
_each_: site.selector,
text: function(arg) {
return arg.$scope.text();
},
href: function(arg) {
return arg.$scope.attr('href');
}
})
console.log(json)
});
}
Let me know if this solves your issue.