Crawler broken?
yashha opened this issue · 8 comments
Also noticed.
Does this happen for all characters?
No idea what happened. Just restart the app
Unfortunately both my backdoor and my crystal ball are broken 😉
So let me get another couple questions, maybe we can figure this out together :)
- The server crashed and I have to move all the website and things to another instance
- I didn't copy the CSV data (because I thought it would generate itself from the database) and just linked the DB.
- I noticed that the twitter graphs weren't working, so I recovered an old email @marcusnovotny send me with some CSV, copied them to the CSV folder on the server and bam: everything was "back to normal"
This is already something I don't quite understand... What do you need the CSVs for? Is it telling the app that they should look for the data on the database?
Now second thing is:
- I log into the db to see if there are new tweets, I just performed a find one on
charactersentiments
:
db.charactersentiments.findOne()
{
"_id" : ObjectId("570179d679429d1e795c2492"),
"name" : "Robb Reyne",
"slug" : "Robb_Reyne",
"total" : 22,
"positive" : 0,
"negative" : 0,
"popularity" : 0,
"heat" : 22,
"updated" : ISODate("2016-05-08T13:06:34.475Z")
}
- As you see, there is an update that dates a couple days back (5, when the server crashed).
- I look for the guy on got.show ( https://got.show/characters/Robb%20Reyne ) but he has no twitter graph (although there should be 22 useless ones :D )
- I search for someone else, hyper:
> db.charactersentiments.findOne({"name":"Petyr Baelish"})
{
"_id" : ObjectId("570179d679429d1e795c2a63"),
"name" : "Petyr Baelish",
"slug" : "Petyr_Baelish",
"total" : 21599,
"positive" : 7177,
"negative" : 3349,
"popularity" : 3828,
"heat" : 21599,
"updated" : ISODate("2016-05-12T08:51:49.124Z")
}
- I see updated today (which is good), though on the website https://got.show/characters/Petyr%20Baelish no updates since Apr 10
Any ideas?
I didn't copy the CSV data (because I thought it would generate itself from the database) and just linked the DB.
Yes it should do so on the first start and on every update to that character afterwards.
This is already something I don't quite understand... What do you need the CSVs for? Is it telling the app that they should look for the data on the database?
Because aggregating the tweets on-access is not an option. It involves some heavy I/O in both the DB itself and transferring the tweets to the server for analysis.
The CSVs are some kind of cached result. The server then just has to serve static files.
Remember the difference between the cached and the non-cached API (Your LRZ guy showed my some graphs yesterday)? The difference here should be a few magnitudes larger 😉
My assumption is, that the crawler can not write the CSVs for some reason. Check file permissions, logs etc.
@julienschmidt checked :( Seems to be all working! Pffffff
Maybe it makes sense to run an instance locally with the same DB and at times copy over the CSVs?
status? @julienschmidt @sacdallago :)
Fixed (was not an ez-pz problem to solve)