Feature: Limit fredy specific hours for deployment on a server
tsteffek opened this issue · 6 comments
Hi!
Great project you got here. It really took my flat searching to the next level.
So, about this feature. My thinking was: if I only have 1k requests on Immoscout per month (because 20€ per month for 10k isn't exactly a bargain), it would be ideal to only scrape Immoscout when I was able to respond anyway. Obviously it's useful for a lot more applications, like reducing Fredy's spam during times you can't answer anyway and so on.
The best thing: I already implemented it over at my fork.
The not so best thing: I don't really have the time for thorough testing and stuff, so it's probably pretty rough around stuff.
Things to improve I've noticed:
- validation: currently it's possible to have begin time be after end time. This leads to never-working hours. Someone would have to either throw an error for this or consider that the end time might be the next day.
- Similar issue with midnight, as the date pickers display that as
0:00
. - design: I've just thrown two inputs in there, they don't really fit the theme.
- editing: currently you can't, you have to delete and reenter. I was a bit lazy there.
- actual testing: I mean, it works, but, you know how it is.
I realize you might have some financial stakes in actually getting people to buy ScrapingAnt memberships, so I'm not mad if this doesn't interest you.
If you're interested, hit me up with how I can help you transition the changes to your repo. Since it's not quite finished it would probably be ideal to set up a PR to a feature-branch on your side.
Hi @tsteffek
first of all, thanks for contributing, this is the heart and soul of open source and every contribution/idea is highly appreciated :)
Let me answer this step by step:
because 20€ per month for 10k isn't exactly a bargain
If you use the 10% voucher (written in the readme) and you convert dollar to eur, then you pay ~15eur per 10k request which I find extremely fair
if I only have 1k requests on Immoscout per month
If you run the job every hour for 30 days in a month, that's 720 requests. If you have 2 jobs, you can still run them every 1 25mins (roughly) without breaking the rate limit...
I realize you might have some financial stakes in actually getting people to buy ScrapingAnt memberships
No I don't get any money. Not at all. It would be nice to have some sort of compensation for my work, but I don't rely on that tbh ;)
In general, I'm not sure I like the idea of working hours. Fredy is meant to do the "searching job" for you, running as often as possible to give you the best chance to find the apartment you're looking for. Sometimes, the best apartments are being added to the service(s) during off hours (trust me, I've done a lot of research on this).
I also don't really understand the part of "spam". I'm not sure how specific your search params are, but even if I run Fredy with very generic params, I'm not getting more than 10-15 offers a day...
Lastly I want to emphasis that I want to support the folks at scrapingAnt. They provide their service mostly for free, but that doesn't mean that it's free for them. Each request is a headless browser that's being loaded. Every request costs them real money, thus if you need more than 1000 requests for some reason, I'd highly recommend to pay the 15 eur as long as you're searching for a flat. You're helping the community with this. (again, I'm not getting any money here, it's just my opinion)
That being said, if you still want to implement "working hours", I'd suggest to not add it into the ui. I don't think this is a feature that many users would use, so I'd say it's better suited to be in the config.json.
wdyt?
Right, first off, I'm a fellow Programmer so I know how much hard work flows into a service like ScrapingAnt and I'm all for supporting them. And if that is your reasoning to not include the feature, that is completely fine with me.
Yet I want to elaborate my side a bit. I'm currently studying and living with below ~750€ a month (that's before rent), so I'm naturally wondering whether I have to pay 15€ per month, especially with a new flat coming up which will require furniture, maybe a kitchen... However, that's not all. If I was using the 10k requests, I'd gladly pay. But with one Immoscout search running every 30 minutes I'm ending up with 440 requests over, so it's 15€ for 440 requests.
Ok, apart from my poor ass, why is it universally useful? I've set up a Telegram Group with my girlfriend Fredy posts to. You can't configure Telegram to mute a group between specific hours, e.g. while we sleep, and it's a hassle to mute it for the next 8 hours every evening. So instead I set up Fredy to not post during those hours. Yes, there might be an interesting flat appearing at 2 am, but best case I'm sleeping worst case Fredy just woke me up but seriously I'm still not going to apply for a flat at 2 am when I've been asleep already. So either way it was a wasted look-up (for both me and the ScrapingAnt guys'). Additionally, coming morning I'll have to sift through the offers during night instead of just getting a collection of the latest offers still online.
Altogether I think that's a valid use-case for a lot of people.
About not implementing it to the UI, I already did. Like I said, it's all already there on my fork (branch working_hours
), you just have to grab it and maybe write a test for it. Go check it out. I'd love to have the time to complete the PR and finish it myself (maybe in some weeks), but I'm afraid I'll have to leave it in your hands from here on out. If you only want to pick the backend part and move it from db (I've implemented it per-job) to config, fine by me.
(P.S.: about the spam part.. we're searching in Berlin and set up 6 providers. We're getting at least 1 post every half hour, most of the time it's 3-4. Some are duplicates because people insert their offers at e.g. both Immonet and Immowelt, filtering those might be another interesting feature request.)
@tsteffek I assume you don't have the time to check this, thus I merge it. Hopefully you like the feature(s)
Yea exactly, sorry. Looks nice. Setting that on config level is probably fine, if we're assuming people use that to not be bothered while sleeping.
On the other hand I'm always for maximum flexibility and was actually wondering whether the interval should be switched to Job level as well - apparently there are offers in Berlin which stay online for only 3-5 minutes, so one could adjust that for different sites/locations. That might be personal preference tho.
Interesting idea