Scheduling breaks completely on long intervals
fweb-dev opened this issue · 47 comments
I stumbled upon toad-scheduler while looking for an improved alternative to the native setInterval and I'm liking the simplicity of this project, great job on that aspect.
However this library doesn't fix a crucial core problem of vanilla Timers: The timeout length must fit into a signed 32-bit integer.
It's a quite frankly pretty unnecessary limit of NodeJS and most browsers, you can not set a timeout for longer than
2 ^ 31 - 1 = 2 147 483 647
milliseconds. Now that seems like a lot but take this example:
const { ToadScheduler, SimpleIntervalJob, Task } = require('toad-scheduler')
const scheduler = new ToadScheduler()
let myTask = new Task('someString', () => { console.log("I've waited a long time for this") })
let myJob = new SimpleIntervalJob({ hours: 720, }, myTask) // or { days: 30, }
scheduler.addSimpleIntervalJob(myJob)
The goal of this code would be executing the Task every 30 days, which is a reasonable interval time for server sided applications in my opinion. However this will instantly break with this warning:
(node:4229) TimeoutOverflowWarning: 2592000000 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.
That's because neither setTimeout or setInterval can handle anything bigger than 24.85 days of a timeout. Now since this module also uses these functions without modification, the same limits apply and the timeout gets set to 1ms right after the warning, meaning your console will be spammed into oblivion when running this.
Now I realize this scheduler is more focused on shorter, more repetitive tasks but I still think this is a fix-worthy issue. Especially because this shouldn't be extremely hard to fix once you understand where this limit applies.
I would probably write a wrapper function around the native setInterval and make sure that
- A - No delay value higher than the limit of 32-bit integer is ever passed to setInterval or setTimeout, as this will create the node warning and mess up the Interval completely
- B - Too big delay values are capsuled in multiple intervals that keep rescheduling themselves until the desired delay is met, then call the callback function and do it all over again
Not anytime soon as I'm pretty busy with other projects atm but some time sure, would be happy to help
Maybe you can think of a more elegant solution aswell, this is just what I thought about off the top of my head without looking deeply into anything
@fweb-dev I've made the library throw an error in such case in 1.4.0 to reduce the amount of surprise, but it would be great to have a proper fix. I don't think I'll have time to implement it myself in the nearby future, though.
btw, another interesting thing about using setInterval
: https://stackoverflow.com/questions/8173580/setinterval-timing-slowly-drifts-away-from-staying-accurate
Not sure if it actually matters in majority of cases, but worth keeping in mind. Probably a more accurate auto-compensating engine should be created in the future.
Oooo looks interesting. So the issue is with setTimeout having a value limit? I looked at the suggestion about how to fix it, and "A." makes sense to me about checking whether the value is greater and throwing an error, it looks like you already did that?
For the second suggestion B, could you explain to me what is meant by "big delays are capsuled in multiple intervals that keep rescheduling themselves"?
Thanks :)
@AndreWillomitzer Yeah, part 1 is done.
Part 2 works like this:
Imagine that limit for a single interval is 100 msecs, and user is trying to schedule 400 msecs. We will create a wrapper that will be counting hops remaining (3) and then schedule itself to fire in 100 msecs. After it is fired, it will decrease amount of hops remaining and then schedule itself to fire in 100 msecs. Repeat until 0 hops are left. Then original task is fired, and a new wrapper with 3 hops is created.
Additional details - I would prefer to keep SimpleIntervalJob simple, so creating LongIntervalJob for that purpose is preferable. Probably SimpleIntervalEngine could be reused (and renamed to IntervalEngine that supports both), but I'm not sure about that yet, feel free to add new engine if needed.
Hmmmmm well adding a new engine seems a bit "out of my scope" if that makes sense. I get what you mean about counting the intervals remaining.
I am just not sure at all how this method would differ from the SimpleIntervalJob file other than the "start()" method handling the const time being greater than the limit.
I guess this is a basic question as well but what do you mean by "firing a task" and "create a wrapper"?
@AndreWillomitzer Yeah, avoiding creating new engine would be preferable :)
other than the "start()" method handling the const time being greater than the limit.`
pretty much this :)
"firing a task"
Executing payload of a task, provided by user for a job. Job defined when something happens. Task defines what happens.
"create a wrapper"
Before we can execute original Task as defined by user, we need to be creating and executing interim tasks that would be doing the time-keeping, checking if it's already time to execute original task. That's what I mean by wrapper, some higher-level job that keeps track of how much time passed since original task creation.
Well, I see what the issue is... could you give an example of creating interim tasks?
I know that we need to compare the current time, with the time they want to execute the task and if it's 0 then it's time.
But given that I'm new (as of this morning) to this project, could you provide an example of creating a "higher-level-job" to manage something?
I appreciate you talking to me about this by the way!
@AndreWillomitzer Take a look at this: https://github.com/kibertoad/toad-scheduler/blob/main/lib/engines/simple-interval/SimpleIntervalJob.ts
This is a job. It gets created with a schedule (telling when task is to be fired), and a task (telling which logic to execute).
The way I see it, if you get a schedule that results in exceeding the limit, within constructor you would have to create a new job calling new SimpleIntervalJob()
, and passing to it a schedule of maximum supported length
and a newly constructed task that would include aforementioned logic of checking how many hops are left, and creating a new SimpleIntervalJob in case there are still hops left. Note that this would result in job having a child job as a field, and calling start and stop operations should be invoked on a child job when called.
Please let me know if you would like more detailed explanation of any of those steps!
Thanks that helps! In which constructor are you referring to for calling "new SimpleIntervalJob()"? I'm also a little confused about how to tell if the schedule is exceeding the limit.
A job constructor receives schedule and task. Schedule has "toMSecs()" method which you use to check how many miliseconds to set the job interval execution to right? So as we said before, I can check if "time" variable is greater than 28 days, if it is, I should... call the "new SimpleIntervalJob()" inside where exactly?
Sorry haha this is my first attempt at actually contributing so I want to understand and ask a lot of questions first :)
constructor(schedule: SimpleIntervalSchedule, task: Task | AsyncTask, id?: string) {
super(id)
this.schedule = schedule
this.task = task
}
This is the constructor that I have in mind, I think this is the best place to evaluate if job is over limit and child jobs need to be created.
There is already logic for determining too long jobs, actually! In the same file, see
// See https://github.com/kibertoad/toad-scheduler/issues/24
if (time >= 2147483647) {
throw new Error(
'Due to setInterval limitations, no intervals longer than 24.85 days can be scheduled correctly. toad-scheduler will eventually include a workaround for this, but for now your schedule is likely to break.'
)
}
Okay so I would call "new SimpleIntervalJob()" in the constructor? Like:
constructor(schedule: SimpleIntervalSchedule, task: Task | AsyncTask, id?: string) {
super(id)
if(this.schedule.toMSecs() > MAX JOB LENGTH){
}else{
this.schedule = schedule
this.task = task
}
}
Or do you mean that I should create the new SimpleIntervalJob() inside the place where the error is now?
No, inside the constructor, just reuse existing logic for determining when interval is too long. Do you think you are proficient enough with JS to write the wrapper task yourself, or would you prefer me to do it? I don't mind taking part of the task, if you create a branch and do the basic structure for it, but also happy to guide you through it.
Hmmmmm well, I do want to help in some way if I can so it would be cool if we could collaborate like that. I would say JS is my most proficient language I am just a student still so sometimes it requires a bit of extra "umph" and explanation.
It's part of why I was bummed out on Colorette because I feel like I totally could have fixed that issue.
I want to learn what I can, but in the case of this issue I am just not sure which part to start with to get going on it. When you say "reuse existing logic in constructor", do you mean this:
constructor(schedule: SimpleIntervalSchedule, task: Task | AsyncTask, id?: string) {
super(id)
this.schedule = schedule
let scheduleTime = this.schedule.toMsecs();
if(scheduleTime >= 2147483647){
//not sure what to do here exactly.
}
this.task = task
}
Well maybe I will attempt to tackle this, I actually technically have 1 month to fix a "bigger" issue, so I can always crack away at it and see how it goes. My professor recommended this. He said just dig in and work on it to get a better idea of the issue at hand.
If there is no rush to fix this, I would like to try. No promises, sir!!! :)
Thanks for being patient.
Andre
When you say "reuse existing logic in constructor", do you mean this:
Yes, exactly. And inside would be that logic that I mentioned, creating a child job with wrapper task in it, which I guess we would need to discuss in more details :). Do you understand how it's supposed to work on a high level?
I actually technically have 1 month to fix a "bigger" issue
Sounds good! And in this case this one is definitely more of a bigger issue than Colorette one was :D
Right, that sounds like a good approach. So I'd use new SimpleIntervalJob() inside the logic. The part I am not sure about is the "wrapper task". I would pass this job a schedule and the task with the logic included to check hops. You mentioned earlier:
"Imagine that limit for a single interval is 100 msecs, and user is trying to schedule 400 msecs. We will create a wrapper that will be counting hops remaining (3) and then schedule itself to fire in 100 msecs."
The idea is that you'd have to execute the internal job creation/hop thing 4 times once per 100ms interval. I just don't quite know how it would look. Would I use setInterval() with the max length of time to "schedule itself to fire"?
Thanks :D
@AndreWillomitzer Create a task which knows that we need to do 4 hops. Create job that triggers in 100 ms. Task fires, sees there are hops to be done, creates another task with 3 hops, creates another job, we keep going.
Ideally we shouldn't use any external setIntervals, just rely on existing job/task mechanism.
I think I need to Fork and play around with the tool to see how the task/job mechanism works better.
I appreciate you letting me try to fix this. My prof says it's a cool bug to work on haha.
Heya sir it's me, sorry got busy with school :)
I was wondering, how can I test any changes I make to the code? For obvious reasons, if I "npm i" it uses the published package and not my test one. So, if I want to try out the task/jobs with my changes, how do I do that?
Forked the repo/cloned it and had a peek around and then got stuck about how I'd test it.
Thanks!!!
@kibertoad Hello sorry if you're busy Igor.
I was able to test the app and got the counter working with a console.log.
So far this change inside SimpleIntervalJob still gives the error so that's the first step:
But as you can see I'm still a little confused about the "hops" and how to get it to recognize how many seconds we are over the limit and how to do the wrapper haha.
I need to wrack my brain about our whole "hops" conversation too. I guess it's like task-ception because I'm creating a task inside the Job engine file lol. Do you have any tips?
Sorry, missed your previous message. I think best approach would be through tests. There is an existing one for long intervals, you can adjust it to check if it triggers the expected amount of times instead throwing an error.
And yes, it is exactly task-ception. The way I see it, we would create a new task on each hop, and store number of remaining hops in it.
I see, thanks! I assume this is the file you meant? https://github.com/kibertoad/toad-scheduler/blob/main/test/SimpleIntervalJob.spec.ts#L249
I've used Jest once before in my life in school... how can I run a test I create?
I assume instead of throwing the error I'd open up the arrow function with { }, and inside that simple job I'd create a task (correct me if I'm wrong)?
Where I'm still lost is what am I supposed to do with this "hop" knowledge? How do I get the number of hops? I don't quite understand how spawning the child jobs/tasks will allow the user to have their job execute every 30 days for example haha. Could you help me understand that?
Thanks again for being cool and I think I'm slowly understanding :)
I see, thanks! I assume this is the file you meant? https://github.com/kibertoad/toad-scheduler/blob/main/test/SimpleIntervalJob.spec.ts#L249
Yup, that is the spot!
https://stackoverflow.com/questions/42827054/how-do-i-run-a-single-test-using-jest
I assume instead of throwing the error I'd open up the arrow function with { }, and inside that simple job I'd create a task (correct me if I'm wrong)?
I think constructor would be more appropriate place for that logic than where we throw an error now, but I might be wrong. Worth experimenting with it a bit.
I don't quite understand how spawning the child jobs/tasks will allow the user to have their job execute every 30 days for example
If you want to trigger something in 10 seconds, you can create one task for 1 second, then after that one finishes another one for one second, all the way till 10 seconds pass. That's how you can break one period that is too long into a series of shorter periods.
How do I get the number of hops?
I've though about it some more, actually having just one child task is sufficient. What you can do is this:
const initialHops = 10
let hopsLeft = initialHops
const task = new Task('simple task', () => {
hopsLeft--
if (hopsLeft === 0) {
// fire the real payload and reset the amount of hops
}
})
Hmmmm... so if I'm understanding you correctly, I should maybe take the amount of time in milliseconds that they're OVER the setInterval limit, and use that as my way to find the initial hops.
If they schedule 25 days like in the example, I need to convert that to milliseconds (using your function) and then find out the difference between that and the limit so I can "eat" that time with the tasks like you mentioned? Then when I don't have any more time to eat, I fire the original task like you have in the if statement.
Does that sound accurate? One thing I'm not sure about is how often I would fire the "time eater" task,and what did you mean by reset the amount of hops (inside the if).
Thanks :D
If they schedule 25 days like in the example, I need to convert that to milliseconds (using your function) and then find out the difference between that and the limit so I can "eat" that time with the tasks like you mentioned?
Yeah, I think that's correct. And obviously you would need to split that time to be eaten over supported interval repeated N times.
One thing I'm not sure about is how often I would fire the "time eater" task
You would have to calculate that. E. g. if setInterval limit is 1 day, and your time is 6 days 4 hours, you invoke timeeating task 6 times, and then fire real task in 4 hours.
what did you mean by reset the amount of hops (inside the if).
After real task was triggered, you need to wait for 6 days and 4 hours again, so you need to start the timeeating task all over.
hey @kibertoad hope all is well. We had Thanksgiving this weekend and I also got to commit to Jasmine framework so that was sick. I will be focusing on this more in the next 2 weeks as I hope it to be my last pull request of my 4 :D
Just letting ya know so you don't think I forgot about the project haha
Hi there sir, I was wondering if you could take a look at this test. As of now I was wondering if a while loop is a good solution. My professor had a great idea of using "polling" with my while loop. Basically, get the time in the future by taking the date.now() + however many milliseconds they schedule. Then do a while loop checking if the current time is still less than the future time we wanted to execute the task, and if it is, we waste some time.
Let me know what you think and I'm not sure how to do "scheduleMs" as I cannot actually pass schedule to the toMsecs() function.
@kibertoad
Thanks :)
it('correctly handles very large intervals', () => {
//a job is scheduled every "x" days/minutes/hours, and fires a task every interval.
//I need to see how long they schedule for and create tasks/jobs until the time between the max and how long after the max they schedule it for has allotted.
let counter = 0
const scheduler = new ToadScheduler()
let scheduleMs = Date.now() + toMsecs(scheduler) //find the time of the future date to execute the task.
const task = new Task('original task', () => {
console.log('testing the task fired')
})
while (Date.now() < scheduleMs) {
const timeEater = new Task('time eating task', () => {
console.log('doing some time wasting...')
})
const timeJob = new SimpleIntervalJob(
{
runImmediately: true,
},
timeEater
)
//maybe addSimpleIntervalJob(timeJob)???
}
// fire the real payload and reset the amount of hops
const job = new SimpleIntervalJob(
{
runImmediately: true,
},
task
)
scheduler.addSimpleIntervalJob(job)
counter++ //this is how we know the original payload fired.
/* expect(() => scheduler.addSimpleIntervalJob(job)).toThrow(/can be scheduled correctly/) */
expect(counter).toBe(1)
scheduler.stop()
})
@kibertoad Hey sir are ya there? Sorry for radio silence I've been job interviewing for co-op's (internships) and also doing classwork. I have next week off if you would be able to help me finish this issue.
All the best,
Andre :)
Sorry for the delay in answering. Looking at it now.
Can you explain what the problem that you have is? Considering that timeeating logic needs to exist inside the toadscheduler itself (probably inside the SimpleIntervalJob, but maybe somewhat higher), you should have access to the real schedule in there.
@kibertoad I think I made a tiny bit of progress...
constructor(schedule: SimpleIntervalSchedule, task: Task | AsyncTask, id?: string) {
super(id)
let scheduleMs = toMsecs(schedule) //find the time of the future date to execute the task.
while (scheduleMs > 2147483647) {
//while the scheduled time is greater than the allowed maximum
scheduleMs = toMsecs(schedule) //keep poling every time the loop runs.
const timeEater = new Task('time eating task', () => {
console.log('doing some time wasting...')
})
const timeJob = new SimpleIntervalJob(
{
seconds: 1,
},
timeEater
)
scheduler.addSimpleIntervalJob(timeJob)
}
// fire the real payload
const job = new SimpleIntervalJob(
{
runImmediately: true,
},
task
)
scheduler.addSimpleIntervalJob(job)
this.schedule = schedule
this.task = task
}
I think this is a good way to do it where I get the schedule time, and then check if it's greater than the max, and keep checking every time the loop executes. Maybe a bit inefficient but it could work...?
Thanks for being patient and I guess ignore my other question because I was trying to do the coding in the test file, when doing it inside the SimpleIntervalJob would have been smarter.
Edited Later: I think I should fire the original task after scheduleMs, not instantly. Because after the loop, the scheduleMs SHOULD be guaranteed to be less than the maximum right?
Let me know what ya think :D appreciate it my dude.
@kibertoad Okay kind of think we're getting close sir. In this version, I check how much they're over by in MS. Then I run a while loop, and reduce the time by 1000ms every time the loop runs, and also execute some dummy task every 1 second.
I had an issue with this method though, it fails all the tests when I run npm test because it says "maximum call size stack exceeded" so I believe I am running the loop too many times. Do you have a suggestion for how to limit this execution?
export class SimpleIntervalJob extends Job {
private timer?: Timeout
private readonly schedule: SimpleIntervalSchedule
private readonly task: Task | AsyncTask
constructor(schedule: SimpleIntervalSchedule, task: Task | AsyncTask, id?: string) {
super(id)
let scheduleMs = 2147483647 + toMsecs(schedule) //find the tptal of the schedule time and the limit.
while (scheduleMs > 2147483647) {
//while the scheduled time is greater than the allowed maximum
scheduleMs = scheduleMs - 1000 //minus 1000ms
const timeEater = new Task('time eating task', () => {
console.log('doing some time wasting...')
})
const timeJob = new SimpleIntervalJob(
{
seconds: 1,
},
timeEater
)
scheduler.addSimpleIntervalJob(timeJob)
}
// fire the real payload
const job = new SimpleIntervalJob(
{
milliseconds: scheduleMs, //scheduleMs should be 2147483647 I think?
},
task
)
scheduler.addSimpleIntervalJob(job)
this.schedule = schedule
this.task = task
}
@kibertoad Would you recommend maybe doing things in start() function rather than constructor? Sorry brother I am just really really stuck now.
@AndreWillomitzer I think problem is that you create an infinite loop, when constructor keeps invoking another constructor ad infinitum - you only ever need to create one timeeating task at a time, and only create the next one after first one concludes and if target time was not reached yet.
@AndreWillomitzer I think problem is that you create an infinite loop, when constructor keeps invoking another constructor ad infinitum - you only ever need to create one timeeating task at a time, and only create the next one after first one concludes and if target time was not reached yet.
Yeah I think it's a infinite loop problem. Hmmmmmm so, instead of inside my while loop where do you think I should create this initial timeeating task?
I am also somewhat unsure of if the constructor is the right place, because didn't you have the "time" check inside start() function before?
@kibertoad I am sort of stuck about where to create the timeEating task instead of the while loop, and also once I create this timeEating task in the constructor, do I need to create a job with it and then addSimpleIntervalJob()?
I am not totally sure what you meant by "create one timeeating task at a time". The idea of doing it if the target time is not reached is what I wanted to do using my while loop but that seems to overload the stack.
@AndreWillomitzer You can't create all of your timeeating tasks immediately. See our previous discussion. You create one. You wait for it to finish. Then it checks if we are already there and can fire out original task, or not. If not, we create another timeeating tasks and keep repeating the process. There should be no while loop.
I think it might be easier to create an internal job within a job, and store it there, without explicitly calling addSimpleIntervalJob, and just proxy all calls to outer job to be invoked on inner job if it exists.
@kibertoad Hmmmmm... I wonder about this way? I check if the MS is over, and how much by, and then inside the timeeating task I subtract how much I'm over by and fire the dummy task after the "overTime" has elapsed.
constructor(schedule: SimpleIntervalSchedule, task: Task | AsyncTask, id?: string) {
super(id)
let scheduleMs = toMsecs(schedule) //find the tptal of the schedule time and the limit.
if (scheduleMs > 2147483647) {
const overTime = scheduleMs - 2147483647 //find out how much we're over by
const timeEater = new Task('time eating task', () => {
scheduleMs = scheduleMs - overTime
console.log('doing some time wasting...')
})
const timeJob = new SimpleIntervalJob(
{
milliseconds: overTime,
},
timeEater
)
scheduler.addSimpleIntervalJob(timeJob)
}
if (scheduleMs <= 2147483647) {
// fire the real payload
const job = new SimpleIntervalJob(
{
milliseconds: scheduleMs, //scheduleMs should be 2147483647 I think?
},
task
)
scheduler.addSimpleIntervalJob(job)
}
this.schedule = schedule
this.task = task
}
I super appreciate you helping me along by the way, as I'm sure you could have flown through this yourself easy enough!
@kibertoad Even without the loop I still get the max stack size error. One thing I am confused about is how can I recursively create the tasks without running into this issue?
I understand what you mean by check if we're over the limit, and create a task. But how would I check if it's time to fire the original task from inside my dummy task? I thought about using
if(Date.now() - scheduleMs === 0){
}
It's really the recursion of creating the timeeating task that's confusing me a lot if you can try to help me understand that part(again, sorry, I know you tried before). Mostly I learned if you want to do something many times you use a loop but in this case I can't so I'm really stuck on how to make the task over and over until I shouldn't anymore without a loop.
@kibertoad Hi sir, made some progress I believe. It passes 4 of the tests but for some reason it's hanging on my long interval test. Based on this new constructor code and my test, do you have any tips on why it's stalling out on the command line?
Here is the new constructor code and the test case.
Constructor
constructor(schedule: SimpleIntervalSchedule, task: Task | AsyncTask, id?: string) {
super(id)
const futureDate = Date.now() + toMsecs(schedule) //find when in the future we want to execute.
let scheduleMs = toMsecs(schedule) //find the tptal of the schedule time and the limit.
const overTime = scheduleMs - 2147483647 //find out how much we're over by
const timeEater = new Task('time eating task', () => {
scheduleMs = scheduleMs - overTime
console.log('doing some time wasting...')
if (Date.now() === futureDate) {
//if our current date matches the time when we wanted to execute
// fire the real payload
const job = new SimpleIntervalJob(
{
runImmediately: true, //scheduleMs should be 2147483647 I think?
},
task
)
scheduler.addSimpleIntervalJob(job)
} else {
const timeJob = new SimpleIntervalJob(
{
milliseconds: overTime,
},
timeEater
)
scheduler.addSimpleIntervalJob(timeJob)
}
})
this.schedule = schedule
this.task = task
}
Test Case
it('correctly handles very large intervals', () => {
let counter = 0
const scheduler = new ToadScheduler()
const task = new Task('simple task', () => {
counter++
})
const job = new SimpleIntervalJob(
{
days: 25,
},
task
)
scheduler.addSimpleIntervalJob(job)
expect(counter).toBe(0)
jest.advanceTimersByTime(2159999999) //just shy of 25 days
expect(counter).toBe(0)
jest.advanceTimersByTime(1) //making it 25 days
expect(counter).toBe(1)
scheduler.stop()
})
@kibertoad Hey! So my prediction was right, we have to do another PR for a repository by Nov 19th. If you can think of any moderately tough/interesting thing I could do for a PR let me know :)
Could be something like splitting things into modules or writing some new tests for Jest. Whatever is needed I suppose.
@AndreWillomitzer A couple of ideas off the top of my head:
- Browser tests using Karma;
- cron-based job type using cron-parser;
- Support for passing custom
loggingErrorHandler
.
@kibertoad Hmmmm could you give me an idea of what 3. might entail? Sounds cool. I see there's a new Logger.ts file (not sure if that was there when I worked on this honestly). Thanks!
@kibertoad I'm actually interested in 1 and 3 both but it depends what you think is more appropriate to where I'm at skill level haha. I think you have an idea how my brain works now after our adventures. I've learned a lot so far.
The testing one with Karma sounds interesting but what would the tests be compared to what we already have with Jest?
I wrote a blog about the tool for my class it has 524 views if you want to see hehe. https://dev.to/andrewillomitzer/setinterval-and-the-32-bit-debacle-5bnm
Thanks and let me know what ya think <3