keikoproj/active-monitor

Support cron-like expressions as an alternative to repeatAfterSec param

davemasselink opened this issue · 16 comments

Currently, active-monitor repeats the workflow submissions based on the repeatAfterSec spec parameter.

However, it would be more flexible if we also allowed users to specify how often they want the workflow to run using a cron-like expression.

@pzou1974 had the great suggestion to look at this library to help with parsing cron expressions: https://godoc.org/gopkg.in/robfig/cron.v2


If you want to get involved, check out the
contributing guide, then reach out to us on Slack so we can see how to get you started.

@codetamaracode, would you be interested in contributing to this issue?

@davemasselink it seems interesting

@malaDev, I had initially wanted to give @codetamaracode a chance to contribute to this task (she had contributed to another task which someone else had already begun work on so I promised her 1st dibs on some other tasks)

However, I haven't heard back from her in 1 week. Therefore, you can go ahead and contribute to this issue, if you're still interested and able to do so.

If so, just comment back here that you're taking it and I'll set you as the assignee. Thanks!

okey i will take it then

@malaDev, fantastic... I'll mark you as the assignee. And let me know if you'd like to schedule a phone/video chat to discuss a bit more as you begin.

here is how I'm planning to implement this new option

  • I will keep repeatAfterSec as the main spec of the loop
  • instead of specifying it explicitly it could be set from a new flag schedule which will set the value of repeatAfterSec. Maybe setting an Unmarshal method on HealthCheckSpec struct type would be enough here hopefully
  • both spec should never be set together, should emit error if so

In that case, there will not be major changes in the code base I guess

I already setup my environment, unit test running fine so far, but i will probably need some help somehow, especially about the global flow, could I be added in slack group?

@malaDev, I think your plan sounds great, overall

  • keeping repeatAfterSec as the main mechanism to define loop period is ideal
  • like you say, the repeatAfterSec and schedule attributes shouldn't both be provided... but if they are, we could either err OR maybe even better yet, indicate a warning log message in the controller code saying that since both are set, schedule will be ignored and repeatAfterSec will be honored
  • I do also like the idea of taking a provided schedule and immediately translating that to a repeatAfterSec value so that that attr can still be consistently present and honored.
    • However, I suspect this may need to be a "fuzzy" mapping. Cron strings can allow one to specify a time period such as "first of every month". But because of different length months, this may not cleanly be able to be mapped into a number of seconds. But, I think that's not too big a deal as long as docs are also updated to indicate that the cron-like expression will be mapped to a period using common durations (ex: 1 month is 30 days, etc)

Regarding the slack channel... I think you should be able to add yourself. Are you able to access the channel using this link? https://keikoproj.slack.com/app_redirect?channel=active-monitor

You'll likely need to create a new account. But the intention is that anyone from the community can join the slack channels without need for approval.

Finally, we haven't started using the slack channel too much yet. If you'd be up for it, we could also schedule a 30-60min window for you and I to have a video conf call where I could help get you up and running and give you some more background and context details.

a conf call will be great as it seems to be a bit complicated. It will be necessary to have some more knowledge about the whole system in order to implement something that fit to this request correctly, especially about the 3rd point you mentioned above

I should be available between 9pm-11pm GMT+3. just let me know
In meanwhile I will continue investigating the code

@malaDev, sorry for the delayed response... been sick for a few days :(

would you be up for a chat MondayTuesday night at 10pm your time (GMT+3)?

if so and if you can get the bluejeans client installed, I'll catch you at bluejeans.com/4082561212

let me know if you run into any trouble, otherwise I'll plan to talk with you soon

actually, will need to shift to Tuesday at 10pm your time

sorry, @malaDev, have been sick for the past few days and won't be able to chat with you in a few minutes as scheduled. I'll reach out again shortly to try to reschedule.

hey @malaDev, sorry to need to push out our mtg a couple times there... but I'm mostly back to normal after an annoying cold and ready to chat. You had earlier said that 9-11pm GMT+3 was a time window that often worked well for you. How about today, Wednesday, at 10:30pm GMT+3?

if so, I'll see you then on the same bluejeans chat channel I proposed before. I'll plan to sync with you then, but let me know if another day/time works better for you. thanks!

Hello @davemasselink I was away for few days, just back now. We should be able to start as soon as possible. Hopefully tomorrow same hour will work for you. bluejeans.com/4082561212 10pm

hey there @irzhywau, sure! let's meet at 10pm (UTC+3) on Wednesday. I'll be online then to chat with you.

(whoops, just flipped back to this browser tab and see I had some notes entered into the comment box but never clicked the button to comment. these questions came from the conversation which @irzhywau and I had a week or so ago)

Questions:

  • how to determine initial time the healthcheck will run? (not immediately)
  • for non constant period (ex: every month), should we fuzzily say 1mo = 30 days or recompute the repeatAfterSec

awesome! thanks to @irzhywau / @malaDev and @RaviHari for some fantastic brainstorming and then development on this feature :)