yorkie-team/yorkie

Housekeeping structure improvement

fourjae opened this issue · 3 comments

What would you like to be added:

Cluster mode, and several background tasks are required, so the Housekeeping execution structure for this needs to be improved.

func (h *Housekeeping) run() {

Why is this needed:

Currently, Housekeeping is running a background routine unrelated to RPC and is running at regular intervals.

  • the housekeeping task only has logic to remove inactive clients. However, other logic may be added later. How should we introduce a structure for this?

    • In addition to Yorkie, it would be a good idea to find a way to run batch or background routines on other servers and apply them to Yorkie.

  • In yorkie's cluster mode, I think problems will arise if multiple nodes run the same housekeeping logic redundantly at the same time.

    • Master nodes are elected through voting, To prevent two nodes from being elected as master It works by acquiring a lock in MongoDB, There may be a way to have only the master node run afterwards

    • Kubernetes uses Jobs to execute necessary housekeeping tasks.

    • If in standalone mode, There may be a way to set up a cron job or detect a specific event and run a script.

I think we need to research good practices, pros and cons, and come up with a method that suits the yorkie project.

Master nodes are elected through voting, To prevent two nodes from being elected as master It works by acquiring a lock in MongoDB, There may be a way to have only the master node run afterwards

For leader-election based master-slave structure, I have implemented leader election using MongoDB: #529

Kubernetes uses Jobs to execute necessary housekeeping tasks.

I think we can use CronJob to regularly run housekeeping process or use simple Job to run housekeeping on demand (this will require API for triggering). For implementation, we can use yorkie CLI like yorkie housekeeping start to only activate housekeeping mechanism from the yorkie server.

If in standalone mode, There may be a way to set up a cron job or detect a specific event and run a script.

For standalone mode, both cron job or event-based job will be well suited for several background tasks to run. For implementation, https://github.com/go-co-op/gocron will do the work for cron job.

I read #529 and understood the comments and source code changes. Very interesting!

But I still have some questions.

  1. If you are currently only doing housekeeping work, I think it would be better to have only the master node perform housekeeping work on that node through leader election rather than a CronJob. What are the benefits of using CronJob?

  2. I think additional improvements are needed not only at the infrastructure but also at the code level.
    Looking at the housekeeping loop code of the current run() function, only the deactivateCandidates task is performed at certain intervals. If other tasks are added here, both must be done at the same time.
    If another task is added and this task has to run separately from deactivateCandidates, how would I improve the code? Should I create different run() functions?

If you are currently only doing housekeeping work, I think it would be better to have only the master node perform housekeeping work on that node through leader election rather than a CronJob. What are the benefits of using CronJob?

I think we can remove the housekeeping & leader election overhead of each servers on the cluster, which will ease server to handle more workloads.

If we run housekeeping only on the master node, it will be a huge burden for the master node to process as cluster gets bigger (elected nodes might die out because of the overhead, which will lead something like election storm).

I think additional improvements are needed not only at the infrastructure but also at the code level.
Looking at the housekeeping loop code of the current run() function, only the deactivateCandidates task is performed at certain intervals. If other tasks are added here, both must be done at the same time.
If another task is added and this task has to run separately from deactivateCandidates, how would I improve the code? Should I create different run() functions?

I agree that we should improve current housekeeping structure. One simple idea is to reference server/backend/background.go and use this to create several background goroutines.

Another idea is that we can implement something like DAG workflow to schedule and run background process that has dependency. I found something similar to this idea: https://github.com/fieldryand/goflow