volcano-sh/volcano

[Discussion] Migrate volcano-sh/scheduler into volcano-sh/volcano

k82cn opened this issue · 20 comments

k82cn commented

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

Description:

Propose to migrate volcano-sh/scheduler into volcano-sh/volcano, as

  1. Several features need to enhance both repo, e.g. DelayPodCreation
  2. When there's fix in volcano-sh/scheduler, it has to be cherry-pick to volcano-sh/volcano; if any missing, there maybe quality issue.
  3. We also need to re-submit the scheduler part to kube-batch which means we have to maintain 3 repos :(

After the migration, we will

  1. make sure adding notes on the scheduler part is built based on kube-batch to follow the license requirement
  2. make sure the scheduler part use the same api with kube-batch
  3. any others?
k82cn commented

If any comments/suggestions, please let me know :)

SGTM I have a question here:

How to migrate scheduler to the repo? Are we using git submodule or move the code in scheduler to volcano directly?

k82cn commented

Are we using git submodule or move the code in scheduler to volcano directly?

Both are OK to me. My original idea is to move the code to volcano directly which is easier to manage and when we donate volcano to CNCF or Kuberentes, it's also easier to migrate (but we may lost the commit history of scheduler). If using git submodule, maybe volcano-sh/controllers, volcano-sh/schedulers and volcano-sh/apis and so on; we can work together on volcano-sh/apis part to make sure related projects are on the same page :)

@k82cn Now scheduler is a fork of kube-batch. If we move it to this repo, it may be hard to rebase the upstream, IMO. I am not sure about the relationship between the upstream kubebatchd and volcano scheduler. If we do not need to rebase the upstream, then both ways work.

k82cn commented

I am not sure about the relationship between the upstream kubebatchd and volcano scheduler.

Almost the same, one minor different because of release cycles. Some features require interaction
between controller & scheduler, the scheduler part will be migrated to kube-batch because of
its scope; and it'll make sure the scheduler part will NOT bind to volcano job.

If we do not need to rebase the upstream, then both ways work.

we do rebase manually right now :)

Then both work for me

Some questions :)

Several features need to enhance both repo, e.g. DelayPodCreation

Now we need both modify code of volcano-sh/scheduler and volcano-sh/volcano. After migratingvolcano-sh/scheduler into volcano-sh/volcano, we also need to modify code of volcano-sh/scheduler and volcano-sh/volcano related code, in different directories instead of different repos. Ah, is there any major difference?

When there's fix in volcano-sh/scheduler, it has to be cherry-pick to volcano-sh/volcano; if any missing, there maybe quality issue.

Most code of volcano-sh/scheduler and volcano-sh/volcano are independent. Are there many code need be cherry-picked?

k82cn commented

we also need to modify code of volcano-sh/scheduler and volcano-sh/volcano related code, in different directories instead of different repos. Ah, is there any major difference?

For now, volcano includes scheduler as vendor for release and e2e test; so every PR in scheduler are cherry-picked into volcano-sh/volcano. If any interaction, we need to review PR in scheduler, bump into volcano, review PR for other part in volcano. Another option is git submodules as Ce suggested; but the PRs has to be reviewed in different repos.

/cc @mrbobbytables @jeefy , who're familiar with k8s's process , may give some suggestions :)

LGTM

gi submodules is a little tricky.

jeefy commented

I might be missing the full picture, so I'm sorry. :(

I feel like cherry-picking upstream commits into v/scheduler is the wrong choice. Also, I feel like until Volcano has a permanent home (ie. CNCF donation) the scheduler code should remain in k-sigs/kube-batch.

Is there a technical or a licensing issue vendoring k-sigs/kube-batch?

k82cn commented

Is there a technical or a licensing issue vendoring k-sigs/kube-batch?

The issue is that we're going to modify v/scheduler for Volcano release; and the release cycle of volcano & kube-batch maybe different.

For now, we fork kube-batch as v/scheduler which takes lots of work to update vendor (from v/scheduler to v/volcano); so I open this discussion to see how to reduce such kind of effort.

Sorry for the late reply, thought I summitted comments before.

I can see some benefits of hosting the scheduler code in tree, but we need to be careful to make sure that code changes happen in two places trackable and easy to sync in bidirectional way.

Manually copy files or cherry-picking commits in to the tree would make history massive, which is not recommanded.

I've created an exmaple PR (#264) show how things look like if we decided to host scheduler code in-tree. The code is checked in by scripts (checkout this for details), and we can use similar commands to sync changes back to upstream if necessary.

With regard to the kubernetes processes, the general goal is to establish a single source of truth. For items that aren't managed in their own repo, they tend to be handled via the staging directory. It serves as the source of truth for a slew of repos that are updated via the publishing bot.

The issue is that we're going to modify v/scheduler for Volcano release; and the release cycle of volcano & kube-batch maybe different.

I can't speak to the differences or upcoming changes between the v/scheduler and kube-batch, but it seems like a good goal to try and bring those in line to reference kube-batch itself as the single source of truth (at least for scheduling related items), and pulled in via vendor. As it's a sub-project, it doesn't have to adhere to the standard kubernetes release cycle and should generally be able to align with a cadence that is usable by volcano or other projects. If the releases are hard to manage, could also reference a specific commit after the needed feature(s) are merged.

If v/scheduler is going to diverge a fair amount and become more tightly coupled to volcano I'd lean towards @kevin-wangzefeng suggestion, git submodule (@gaocegege suggestion) or if the in-tree code should be the source of truth -- publishing bot. Folks touching the code in the scheduler sub-directory should be cognizant that the code will (may) be pushed upstream and they should stage their commits wisely for easier import.

k82cn commented

If v/scheduler is going to diverge a fair amount and become more tightly coupled to volcano

That's the reason I open this discussion; and seems kevin-wangzefeng@ suggestion is simpler to other contributors.

k82cn commented

If no objection, we'd like to follow kevin-wangzefeng@ suggestion to make process simpler for other contributors :)

To summarize:

  • To manage code in-tree for better daliy development experience:

    • Automation for bidirectional code sync, anyone can do it when necessary -- The scripts is ready in #287.
    • Indentify if there's any extra changes needed for checking in the code -- Created a new PR #288 with scripts submitted in #287.
  • To follow licensing compliance:

    • Add description in the main repo readme, to clarify the scheduler code copyright (major requirement of Apache 2.0 License) -- We can do it once scheduler code is in.
    • Integrate fossa to project CI, make sure the whole project is compliant with license requirements from its dependencies -- We can do this in parallel, it's acutally not depending on whether we decide to manage scheduler code in-tree or not.

We can timebox lazy consensus to this Friday 23:59 Beijing Time.

k82cn commented

@asifdxtreme , please help on "Integrate fossa to project CI,"

k82cn commented

/close

All tasks are done.

@k82cn: Closing this issue.

In response to this:

/close

All tasks are done.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.