Unify computation of number of parallel jobs
tobiasdiez opened this issue · 18 comments
Currently, the number of jobs used for parallel operations is re-computed and set in various places. In this ticket, we unify this to one central place in sage.env, which is then reused across the whole infrastructure. The only exception is the script build/bin/sage-build-num-threads which is only needed in the build of the sage-conf package. This can probably be improved as well, but we leave it for a follow-up ticket.
Changes in detail:
- Remove SAGE_NUM_THREADS_PARALLEL since it was only used for the docbuild.
- Introduce the method
thread_countinsage.envthat based on the environment variableSAGE_NUM_THREADSand the number of CPUs computes the number of parallel jobs sage should use. - Use this method (or its cached value
THREAD_COUNT) everywhere where previouslySAGE_NUM_THREADShas been used. - Remove a few places that also calculated the number of CPUs or number of parallel jobs.
- In particular, remove the computation of the number of threads to use in the make files. This is now exclusively handled through
sage.env. - Don't specify
SAGE_NUM_THREADSon CI, but let this be calculated using the number of CPUs available.
CC: @mkoeppe @jhpalmieri @tscrim @nthiery @fchapoton
Component: build
Branch/Commit: public/build/num_threads @ af4c3e2
Issue created by migration from https://trac.sagemath.org/ticket/33317
Branch pushed to git repo; I updated commit sha1. New commits:
af4c3e2 | Fix style |
If I'm reading the old code right, the default number of threads was 2, but you've changed it to 10. At least one reason to keep the default low was for the case of shared machines, where a single user should only use many cores at a time intentionally.
Replying to @jhpalmieri:
Does this duplicate #30639? If so, let's close #30639, since there is no branch there.
Yes, removing src/bin/sage-num-threads.py is indeed done as part of this ticket. Thanks for pointing this out.
Replying to @jhpalmieri:
If I'm reading the old code right, the default number of threads was 2, but you've changed it to 10. At least one reason to keep the default low was for the case of shared machines, where a single user should only use many cores at a time intentionally.
Yes, you are right. Some of the defaults changed as part of this refactoring since the defaults used where not homogeneous. Personally, I would say that we should try to use a high parallelization as the default, because most systems today have more than 8 cores (especially developer machines). People in a shared environment can still overwrite this of course. But if you think another default is more reasonable, this can be easily changed.
ALL_CAPS variables in sage.env should be strings, not integers.
Is it really necessary to introduce THREAD_COUNT as a variable? Having a function that returns this value should be enough.
Also, I don't see why the words num_threads (which developers are familiar with) should be replaced by thread_count.
Also, the point of build/bin/sage-build-num-threads (called in build/make/install) is that it initializes the number of threads in a build context based on the user's use of make -jNUM.
Please don't delete this.
Replying to @tobiasdiez:
The only exception is the script
build/bin/sage-build-num-threadswhich is only needed in the build of thesage-confpackage.
... no, that's not what it does
Replying to @tobiasdiez:
Replying to @jhpalmieri:
If I'm reading the old code right, the default number of threads was 2, but you've changed it to 10. At least one reason to keep the default low was for the case of shared machines, where a single user should only use many cores at a time intentionally.
Yes, you are right. Some of the defaults changed as part of this refactoring since the defaults used where not homogeneous. Personally, I would say that we should try to use a high parallelization as the default, because most systems today have more than 8 cores (especially developer machines). People in a shared environment can still overwrite this of course.
I'm concerned with mathematicians not knowing that they should change it and so accidentally overloading a system. We should not rely on users knowing what they're doing.
Replying to @jhpalmieri:
I'm concerned with mathematicians not knowing that they
should change it and so accidentally overloading a system.
We should not rely on users knowing what they're doing.
+1
Users who need speed will figure out how to change
a low default to something higher.
They can set the number of threads to use
for parallel computations in their init.sage.
Replying to @slel:
Replying to @jhpalmieri:
I'm concerned with mathematicians not knowing that they
should change it and so accidentally overloading a system.
We should not rely on users knowing what they're doing.+1
Users who need speed will figure out how to change
a low default to something higher.They can set the number of threads to use
for parallel computations in theirinit.sage.
+1 as well. We could very likely upset a sysadmin, especially with such a change where even a more knowledgeable user might not be paying so close attention to the Sage development cycle.
Replying to @mkoeppe:
Also, I don't see why the words
num_threads(which developers are familiar with) should be replaced bythread_count.
I took the cpu_count function from python as a blueprint, but can change the naming of course if you think num_threads is easier / more common.
Replying to @mkoeppe:
Also, the point of
build/bin/sage-build-num-threads(called in build/make/install) is that it initializes the number of threads in a build context based on the user's use ofmake -jNUM.
Please don't delete this.
I didn't delete this script, but only the one in src/bin.
Replying to @tscrim:
Replying to @slel:
Replying to @jhpalmieri:
I'm concerned with mathematicians not knowing that they
should change it and so accidentally overloading a system.
We should not rely on users knowing what they're doing.+1
Users who need speed will figure out how to change
a low default to something higher.They can set the number of threads to use
for parallel computations in theirinit.sage.+1 as well. We could very likely upset a sysadmin, especially with such a change where even a more knowledgeable user might not be paying so close attention to the Sage development cycle.
I think you misunderstood me. I didn't change the user-facing multiprocessing for calculation, i.e. what ncpus in sage/parallel returns (at least if I don't overlook something). This still first returns SAGE_NUM_THREADS if set, and only as a fallback uses the number of cores. What I changed was that other parallel processes follow the same strategy. This change should only impact devs that compile sage and run doctests. I'm not sure if it really makes sense to be more strict there than with user-facing calculations. I would expect that devs have more knowledge that they might need to restrict parallel computations in order to not negatively impact their system.
Anyway, if the special treatment of compilation and doctests is indeed preferred, what would be a good default? No parallelization at all if the user doesn't specify SAGE_NUM_THREADS?
Replying to @tobiasdiez:
Replying to @mkoeppe:
Also, the point of
build/bin/sage-build-num-threads(called in build/make/install) is that it initializes the number of threads in a build context based on the user's use ofmake -jNUM.
Please don't delete this.I didn't delete this script, but only the one in src/bin.
You deleted its use in build/make/install
Replying to @tobiasdiez:
Replying to @mkoeppe:
Also, I don't see why the words
num_threads(which developers are familiar with) should be replaced bythread_count.I [...] can change the naming of course
Yes, please change it back.