JuliaMolSim/DFTK.jl

Setting a >1 threading preference will trigger a recompilation and make it fail

Closed this issue · 2 comments

Reproduction steps:

  • Install DFTK, open julia and run using DFTK. It precompiles fine. Close julia.
  • Now open julia -t2. Run: using DFTK; DFTK.setup_threading(; n_DFTK=2). Close julia.
  • Now open julia -t2 again and run using DFTK. It has to precompile again, and will fail with the following error:
ERROR: The following 1 direct dependency failed to precompile:

DFTK

Failed to precompile DFTK [acf6eb54-70d9-11e9-0013-234b7a5f5337] to "/home/bruno/.julia/compiled/v1.11/DFTK/jl_onZ99l".
ERROR: LoadError: You set your preference for DFTK threads using set_DFTK_threads!, but only ran julia with 1 threads.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] get_DFTK_threads()
    @ DFTK ~/.julia/packages/DFTK/xNYFf/src/common/threading.jl:50
  [3] parallel_loop_over_range(fun::DFTK.var"#287#289"{Matrix{ComplexF64}, DFTK.DftHamiltonianBlock, Matrix{ComplexF64}, Array{Float64, 3}, Bool}, range::UnitRange{Int64}, storages::Vector{@NamedTuple{ψ_reals::Array{ComplexF64, 3}}})
    @ DFTK ~/.julia/packages/DFTK/xNYFf/src/common/threading.jl:72
  [4] macro expansion
    @ ~/.julia/packages/DFTK/xNYFf/src/terms/Hamiltonian.jl:144 [inlined]
  [5] mul!(Hψ::Matrix{ComplexF64}, H::DFTK.DftHamiltonianBlock, ψ::Matrix{ComplexF64})
    @ DFTK ~/.julia/packages/TimerOutputs/NRdsv/src/TimerOutput.jl:237
  [6] macro expansion
    @ ~/.julia/packages/DFTK/xNYFf/src/eigen/lobpcg_hyper_impl.jl:350 [inlined]
  [7] LOBPCG(A::DFTK.DftHamiltonianBlock, X::Matrix{ComplexF64}, B::LinearAlgebra.UniformScaling{Bool}, precon::DFTK.PreconditionerTPA{Float64}, tol::Float64, maxiter::Int64; miniter::Int64, ortho_tol::Float64, n_conv_check::Int64, display_progress::Bool)
    @ DFTK ~/.julia/packages/TimerOutputs/NRdsv/src/TimerOutput.jl:237
[other stuff below]

Looks like the precompilation workload only runs with one thread.

A workaround might be to configure the ratio of DFTK to total Julia threads instead of an absolute number of threads. E.g. DFTK.setup_threading(; n_DFTK=c) meaning that DFTK will use max(1, c * Threads.ncount()) for some float c between 0 and 1.

I think the best solution is to make dftk threads a runtime and not a compile time setting. Not sure how easy this is in the current setup, though.