Function `assemble_vector` crashes with `LinearizedFESpace` when running on multiple processors
wei3li opened this issue · 3 comments
Hi @amartinhuertas,
The following code crashes when running on more than 2 processors:
using Gridap, GridapDistributed, GridapP4est, PartitionedArrays
with_mpi() do distribute
order = 2
parts = (1, 2)
ranks = distribute(LinearIndices((prod(parts),)))
coarse_model = CartesianDiscreteModel((0, 1, 0, 1), (1, 1))
model = OctreeDistributedDiscreteModel(ranks, coarse_model, 3)
reffe = ReferenceFE(lagrangian, Float64, order)
degree = 2 * order + 1
space_args = Dict(:conformity => :H1, :dirichlet_tags => "boundary")
U = TrialFESpace(FESpace(model, reffe; space_args...), 1.0)
V, lin_model = Gridap.LinearizedFESpace(model, reffe; space_args...)
Ω = Triangulation(lin_model)
dΩ = Measure(Ω, (degree + order - 1) ÷ order)
a(u, v) = ∫(∇(v) ⋅ ∇(u))dΩ
assemble_vector(a(get_fe_basis(U), zero(V)), U)
end
The error traces are:
┌ Error:
│ exception =
│ AssertionError: length(cell_vals) == length(cell_rows)
│ Stacktrace:
│ [1] _numeric_loop_vector!(...)
│ @ Gridap.FESpaces ~/Programs/.julia/packages/Gridap/pKtni/src/FESpaces/SparseMatrixAssemblers.jl:282
│ [2] numeric_loop_vector!(...)
│ @ Gridap.FESpaces ~/Programs/.julia/packages/Gridap/pKtni/src/FESpaces/SparseMatrixAssemblers.jl:274
│ [3] map(...)
│ @ PartitionedArrays ~/Programs/.julia/packages/PartitionedArrays/py6uo/src/mpi_array.jl:237
│ [4] numeric_loop_vector!(...)
│ @ GridapDistributed ~/Programs/.julia/packages/GridapDistributed/a9WXb/src/FESpaces.jl:609
│ [5] assemble_vector(...)
│ @ Gridap.FESpaces ~/Programs/.julia/packages/Gridap/pKtni/src/FESpaces/SparseMatrixAssemblers.jl:43
│ [6] assemble_vector(...)
│ @ Gridap.FESpaces ~/Programs/.julia/packages/Gridap/pKtni/src/FESpaces/Assemblers.jl:325
│ [7] assemble_vector(...)
│ @ Gridap.FESpaces ~/Programs/.julia/packages/Gridap/pKtni/src/FESpaces/Assemblers.jl:352
│ [8] (::var"#1#2")(distribute::PartitionedArrays.var"#88#89"{MPI.Comm, Bool})
│ @ Main ~/Documents/mpi-examples/assemble_vector_issue.jl:22
│ [9] with_mpi(f::var"#1#2"; comm::MPI.Comm, duplicate_comm::Bool)
│ @ PartitionedArrays ~/Programs/.julia/packages/PartitionedArrays/py6uo/src/mpi_array.jl:73
│ [10] with_mpi(f::Function)
│ @ PartitionedArrays ~/Programs/.julia/packages/PartitionedArrays/py6uo/src/mpi_array.jl:65
│ [11] top-level scope
│ @ ~/Documents/mpi-examples/assemble_vector_issue.jl:4
│ [12] include(mod::Module, _path::String)
│ @ Base ./Base.jl:418
│ [13] exec_options(opts::Base.JLOptions)
│ @ Base ./client.jl:292
│ [14] _start()
│ @ Base ./client.jl:495
└ @ PartitionedArrays ~/Programs/.julia/packages/PartitionedArrays/py6uo/src/mpi_array.jl:75
The number of cells in the linearized model and the original model seems to be different. I think you can either
- only work with V, i.e use the fe_basis from V instead of U, or
- define U using the lin_model, so that it's defined on the same triangulation as V.
Alternatively, you should define a glue between the two triangulations (which I believe is something that is not automatic right now) that allows you to move from one to the other.
I can reproduce the error in my local environment. Will take a look and come back to you.
@JordiManyer ... this is a very recent development (LinearizedFEspace on top of non-conforming meshes + parallel).
The error has to do with the local arrays with ghosts versus local arrays without ghosts.
Solved in gridap/Gridap.jl@fcf3301