omlins/ParallelStencil.jl

`@hide_communication` error when inside a module and using `CUDA.jl` backend

Closed this issue · 5 comments

Currently, the @hide_communication macro does not work when using it within a module, e.g. in the context of a package that depends on ParallelStencil (as done for example in AcousticWaveCPML.jl).

Error description

The following error pops up during precompilation of the package that depends on ParallelStencil:

...
ERROR: LoadError: Evaluation into the closed module `ParallelKernel` breaks incremental compilation because the side effects will not be permanent. This is likely due to some other module mutating `ParallelKernel` with `eval` during precompilation - don't do this.
Stacktrace:
  [1] eval
    @ ./boot.jl:368 [inlined]
  [2] eval
    @ ~/.julia/packages/ParallelStencil/3flwf/src/ParallelKernel/ParallelKernel.jl:37 [inlined]
  [3] extract_args(call::Expr, macroname::Symbol)
    @ ParallelStencil.ParallelKernel ~/.julia/packages/ParallelStencil/3flwf/src/ParallelKernel/shared.jl:100
  [4] hide_communication_cuda(boundary_width::Symbol, block::Expr)
    @ ParallelStencil.ParallelKernel ~/.julia/packages/ParallelStencil/3flwf/src/ParallelKernel/hide_communication.jl:118
  [5] hide_communication(::Symbol, ::Vararg{Union{Expr, Integer, Symbol}}; package::Symbol)
    @ ParallelStencil.ParallelKernel ~/.julia/packages/ParallelStencil/3flwf/src/ParallelKernel/hide_communication.jl:82
  [6] hide_communication(::Symbol, ::Vararg{Union{Expr, Integer, Symbol}})
    @ ParallelStencil.ParallelKernel ~/.julia/packages/ParallelStencil/3flwf/src/ParallelKernel/hide_communication.jl:81
  [7] var"@hide_communication"(__source__::LineNumberNode, __module__::Module, args::Vararg{Any})
    @ ParallelStencil.ParallelKernel ~/.julia/packages/ParallelStencil/3flwf/src/ParallelKernel/hide_communication.jl:68
...

This error only shows up when using @hide_communication inside a module where ParallelStencil was initialized with the CUDA.jl backend.

The cause of this error is the eval call inside the extract_args function that gets called inside the hide_communication_cuda function, currently implemented as:

extract_args(call::Expr, macroname::Symbol) = eval(substitute(deepcopy(call), macroname, Symbol("@get_args")))

This function is only called in the context of hide communication and the macroname is always Symbol("@parallel").

What this function does is replace the @parallel macro of the kernel call expression with the @get_args macro which is defined in the same file as extract_args like the following:

macro get_args(args...) return args end

and then calls eval to evaluate the new macro, hence getting a tuple with the arguments of the @parallel kernel call.
This eval call is problematic since it could change the ParallelStencil.ParallelKernel module from a third-party module while precompilation is occurring. Of course, no actual harm is done here, but Julia is (apparently?) not smart enough to notice this.

How to reproduce the error

Either:

  • clone AcousticWaveCPML.jl. Edit a multi-xPU shared solver (e.g. src/shared/acoustic_2D_multixPU.jl) and uncomment the @hide_communication call (e.g. line 160 and 172 in src/shared/acoustic_2D_multixPU.jl). Try to precompile the package and you should get the error.
  • write a package with a module that uses ParallelStencil inside of it and initialize ParallelStencil with the CUDA.jl backend. Write a @parallel kernel function and use @hide_communication to call that kernel function from within the module. Try to precompile the package and you should get the error.

Possible fixes

A possible fix is to call eval specifying the caller module to be the module where the @hide_communication macro is originally in. This does however require for the @get_args macro to be now accessible in the caller module, which is not in this case because it resides in the ParallelStencil.ParallelKernel module.

Another fix is to just replace the extract_args function with the following:

extract_args(call::Expr) = tuple(expr.args[3:end]...)

since we know that the only possible expression in expr will be a @parallel kernel call, we can extract the arguments of the call just like that without having to rely on eval. Maybe it would be nice to extract these arguments somewhere else (e.g. in the extract_calls function) where it is explicitly checked that expr is a @parallel kernel call.

This latter fix has been tested on both Base.Threads and CUDA.jl backends unit tests currently available in the package.

@GiackAloZ Thanks a lot for the issue and the detailed description! I'll try to fix it soon in a development branch. Is it urgent for you to have a development branch where it is fixed or do you anyway already have that in a fork of yours?

@omlins I am not in a hurry for a fix, but it would be great to have it in a development branch since I currently do not have a fork of mine.

Even better would be having a patch 0.6.1 where this is fixed, but I am probably asking for too much ;)

A patch would be good.

@GiackAloZ This is merged into main. You can install it as ] add ParallelStencil#main.
I can also do a patch for you - it will still take a while until main is released ( many new fundamental features will be included).

@GiackAloZ : patch released...