/ThreadsX.jl

Parallelized Base functions

Primary LanguageJuliaMIT LicenseMIT

Threads⨉: Parallelized Base functions

Dev GitHub Actions Aqua QA

tl;dr

Add prefix ThreadsX. to functions from Base to get some speedup, if supported. Example:

using ThreadsX
ThreadsX.sum(sin, 1:10_000)

To find out functions supported by ThreadsX.jl, just type ThreadsX. + TAB in the REPL:

julia> using ThreadsX

julia> ThreadsX.
MergeSort       any             findlast        mapreduce       sort
QuickSort       count           foreach         maximum         sort!
Set             extrema         issorted        minimum         sum
StableQuickSort findall         map             prod            unique
all             findfirst       map!            reduce

API

ThreadsX.jl is aiming at providing API compatible with Base functions to easily parallelize Julia programs.

All functions that exist directly under ThreadsX namespace are public API and they implement a subset of API provided by Base. Everything inside ThreadsX.Implementations is implementation detail. The public API functions of ThreadsX expect that the data structure and function(s) passed as argument are "thread-friendly" in the sense that operating on distinct elements in the given container from multiple tasks in parallel is safe. For example, ThreadsX.sum(f, array) assumes that executing f(::eltype(array)) and accessing elements as in array[i] from multiple threads is safe. In particular, this is the case if array is a Vector of immutable objects and f is a pure function in the sense it does not mutate any global objects. Note that it is not required and not recommended to use "thread-safe" array that protects accessing array[i] by a lock.

In addition to the Base API, all functions accept keyword argument basesize::Integer to configure the number of elements processed by each thread. A large value is useful for minimizing the overhead of using multiple threads. A small value is useful for load balancing when the time to process single item varies a lot from item to item. The default value of basesize for each function is currently an implementation detail.

ThreadsX.jl API is deterministic in the sense that the same input produces the same output, independent of how julia's task scheduler decide to execute the tasks. However, note that basesize is a part of the input which may be set based on Threads.nthreads(). To make the result of the computation independent of Threads.nthreads() value, basesize must be specified explicitly.

Limitations

  • Keyword argument dims is not supported yet.
  • (There are probably more.)

Implementations

Most of reduce-based functions are implemented as thin wrappers of Transducers.jl.

Custom collections can support ThreadsX.jl API by implementing SplittablesBase.jl interface.