JuliaIntervals/IntervalRootFinding.jl

Multithreading causes random errors

Shmuley95 opened this issue · 3 comments

The minimum working example below should always return 1. When using one thread, everything is fine, but as soon as I run the same code with multiple threads, random errors start creeping up.

using IntervalRootFinding
let
	results = []
	Threads.@threads for iter in 1:1000	
		f(x) = (x^-1)-1
		result = roots(f, 0.0001..10000)
		push!(results, result)
	end
	length(unique(results))
end

I hope you could help identify the issue or find a possible workaround. Thank you for your time!

Kolaru commented

Could you give more detail on how you run this code ? I have tried it with the last release and on master, and I don't get any error.

Those are my specs

julia> versioninfo()
Julia Version 1.8.3
Commit 0434deb161e (2022-11-14 20:14 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Xeon(R) CPU E5-1620 v2 @ 3.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, ivybridge)
  Threads: 4 on 8 virtual cores
Environment:
  LD_LIBRARY_PATH = :/home/benoit/lib
  JULIA_EDITOR = code

You will not get an error. Instead, you will get a random integer (on my windows pc with 6 cores around 5-7, on a friend's Macbook with 4 cores usually somewhat lower), which changes on consecutive runs. But the code should always return 1 deterministically, for every element in results should be equal to 1.

Kolaru commented

Oh I see. Looking at the "unique" results with full precision, I get :

julia> using IntervalArithmetic

julia> @format full
Display parameters:
- format: full
- decorations: false
- significant figures: 6

julia> rts = unique(results)
4-element Vector{Any}:
 Root{Interval{Float64}}[Root(Interval(0.9999999999999994, 1.0000000000000007), :unique)]
 Root{Interval{Float64}}[Root(Interval(0.9999999999999997, 1.0000000000000007), :unique)]
 Root{Interval{Float64}}[Root(Interval(0.9999999999999996, 1.0000000000000007), :unique)]
 Root{Interval{Float64}}[Root(Interval(0.9999999999999994, 1.0000000000000004), :unique)]

So it looks like by running on different processors we got slightly different numerical results (seems to be a 1 or 2 bits difference). It doesn't strike me as necessary a bug, since they are all correct results.

I am not an expert in those details, however.