tro3/ThreadPools.jl

Different behaviour when spawning to thread 1 or other threads

Closed this issue · 13 comments

Hi, I'm not sure if this is expected but this is the behaviour I'm seeing:

julia> ThreadPools.@tspawnat 1 twith(ThreadPools.StaticPool(3:8)) do pool
           @tthreads pool for i in 1:8
               sleep(0.1);println(i, Threads.threadid())
           end
       end
Task (runnable) @0x0000000132cab850
78
13
35
46
24
57
88
67
julia> ThreadPools.@tspawnat 2 twith(ThreadPools.StaticPool(3:8)) do pool
           @tthreads pool for i in 1:8
               sleep(0.1);println(i, Threads.threadid())
           end
       end
Task (runnable) @0x0000000132e96410
12
22
32
42
52
62
72
82

When spawning to thread 1, all of thread 3:8 can be used by the pool, but when spawning to thread 2 only thread 2 is used. Ideally I'd like to spawn to thread 2, and then use threads 3:8.

tro3 commented

Isn't that interesting? Let me take a look. I do recall seeing some code that could explain this, but I don't recall if it was mine or in Julia.

That being said, can I ask why you want the spawning itself to be done on a background thread? The spawning process is fast, so the primary doesn't really get loaded down. If I understand your use case, I may be able to tailor things a little better.

Yep of course. I'm working with a microservice and keeping the primary thread free for HTTP comms. I've been doing this using something similar to WorkerUtilities by spawning long running tasks to background threads. Now in one of those background threads I want to speed up a section of the code using @tthreads. I (think I) need to use @tthreads because I want to keep both the primary thread and the threads on which WorkerUtilities is sending tasks free, because I think WorkerUtilities.init clashes with @threads, and use all other threads.

There may be a better way of doing this, thanks for your help!

tro3 commented

Okay, this looks like an issue with my scheduler. I'll see if I can fix this up.

tro3 commented

Scratch that - scheduler is fine. I think it is my internal use of @threads in the implementation, which is where I think I saw the code that stirred the memory. I'll get this fixed up.

Interested to see what the fix is, I had a look but it was over my head!

tro3 commented

Okay, v1.2.1 is working it way through registration.

In other news, though, I'm thinking of modifying the lower-level API to make what you are doing a little easier. In particular, I think that having twith outside the macro just to handle context is klugey. It seems to me that the dominant use case for more complex pooling would still be to shut down the pool once through the macro. So if we assume that all macros are their own context, we could move twith inside and rewrite the above as:

@tspawnat 2
    @tthreads StaticPool(3) for i in 1:8
        sleep(0.1);  println(i, Threads.threadid())
    end

which is way cleaner. (Note that StaticPool(3) already translates to "threads 3 and up".) What do you think?

tro3 commented

The issue you found, BTW, was indeed a quirk of the Threads.@threads scheduler, which assumes it is being launched from the primary, and I had used it in the implementation. The fix was to not use it. :-)

tro3 commented

@tkf - would be interested on your thoughts on the above API change proposal, as well.

Amazing, thanks for sorting that, just tested and it's working well.

I think your suggested change to not need twith is good, at least for my use case. Is there any advantage to keeping a pool open for multiple uses of @tthreads? Seems pretty quick to create and close: But don't know if anyone else has a use case for keeping the pool alive.

function f1()
    twith(ThreadPools.StaticPool(3)) do pool
        for i in 1:100
            @tthreads pool for i in 1:8
                1+1
            end
        end
    end
end
function f2()
    for i in 1:100
        twith(ThreadPools.StaticPool(3)) do pool
            @tthreads pool for i in 1:8
                1+1
            end
        end
    end
end
julia> @btime f1()
  998.224 μs (10501 allocations: 1.18 MiB)
julia> @btime f2()
  986.835 μs (10600 allocations: 1.19 MiB)

So if we assume that all macros are their own context

Please could you explain what you mean by that? Is it already assumed as @tthreads uses tforeach?

Also I wonder why Threads.@threads only farms out to the threads if called from the primary thread, it seems like intended design so there must be a reason for it!

tro3 commented

Please could you explain what you mean by that? Is it already assumed as @tthreads uses tforeach?

By context, I mean the opening and closing of a resource. twith's only purpose, really, is to close the pool once the function is done. I agree with your assessment - opening and closing the pools does not involve much overhead, so forcing the user to establish the pool context around each macro call is not worth it. I'll move twith inside the macro for 1.3.

Also I wonder why Threads.@threads only farms out to the threads if called from the primary thread, it seems like intended design so there must be a reason for it!

Yep. The @threads scheduler from earlier versions of Julia implicitly assumed the scheduling thread was the primary. Looking at more recent versions, it was formalized and made explicit. No idea why.

By context, I mean the opening and closing of a resource.

Ah I'm with you, that makes sense yep.

No idea why

I've asked on slack out of curiosity

From slack:

if the function you're calling implicitly does threading, and a user calls that function in a thread-ed for loop, you don't want to over spawn

At a guess: @thread is designed for highly balanced workloads where as @spawn is not.So if you are using @threads on an outer layer, then you already know the inner layer is balanced.