Jeffail/tunny

Change example to set GOMAXPROCS to numCPUs+1

Opened this issue · 5 comments

You should really have an OS thread spare to handle non-busy work, like handling the HTTP requests.

Otherwise you are relying on Go's 'preemption' to give the HTTP handler a chance, which may or may not happen in a timely manner depending on the nature of your "heavy work".

Hey @fcntl, if you set GOMAXPROCS to above numCPUs and run them all hot you'll get thread contention. I think depending on your tasks you are probably better off keeping GOMAXPROCS to remain equal to or less than the number of logical cores and letting the runtime schedule those extra bits of work. I haven't actually tested or verified this though so I'm happy to be shown otherwise.

@Jeffail you are already creating a pool for heavy work with number of goroutines matching number of CPUs:

tunny.CreatePool(numCPUs

Which means when those goroutines start performing heavy work (the whole point of the pool in the first place), there will be no other OS threads to schedule the goroutine for the HTTP handler.

if you set GOMAXPROCS to above numCPUs and run them all hot you'll get thread contention

But the entire point of your library is to control the number of threads running hot by using a thread pool. That is what limits the number of hot threads, not GOMAXPROCS. A goroutine for handling HTTP requests is not going to be a hot thread. A thread which is sleeping on a socket is not going to cause much contention.

I think depending on your tasks you are probably better off keeping GOMAXPROCS to remain equal to or less than the number of logical cores and letting the runtime schedule those extra bits of work.

Which may or may not happen depending on when or if your busy threads yield to the runtime in a timely manner. If you have one extra OS thread for the goroutine handling the HTTP requests it will be scheduled in time by the OS. Would you rather a busy thread take a little longer to run or start dropping HTTP requests because you can't response to them? Having GOMAXPROCS set to less than number of CPUs (or one) only makes sense for an app that only does IO bound stuff, whereas you library is explicitly intended for pooling busy work.

I haven't actually tested or verified this though so I'm happy to be shown otherwise.

It's really easy - just write a program that does really busy, non-yielding work in the pool and see if you can still send http requests after the pool becomes busy.

E.g. this program was able to exactly produce the kind of behavior I am talking about:

package main

import (
    "io/ioutil"
    "net/http"
    "runtime"
    "github.com/jeffail/tunny"
    "fmt"
)

func main() {
    numCPUs := runtime.NumCPU()
    runtime.GOMAXPROCS(numCPUs)

    pool, _ := tunny.CreatePool(numCPUs, func(object interface{}) interface{} {
        input, _ := object.([]byte)

        // Do something that takes a lot of work
        var x,y,z int;

        for x = 1; x < 1e6; x++ {
            for y = 1 ; y < 1e6; y++ {
                z = z+y/x;
            }
        }

        return fmt.Sprintf("%s: %d",input,z);
    }).Open()

    defer pool.Close()

    http.HandleFunc("/work", func(w http.ResponseWriter, r *http.Request) {
        input, err := ioutil.ReadAll(r.Body)
        if err != nil {
            http.Error(w, "Internal error", http.StatusInternalServerError)
        }

        // Send work to our pool
        fmt.Printf("Got a message, sending to pool: %s\n",input);
        result, _ := pool.SendWork(input)

        w.Write(result.([]byte))
    })

    http.ListenAndServe(":8080", nil)
}

On a 2 core machine, the output of sending a number of http requests is quite different if you don't have another OS thread for the HTTP handling:

GOMAXPROCS=numCPU+1
$ ./worktest
Got a message, sending to pool: hello
Got a message, sending to pool: hello
Got a message, sending to pool: hello
Got a message, sending to pool: hello
Got a message, sending to pool: hello
Got a message, sending to pool: hello
Got a message, sending to pool: hello
Got a message, sending to pool: hello
Got a message, sending to pool: hello
Got a message, sending to pool: hello
^C

GOMAXPROCS=numCPU
$ ./worktest
Got a message, sending to pool: hello
Got a message, sending to pool: hello
^C

And sending 10 CURLs to each one using:
curl -d 'hello' localhost:8080/work &

Note that with GOMAXPROCS=numCPU only 2 HTTP requests are received - that's because the goroutine handling requests never gets any time. Make another OS thread available to it and it works just fine.

I think your library is really good because it means you don't have to rely on a GOMAXPROCS setting to control the amount of busy work you are doing, and you can have one or more OS threads dedicated to IO bound work instead.

Hey @fcntl, thanks for the example. I agree with your points. We're effectively comparing OS level thread scheduling versus Go runtime green thread scheduling, from your demo it looks like Go is more likely to starve goroutines in these cases. I'll update the example as you're right that it makes sense in this case. Thanks for the quick response, sorry that mine have been more spaced out :P

Yes, Go only schedules on blocking IO or sometimes function calls. Limiting GOMAXPROCS makes sense when you don't have a means of controlling busy Goroutines and you would have hundreds of busy threads fighting each other otherwise.

I was forced to learn this in great detail the hard way after porting a Python/Gevent app to Golang and trying to figure out how to make use of more CPU cores without creating contention or starving IO threads :P

@fcntl hey there! I was reading this issue and saw you made some good points. Would like to hear your thoughts on what I should set GOMAXPROCS and number of workers to for IO bound jobs (accessing a remote mysql database in my case). I made an issue here for it that you can reply to: #16 thanks!