Split an array/slice into n
evenly chunks.
Inspired from the blog post by Paul Di Gian on his blog: Split a slice or array in a defined number of chunks in golang
Requires Go 1.18 or later.
add github.com/veggiemonk/batch
to your go.mod
file
then run the following command:
go mod tidy
Note: you might better off just copying the function into your codebase. It is less 10 lines of code.
See Go Proverbs for more details.
A little copying is better than a little dependency.
package main
import (
"fmt"
"github.com/veggiemonk/batch"
)
func main() {
s := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
// Split the slice into 3 even parts
chunks := batch.BatchSlice(s, 3)
// Print the chunks
fmt.Println(chunks)
// length 3 3 4
// output: [[1 2 3] [4 5 6] [7 8 9 10]]
// the size of each batch has variation of max 1 item
// this can spread the load evenly amongst workers
}
batchID = uuid.New().String()
taskCount, _ = strconv.Atoi(os.Getenv("CLOUD_RUN_TASK_COUNT"))
taskIndex, _ = strconv.Atoi(os.Getenv("CLOUD_RUN_TASK_INDEX"))
tt, err := requestToTasks(request)
if err != nil {
return fmt.Errorf("failed to get list of tasks (id:%s): %w", batchID, err)
}
if len(tt) == 0 {
return fmt.Errorf("no tasks found (id:%s): %w", batchID, ErrNoTaskFound)
}
batches := batch.BatchSlice(tt, taskCount)
if taskIndex >= len(batches) || taskIndex < 0 {
return fmt.Errorf("index (%d) out of bounds (max: %d), (id:%s): %w", taskIndex, len(batches), batchID, ErrTaskIndexOutOfBounds)
}
b := batches[taskIndex]
err = process(b)
if err != nil {
return fmt.Errorf("failed to process batch (id:%s): %w", batchID, err)
}
Having evenly sized batch is useful when you want to distribute the workload evenly across multiple workers.
As opposed to defining the size of each batch, we define the number of batch we want to have.
Here a counter example:
package main
import "fmt"
func main() {
array := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
chunkSize := 3
var result [][]int
for i := 0; i < len(array); i += chunkSize {
end := i + chunkSize
if end > len(array) {
end = len(array)
}
result = append(result, array[i:end])
}
fmt.Println(result)
// length 4 | 4 | 2
// output: [[1 2 3 4] [5 6 7 8] [9 10]]
// 2 workers will do double the work of the last worker.
}
This is not ideal when you want to distribute the workload evenly across multiple workers.