genaray/ZeroAllocJobScheduler

Undocumented value "amount" in Scheduler.Schedule

thygrrr opened this issue · 3 comments

None of the APIs document what "amount" does in the Scheduler.Schedule method. The internal methods have it as a default parameter = 0, but that is not an allowed value for the user-facing API.
image

I would have interpreted it as the number of total work items that will be split into batches.

IJobParallelFor.Execute(int index) would then be the batch index.

However, this doesn't seem to be the case either. I'm getting astronomical (integer overflow) index*BatchSize values trying to chop a Memory<T> of ~0.25 million contiguous Ts into chunks of 16384 and execute them in parallel.

It also doesn't seem to be the case that Amount is the number of batches. (why do we need to specify batch size then, anyway? there's quite a bit of logic inside the source code that seems to find some middle ground between the thread limits and chopping up the parallelWork, but it doesn't seem to be working as intended.)

Here's my use case:

Memory is the appropriate size, spanning precisely the entire workload.
It does not (and might never) divide perfectly by batch size; it could even be a prime number depending on application state.

    private class Parallel<C1> : IJobParallelFor
    {
        public RefAction_C<C1> Action;
        public JobHandle Handle;
        public Memory<C1> Memory { get; set; }

        public void Execute(int index)
        {
            var start = index * BatchSize;
            var length = Math.Min(BatchSize, Memory.Length - start);
            var span = Memory.Span.Slice(start, length);

            foreach (ref var c in span) Action(ref c);
        }
        
        public void Finish()
        {
        }

        public int ThreadCount { get; set; }
        public int BatchSize { get; set; }
    }

What is the amount I need to specify when scheduling the job?
job.Handle = Scheduler.Schedule(job, table.Count);

Table.count is 0.25 Million (I want to schedule 4 jobs that each batch 0.25 millionSystem.Numerics.Vector3s). The array is jagged so it's not easily possible to batch them all together. job.BatchSize is 16384. (but any batch size causes issues, and the Scheduler seems to always wait on the 4th job to complete with ThreadCount = 4.

If I try this, this will just hang on the last job for a long time and eventually crash.
The job is a simple per-element Cross Product.

    public void JobParallel(RefAction_C<C> action, int chunkSize = int.MaxValue)
    {
        Archetypes.Lock();
        
        for (var i = 0; i < Tables.Count; i++)
        {
            var table = Tables[i];
            if (table.IsEmpty) continue; //tables currently never empty.

            var storage = table.GetStorage<C>(Identity.None);

            var job = (Parallel<C>) Jobs[i];
            job.Action = action;
            job.Memory = storage.AsMemory(0, table.Count);
            job.BatchSize = chunkSize;
            job.Handle = Scheduler.Schedule(job, table.Count / chunkSize + 1));
        }
        
        Scheduler.Flush();

        for (var i = 0; i < Tables.Count; i++) 
        {
            var job = (Parallel<C>) Jobs[i]; //ugh we really need that stackalloc ...
            Console.WriteLine($"Waiting for job {i} to complete.");
            job.Handle.Complete();
        }
        
        Archetypes.Unlock();
    }

Thanks! Some of the features might be reversed soon, especially the IJobParallelFor ones since the Work Stealing is still quite buggy. However @LilithSilver might be able to say a bit more about that topic.

As genaray mentioned, IJobParallelFor is getting nuked in #29 unfortunately since there are performance issues.

However I would like to add it back once I get some time to resolve those issues! At that point I'll make sure to properly document amount, and add some examples.

For this use case, Execute(i) is simply called on every index. amount is the amount of indices, not batches. You can totally ignore batch size in your code, actually. What your code is doing is actually 0.25 million * 16834 operations.

A larger batch size is just a measure of how "complicated" your code is to run. If you're running a ton of work on just several indices, a small batch size might make sense. If you're barely doing anything, a very large batch size might make sense. I'd say your batch size is unreasonably large. The algorithm won't be able to work-steal at all, because there's only one batch per thread, ever available.

Again, though, IJobParallelFor is getting nuked. For now, if you want to run parallel jobs, I'd recommend spawning off Environment.ProcessorCount jobs manually and assigning them each an equal chunk of work.

Hopefully that makes sense. Let me know if you have any questions!