dmlc/parameter_server

Limit on the minibatch config in the async_sgd app?

Closed this issue · 2 comments

Hi,

I'm running async_sgd. I find when I set the minibatch config to the size of the whole dataset, the program will hang. The worker log says "the scheduler is died, killing myself". Is there any limit on the minibatch size?

Thanks,
Cui

mli commented

i didn't test using a huge minibatch size before. async_sgd is not designed to run gradient descent, since the later should use different strategy to read data and pin it in memory if necessary.

are you doing gradient descent benchmark? i can implement a version in a few days.

Yeah, we are trying to run it on the criteo click dataset.

Thanks,
Cui