Postponing to load individual datasets
mrkn opened this issue · 3 comments
I'm concerned that many datasets are loaded in advance regardless of their necessities.
https://github.com/red-data-tools/red-datasets/blob/master/lib/datasets.rb#L3-L34
How do you think about postponing to load individual datasets?
Are you thinking about start-up time?
I agree with less start-up time is better. But I don't want users to force requiring each dataset explicitly such as require "datasets/iris"
.
We can use autoload
for this case but I don't like autoload
a bit... For example, autoload
doesn't work in Ractor.
Do you have any idea to implement this proposal?
I don't know how to overcome the problem due to non-main Ractors. I guess we need a mechanism that non-main Ractors let the main Ractor load libraries.
I've implemented this.
We can postpone to load by datasets/lazy
:
require "datasets/lazy"
# Datasets::Iris isn't loaded yet
Datasets::Iris # Datasts::Iris is loaded now