Special array response syntax?
bessey opened this issue ยท 12 comments
Hi, firstly great library, love every bit of the implementation! Was fascinating learning how batch-loader worked.
I have what I can only imagine to be a common use-case for BatchLoader; batching N+1 SQL 1:Many calls. All the examples in the docs appear to be Many:1, or at least X:1, and so they are resolving to single items. Let me explain in code, as all that is probably just confusing things.
# Lets say I have a house, and I want all the photos for that house.
# This doesn't work, because it resolves a single photo to a single house
BatchLoader.for(house.id).batch do |house_ids, loader|
Photo.where(owner_id: house_ids).each do |photo|
loader.call(photo.owner_id, photo)
end
end
# This doesn't work, because if a house has no photos, nil is returned instead of [].
# Also for such a common pattern, this seems like a lot of boilerplate
BatchLoader.for(house.id).batch do |house_ids, loader|
Photo.where(owner_id: house_ids)
.group_by(&:owner_id).each do |house_id, photos|
loader.call(house_id, photos)
end
end
# This would be my dream API... Note batch takes a default resolve value
BatchLoader.for(house.id).batch([]) do |house_ids, loader|
Photo.where(owner_id: house_ids).each do |photo|
# Here .add is used instead of .call to let BatchLoader know not to replace item
# but to call
# @item = (@item || @default) << item
loader.add(photo.owner_id, photo)
end
end
# Alternative API... explicit batch_array means loader API can stay the same, no default needed
BatchLoader.for(house.id).batch_array do |house_ids, loader|
Photo.where(owner_id: house_ids).each do |photo|
loader.call(photo.owner_id, photo)
end
end
@bessey your ideas are brilliant! ๐ฏ ๐
Currently I'm doing something like:
BatchLoader.for(house.id).batch do |house_ids, loader|
photos = Photo.where(owner_id: house_ids)
# a hash with an empty array as a default value
photos_by_house_id = photos.each_with_object(Hash.new { |h, k| h[k] = [] }) do |photo, memo|
memo[photo.owner_id] << photo
end
house_ids.each { |houst_id| loader.call(house_id, photos_by_house_id[house_id]) }
end
But that's a lot of code and it can be definitely simplified with BatchLoader!
There are, however, may be the cases when users would like to load not an array but a hash. For example:
loader.call(
house_id1,
{
photo1.id => photo1,
photo2.id => photo2,
photo3.id => photo3
}
)
Do you have any ideas of an API which will allow building custom objects (arrays, hashes, etc.) for M:N associations?
For example:
Photo.where(owner_id: house_ids).each do |photo|
loader.call_with_object(photo.owner_id, []) { |memo| memo << photo }
end
# [
# photo1,
# photo2,
# photo3
# ]
Photo.where(owner_id: house_ids).each do |photo|
loader.call_with_object(photo.owner_id, {}) { |memo| memo[photo.id] = photo }
end
# {
# photo1.id => photo1,
# photo2.id => photo2,
# photo3.id => photo3
# }
However, a lot of people probably need just an array or they can transform it before / later manually if they want.
This would be my dream API... Note batch takes a default resolve value
I like the idea!
# "default_value" can be a keyword argument. BatchLoader#batch already accepts "cache" option
BatchLoader.for(house.id).batch(default_value: []) do |house_ids, loader|
Photo.where(owner_id: house_ids).each do |photo|
# the method can be called "add"
# or "push" โ similarly to Array#push
# or "append" โ explicitly says that the value will be added at the end of the array
# which one would you prefer?
loader.append(photo.owner_id, photo)
end
end
Please let me know what do you think about it ^
Definitely more flexible, but still a little too much boilerplate for me to call it perfect!
I wonder if one could get the best of both worlds (flexibility, LOC at point of use) by moving this config into the BatchLoader class, essentially making this loader definition configurable
# existing implementation + new unused memo arg which gives item's CURRENT value
class BatchLoader
def initial_value
nil
end
def loader
-> (item, value, memo) { __executor_proxy.load(item: item, value: value) }
end
end
class ArrayBatchLoader < BatchLoader
# I'm using a block to work ensure we don't share mutable container across batches
def initial_value
[]
end
def loader
loader -> (item, value, memo) {
new_value = memo.concat([value])
# now that this is public API, would probably want to give it a nicer name than __executor_proxy
__executor_proxy.load(item: item, value: memo_value)
}
end
end
I imagine you could and would predefine ArrayBatchLoader and HashBatchLoader, as this API isn't that great for an end user to wrap their head around.
I accidentally posted my last comment while typing ๐
Updated it by adding a code example with a few suggestions
To clarify, in that example memo
would basically be:
def memo
if value_loaded?(item: item)
loaded_value(item: item)
else
initial_value
end
end
I wonder if one could get the best of both worlds (flexibility, LOC at point of use) by moving this config into the BatchLoader class
๐ got the idea. I personally don't want to force users to define custom BatchLoader classes based on their needs. I think that BatchLoader can provide a clear API which will be enough to use it as it is, at least for now.
I like your initial idea. Just a few suggestions to discuss:
- Add
default_value
as a keyword argument:batch(default_value: [])
- Think about the method name on the loader:
add
,push
orappend
. Internally, it'll require to start using a special class likeBatchLoader::Loader
for aloader
variable instead of lambdas.
Seems like you already took a look at the source code. If we agreed on the API and you'd like to contribute, feel free to open a PR. Even failing tests will be enough to start :)
Happy to contribute once we've agreed on at least some portion of the API. I can't personally say i have a usecase for the Hash syntax so I'm struggling to design an API for that, but maybe we can start with the Array option. What do you think of this?
BatchLoader.for(house.id).batch(default_value: []) do |house_ids, loader|
Photo.where(owner_id: house_ids).each do |photo|
loader.call(photo.owner_id, photo)
end
end
Essentially I'm thinking you just add default_value
and change the implementation of call to something like:
item ||= default_value # default nil
if item.respond_to?(:<<)
item << value
else
item = value
end
This absolutely doesn't handle the use case of Hash though, sorry!
Hey @bessey!
I like your initial idea. I think that using plain arrays is a much more common use case compared to hashes or other structures. People can always do the data transformations manually and just execute loader.call
.
However, it'll be great if we can add a support to make these custom data transformations easier, too. What do you think about this API:
- Load many-to-many associations:
BatchLoader.for(house.id).batch(default_value: []) do |house_ids, loader|
Photo.where(owner_id: house_ids).each do |photo|
loader.call(photo.owner_id) { |value| value << photo }
end
end
# => [photo1, photo2]
- Custom hash object:
BatchLoader.for(house.id).batch(default_value: {}) do |house_ids, loader|
Photo.where(owner_id: house_ids).each do |photo|
loader.call(photo.owner_id) { |value| value[photo.id] = photo }
end
end
# => { photo1.id => photo1, photo2.id => photo2 }
I think it's a quite simple and consistent API but probably doesn't cover all use cases. Fo example, immutable data structures. In this case, we can require returning a result object from the block (similarity to Array#inject
), not sure:
- Loading a single object at once
BatchLoader.for(house.id).batch(default_value: []) do |house_ids, loader|
Photo.where(owner_id: house_ids).each do |photo|
loader.call(photo.owner_id) { photo } # same as loader.call(photo.owner_id, photo)
end
end
# => photo2
- Immutable Hamster vector:
BatchLoader.for(house.id).batch(default_value: Hamster::Vector[]) do |house_ids, loader|
Photo.where(owner_id: house_ids).each do |photo|
loader.call(photo.owner_id) { |vector| vector.add(photo) } # returns a new object w/o mutating it
end
end
# => Hamster::Vector[photo1, photo2]
Please let me know what do you think about it.
I implemented the array parts, but hit a bug i don't understand around the .call {...}
syntax. Need your help there, I'm guessing its something to do with passing the __executor_proxy by value instead of reference as it was in the lambda.
Check out my callable branch, and see there's one failing test for nested batch loaders.
@bessey Looks great! ๐
I think that the test with the nested batch loaders fails because of the line:
raise ArgumentError, "Please pass a value or a block" if value == NULL_VALUE
The value
in this case is a nested BatchLoader
instance. Executing ==
forces loading the value
(not lazily). I think you can remove the check for now.
Additionally, if you want, you can move the logic from the ExecutorCallable#call
method to the existing loader
lambda since we don't introduce new methods and reuse the same call
. It's up to you :)
By the way, nice error handling!