Batch support
asg017 opened this issue · 2 comments
asg017 commented
Currently the rembed()
function makes a new HTTP request for every item. For example, for this query:
select rembed('myModel', field)
from my_table;
If my_table
has 100,000 rows, then 100,000 sequential HTTP requests would be sent.
This isn't ideal, most of these providers support multiple inputs in a single request, which should help with rate limits and speed. But finding a good SQL API that works with SQLite can be tricky.
A few different options:
Option 1: Table function with JSON array input
with subset as (
select json_group_array(
'id', rowid,
'contents', my_table.field
) as value
from my_table
)
select
rowid,
embedding
from subset
join rembed_each('myModel', json(subset.value))
Option 2: input in (...)
with serialized rembed_item()
select
rowid,
embedding
from rembed('myModel')
where inputs in (select rembed_item(id, field) from my_table);
alexpaden commented
I want to see batch fixed as I'm trying in-memory sqlite search
re: my bad I'm using ts rn
ajram23 commented
+n to this please!