JuliaComputing/OpenAPI.jl

Add callback function in `storefile()` to show download progress?

Opened this issue · 9 comments

It seems impossible because api_call() have received all bytes in RAM here.

I don't know if I'm thinking right. Is it possible to add a progress callback function?

Can you give me an example of How to use response stream?

# example asynchronous API that streams matching Pet instances into response_stream
findPetsByStatus(
    api::PetApi,
    response_stream::Channel,
    status::Vector{String}) -> (response_stream, http_response)

Can you give me an example of How to use response stream?

# example asynchronous API that streams matching Pet instances into response_stream
findPetsByStatus(
    api::PetApi,
    response_stream::Channel,
    status::Vector{String}) -> (response_stream, http_response)

I know... response object will simply put! into the channel.

chan =  Channel(2)
findPetsByStatus(api,chan;statu)
take!(chan)

It seems impossible because api_call() have received all bytes in RAM here.

I don't know if I'm thinking right. Is it possible to add a progress callback function?

I figured it out, storefile() just save not download response body. And it is suited for small files only, because all file content already save in RAM.

So if we want actual download progression, we need to add file download code in do_request(), use Downloads.download.

And for files downloads in OpenAPI definition:

          content:
            application/pdf:
              schema:
                type: string
                format: binary

The return type will be Vector{UInt8}, maybe we could let users download the file rather than save the response to RAM?

Yes, we need to invoke the underlying Downloads.download differently if we want to download a large file in streaming fashion. The APIs are generated to accept and return instances of structs/types. The streaming API is meant to stream multiple instances into a Channel. We do not have something that can stream raw bytes into an IO.

It is not clear yet how this should be structured. I can think of two ways:

  • If the response content type is a binary string, we could generate a different endpoint that allows the user to provide an IO to stream the response into. But detecting this may be tricky.
  • Enable the caller to provide an IO while making any API call, and have the response streamed into it.

I am leaning towards the second option ATM.

Enable the caller to provide an IO while making any API call, and have the response streamed into it.

I don't think providing IO is a good idea, this requires modifying the parameters list of many existing functions.

Maybe you want?

function download(api_call::Function;
    folder::AbstractString = pwd(),
    filename::Union{String,Nothing} = nothing,
    )::Tuple{Any,ApiResponse,String}
    # init io
    api_call(api,queries...;io)
    # rename file, move to target folder
end

First, codegen needs to add the IO parameter for ALL API call function:

function datasets_download(_api::Api, queries...; io)
    _ctx = _oacinternal_datasets_download(_api, queries...)
    return OpenAPI.Clients.exec(_ctx; io)
end

And client.jl needs some changes.

function exec(ctx::Ctx, stream_to::Union{Channel,Nothing}=nothing; io)
    stream = stream_to !== nothing
    resp, output = do_request(ctx, stream; stream_to=stream_to, io)
    ...
end
function do_request(ctx::Ctx, stream::Bool=false; stream_to::Union{Channel,Nothing}=nothing, io)
    ...
    # io exist
    Downloads.download(url;output=io)
    ...
   end

If the response content type is a binary string, we could generate a different endpoint that allows the user to provide an IO to stream the response into. But detecting this may be tricky.

Maybe this is a good solution. But no need to generate a different endpoint?

First, detecting this request is a file download (from ctx header)

Then, change return_type, make sure exec return filepath (string) and response_header, instead of response_body (Vector{UInt8}) and response_header.

Last, add the download code in do_request, and auto rename by response header.

The exposed API like:

    filepath,http_response = datasets_download(api,queries...)

In this way, there is no need to modify Codegen and many other function parameters list.

If users want to save the response result to a file. that's what savefile() do!

And one more question:

How to add progress_hook? Which method makes implementing hooks easy?

How to add progress_hook? Which method makes implementing hooks easy?

Downloads.download (via libcurl) allows some hooks to be registered for progress reporting.

Downloads.download (via libcurl) allows some hooks to be registered for progress reporting.

I mean, the order in which the functions are called is: generated_api -> OpenAPI.Clients.exec -> do_request -> Downloads.download.

Add progress callback function for Downloads.download, which means add callback function to generated_api, OpenAPI.Clients.exec, do_request argument list?

It is a little trivial and needs to modify many existing function argument lists.

I don't know if there is a better solution.