grpc/grpc-go

File upload (Blob streaming)

labkode opened this issue ยท 13 comments

Is it possible to easily stream binary blobs with gRPC ?

With an HTTP/1.1 server I can pipe the request.Body to a Writer easily.

The only way I've found to do a big blob steaming with gRPC is to create a Chunk protobuf message

message DataChunk {
    bytes data = 1;
}

and to create a server streaming endpoint.

Then, the client reads from a Reader and fills a little buffer. After, this little buffer is streamed trough the gRPC connection using the DataChunk message format.

It would be great to have the same Reader/Writer easiness with gRPC streaming.

I've tried to find some info about file upload /blob upload and the only resource I've found is this article.

It shows that file uploads are handled by HTTP and metadata handling by gRPC.

Is gRPC able/designed to handle these file uploads without using my poor approach ?

gRPC isn't designed for arbitrary streams, only streams of messages. Your DataChunk is exactly what's supported.

@dsymonds Thanks for your quick response. We can close this issue.

So the suggestion is not to use gRPC for uploading arbitrary blobs of data? is it better to use HTTP multipart requests?

-- Apologies for commenting on closed issues.

@c4milo I guess that everyone who need to transfer big objects via grpc will have to implement chuncking on the top of grpc streaming.

@vitalyisaev2 do you have any suggestion about how to do this? I came up with an approach I'm not feeling very proud of (using unsafe.Sizeof on unserialized messages and wild guesses).

Serializing beforehand in order to get the real size seems too wasteful since gRPC would encode the message again before sending.

Perhaps having a custom codec with a timer and flushing whenever it reaches a size limit or a timeout?

@c4milo I would suggest you to try the following approach:

service BlobKeeper {
    rpc Put (stream PutRequest) returns (PutResponse);
}

message PutRequest {
    message Key {
         string key = 1;
    }
    message Chunk {
          bytes data = 1;
          int64 position = 2;
    }
    oneof value {
        Key key = 1;
        Chunk chunk = 2;
    }
}

On the client side you'll have to split your data to particles and send it sequentially within a stream. Probably in the first message you should provide some kind of metadata, like data key or something like that. On the server side you can do whatever you want: yoy may buffer the chunks, or you can put them on disk immidiately.

The one's is for sure: you'll have to do a lot of manual work here, but it's quite straightforward.

@vitalyisaev2 nice, I had something very similar. I'll keep using this approach then. Thank you!

Are there any infos about how people are doing this? Any libraries that already do this on top of grpc? Multipart, as suggested?
Thx!

@cirocosta I'm interested on this as well!

i m in this too

Same here

I try to implement chuncking in java and cpp, and write some sample code and protobuf. Hope it helps. Any feedback is appreciated.

@tzutalin
Never mind the previous post.
I have two more concerns,

  1. https://github.com/tzutalin/example-grpc/blob/0bde7f0ed4a9b0961429697307be88314734cbe5/java/src/main/java/UploadFileClient.java#L74
    For this line, we need to change to
ByteString byteString = ByteString.copyFrom(buffer, 0, tmp);

to make sure we didn't waste space and change the original file size.

  1. Another thing is for the file with multiple chunks to send, we need to wait for a while after sending each chunking, otherwise, the server will refuse to process the request.