gRPC + Image Upload
Asked Answered
F

2

34

I want to create a simple gRPC endpoint which the user can upload his/her picture. The protocol buffer declaration is the following:

message UploadImageRequest {
    AuthToken auth = 1;
    // An enum with either JPG or PNG
    FileType image_format = 2;
    // Image file as bytes
    bytes image = 3;
}

Is this approach of uploading pictures (and recieving pictures) still ok regardless of the warning in the gRPC documentation?

And if not, is the better approach (standard) to upload pictures using the standard form and storing the image file location instead?

Forenamed answered 23/1, 2016 at 22:4 Comment(2)
What warning in the gRPC documentation are you referring to?Buyers
@EricAnderson "Protocol Buffers are not designed to handle large messages. As a general rule of thumb, if you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy." -developers.google.com/protocol-buffers/docs/techniques?hl=enForenamed
B
35

For large binary transfers, the standard approach is chunking. Chunking can serve two purposes:

  1. reduce the maximum amount of memory required to process each message
  2. provide a boundary for recovering partial uploads.

For your use-case #2 probably isn't very necessary.

In gRPC, a client-streaming call allows for fairly natural chunking since it has flow control, pipelining, and is easy to maintain context in the client and server code. If you care about recovery of partial uploads, then bidirectional-streaming works well since the server can be responding with acknowledgements of progress that the client can use to resume.

Chunking using individual RPCs is also possible, but has more complications. When load balancing, the backend may be required to coordinate with other backends each chunk. If you upload the chunks serially, then the latency of the network can slow upload speed as you spend most of the time waiting to receive responses from the server. You then either have to upload in parallel (but how many in parallel?) or increase the chunk size. But increasing the chunk size increases the memory required to process each chunk and increases the granularity for recovering failed uploads. Parallel upload also requires the server to handle out-of-order uploads.

Buyers answered 24/1, 2016 at 23:1 Comment(2)
So you are suggesting a return of (stream bytes)? Is there any way to return both the metadata & the stream of bytes in one?Leela
I'm suggesting responding with a message, with one field that is bytes. So you could have metadata as other fields.Buyers
F
26

the solution provided in the question will not work for files having large sizes. it will only work for smaller image sizes. the better and standard approach is use chunking. grpc supports streaming a built in. so it is fairly easy to send in chunks

syntax = 'proto3'

message UploadImageRequest{
    bytes image = 1;

}

rpc UploadImage(stream UploadImageRequest) returns (Ack); 

in the above way we can use streaming for chunking.

for chunking all the languages provide its own way to chunk file based on chunk size.

Things to take care:

you need to handle the chunking logic, streaming helps in sending naturally. if you want to send the metadata also there are three approaches.

1: use below structure

message UploadImageRequest{
    AuthToken auth = 1;
    FileType image_format = 2;
    bytes image = 3;
}

rpc UploadImage(stream UploadImageRequest) returns (Ack); 

here bytes is still chunks and for the first chunk send AuthToken and FileType and for all other requests just don't send those metadata.

2: you can also use oneof which is much easier.

message UploadImageRequest{
        oneof test_oneof {
              Metadata meta = 2;
              bytes image = 1;
        }
}
message Metadata{
     AuthToken auth = 1;
     FileType image_format = 2;
}

rpc UploadImage(stream UploadImageRequest) returns (Ack); 

3: just use below structure and in first chunk send metadata and other chunks will have data. you need to handle that in code.

syntax = 'proto3'

message UploadImageRequest{
    bytes message = 1;

}

rpc UploadImage(stream UploadImageRequest) returns (Ack); 

lastly for auth you can use headers instead of sending that in message.

Fruitless answered 20/11, 2017 at 5:32 Comment(2)
so in case 1 how can you be sure AuthToken and FileType are only sent once?Intendant
What do you mean by sending only once? If you are having the client it will be in your control to send it once, if client is some third party they have to control it, in any case you will ignore both after first chunk. If you want a strict restriction to sent only once then approach 2 is your option.Fruitless

© 2022 - 2024 — McMap. All rights reserved.