Multiple unary rpc calls vs long-running bidirectional streaming in grpc?

Asked 26/6, 2019 at 6:57 Answered 28/12, 2020 at 18:59

I have a use case where many clients need to keep sending a lot of metrics to the server (almost perpetually). The server needs to store these events, and process them later. I don't expect any kind of response from the server for these events.
I'm thinking of using grpc for this. Initially, I thought client-side streaming would do (like how envoy does), but the issue is that client side streaming cannot ensure reliable delivery at application level (i.e. if the stream closed in between, how many messages that were sent were actually processed by the server) and I can't afford this.
My thought process is, I should either go with bidi streaming, with acks in the server stream, or multiple unary rpc calls (perhaps with some batching of the events in a repeated field for performance).
Which of these would be better?

Teaspoon answered 26/6, 2019 at 6:57 Comment(0)

the issue is that client side streaming cannot ensure reliable delivery at application level (i.e. if the stream closed in between, how many messages that were sent were actually processed by the server) and I can't afford this

This implies you need a response. Even if the response is just an acknowledgement, it is still a response from gRPC's perspective.

The general approach should be "use unary," unless large enough problems can be solved by streaming to overcome their complexity costs. I discussed this at 2018 CloudNativeCon NA (there's a link to slides and YouTube for the video).

For example, if you have multiple backends then each unary RPC may be sent to a different backend. That may cause a high overhead for those various backends to synchronize themselves. A streaming RPC chooses a backend at the beginning and continues using the same backend. So streaming might reduce the frequency of backend synchronization and allow higher performance in the service implementation. But streaming adds complexity when errors occur, and in this case it will cause the RPCs to become long-lived which are more complicated to load balance. So you need to weigh whether the added complexity from streaming/long-lived RPCs provides a large enough benefit to your application.

We don't generally recommend using streaming RPCs for higher gRPC performance. It is true that sending a message on a stream is faster than a new unary RPC, but the improvement is fixed and has higher complexity. Instead, we recommend using streaming RPCs when it would provide higher application (your code) performance or lower application complexity.

Halfon answered 26/6, 2019 at 14:16 Comment(1)

Thanks for the answer, Eric. I had already watched your talk. It's partly what led me to suspect if streaming might be overkill for my use case and ask this question. – Teaspoon 26/6, 2019 at 18:42

Streams ensure that messages are delivered in the order that they were sent, this would mean that if there are concurrent messages, there will be some kind of bottleneck.

Google’s gRPC team advises against using streams over unary for performance, but nevertheless, there have been arguments that theoretically, streams should have lower overhead. But that does not seem to be true.

For a lower number of concurrent requests, both seem to have comparable latencies. However, for higher loads, unary calls are much more performant.

There is no apparent reason we should prefer streams over unary, given using streams comes with additional problems like

Poor latency when we have concurrent requests
Complex implementation at the application level
Lack of load balancing: the client will connect with one server and ignore any new servers
Poor resilience to network interruptions (even small interruptions in TCP connections will fail the connection)

Some benchmarks here: https://nshnt.medium.com/using-grpc-streams-for-unary-calls-cd64a1638c8a

Situated answered 28/12, 2020 at 18:59 Comment(3)

Could you elaborate more on the 2nd and 3rd bullet point? For 'complex implementation', I thought we just follow the codes from the documentation then we can set up streaming, our application can easily implement codes to read the received data. For 'Lack of load balancing', do we actually need each client to connect to new servers? I thought each time the client just have to form 1 connection with 1 server to stream data. – Neo 3/1, 2021 at 18:56

Sure, @Neo About Point 2 - re-establishing stream on error (when a server instance goes down) has to be done at applicatoin level - mapping a client request with response has to be done at application level (for e.g. if we sent two request concurrently for getSalary({employeeId: uint32}) and response is like {amount: uint64, currency : Currency} then we will need to map the requested employee id with response. - The second point either makes you add extra fields in your response (change contract because of internal implmlementation) and add extra logic for handling this. – Situated 6/1, 2021 at 6:48

Point 3 - We may be using two type of load balancing with gRPC : server side or client side. It applies to both. Multiple unary calls from client will be evenly distributed among the multiple gRPC backends (good load balancing). But once a streaming connection is established with a single backend, all load from client goest to that specific server (bad load balancing). BTW this also makes the streaming suitable when we need to have transactions but bad otherwise. – Situated 6/1, 2021 at 6:48

Recommended topics

Hot tags