Data Sharing between micro services

Asked 13/1, 2017 at 17:52 Answered 25/1, 2017 at 22:16

Solved design-patterns architecture microservices aws-api-gateway data-sharing

Current Architecture:

Problem:

We have a two-step flow between frontend and backend layers.

First step: The frontend validates an input I1 from the user on microservice 1 (MS1)
Second step: The frontend submits I1 and more information to the microservice 2

The micro service 2 (MS2) needs to validates the integrity of I1 as it is coming from the frontend. How to do avoid a new query to MS1? What's the best approach?

Flows that I'm trying to optimize removing the steps 1.3 and 2.3

Flow 1:

1.1 The User X requests data (MS2_Data) from MS2
1.2 The User X persists data (MS2_Data + MS1_Data) on MS1
1.3 The MS1 check the integrity of MS2_Data using a B2B HTTP request
1.4 The MS1 use MS2_Data and MS1_Data to persist and Database 1 and build the HTTP response.

Flow 2:

2.1 The User X already has data (MS2_Data) stored on local/session storage
2.2 The User X persists data (MS2_Data + MS1_Data) on MS1
2.3 The MS1 check the integrity of MS2_Data using a B2B HTTP request
2.4 The MS1 use MS2_Data and MS1_Data to persist and Database 1 and build the HTTP response.

Approach

One possible approach is to use a B2B HTTP request between MS2 and MS1 but we would be duplicating the validation in the first step. Another approach will be duplicating data from MS1 to MS2. however this is prohibitive due to the amount of data and it's volatility nature. Duplication does not seem to be a viable option.

A more suitable solution is my opinion will the frontend to have the responsibility to fetch all the information required by the micro service 1 on the micro service 2 and delivered it to the micro service 2. This will avoid all this B2B HTTP requests.

The problem is how the micro service 1 can trust the information sent by the frontend. Perhaps using JWT to somehow sign the data from the micro service 1 and the micro service 2 will be able to verify the message.

Note Every time the micro service 2 needs information from the micro service 1 a B2B http request is performed. (The HTTP request use ETAG and Cache Control: max-age). How to avoid this?

Architecture Goal

The micro service 1 needs the data from the micro service 2 on demand to be able to persist MS1_Data and MS2_Data on MS1 database, so the ASYNC approach using a broker does not apply here.

My question is if exists a design pattern, best practice or a framework to enable this kind of thrust communication.

The downside of the current architecture is the number of B2B HTTP requests that are performed between each micro services. Even if I use a cache-control mechanism the response time of each micro service will be affected. The response time of each micro services is critical. The goal here is to archive a better performance and some how use the frontend as a gateway to distribute data across several micro services but using a thrust communication.

MS2_Data is just an Entity SID like product SID or vendor SID that the MS1 must use to maintain data integrity.

Possible Solution

The idea is to use the gateway as an api gateway request processing that will cache some HTTP response from MS1 and MS2 and use them as a response to MS2 SDK and MS1 SDK. This way no communication (SYNC OR ASYNC) is made directly between MS1 and MS2 and data duplication is also avoided.

Of course the above solution is just for shared UUID/GUID across micro services. For full data, an event bus is used to distribute events and data across micro services in an asynchronous way (Event sourcing pattern).

Inspiration: https://aws.amazon.com/api-gateway/ and https://getkong.org/

Related questions and documentation:

Nagpur answered 13/1, 2017 at 17:52 Comment(9)

What's the downside of the current solution? It's simpler and each service just has to return the data that is asked of it, instead of having to get into the message authentication business. – Jerrine 13/1, 2017 at 18:28

The downside is the number of B2B HTTP requests that are performed between each micro services. Even if I use a cache-control mechanism the response time of each micro service will be affected. The response time of each microservices is critical. The goal here is to archive a better performance and some how use the frontend as a gateway to distribute data across several microservices but using a thrust communication. @Jerrine what do you think? – Nagpur 13/1, 2017 at 18:42

I don't see how you are reducing overall traffic, you're just rearranging it and adding in the complication of encryption. – Jerrine 13/1, 2017 at 19:12

Can you describe dynamic behaviour of illustrated comonents in domain-specific terms? E.g. "User X queries profile data and needs to query both services A and B". Because currently it is not very clear what those b2b requests are about. – Bandmaster 14/1, 2017 at 16:45

@Bandmaster I updated the question with more details regarding my current architecture and what I'm trying to archive. Please take a look at the flows sections. – Nagpur 16/1, 2017 at 10:27

@Bandmaster please check my updated question and the section possible solution. What do you think? – Nagpur 25/1, 2017 at 22:18

@Jerrine please check my updated question and the section possible solution. What do you think? – Nagpur 25/1, 2017 at 22:18

Denormalization and async communication is a best practice in a modern distributed fault tolerant systems. Try to accept it, otherwise you will end up building another monolith and fragile thing. – Bandmaster 26/1, 2017 at 5:14

@Bandmaster I updated my possible solution. Of course the above solution is just for shared UUID/GUID across micro services. For full data, an event bus is used to distribute events and data across micro services in an asynchronous way (Event sourcing pattern). The main goal is always to avoid data duplication. Denormalization is archived with bounded context (This is critical of course). What do you think? – Nagpur 26/1, 2017 at 10:17

Check the section Possible Solution on my question:

Inspiration: https://aws.amazon.com/api-gateway/ and https://getkong.org/

Nagpur answered 25/1, 2017 at 22:16 Comment(1)

can help on this one? #44084419 – Bacteria 20/5, 2017 at 12:52

From the question and comments I understand that you are trying to rearrange the blocks to enhance the performance of the system. As described by the diagrams you suggest that instead of microservice1 querying microservice2 the gateway would query microservice2 and then query microservice1 providing it the information from microservice2.

As such I don't see how this would significantly increase the system performance but rather the change seems to just move the logic around.

To remedy the situation the performance of the critical microservice2 should be enhanced. It can be done by profiling and optimizing the microservice2 software (vertical scaling) and/or you can introduce load balancing (horizontal scaling) and execute microservice2 on multiple servers. The design pattern to be utilized in this case is Service Load Balancing pattern.

Unwilling answered 14/1, 2017 at 10:57 Comment(2)

I updated the question with more details regarding my current architecture and what I'm trying to archive. I'm already using load balancing and vertical scaling on each micro services. Please take a look at the flows sections. – Nagpur 16/1, 2017 at 10:2

please check my updated question and the section possible solution. What do you think? – Nagpur 25/1, 2017 at 22:19

It's difficult to judge the viability of any solution without looking "inside" the boxes, however:

If the only thing you care about here is stopping the frontend from potentially tampering with the data, you can create a sort of "signature" of the packet of data sent by MS2 to the frontend and propagate the signature to MS1 together with the packet. The signature can be a hash of the packet contatenated with a pseudorandom number generated in a deterministic way from a seed shared by the microservices (so MS1 can recreate the same pseudorandom number as MS2 without the need for an additional B2B HTTP request, and then verify the integrity of the packet).
The first idea that comes to my mind is to verify whether the ownership of the data could be modified. If MS1 must frequently access a subset of data from MS2 it MAY be possible to move the ownership of that subset from MS2 to MS1.
In an ideal world the microservices should be completely standalone, each one with it's own persistence layer and a replication system in place. You say that a broker is not a viable solution, so what about a shared data layer?

Hope it helps!

Toby answered 13/1, 2017 at 19:50 Comment(2)

I updated the question with more details regarding my current architecture and what I'm trying to archive. Can you provide more information/links regarding. Best answer so far! – Nagpur 16/1, 2017 at 10:4

please check my updated question and the section possible solution. What do you think? – Nagpur 25/1, 2017 at 22:19

You can consider changing your sync way of b2b communication to the async one using publish-subscribe pattern. In that situation, services operation will be more independent and you may not need to perform b2b requests all the time.

The way you make it faster in distributed system is through denormalization. If ms2data changes rarely, e.g. you read it more than rewrite, you have to duplicate it across services. By doing this you will reduce latency and temporal coupling. Coupling aspect may be even more important than the speed in many situations.

If ms2data is an information about product, then ms2 should publish ProductCreated event containing ms2data to a bus. Ms1 should be subscribed to this event and store ms2data in its own database. Now, anytime ms1 requires ms2data it will just read it locally without a need to perform requests to ms2. This is what temporal decoupling means. When you following this pattern, your solution becomes more fault tolerant and shutting down ms2 will not influence ms1 in any way.

Consider reading a good series of articles that describe the problems behind sync communication in microservices architecture.

Related SO questions here and here discussing pretty similar problems, consider taking a look.

Bandmaster answered 14/1, 2017 at 16:42 Comment(4)

I updated the question with more details regarding my current architecture and what I'm trying to archive. The async approach can not be used here because: The micro service 1 needs the data from the micro service 2 on demand to be able to persist MS1_Data and MS2_Data on MS1 database, so the ASYNC approach using a broker does not apply here. – Nagpur 16/1, 2017 at 10:0

thank you Leonel, take a look at my updated answer based on new information provided. Unfortunately you have not specified the meaning of data1 and data2, so it is hard to give more meaningful answer. Is data an image? Is it authentication key? Is it user profile? Still not clear. – Bandmaster 16/1, 2017 at 15:54

MS2_Data is just an Entity SID like product SID or vendor SID that the MS1 must use to maintain data integrity. – Nagpur 17/1, 2017 at 10:32

Then you should consider storing this id in both services. I've updated my answer with the detailed description about how to do that. – Bandmaster 17/1, 2017 at 16:36

Recommended topics

Hot tags