Microservices. Event based architecture.
Download the source code
Firstly let us look at the communication between the components in this mini application of microservices.
This React application is going to communicate via network request to a posts service and a comments service . Both, Post Service and Comment Service are going to be Express based applications. We are going to store all our data in memory, we are not to worry about databases or anything like that.
After cloning the project, we can check out Downsideone branch to reproduce the problem we are going to talk about next.
git checkout downsideone
Basically, the downside is that for every single post we load up, we are making one request to our Comments Service to get all the comments associated with that post. In brief, this is incredibly inefficient.
To see the problem live you can start the project by executing these commands in the directories of each service.
Afterwards, you will be able to see the result in your browser consulting the existing network requests.
In comparison, this problem has an easy solution in a monolithic architecture.
Then, how to solve this problem in a microservices architecture?
As an illustration, in a previous post we have seen the different styles of communication in this environment: Data management between services. In order to solve this problem, let’s see several options.
Synchronous vs asynchronous: when to choose one over the other?
Both types of communication have benefits and drawbacks. While asynchronous communication is hard to get right but offer loose coupling, synchronous communication is synonymous with high coupling but is simple to use & debug. It is very common to find both of them in the same application.
Microservices Sync communication
This is an option however not the one we use for our case.
In the post mentioned above, the downsides of this type of communication are indicated.
Even it is easy to understand, nevertheless:
An Async Solution
For our case we will use asynchronous communication, let’s see how it works.
Firstly, Post Service will emit an event any time a post is created:
In response, Comments Service will emit an event any time a comment is created and Query Service assembles all of the posts and comments into an efficient data structure:
As a result, Query Service has zero dependencies on others services and it will be extremely fast.
If we create several posts and comments and stop both services simulating that they are failing then we will continue to have the information available.
To stop the services run the following command in the directories of the services you want to stop (post and comments):
control + C
Then, we still have the information available:
Even though asynchronous communication has downsides as well but it is a better solution for us.
We have already talked about it in another blog post, Async communication (second way)
Event Bus for Microservices
First we must clarify that this application is only for the purpose of teaching. The Event Bus that we are going to create will not implement the vast majority of features a normal bus has. As I have said, our objective is to have a really good idea of what is going on behind the scene and that is why we made this Even Bus implementation.
Although comment moderation is very easy to implement in the existing Comments Service let’s assume we need a new service. This feature looks simple but there is a some incredible hidden complexity here, of course it is more simple in a monolith style world.
Eventually, It might take a long time for the new service to moderate a comment. Therefore, as soon as the comment is created, it must be persisted in the Query Service with the pending status so that the user can be informed that their comment is waiting to be moderated.
In order that Query Service be totally independent of other services, then Comments Service will take care of all the business logic around the comment.
Whenever the Comment Service updates a comment by handling one specialized event, we’re going to emit one single very generic event called simply comment updated. Query Service only listens for update events without worry about what the update was or trying to interpret it. So, Query Service doesn’t worry about trying to run some specialized business logic around it. In conclusion, Query Service will be limited to recording the updated information.
To illustrate all this, in the Moderation Service we will set a filter. This filter will reject all comments that include the word orange.
Let’s simulate that once a comment is created it is not immediately moderated. That a person is in charge of such a task and it can take 5 minutes, 2 hours, a day or more (takes a human intervention).
To do this, we stop the moderation service.
Control + C
When creating a comment we are informed on the screen that it is waiting for moderation.
If we start the service again, we find a problem. The event was sent to the moderation service while it was stopped (temporary interruption) and now when it is started again that event does not reach the service, it has been lost. Let’s say now the whole app is a bit out of sync.
After restarting the moderation service, the last comment created will continue to wait for moderation.
To reproduce this problem check out over the moderationservice branch:
git checkout moderationservice
Microservices: Dealing with Missing Events
The first scenario that can take place. What happens to our application when a service goes down for some period of time?
Firstly, Moderation Service stops working for a while and then events B and C don’t reach their destination. A little later Moderation Service recovers but the events have been lost. Finally, the service has no way of finding out that these events have happened and being able to retrieve them.
Events possibly being missed while a service is experiencing some amount of downtime.
On the other hand, the second scenario that we must consider is the following. If we create and add a service a year later or we start a service much later than the others. We are bringing services online in the future.
How do we manage this? We must synchronize the service with the entire application by retrieving all past events and managing them in the service.
As a posible solutions the nexts.
Before, let’s imagine that for some time we have been running services Posts and Comments. A year later we decided to incorporate Query Service When starting this service.
Obviously, Query Service must make requests to obtain all the existing posts in Posts Service so far and another request to obtain all the existing comments so far in Comments Service.
As a result, we already have the code in production and this change would force us to implement the code of all calls in Query Service and create an endpoint in each of the existing services to respond to these requests. Let’s imagine that we have many services involved in a change of this type, we must modify all those services. Then, we consider this a downside.
On the other hand, these requests would be made by starting the Query service to synchronize the service with the application. Later it will only be limited to emitting and receiving events to maintain synchronization, these requests will not be made again. So, we would be implementing a code whose only function is to put the Query Service online
Direct Database Access
Another possible solution would be direct access to the databases. This would be an exception to the rule that we have already talked about in Data Management Between Service, each service must have a private database.
Instead of making the synchronous request to the service, it is made directly to your database. Now, the upside to this is now the query service could run its own queries and figure out all the different data that it needs to get out of the post database and common database. After synchronizing all this data once again, the query service could start listening to different events.
However, this approach has a downside. If the services use different types of databases (MongoDB, PostgreSQL,etc…), the code to be implemented in the Query Service must take into account all these differences. That’s a lot of extra code to potentially have to write.
Finally, as a third option we have the following solution that we can call Store Events. This is how people actually implement microservices. We are going to give the Query Service access to all the events that have happened in the past. In addition to emitting and receiving events, Event Bus will store all events internally in some kind of data structure, in some database or something similar (in our learning application we use memory).
In this way, we solve the issue of bringing services online in the future. We can see in Query Service code how the events are retrieved when the service is started.
const res = await axios.get("http://localhost:4005/events");
Not only does it solve the issue of bringing services online in the future, but it also solves this problem of events possibly being missed while a service is experiencing some amount of downtime. When the Moderation Service starts, it retrieves past events.
To check that all these microservices are working properly, we can stop one service or the other and create posts and comments. When starting or restarting the service all the created events are recovered and the application is correctly synchronized with everything that happened in the past.