Tag: microservices

Microservices architecture – orchestrator, choreography, hybrid… which approach to use?

So you want to build microservices, you have identified the properly isolated domains and boundaries for each one of them and now it’s time to find out how to make them interact with each other… Now is when you ask yourself, should we use an orchestrator approach? a choreography approach? mmmm, maybe a hybrid solution?

In this post, I’m going to talk about the different microservices architectures, and I will try to answer those questions, so when you have to make the call you have at least an idea of what are the tradeoffs of each one.

The orchestrator architecture

Orchestrator architecture

On the orchestrator architecture, our users will probably hit an api-gateway, which will then trigger an event on the orchestrator. Now the orchestrator service is in charge of executing the business logic by making requests to the other microservices while keeping track of the event status. It is a centralized controller and the requests are often synchronous.

Let’s dive into the pros and cons of the orchestrator approach:

Pros

  • Business logic is “hardcoded” and tangible
  • Easy request flow and status tracking

Cons

  • Tightly coupled services. Dependencies
  • Single point of failure

The choreography architecture

Choreography architecture

The choreography architecture, often called message broker or reactive architecture is, reactive and loosely coupled. Here, a message/event broker receives the requests and the microservices subscribe to them. Each one of them triggers when a certain message appears in the broker. If needed, after a service is triggered by a message it can also talk to other services directly, to chain different flows. Messaging is asynchronous here, with a publisher/subscriber pattern.

It has some advantages over the orchestrator architecture, but it comes with other problems, let’s dive into them:

Pros

  • Loosely coupled services
  • Isolated, independent packages of services
  • No single point of failure

Cons

  • Monitoring and observability are quite complex
  • Difficult request flow control (timeouts, retries, errors)
  • End-to-end knowledge of the system, where is the business logic?

Benefits and tradeoffs of both approaches

A typical tradeoff of the orchestrator approach is that you have tightly coupled services, on this topic, the choreography approach is the clear winner.

The single-point-of-failure vs no-point-of-failure would score another point for the choreography architecture, yet it is something that can be solved with the underlying cloud architecture and some good programming skills. So, no clear winner here.

Monitoring, observability, and flow control are big benefits for the orchestrator approach, so it is the clear winner. Something simple like setting up a timeout for a request is a pain to do in the choreography architecture and don’t try to find out where your system is failing on a choreography architecture, it is a nightmare if you haven’t spent a lot of time on properly monitoring the system.

However, there are other benefits and tradeoffs that are almost never mentioned…
On an orchestrator architecture, the orchestrator itself contains the business logic. It is there. You can read the actual code. On the other hand, on the choreography architecture, you have to read a document or see a graphic, because only a few people really know the whole end-to-end architecture. This becomes an issue when for example, someone leaves the company, understanding end-to-end complex choreographies can be really a problem.

The choreography approach is often a popular choice done by “cloud-native architects”. This is because most cloud providers have some really good event/messaging systems and there are a lot of tools with buzzwords names that support it. However, for developers, (yes, those that will actually write the code) thinking about a system asynchronously is quite difficult and it means an important mind shift.

The hybrid approach

With all these benefits and tradeoffs of both architectures, there is no clear standard path to choose. Both have their own pros and cons, and the tradeoffs of each approach are different. But if you look closer, they kind of complement each other. If we could have a mix of the benefits from the orchestrator approach and the choreography approach, it would be ideal!

Enters the hybrid approach, and there are many ways you can implement a mix of both architectures, but let’s focus on the next example:

Hybrid architecture

On the hybrid architecture, we combine the two things, an orchestrator and an event tracker. The requests are received by both and the orchestrator starts to execute the business logic but keeping track of the request in sync with the event tracker. This allows the microservice C, B, and D to do decoupled things, based on the event subscription/status, while the orchestrator still orchestrates what happens with the whole flow and directly controls microservices E and F.

With this mix, we can have tightly coupled services when sequentiality is needed, and loosely coupled services when we need to do parallel things “in the background”

When to use each one?

The orchestrator approach is good when you have a business-critical sequential flow. There, the orchestrator can handle the “what if” cases. For example, if one microservice needs to return successfully before calling a second microservice.

The choreography approach is good when you need to do a lot of things in parallel, it allows you to have faster processing times and it scales easily. For example, job processing in the background

The hybrid approach is more flexible, here you can have parallel asynchronous processing while still keeping track of the business logic when it is needed. For example, job processing in the background, but only after a logic sequence of services has been addressed.

Conclusion

Whatever approach you decide to follow, take it as a guide, but not be limited to it.
There are perfectly fit use-cases for each approach, no one is better than the other, I prefer the hybrid approach because of its flexibility.

Handling batch operations with REST APIs

So, you created your REST API following the best practices, named your endpoints accordingly, used the correct HTTP verbs and everything is working well.

For example, you create users by making a POST /users call, get a list of them using GET /users, or get a single one by doing GET /users/:userId

Awesome!

Developers are happy, customers using the API are happy, what a beautiful world!

Until, someone comes and says: “Hey, I need to import 10000 users”. Ouch.

Your first thought might be, “well… just do a for loop and make a POST for each one of them, I don’t care if it takes half an hour”

That may be a solution in most of the cases because as I said, you built your API following the best practices and your underlying cloud infrastructure is horizontally scalable, you can create as many instances as your credit card allows, with almost unlimited computing resources like CPU or RAM.
But there is something that doesn’t scale too much, networking.

Why networking is an issue?

Networking or the number of calls you need to make is a bottleneck, each networking call needs to negotiate a complicated protocol like TCP and find its way through an unreliable global network of routers and switches.

Some clients may face the issue of having a hard limit of outbound connections. (As I did, for an API I built myself, which triggered this post)
Usually, an outbound connection from within a system uses SNAT which means one TCP port needs to be used per each request. Poorly built or complex systems may have a really low amount of available (allocatable) TCP ports.

How can we fix it?

The solution for this networking issue is to have some way of sending multiple items in a single call. Therefore, we can make fewer requests with more data as opposed to making a single request per user.

But this means we need to do changes in our beautifully designed API. Not only that but it also means you need to start asking yourself a lot of new questions, for example, what happens if I (as an API) receive an array of users, and when processing one of them fails due to the lack of a required property. What should be the response code? do I return 200? 400? do I return an array of response codes?

That is totally the opposite of best-practices. So we need to find a better way.

What options do we have?

We have several options to fix this issue, let’s enumerate them:

  • Change your contract to accept arrays in the body
  • Change your server-side code to accept multiple body formats
  • Rename your endpoints
  • Create a new endpoint for arrays
  • Create an endpoint for receiving batches (for each entity)
  • Create a new batch endpoint

Now, let’s take a deeper dive into each one of the options, and let’s talk about why you should use it or not:

Change your contract to accept arrays in the body

This might be the first thing that came to your mind, hey, let’s just accept an array instead of an object as we do nowadays.

So instead of:

POST /users
{
  "username": "Diego",
  "password": "123456"
}

You would do:

POST /users
[
  {
    "username": "Diego",
    "password": "123456"
  }
]

That sounds nice, but it is an anti-pattern and worst, if you have your API already published with customers using it, it means it is a breaking change. It is not backward-compatible, it is a contract change.

You could still use this approach if you change the version, like POST /v2/users

Change your server-side code to accept multiple body formats

If you created your API using… let’s say node, express, and swagger/openAPI, you might be tempted to fork the swagger library and modify its code.

This is the worst option of all (I believe). It means you need to change the libraries you used to create the API, and make them more “smart” so they don’t explode and route the traffic differently if an array was received instead of an object.

This is, again, an anti-pattern. DON’T DO THIS PLEASE.

Rename your endpoints

What if instead of users, we change its name to user and then make the users endpoint to accept array and user to accept a single object…

This is another anti-pattern, entity names should be in plural. (not to mention that this is also a breaking change)

Create a new endpoint for arrays

Ok, now things are taking more shape… what if we do POST /usersArray

It is a new endpoint, at first it is not an anti-pattern, but looks fishy. I wouldn’t recommend this approach since it can become inconsistent quickly. Although it may be a good “quick-fix” it is not a long-term solution.

Create an endpoint for receiving batches (for each entity)

We can do POST /users/batch

Hey, this kind of looks nice. You are exposing a new endpoint for the POST method, you will need to add logic in your controller to do a batch job with the received array, but it is not an anti-pattern and is one of the recommended ways to go.

Create a new batch endpoint

This takes the previous approach to the next level, providing a good long-term re-usable solution.

Instead of POST /users/batch you can do POST /batch/users

This sight order change means a huge difference in the backend. If your API is using microservices, this means a completely new batch microservice. If not, it means a completely new controller, like the users one, but this one is called batch.

The purpose of this controller is to run batch jobs against the API endpoints.

So we can do:

POST /batch/users
[
  {
    "method": "POST",
    "body": {
              "username": "Diego",
              "password": "123456"
             }
  },
  {
    "method": "POST",
    "body": {
              "username": "Diego2"
             }
  },
]

And the response code of the batch endpoint would almost always be 200, but the response body can contain an array with each single response code:

[
  {
    "responseCode": "200",
    "responseBody": {
              "userId": "12839"
              "username": "Diego"
             }
  },
  {
    "responseCode": "400",
    "responseBody": {
              "error": "Missing password"
             }
  },
]

You can take this even further, by allowing async jobs by adding an ID to each batch job, then you can query for that ID to get the results.

Conclusion

From all the options, only three of them are not anti-patterns, you can change your contract and accept arrays in a newer version of your API if you are willing to confront the risks of having an inconsistent experience, or you can use any of the last two options to build a batch endpoint.

I tend to choose the last one (build a batch microservice/controller) as the “best” option, but it really depends on your API, the business context, and some other factors.

Microservices contract and versioning

What is it? Why is it important?

This is a series of posts I’m doing about designing microservices. Keep posted I will link them up when they are all ready!

When you are designing a microservices architecture, whether you are using a REST or messaging approach for communication between microservices, you have to design the APIs/messages and how a microservice will interact with each other.

One of the most important aspects of the microservices architectures is the ability to work on and deploy a microservice, completely independent of each other. To achieve this, each microservice must provide a well-defined, versioned contract.

The microservices contract

A microservice contract is between the service and its clients. The main goal is that you can make changes to the service without affecting the clients, nevertheless if the clients are aware of the service changes or not.

Even if you spent a lot of time the first time, designing the initial contract, for certain, the API will need to change over time.

When the time comes to update an API, it is important to understand the difference between breaking and non-breaking changes, when a major release is required and when to dispose of an old version.

When the changes are small, for example, adding a new parameter to the API, if that parameter isn’t business-critical, the clients should be able to consume the API in the old way, without sending or expecting to receive that parameter, and the server should fill the blanks with default values.

However, if you are doing a major, backward-incompatible, change to the API, you will need to maintain the old version for some time because you as a service cannot force your clients to update immediately.
If you are using a REST approach, one way is to add a versioning number in the path, for example /app/v1/service, app/v2/service. This way you can have two or more versions of your microservice available.

That is key to understand, if you are not doing a breaking change, there is no need for a new version of your contract.