Overview of the circuit breaker in Hystrix

Libraries provided by Netflix, usually look simple, but after a deep dive, you will realize this stuff is pretty complicated. In this article, I want to explain behavior and usage of the circuit-breaker pattern being a part of the Hystrix.

Following the Wikipedia definition "Circuit breaker is used to detect failures and encapsulates logic of preventing a failure to reoccur constantly (during maintenance, temporary external system failure or unexpected system difficulties)." What does it mean?
For example, let's consider the circuit as a connection between two services. It operates normally (aka stays closed) when the connection is stable, and the target service is healthy. We can execute our requests (assume 100 per second) without any problems.
But when our target service is down (what we know after a few unsuccessful tries), it no longer makes any sense to try 100 times per second. So we open our circuit and serve responses from a predefined fallback, as we know that the service won't reply.

How is it implemented in the Hystrix?

We start with the closed circuit. All requests are processed, and Hystrix is gathering metrics. Metric includes information about a number of processed requests, execution time and a finish status. As long as everything is up and running, and there are no network issues the circuit breaker (CB) stays closed.
If we execute less than a minimum number of request (circuitBreakerRequestVolumeThreshold property, default to 20) in a given time window (metricsRollingStatisticalWindowInMilliseconds property, default to 10 seconds), there won't be any decision to change the status. If we fulfill the minimum threshold requirement and more than a given number of requests have failed (circuitBreakerErrorThresholdPercentage, default to 50%), Hystrix will decide to open the CB. Now when we try to execute the next request, it will be redirected straight to the fallback.

But how do we know when the service will be back online? Hystrix allows one request per some time, called the 'sleep window` (circuitBreakerSleepWindowInMilliseconds, default to 5 seconds) to execute normally. Launching such request changes the CB status to half open. Now based on the result of this request, CB status will be changed to:
- closed, if the request finished successfully,
- open, if the request failed.

Now, to be fully correct, I need to clarify one thing. In fact, for Hystrix there is no such thing like "request." It operates on "commands," which of course can be implemented to execute a RESTful request.


But when I decide to use Hystrix in my application, does it mean there will be just one circuit breaker? No! There are usually more instances, and the one which will be used for a given command is determined by a key being the name of the command. So is there any naming convention? Not really, because it depends on what we exactly want to achieve - how to divide our commands.

If you're for example afraid that the particular REST endpoint (regardless the physical host we hit) can fail - let's say "GET /users," it's a good idea to name the command "getUsers". Then all calls to userService1, userService2, ..., userServiceN will be protected by the same circuit breaker.

When more important is which host are you talking to (because e.g. we do use sticky-session client-side load-balancing) the above name won't help. Then it's better to name the command equally to the host name, so "userService1", "userService2", etc. Then outage of the one instance won't impact the other calls. There is just one tiny problem - if you want to specify any properties of the command (like a timeout) it can be done on a command name level. So all requests to the userService1 must have the same configuration. Often we have endpoints which take longer to finish, so it looks like a serious limitation.

There is the third possibility, to merge both approaches and use "userService1#getUsers" name. It solves the different settings issue, but it will be less effective than the second solution, as all circuits we have for a given host needs to be closed/opened independently.

The best solution would be to be able to use a custom key for a circuit breaker resolution, however, it's unfortunately still not possible.


newvalue said…
Thanks for your detail explanation.
Unknown said…
Thanks for sharing this informative article here about the circuit breaker. A good electric circuit breaker is very useful to protect an electrical circuit from damage. If you are looking to buy Square d circuit breaker, visit mdwindustrialsupply.com
Malcom Marshall said…
Thanks for sharing this informative article here about the circuit breaker. This information is really useful. Visit Skartec
Maria said…
In the evening it happens that sidish is boring in the evening and you don’t know what to do with your desire to chevoto gamble, cool. Found this site swell top 10 online casinos now sitting here all day. Money withdraw easily, cool design. I like everything
It looked like a phenomenal treat yelling out to me to eat it. This is worthy article.
SAP training in Kolkata
Best SAP training in Kolkata
SAP training institute in Kolkata
Joyal said…
Thanks for one marvelous posting! I enjoyed reading it; you are a great author. I will make sure to bookmark your blog and may come back someday. I want to encourage that you continue your great posts.
oracle training in chennai

oracle training institute in chennai

oracle training in bangalore

oracle training in hyderabad

oracle training

oracle online training

hadoop training in chennai

hadoop training in bangalore

Popular posts from this blog

Understanding the first level JPA cache

Understanding Spring Web Initialization

Including Java agent in standalone Spring Boot application