How Cloud Resiliency organized in Microsoft Azure
First let’s have some idea on a few key terms, starting with regions, region is a logical name given to a group of data centers that are placed somewhat close to each other. Regions are organized under geographies. Geography is a boundary defined based on political, regulatory, and location factors. Under regions there are multiple zones, a zone has at least one data center, a zone has its own dedicated power supply and network service. Zones contain multiple fault domains and update domains. Fault domain identifies a group of hardware that have common cooling, power, and network service. We can think of a fault domain as a server rack. Update domain identifies a group of hardware that gets fabric level update at the same time. Azure ensures that they do not update two or more update domains at the same time. These policies apply as same to zone level. In this paragraph we identify a few commonly used keywords that we need when talking resiliency options in Azure, let’s discuss a few resiliency options from next para.
Let’s start with availability sets, it is a logical group that spans through few fault-domains and update-domains. The recommended setting is to include 2-3 fault domains and 5-20 update domains within an availability-set. After setting up, then we can place over service, under the availability set. Users do not have the ability to select which fault or update domains to place their service. Azure cloud handles that part therefore Azure highly recommends creating separate availability sets for different tasks. Availability set guarantees 99.5% availability.
Next, let’s talk about Availability Zone. This concept is all about placing your services and resources in multiple zones. This ensures that in the event of zonal level disruption still your service can operate as normal. Availability zones guarantee 99.9% service availability. Under current configurations, a service can be deployed to one of three availability zones. Selection of the availability zone should be carried out by customers. Therefore, when architecting your applications, you have to be really careful about where to place services. Regions that support availability zones contain at least three zones.
Next, we have paired regions, if you wish to have regional level resiliency then this service is for you. Regional level Azure service could fail due to mainly natural disasters, war situations but this type of situation occurs rarely and normally you have enough time to react before that scale of an event happens.
These are the commonly used resiliency services. However, there are many more resiliency options we can use with Azure cloud. Hope to discuss those options in future posts.
References


Nice write-up. As per my understanding, in cloud computing, applications are involved with many types of resources and the applications are depending on internal and external services. There can be issues due to the failures in those services or defective software. So, resiliency should be involved with how these failures can be detected and recovered. Has Azure come up with a mechanism to address this context too?
ReplyDeleteYes Dulanga, there are various offerings in azure to address different situations as you mentioned. For example, we can use Azure application insights to identify behaviors of application that are running in the cloud, Azure Monitor is another service we could use to monitor infrastructure and application running on Azure, we have a security center to monitor security-related activities. Azure also offers best practices and recommendations with the Azure advisor.
DeleteAs a leading cloud service provider it is good that Azure has addressed the issues related to availability ensuring high availability of it's services.
ReplyDeleteYes, it is must to do, because of the high competitiveness in cloud business all the cloud providers need to come up with their own availability strategies to stay in the business.
DeleteIt's very informative. keep it up!
ReplyDeleteThanks for your supportive feedback :)
Delete