24 May SERVICE LEVEL OBJECTIVES
The first thing that we need to understand in respect of setting up a Service Level Objective (SLO) is Risk and Availability.
You might expect your service provider to build a 100% reliable service but remember that after meeting a certain point of reliability, say 99.9%, the customers are indifferent from the increase in reliability but the cost to achieve that slight increase in reliability to 99.99% would increase your operational costs manifolds. Not only this, but you will also be slowing down on the rollout of new features and deployments.
Why is that the users are indifferent to this increase in reliability?
It is because then the service is dependent on factors like the user’s network, system etc.
So, if you are a product manager it becomes important for you to strike the right balance between risk and availability. It shouldn’t be the case that you are preventing the release of new features that will make your customers happy to avoid the risk of unavailability.
How to strike this balance between risk and availability?
This is where the collaboration between product managers and SREs come into the picture.
.
Now, we will be explaining all the terms that help us predict the availability of the service and plan out our new releases.
What is a Service Level Indicator (SLI)?
SLI is a metric that measures the compliance rate with the SLO. The SLIs must be based on key performance indicators that matter to the user. SLIs always have to meet or exceed the SLO. We evaluate whether the system has been running within the SLO by taking aggregate values over some time like the past week or month. If it does not meet our SLO, we have to modify worker loads or run a new instance of the service.
What is a Service Level Agreement (SLA)?
A service level agreement is a contract between service providers and service users based on the performance metrics. An SLA comprises both performance metrics and consequences as part of failing to meet the SLA which may be in some form of penalty like refund etc.
What is a Service Level Objective (SLO)?
The SLO is an aggregate of metrics like response time and uptime. It defines the availability of the service as a numerical target. The SLOs might sound similar to SLAs but SLO only helps the DevOps teams to clearly state the goals and are not a legal contract between providers and users.
It is important to understand that when defining SLOs, the SLI value should be the lowest acceptable reliability.
Why is this value so important? As previously said when you try to achieve higher reliability unnecessarily, you will not only be incurring higher operational costs but also slowing down on developments that will make your customers unhappy.
Conclusion
SLIs are quantitative measurements of successful probes in a given period which help us derive the SLOs, which determines the amount of time an SLI could not be met and SLAs are business agreements between users and service providers with consequences on failing to deliver the promised service.
No Comments