Setting up the first SLO

10 May 2023

News, Observability

Setting up the first SLO
This is the final piece of the 3 part series “The path to your first SLO”.

We have discussed on the basics of what to observe and how to get the relevant metrics in part 1 and part 2 of this series. This time we are going to have a quick look on to setup a simple service availability monitoring SLO with Nobl9 and SolarWinds Pingdom.

Nobl9 is used for building the SLO dashboards in this example as it supports a lot of data sources input, with options to gather metrics through the Nobl9 agent or using direct API integration. To save some time, we use the SolarWinds Pingdom to run a script and monitor a web URL.

You can setup a free trial on SolarWinds Pingdom as well as Nobl9 for a 30 days trial. That will provide you with a nice playground on this exercise. Point this to an important service URL and runs a simple availability test on a per minute basis. SolarWinds PingDom will return up/time as well as a service response metrics for you.

Setting up the data source in Nobl9 to connect to SolarWinds is a breeze. We do not want to repeat the details here. If you are interested, you can follow this nice tutorial. At the end you can get a nice dashboard similar to this.

We set a SLO target of 99% if the response time of this slow endpoint is within 8 seconds for a 1 day rolling window (Satisfactory). Another similar SLO target of 99% for response time within 6 seconds (Optimal). Based on above, we can comfortably commit and SLA to the end user for an SLA with response time < 8s for 99% of the time on a 1 day rolling window. At the same time leaving some room for system downtime or pushing new releases to the production.

This just scratched the surface of how to utilize SLO for service reliability tracking. You can also build composite SLO, setting alerts or changing the time windows of the SLO. Of course you can build all these dashboards with other tools but Nobl9 can make you life a bit easier. The whole process can be setup using SLO as code with Terraform or OpenSLO.

Hope you enjoy the series of “The path to your SLO”. If you have a need to revisit the observability practice feel free to reach out and talk to us. Our team of consultants from Vsceptre can help you on different aspects of your observability journey from monitoring, log aggregation, DevOps integration, SRE practice as well as data consolidation.
New to SLO?
#SLOconf is a free, virtual event focused on #SLOs! 🔥
Whether you are doing SRE, SLO, or DevOps, or Ops, or a Dev – SLOconf is the perfect platform to share insights and ideas on the latest trends and developments in SRE/SLO.
Vsceptre is a sponsor at SLOconf 2023, hosted by Nobl9! 📢
For more details & speaker lineup, register here: 👇
www.sloconf.com

Related Articles

The Disruptive Effects of Mobile Application Outages on Large Enterprises in Hong Kong

The Disruptive Effects of Mobile Application Outages on Large Enterprises in Hong Kong

In today’s digital age, mobile applications are essential for large enterprises to connect with customers and drive growth. However, even the most meticulously tested apps can experience outages, leading to significant consequences for both users and the organizations behind them. This article explores the impact of unforeseen downtime, the repercussions on end users and company reputation, and how tools like LaunchDarkly can help alleviate these challenges. Learn how enterprises can uphold application reliability and ensure customer satisfaction amidst unexpected disruptions, leveraging Observability tools with the help of Vscetpre and LaunchDarkly.

Implementing a production ready chatbot solution with governance and monitoring

Implementing a production ready chatbot solution with governance and monitoring

As a company focused on IT consultancy and system integration, we have accumulated a large number of sales and solution briefs for various products over the past few years. We decided to implement an internal chatbot solution to better support sales activities. To minimize the investment required, we opted for a RAG approach instead of fine-tuning, building a chatbot solution based on a few products we are familiar with. Below is a high-level overview of how everything connects.

Uncovering Suspicious Domain Access in a company Network with Threatbook’s OneDNS and Splunk Stream

Uncovering Suspicious Domain Access in a company Network with Threatbook’s OneDNS and Splunk Stream

As your trusted ally in fortifying digital defenses, we understand that it can be difficult to pinpoint the users who have accessed dubious domains within your network. This task can be even more daunting in a larger-scale environment where the underlying on-prem infrastructure is subject to strict limitations on modifications. Furthermore, you may also ask the questions, how do we classify a domain as a threat, how can we obtain a list of domains that are deemed as malicious and how can we utilise this domain list to correlate the users in your network who have accessed them?