How to obtain the metrics for SLO tracking

5 May 2023

News, Observability

How to obtain the metrics for SLO tracking
This is part 2 of the 3 part series “The path to your first SLO”.

When you have a clear understanding of what metrics to gather for SLO, the next question is how to obtain and gather those metrics. Basically the metrics can be obtained by the following methods.

Metrics from commercial APM/ Observability tools,
Commercial monitoring tools offer one stop shop to gather infrastructure metrics from machine agents, cluster agents or in terms of public cloud platforms, those will be metrics from AWS CloudWatch or Azure Monitor. For application metrics, usually the same APM agent can generate application specific metrics derived from logs or traces. Usually these contribute to the 4 golden signals we discussed in the previous post.
You can now build dashboards directly from the commercial APM tools to utilize those metrics. Metrics can also be retrieved through APIs for external consumption.

Metrics from siloed data sources
They are usually from systems in your organization that have no full featured APM tools in place. Today it is common for applications to expose Prometheus compatible metrics ( a de facto standard to allow metrics to be scrapped for monitoring by Prometheus ) to fulfill simple monitoring use cases. Grafana then fills the gap on time series visualization and alerting.

Metrics from E2E testing and synthetic tools
Synthetic testing tools can provide valuable insight on service availability, response time and customer experience. These metrics can also be a source of SLO. Commercial solutions include ThousandEyes and SolarWinds PingDom.

If you have multiple monitoring tools it is wise to consolidate all these metrics into a single dashboard for alerting and visualization. Examples are Grafana and Nobl9 which can help to
Consolidate metrics from multiple tools into a single dashboard.
Offers pre-built SLO dashboards or flexibility to easily build SLO dashboards
As a summary, the goal is to streamline the process of obtaining the metrics from multiple systems and quickly realize the benefits of SLO tracking. In the next article, we will look at a real example of utilizing Nobl9 for a simple service availability SLO.

New to SLO?
#SLOconf is a free, virtual event focused on #SLOs! 🔥
Whether you are doing SRE, SLO, or DevOps, or Ops, or a Dev – SLOconf is the perfect platform to share insights and ideas on the latest trends and developments in SRE/SLO.
Vsceptre is a sponsor at SLOconf 2023, hosted by Nobl9! 📢
For more details & speaker lineup, register here: 👇
www.sloconf.com

Related Articles

The Disruptive Effects of Mobile Application Outages on Large Enterprises in Hong Kong

The Disruptive Effects of Mobile Application Outages on Large Enterprises in Hong Kong

In today’s digital age, mobile applications are essential for large enterprises to connect with customers and drive growth. However, even the most meticulously tested apps can experience outages, leading to significant consequences for both users and the organizations behind them. This article explores the impact of unforeseen downtime, the repercussions on end users and company reputation, and how tools like LaunchDarkly can help alleviate these challenges. Learn how enterprises can uphold application reliability and ensure customer satisfaction amidst unexpected disruptions, leveraging Observability tools with the help of Vscetpre and LaunchDarkly.

Implementing a production ready chatbot solution with governance and monitoring

Implementing a production ready chatbot solution with governance and monitoring

As a company focused on IT consultancy and system integration, we have accumulated a large number of sales and solution briefs for various products over the past few years. We decided to implement an internal chatbot solution to better support sales activities. To minimize the investment required, we opted for a RAG approach instead of fine-tuning, building a chatbot solution based on a few products we are familiar with. Below is a high-level overview of how everything connects.

Uncovering Suspicious Domain Access in a company Network with Threatbook’s OneDNS and Splunk Stream

Uncovering Suspicious Domain Access in a company Network with Threatbook’s OneDNS and Splunk Stream

As your trusted ally in fortifying digital defenses, we understand that it can be difficult to pinpoint the users who have accessed dubious domains within your network. This task can be even more daunting in a larger-scale environment where the underlying on-prem infrastructure is subject to strict limitations on modifications. Furthermore, you may also ask the questions, how do we classify a domain as a threat, how can we obtain a list of domains that are deemed as malicious and how can we utilise this domain list to correlate the users in your network who have accessed them?