Monitoring your system and infrastructure is important to deliver your customers the results they expect. Now, software development is a very dynamic process. That’s why collecting metrics, monitoring, and alerting are indispensable to deliver top-notch products.
What are software development metrics
We bet you think that here, we are going to discuss those standard metrics that everybody is talking about:
- Customer experience;
- Time to market;
- Team satisfaction;
- On-time delivery;
- Software quality.
Сertainly they are important, but here, we aren’t going to discuss them. Why? Because you can find information about them just anywhere. Instead, let us concentrate on more unique metrics, those that we use at Mad Devs.
Do we use the standard metrics? Of course, we do! But our aim is not just to monitor the team’s performance. We aim at delivering value to our customers (value in our case means the expected results and smoothly working products). And we do it by using rather custom software development metrics.
Business metrics are tracked with information radiators
Just some time ago, developers or we better call them programmers, were responsible for the technical part of the project only. Further, with the evolution of DevOps culture, production became a priority.
The business part has been for a long time left without attention though. Many companies ignore it even now.
It doesn’t work like this anymore though. We develop products for businesses. Therefore, the earlier we get an idea of what is the purpose of our client's business and what business value our products bring, the faster we deliver value to our customers.
We use information radiators to track business metrics.
An information radiator is a generic term. It can be a display or a chart that visualizes the flow of work, bottlenecks of obstacles, and enables everybody to see what the team is working on at the moment.
It sounds pretty generic, doesn’t it? What about checking some examples?
In everyday life, study, or work, many things can be used as an information radiator. For example:
- A board with sticky notes;
- A whiteboard with notes;
- A tool.
And much more. We can define as a radiator whatever that can be used to collect information and display it in a clear and understandable form.
What radiators do we use?
Well, to tell the truth, we use many types of radiators at different stages of work. We use radiators as means to collect and display information for further analysis.
If we talk about a radiator as an officially accepted term in the team, then, it is a tool. Moreover, for every project, the tool has different settings and connects & displays different data. When we are developing a radiator, we consider the project and business specifics, relevant data that might help to keep the project moving on and improving, and just whatever can be useful for the team and the business to perform in the most efficient way. Most often, radiators are presented to the team in messages that come to the general communication channel in Slack or another messenger.
To make the idea clearer, let us have a look at some examples. You will see how different our radiators can be and how they are built.
This radiator is used by our marketing team. So, it is developed to collect and display data that help our marketers to find out whether the current strategies work and what components of our marketing strategy need to be improved.
The marketing radiator helps us to monitor the website traffic and it's fluctuations (not only, it is just an example). So, we can detect and analyze the factors that influence the traffic and optimize our marketing approaches. It is fun to observe how a single blog post or a publication in social media makes the number of website visitors skyrocket or plunge. However, along with having fun, we perform serious work depending on the data displayed by the radiator. We can adjust the frequency of publications, track the trends, and plan how we can repeat the success stories and avoid failures.
Website traffic is not the only metrics we can get though. Some more examples are the following:
- Number of users who visited the website;
- Traffic value fluctuations within a specific period;
- Top countries from where our visitors opened the website;
- Devices from which the website was opened;
- The best pages;
- The best posts, and so on.
Basically, our marketing radiator provides us with all the data needed to keep the website at the top of the search, attract leads, and in general inform the world about us and our activities.
Other radiators? - There are many of them!
So, if a team is working on a payment solution, their radiator will display data relevant to that specific project. It can be:
- Number of people who used a specific payment method;
- Turnaround for a specific period;
- Buyer and seller name if any goods and services are sold, or whatever.
If the team is working on a solution for the transportation industry, their radiator will look different, too. So, it may display:
- Number of drivers who used the app;
- Number of passengers who used the app;
- Route metrics, and similar.
The idea behind using a radiator is to get information that shows us whether our efforts are correct, whether we are moving in the right direction.
Has the team implemented a new feature, and it caused a sudden drop in the number of users? It might mean that the feature doesn’t work as it is supposed to or something important was missed. It helps us to react on time and make the needed changes to improve the situation.
While business metrics are important, without technical metrics, the project will not move on. There are a plethora of standard metrics and practices to measure them. So, we just skip the standard technical software development metrics examples because the information about them is available on every corner literally.
Instead, we would concentrate on some practices that we use to detect any errors and inconsistencies and fix them while working on a solution. Yep, here, we start with metrics for testing.
Metrics for testing
We use test metrics to assess product quality. Like in the case of radiators, these metrics and requirements to them differ depending on the project. However, we have some standard approaches and tools to ensure we deliver the expected solution.
Our standard practice is to cover all code with tests. That’s why the code coverage metrics are monitored constantly. We can check them all on our code coverage report:
You can see what information can be checked there:
- Whether all tests have passed or not;
- The total coverage of tests in percentage;
- How many statements in the program have been executed;
- How many branches of the control structures have been executed;
- How many branches in the control structure have been executed;
- How many lines of source code have been tested;
- How many functions defined have been called.
It is more than enough to check whether testing was successful or some improvements on the code are needed.
Metrics alone might not be sufficient, or why we use logging tools
As you can see, metrics are just raw data about user behaviour, resource usage, or whatever (depending on the project). They show us the results. But from them, we cannot find out what the consequences are for one or another change in metrics value.
To follow the trail of events and understand what has happened, we use logging tools. They show what has happened and when. So, metrics show us data, results, and logging tools show us the sequence of events that have led to the results. In other words, logs show us what has happened behind the scenes. While there are many logging tools, our favorite ones are the ELK stack and DataDog.
ELK is an acronym used for a collection of three projects - Elasticsearch, Logstash, and Kibana:
- Elasticsearch is a full-text search and analytics engine.
- Logstash is a log aggregator. It collects data from multiple sources, converts it, and sends it to different destinations, e.g. Elasticsearch.
- Kibana provides a user interface. It enables users to visualize data and analyze them via graphs and charts.
ELK is not a single tool as you can see. It is a log management platform that enables you to collect massive data volumes from anywhere across your infrastructure, and then, to search, analyze, and visualize this data in real-time mode.
Here, we have talked about ELK as a logging platform. However, its application cases are much wider than just logging. It can be used for monitoring, web analytics, troubleshooting, security analysis, and some more cases.
DataDog is a monitoring, security, and analytics platform, or we also can call it an observability service. If we believe that ELK is not needed, and we are fine with DataDog only, we use DataDog.
What if something goes wrong or some words about alerting tools
While metrics and logs are important, we cannot be checking them constantly. In the end, we want to concentrate on our work. On the other hand, it is important to learn immediately when something goes wrong.
It is the reason why we use alerting tools. If something goes wrong, we are alerted immediately. So, we can react immediately and prevent the accumulation of errors and bugs. While we use a lot of alerting tools, and they change from project to project, we have a set of favorites.
We love using Sentry to detect various, sometimes unique errors in user behaviour. This tool allows us not only to detect errors immediately but provides enough information about the error to fix it asap. In most cases, Sentry gives us enough data to fix errors within a couple of minutes. More complex issues might take up to some hours. And in very rare cases, we might need more time.
Uptimerbot is a free tool used to monitor the production/staging environment. It sends us real-time alerts when the production/staging is down, and thus, enables us to fix it within the shortest time.
We can set up Uptimerbot to get alerts via email, SMS, Slack.
GitLab events integration
Gitlab Events Integration monitors for CI/CD job failures and performance issues and about the deployment status. We receive alerts in our Slack channel which allows us to take action immediately.
Prometheus is not just a tool. It is a monitoring system that records real-time metrics in a time series database. We set up Prometheus to send alerts when a specific condition is maintained during a preset period. The alerts are sent to Alertmanager.
We use Grafana to visualize the metrics provided by Prometheus. While the latter sends metrics in the form of numbers, Grafana turns these metrics into graphs and models. These visuals allow us to track the project progress, regression, and to build hypotheses.
We use this tool to receive alerts sent by client applications. When Alertmanager receives an alert from Prometheus, it chooses whether to silence it or to forward it to us. The Alertmanager debugs and groups alerts, and sends them to the correct receiver integration:
- An email
Considering the tool's convenience, it is not surprising that we use it in all projects.
No metrics tracking means no progress
What if the team doesn’t use any metrics, monitoring, alerting, and logging? Now, when you have read what all these tools do, you can imagine what can happen without them. You might not be aware that production and staging are down, the bugs will be accumulating, and in the end, you will end up frustrated, and your customer will get a product that doesn’t comply with any customer requirements. Fortunately, at Mad Devs, we have all the resources to supply our customers with a quality product that meets all their requirements over the years.