Netdata Blog | Netdata Blog

Data & ML @ Netdata

Monitoring indoor air quality with Airthings and Netdata. Understanding and measuring common contaminants and pollutants reduces your risk of air quality health concerns.

Monitor KSM performance with Netdata

November 1, 2022 · 4 min read

Data & ML @ Netdata

Monitoring KSM (Kernel Same-page Merging) performance at deduping memory shared across VMs.

Monitoring & troubleshooting Cassandra with Netdata

October 29, 2022 · 5 min read

Data & ML @ Netdata

How to monitor and troubleshoot Cassandra with Netdata.

How to monitor and fix Database bloats in PostgreSQL?

October 28, 2022 · 7 min read

Technical Product Manager

Database bloat is disk space that was used by a table or index and is available for reuse by the database but has not been reclaimed. Bloat is created when deleting or updating tables and indexes. Here's how to deal with it!

Cassandra monitoring

October 27, 2022 · 9 min read

Data & ML @ Netdata

What are the important Cassandra metrics to monitor and how to monitor them.

How to find out which application is causing server load

October 26, 2022 · 6 min read

Data & ML @ Netdata

We often hear the term load used to describe the state of a server or a device, but we're here to tell you what it means, precisely, and how to monitor it.

How to monitor the disk usage on your infrastructure

October 25, 2022 · 5 min read

Technical Product Manager

The most important part of disk usage monitoring is to check the utilization of each filesystem and each mount point which can reveal existing or impending issues with the storage space on your infrastructure.

7 types of Redis latency and how to fix it

October 24, 2022 · 5 min read

Data & ML @ Netdata

Redis is designed to be fast. In most cases, it is. However, there are times when Redis may be slow, due to network issues, disk latency, or other factors. When this happens, it is important to be able to detect the slow down and investigate the cause of Redis latency.

How to monitor systemd service liveness

October 21, 2022 · 3 min read

Chris Akritidis

Chief Operations Officer

The life of a sysadmin or SRE is often difficult, but occasionally very simple things can make a huge difference. Basic monitoring of your systemd services is one of those simple things, which we sometimes overlook. The simplest question one would want to know is if the thing that’s supposed to be running is actually running at all. If you use systemd services, you can guarantee an answer to that question within minutes using Netdata.

How to monitor web servers and their performance

October 20, 2022 · 4 min read

Technical Product Manager

How you can use the Pandas Python collector to monitor weather data

October 19, 2022 · 8 min read

Analytics & ML Lead

netdata-pandas

Netdata just got a Pandas collector.

How to monitor HTTP endpoints

October 17, 2022 · 5 min read

Chris Akritidis

Chief Operations Officer

The HTTP protocol has become the de facto standard application layer protocol of the internet. From publicly available web sites and APIs to “inter-process” communications in REST based microservice architectures or large Service Oriented Architectures based on SOAP, you find HTTP being used again and again, due to its simplicity and our familiarity with it. How many protocols can you name that have memes for their status codes? Of course, such a popular protocol has endless pages written about how to properly monitor the services that rely on it, with many options specific to every use case.

How to monitor DNS query response time

October 12, 2022 · 5 min read

Data & ML @ Netdata

DNS (Domain Name System) servers translate standard language web addresses to their actual IP addresses for network access.

Why is data replication important?

October 12, 2022 · 9 min read

Alex Malkov

VP of Engineering

High availability. This is what every monitoring tool needs to ensure that you never compromise on IT infrastructure visibility.

How to monitor host reachability

October 10, 2022 · 8 min read

Chris Akritidis

Chief Operations Officer

Most sysadmins and developers have at some point used a few of the popular Linux networking commands or their Windows equivalents to answer the common questions of host reachability - that is, whether a host or service is reachable and how fast it responds.

Introducing the Netdata Source Plugin for Grafana

October 7, 2022 · 8 min read

Hugo Valente

Technical Product Manager

sample-dashboard

The open-source community is about to benefit greatly from Netdata's new Grafana data source plugin, which makes use of a powerful data collection engine.

How to filter metrics by label?

October 6, 2022 · 3 min read

Technical Product Manager

It is sometimes easy to get lost in the mountain of metrics and infinite number of dimensions when working with an infrastructure monitoring tool. Being able to filter metrics by label and visualize only what is relevant to the current scope of monitoring &troubleshooting, becomes absolutely crucial to the success of SREs, Sysadmins and DevOps professionals.

Missing indexes in PostgreSQL? How to quickly identify it

October 5, 2022 · 2 min read

Technical Product Manager

While working on improving the Netdata PostgreSQL collector, we were monitoring our production PostgreSQL instance and something caught our attention immediately. The rows fetched ratio seemed really, really low for one particular database... there were missing indexes in PostgreSQL!

Redis Monitoring

September 29, 2022 · 11 min read

Data & ML @ Netdata

PostgreSQL Monitoring

September 16, 2022 · 24 min read

Data & ML @ Netdata

Data Collection Strategies for Infrastructure Monitoring – Troubleshooting Specifics

September 6, 2022 · 17 min read

Alex Malkov

VP of Engineering

How Netdata’s Machine Learning works

September 1, 2022 · One min read

Analytics & ML Lead

Following on from the recent launch of our Anomaly Advisor feature, and in keeping with our approach to machine learning, here is a detailed Python notebook outlining exactly how the machine learning powering the Anomaly Advisor actually works under the hood.

Anomaly rate in every chart

June 23, 2022 · 3 min read

Data & ML @ Netdata

Metric Correlations on the Agent

June 15, 2022 · 3 min read

Analytics & ML Lead

As of v1.35.0 the Netdata Agent can now run Metric Correlations (MC) itself. This means that, for nodes with MC enabled, the Metric Correlations feature just got a whole lot faster!

Monitoring without Cooperation: Kubernetes

May 20, 2022 · 4 min read

Data & ML @ Netdata

Kubernetes Throttling Doesn’t Have To Suck. Let Us Help!

May 3, 2022 · 8 min read

Costa Tsaousis

Founder & Chief Executive Officer

CPU limits are probably the most misunderstood concept in Kubernetes CPU resources allocation and management.

Troubleshooting Alerts the Right Way: As a Team

April 28, 2022 · 4 min read

Tasos Katsoulas

Software Engineer

CNCF Live: Power up your machine learning – Automated anomaly detection

April 27, 2022 · 2 min read

Analytics & ML Lead

The Netdata Way of Troubleshooting

April 4, 2022 · 3 min read

Netdata Team

Together with you, our fabulous community, Netdata is changing the way the world thinks of high fidelity monitoring - and we are gaining momentum.

Our Approach to Machine Learning

March 25, 2022 · 13 min read