Anomaly detection and explanation with DataMa

 

In a constantly changing digital world, the amount of data you have to analyze is always increasing. It becomes long and tedious to do daily analyses of each stage of your conversion funnel, or to analyze the traffic acquisition from your different campaigns and identify anomalies you should correct or opportunities you should take. This process takes time, which is why some problems are detected only several days after they appear.

Let’s discover together how DataMa automates anomaly detection, notifying you of the problem(s) that occurred the previous day and explaining its causes.

Why detect anomalies?

 Anomaly detection in the daily monitoring of business KPIs has two main objectives:

1. The first goal is the identification of technical problems, in particular the collection and structure of the data, causing an “artificial” abnormality in the KPIs. Although often having no direct impact on the business in the short-term, these anomalies must be corrected quickly to guarantee the reliability of the data, and of the decisions and systems that depend on it (BI, reporting, but also data science algorithms, etc.)

For instance, an implementation of a GTM tag has been badly set up, resulting in a huge drop in your pageviews. If you don’t spend your days monitoring all your indicators on Google Analytics, you wouldn’t detect the problem until several days later, if not several weeks.

2. The second goal is to identify business problems or business opportunities, which will affect actions on the product or marketing side. When we identify adverse anomalies, we seek to find corrective actions. On the other hand, when the anomaly is favorable, it is often an indicator of opportunities you should not miss.

For example, if the number of sessions on an e-commerce site drops abnormally compared to the seasonal trend, this can be an indicator of aggressive acquisition actions by the competition, which should probably be addressed quickly.

The problem is, the identification and explanation of these anomalies is rarely done in a systematic way. When it is done manually, there are many KPIs to monitor, their volatility is uneven, and human resources are scarce.

 

 Automated detection

DataMa offers, in its DataMa Impact module, an anomaly detection option based on two methods that you can choose:

  • A “moving window average” method, or a moving weighted average based on the previous points, taking into account the weight of each day in the calculation of the ratio and the dispersion of the different points
  •  A “forecast,” or prediction, method taking into account the seasonal components of your KPIs, based on historical data.

You can choose to look at a particular KPI, or choose the option “Check all metric relation steps” which will analyze all the indicators of your “market equation” that you have defined upstream: in the selection of indicators that you can watch, those with anomalies will be highlighted :

 

First, it is possible to search for all anomalies over a given period and refine the parameters that determine which data should be considered an anomaly or not. The available settings are:

  • The confidence interval which is fixed initially at 95%. The larger it is, the larger the so-called “normal” value window will be.


  • The number of points in the “moving window” which serves as a basis to estimate the window. The more points there are, the more stable the anomaly detection window will be.

Once at this stage, it is interesting to set up automatic daily monitoring, which is why you can choose the “latest point only” option that will only analyze the last value available. Consider the following example, where an anomaly was detected at the checkout rate step:

An explanation of anomaly causes

Anomaly detection is only the first step in an analysis. As sen in the graph below, DataMa will offer the most relevant analysis that it can find according to the segments/dimensions it has at its disposal in the data source.

In the previous example, we have the following key point (there can be several depending on the case):

“Compared with previous periods, 2021-08-31 Checkout/Basket has decreased by -17.8% (from 8.19% to 6.73) Looking at it specifically by Channel:

  • There are significant negative changes in performance which explain 100% of the gap. In particular, Social Checkout/Basket has decreased by -100% (from 0.058 to 0), which is significantly higher (4.8 times higher) than the average variation.

Then the following graph explains the potential cause of the anomaly in terms of impact on the KPI.

 

 

Automation of anomaly detection

DataMa also offers automatic notifications by email or on Slack.

To do this, your “workbook” must be connected to a live data source, i.e. a source that will update as regularly as you want the notification (daily, weekly, etc.). This source can be a Gsheet, a Bigquery request, or a Google Analytics directly connected to the DataMa Prep module.

Once the workbook has been configured and saved, you then have the option of exporting these insights by choosing the sending mode (Slack, email, etc.), the frequency, and potentially the choice of only receiving the message if there is anomaly (“Send only when anomaly found”).

Note : in case you have chosen to send a Slack notification, you must remember to add DataMa bot via the command “/invite @datama_bot”..

And now, every time there is an anomaly, you will be notified.

 

An example of an anomaly notification in Slack

 

👉  Did you like this analysis?

Do the same at home😊 !

  1. Create an account on solutions.datama.fr
  2. Adapt and upload the dataset used in this article