What is covariance?

ARTICLE

We define covariation as a statistical measure of the joint variability between two variables. You are
going to encounter this statistical measure more often than you think.
For example if you are trying to understand the evolution of the turnover of your business and the volume and the price of your products have changed at the same time … You can attribute part of the increase in turnover to the price evolution and part of the evolution of the volume sold, but there remains a small part of the increase in turnover which is not linked to either one but only to the simultaneous evolution of the two .

If two variables generally move in the same direction, the covariance has a positive sign (example: outside temperature and consumption of ice cream). If they move in the opposite direction, the covariance is negative (example: outdoor temperature and heating consumption). If two variables are independent, their covariance is zero, but the converse is false: there can be a nonlinear dependence which also results in a zero covariance.

You can learn more about the statistical literature on the web, but this article is about covariation within waterfall chart representations.

In our case, the covariation is essentially the part of the variation that you cannot attribute to either of the effects that you are trying to split.

1. Simple example of a covariation

If you decide tomorrow to enlarge the building above by adding a floor or by widening the building by the right or the left, you will have to add windows. If you add a floor, you add two windows that you can directly assign when adding this new floor. In the same way if you widen your building to the right or to the left, you will have two more windows that you can fully attribute to the widening of the building.

Now if you combine the addition of a floor and the enlargement of the building, your building will now end up with no more four windows but nine windows. You will be able to assign two additional windows to the addition of a floor and two additional windows to the extension of the building. So you can explain quite simply the creation of four new windows, but where did the last one come from? The ninth window corresponds to what is called the covariation, it comes from the joint effect of the two enlargements.

There is a notion of temporality which is important in this concept. If you had widened first, then add a new floor afterwards. It is indeed the addition of the new floor which would have explained the addition of the three top windows while the enlargement carried out first would only have explained the addition of two windows. It is the joint variation of 2 (or more) effects that is at the origin of this concept of covariation.

2. Waterfall analysis

A Waterfall Analysis (WA) or waterfall chart analysis refers to the waterfall diagram used in finance on which are represented the different components (positive or negative) that add up to form the result that we seek to observe. Typically, such a diagram makes it possible to represent the evolution of a performance indicator (KPI) over time, e.g. the profits of a company, by representing the impact factors of this KPI.

The representation of the main indicator (KPI), in a dimension chosen by the user, is based on the evolution of Primary Indicators (PI), also called a performance sub-indicator. For example with your KPI: Revenue (turnover) is induced by the underlying volume sold and selling price, which could be the different indicators on which a company could (or not) have the hand to act.

Example

You want to analyze the variations in your revenue (Revenue (A)) between last year and this year. For commercial reasons, you may want to split your revenue (A) between volume (B) and price (B / A). So what you really want is to explain:

Revenue_ThisYear - Revenue_LastYear. (A_End - A_Start).

Which can be written as follows:

A_End - A_Start = (B_End - B_Start) * A_Start / B_Start + (A_End / B_End – A_Start / B_Start) * B_Start + (B_End - B_Start) * (A_End / B_End - A_Start / B_Start)

The graph below is much clearer:

The above graph illustrates graphically the impact of the covariation on the positive evolution of a KPI between its starting state and its arrival state, in the case considered (two PIs). It should be noted that, although the evolution of the KPI comes entirely from the evolution of the A and A / B indicators, it cannot be perfectly attributed to either of these indicators. Indeed, there is a shared impact of the different PIs on the KPI, called covariation, which must therefore be reassigned to each PI according to an arbitrary rule.

What you want to explain is the area gap between the two green squares. Just divide this difference between effect B (purple), A / B (yellow), and covariance (orange).

Even more concretely

Let’s try to understand the covariation more concretely by taking our example of the evolution of Revenue between the results of 2019 compared to 2020.

On the simplified table that we have above we represent the volume of product sold and the revenue generated over the year 2019 and over the year 2020. Thanks to our sales team we have managed to sell more product in 2020 and at a higher price.

Our revenue therefore goes from € 400 to € 750. But to what can this change of € 350 in my revenue be attributed: to the increase in price or to the increase in volume sold?

> If we calculate the impact of the increase in volume, we find: (150 – 100) x € 4 = € 50 x 4 = € 200.
Increasing my sales by +50 products sold generated 200 € of additional revenue.

> By carrying out the same calculation with the price, I find: (5 € – 4 €) * 100 = 1 € * 100 = 100 €.
I generated 100 € of additional revenue thanks to a 1 € increase in my selling price.

You begin to see the problem each change taken independently I manage to explain 300 € of evolution of my revenue but not the 350 € that I observe in reality. That’s 50 € difference represents the covariation.

The impact on your waterfall analysis

The above calculations are great, but difficult to show and explain to a businessman who just wants to understand why his revenue has gone up or down.

So, to display this in your waterfall analysis, with only the effect of volume and price; or other effects that you have defined; we simply attribute the covariance to each effect, proportionally to its variation or in other words we distribute the covariation proportionally.

The percentage of covariance that we display in the Datama Compare charts is the ratio of the covariance to the overall variation that you are trying to explain.

We put a “!” when the covariance becomes high (> 30%), but the analysis remains true. Just understand that the effects you are trying to analyze are not completely independent, so sometimes it’s not one effect (e.g. price) or the other (e.g. volume) that affects your overall KPI (e.g. income), it’s just both at the same time. And that’s your covariance!