In the last post (part 1, unfortunately in German), I mentioned 3 types of LODs in Tableau. I believe, it is already very clear on the cases where INCLUDE, EXCLUDE and FIXED are applied. However, I noticed one thing on the LOD calculation that raises no error but give us misleading result.
Lets take an example using Superstore dataset. The task is to calculate the average sales per customer per state/province using INCLUDE function. Simple visualization of the overall sales will look like this. I will take Tyrol, Austria for checking purpose.
1. SUM() within INCLUDE
The syntax is as follows { INCLUDE[Customer ID]: SUM([Sales]) } and then averaging in the row. It results 1,037€ as the average sales per customer per state.
2. AVG() within INCLUDE
The syntax is as follows { INCLUDE [Customer ID]: AVG([Sales]) } and then averaging in the row. It gives us different value, 444€.
Which one do you think is right? Both syntaxes raise no error. Let’s investigate further.
In the first case, Tableau sums up total sales for each customer (Customer ID) and divides by 6 customers.
Wait a minute! We may think that each customer could have several orders (order ID) and each order might contain several items (Product ID). Exactly!
In the second case, Tableau calculates the average of each order ID of the items (Product ID) and then sum them up, then divides by 6 customers which gives us 444€.
So, the main difference lies in the order of operations. The first case first sums up generated by each user and then calculates the average of those sums. While, the second case calculates the average at the level of each user, sums up and then calculates the average for all customers (2 levels of average).
To choose between them, we need to consider the specific requirements of our analysis and the level of detail at which we want to perform the aggregation. If we want the average of the total sales for all users, the first case might be more suitable. If we want the average sales for each user, the second case is appropriate.
Cheers,
Nuki