DAX Server Timings in DAX Studio

We can take a measure like the one customer order measure, add it to a card visual, and use the performance analyzer to give us the DAX query, which we can paste into the performance analyzer. However, we may see different results if we use it in a matrix, so we will test that too.

The original DAX measure is as follows, and the test measures are found here, which use different functions to create the virtual tables used in the methods. The first measure is below, and we will look at the output of the server timings in more detail, using AI to help us analyze the output.

CustomersWithOneOrder = 
VAR _CustomerOrders = SUMMARIZE('FactInternetSales', -- Table 
                            'FactInternetSales'[CustomerKey], -- Groupby  
                             "Orders", DISTINCTCOUNT('FactInternetSales'[SalesOrderNumber]) 
                                )
VAR _OneOrder = FILTER(_CustomerOrders, [Orders] = 1)

RETURN
COUNTROWS(_OneOrder)

The DAX query created by the Formula Engine (copied from the performance analyzer) is as follows for a card visual (total only):

// DAX Query
EVALUATE
    ROW(
    "CustomersWithOneOrder", '_Measures'[CustomersWithOneOrder]
)


In DAX Studio, we can paste in the code, clear the cache first, select server time, and all the options to show us each step of the query plan

The first part of the server timings shows us that the query took 12 ms in the Formula engine (FE) and 5 ms in the storage engine (SE) a total of 12 ms to run. In addition, we can see that a total of 2 SE queries were parsed to the Vertipaq engine.

This SQL-like language is called xmlSQL and here are the steps that we can see in DAX Studio.

-- Line 1, Subclass Internal, Duration 2 rows 0

SET DC_KIND="GEC32";   -- Selecting distinct count strategy DC_KIND
SELECT
    'FactInternetSales'[CustomerKey],
    'FactInternetSales'[SalesOrderNumber]
FROM 'FactInternetSales';

-----------------------------------
Line 2, Subclass: Internal,  Duration 3 rows 27,659   --- builds a datacache projects of the 2 columns it needs

SET DC_KIND="GEC32";
SELECT
    'FactInternetSales'[CustomerKey],
    'FactInternetSales'[SalesOrderNumber]
FROM 'FactInternetSales';

Estimated size: rows = 27,659  bytes = 110,636
---------------------------------
Line 3, Subclass: Internal,  Duration 3, rows 0  ---use the disinct cache count from the data cached.


SET DC_KIND="C32";
SELECT
    'FactInternetSales'[CustomerKey],
    COUNT () 
FROM $DCOUNT_DATACACHE USING ( 'FactInternetSales' ) ;

----------------------------
Line 4, Subclass: Scan, Duration 4, Rows 18,484   --- this is the real storage engine scan with a DCOUNT aggregation, group by customerkey

SET DC_KIND="AUTO";
SELECT
    'FactInternetSales'[CustomerKey],
    DCOUNT ( 'FactInternetSales'[SalesOrderNumber] )
FROM 'FactInternetSales';


Estimated size: rows = 18,484  bytes = 221,808

------------------------------------------

Line 5, Subclass Internal, Duration 1, rows 0   --- auzillary count to help the optimizer (density/cardinatlity hints).

SET DC_KIND="DENSE";
SELECT
    'FactInternetSales'[CustomerKey],
    COUNT () 
FROM 'FactInternetSales';

------------------------------------------

Line 6, Subclass Scan, Duration 1, Rows 18,487  --another quick scan of customerkey likely to get the distinct key

SET DC_KIND="AUTO";
SELECT
    'FactInternetSales'[CustomerKey]
FROM 'FactInternetSales';


Estimated size: rows = 18,487  bytes = 147,896

------------------------------------------

Line 7, Duration 0 refers to the execution metrics

Note that the actual result is 11619, which is not shown in the server timings. The server timings show the intermediary steps.

So that was all very interesting, but what do we do with it? Well, first, we can use it to compare measures as follows:

Card Visual (Total) DAX query performance

Table function used in measuresFESETotalSE QueriesApprox. Peak Mem Consumption (Bytes)
SUMMARIZE()1201323407
SELECTCOLUMS()1071722379
SUMMARIZECOLUMNS()8101813376
FILTER()671313357

In the first test, which was based on the DAX query total on a card visual, all the measures ran very fast, with the summarize() and filter() functions performing faster. So based on this, we might run off and decide to use one of these, but then what if we want to use them in a different visual? Will the performance be the same?

Matrix Visual DAX Query performance

Table function used in measuresFESETotalSE QueriesApprox. Peak Mem Consumption (Bytes)
Summarize()79250329539,884
Selectcolumns()414990611,968
SummarizeColumns()192645399,54
FILTER()247397510,445

So, as we can see, in the DAX query taken from the matrix test, the SUMMARIZE() function performed very poorly, with the measure that used the SUMMARIZECOLUMNS performing best. As you can see, this adds a whole different level of complexity to testing.

A solution for this is to use the ISINSCOPE() function to determine the scope of how the measure is being filtered and use it to switch between 2 different measures, but of course that makes the model more complex and is a topic for another day.