Today, let’s talk about Best Practices to write Dax Code. Dax is a language that allow us to write code that will produce the desired result in a few different ways, for this reason there are practices we should implement to improve the understanding and performance of our code. When DAX code is written correctly will be fast, reliable, easily maintainable and reusable. Bad Dax on the other hand, can create problems of performance and ambiguity.
We will go to some key aspects of making your code easy to read, update and maintain, how to handle errors and improve performance. Let’s dig in!
Naming convention
The first practice I want to mention is regarding the readability of the code. Understanding the use of columns and measures reference is a key aspect of the way your code is written: Since the syntax in DAX allows for these two to be the same, we need to learn to write your code in a way that other developers can read it and understand what you are referencing.
For the column reference: Always use the table name!
TableName[ColumnName]
For the measure reference: Do not use the table name in this case, that will be the major differentiator
[MeasureName]
Besides being a great way to standardise the way the code is presented; the performance of your code will be better.
The use of variables
Variables are a powerful feature in DAX to improve you code readability by reducing nested calls. It allows for reusability and makes the syntax look a lot simpler.
In addition to that, the use of variables also contributes to an improvement of performance because the code associated with a variable is only computed once, avoiding the any double computation of the same expression. Especially if you are working with complex subexpressions, having variables will lead to the single evaluation of that expression, making the code a lot faster. See the following example:
We want to calculate the taxed amount for the products on a table. We can do that in a couple of ways. The best performing code will be the following:
TaxedSales =
SUMX (
Sales,
Sales[Quantity] * Sales[Unit Price] * ( 1 + Sales[Tax Percentage] )
)
This very short query will return the right results, but let’s say, for the sake of argument, this query is not that easy to understand for a person that is not familiar with the calculations, and we want to split these calculations into steps. The code without the use of variables, trying to split the code in steps to make it easier to understand, we get this:
TaxedSalesExplained =
SUMX (
ADDCOLUMNS (
ADDCOLUMNS (
ADDCOLUMNS ( Sales, "Line Amount", Sales[Quantity] * Sales[Unit Price] ),
"Taxes", [LineAmount] * Sales[Tax Percentage]
),
"Taxed Amount", [Line Amount] + [Taxes]
),
[Taxed Amount]
)
That seems a bit long and convoluted, without mentioning that this query will perform very poorly. We can achieve the results, with the performance of the first query using variables:
TaxedSalesExplained =
SUMX (
Sales,
VAR LineAmount = Sales[Quantity] * Sales[Unit Price]
VAR Taxes = LineAmount * Sales[Tax Percentage]
VAR TaxedAmount = LineAmount + Taxes
RETURN
TaxedAmount
)
Even though the result will be the same the first one will perform poorly and is a lot longer, while the second one is easier to maintain, understand and will be faster.
Errors
The premise here is “Errors are good”. Dax has functions we can use to handle errors. We are going to focus on the two primary Error functions to use when you know and when you don’t know what is causing the error.
In the case where we know what type of error we are going to get, we can use the IFERROR function that will return an specified value if the expression fails. For example, when we have a division by 0, we can provide the number 0 to be returned.
In any other case, we can use the function ERROR, this is the good practice: This function will stop the computation of the expression and raise an error with an error message. This message is a constant string, can’t be dynamic. Doing this, should make easier to understand where the issue in the data is. For instance, see the code below where we try to calculate the commission of an agent:
Commission =
SUMX (
Customer,
VAR Commission =
LOOKUPVALUE ( Agent[Commission], Agent[Code], Customer[AgentCode] )
RETURN
IF (
NOT ISBLANK ( Commission ) || ISBLANK ( Customer[AgentCode] ),
[Sales Amount] * Commission,
ERROR ( "Wrong configuration in Customer[AgentCode]" )
)
)
In this case , we are implementing the error message to stop the computation and raise an error if the expression fails to compute and give us some information of what the problem is.
Bidirectional filters
Using bidirectional filters in your DAX code can create ambiguity in the results you get. Basically, this means that depending on how these filters have been setup you can get an inconsistency on how the data will be filtered from one table to the next since the algorithm can find multiple paths in the order it will filter the tables.
Bidirectional filters are a very powerful feature, when leveraged correctly they can give you the functionality you need. There are a few scenarios where the power of bidirectional cross-filter really shines. But, if used incorrectly, they can lead to an unreliable and inconsistent model with poor performance. The main take away here is to:
- Only use bidirectional filters in the data model when fact tables are reachable only on one side of the relationship.
- Use CROSSFILTER instead of bidirectional filter whenever possible.
Avoid context transition in large iterators
For this point, we will refer as context transition calling a measure reference in an iterator.
Context transition is expensive, it is one of the most common sources of performance issues, even more so when this happens in a large iteration. As a result of this context transition, we materialise an uncompressed temporary data table in memory.
To avoid getting your code from using up all the memory trying to resolve the computation of your expression, the good practice is to use context transition in iterators or tables with a Primary Key. That will be more effective and will return results quicker.
For example, in the following example the Discounted Amount is calculated in both ways, one using context transition, and without using it.
- Using Context Transition in a very large table (assuming that our table has more than 12 million rows) this will cause the query to use up more than 5 GBs of memory and will be very slow to return results:
Discounted Amount =
SUMX ( Sales, [Sales Amount] - Sales[Quantity] * Sales[Unit Discount] )
- The right way to write that calculation:
Discounted Amount =
[Sales Amount]
- SUMX ( Sales, Sales[Quantity] * Sales[Unit Discount] )
The difference in the results for those measures will be huge in terms of performance and in some cases, we can have issues of computational errors and get different results. The second way will be always faster and more reliable.
Conclusions
The importance of implementing practices that will allow your code to be easier to read and maintain will be significant over time. Using practices like name convention for measures and column names, as well as using variables will prove very handy when going back to reports or models you haven’t seen in a while. In terms of performance, understanding the risks and disadvantages of using bidirectional filters and context transition will be key to create good models and have fast and reliable reports. I covered, in my opinion, the most important practices to always have in mind when writing DAX code. I hope you find it useful and you implement this in your work.
Happy Learning!
References: