FROM SQL TO DAX [Part 1]

Over the last few years, the increasing number of choices for Business Intelligence/reporting applications has open more possibilities to choose the implementation tools for data visualisation based on the specific circumstances, such as the data format, business use, end-user knowledge and preference, just to mention a few. With this ever-changing environment is always important to have a grasp of at least the basics, to make the best decision for implementation.

The vast majority of data analysts and BI developers usually come from a SQL background, using it as our main language for data extraction, manipulation and transformation. However, now we have very powerful tools becoming relevant that are based on other languages. Since I’ve been doing the transition from SQL to DAX myself, I thought it would be a good opportunity to share it with you.  I come from an SSRS background, however, the last year I have been moving towards Power BI as the main tool for reporting. During the transition, one of the key pain points for me was to get comfortable with DAX and try to find a way to do what I know how to do in SQL in DAX syntax.

So, I thought it would be useful to put together a little guide with the relevant SQL to DAX conversions. Let’s easy into in by going through the classics first:

SQL

SELECT * 
FROM   Member 

DAX

EVALUATE
 member

Important to mention that the EVALUATE statement that is needed to execute any query. Let’s see how we select specific columns and provide an alias for them:

SQL

SELECT  FirstName + ‘ ’ +LastName AS FullName, 
        DateOfBirth AS DOB
 FROM   Member

DAX

EVALUATE
  (
     SELECTCOLUMNS (
         Member,
         "Full Name", Member[FirstName] & "  " & Member[LastName],
         "DOB", Member[DateOfBirth]
     )
 )

 You can also use ADDCOLUMNS to obtain this result; the only difference is that instead of starting with the <table> specified, SELECTCOLUMNS starts with an empty table before adding columns.

We covered the most basic queries, let’s move onto grouping and aggregates. In the case of grouping, DAX has a native function that allows implementing the same functionality called SUMMARIZE:

SQL

SELECT  InvoiceId,
         SUM(Amount) AS TotalAmount
 FROM    Invoice
 GROUP   BY InvoiceId

DAX

EVALUATE
 SUMMARIZE (
     Invoice,
     Invoice[InvoiceId],
     "TotalAmount", SUM ( Invoice[Amount] )
 )

It’s the turn for a very relevant topic in SQL; key while building reports in complex database models: JOINS! Dax does not support all the JOIN operations available in SQL, however, we can write DAX expressions to obtain the equivalent result in most cases. The magic is to leverage the relationships created in the data model to obtain the same result. By creating the relationships correctly in the data model, we can obtain the JOIN behaviour. For example, having the following model:

If you want to display the Total Sales by Customer by Product Category you can do it with the following expression:

SQL

SELECT  CustomerId,
         Category,
         SUM(Total) AS TotalAmount
 FROM    Invoice
         LEFT JOIN Customer 
                ON Customer.CustomerId = Invoice.CustomerId
         LEFT JOIN Product 
                ON Product.ProductId = Invoice.ProductId
 GROUP   BY CustomerId, Category

DAX

EVALUATE
 SUMMARIZE (
     Invoice,
     Customer[CustomerId],
     Product[Category],
     "TotalAmount", CALCULATE ( SUM ( Invoice[Total] ) )
 )

If there are no aggregates, use ADDCOLUMNS:

SQL

SELECT  Invoice.*,
         Customer.FirstName,
         Product.Category
 FROM    Invoice
         LEFT JOIN Customer 
                ON Customer.CustomerId = Invoice.CustomerId
         LEFT JOIN Product 
                ON Product.ProductId = Invoice.ProductId

DAX

EVALUATE
 ADDCOLUMNS (
     Invoice,
     "Customer Name", RELATED ( Customer[FirstName] ),
     "Product Category", RELATED ( Product[Category] )
 )

There are another 3 functions that are very important when dealing with JOINS: NATURALLEFTOUTERJOIN, NATURALINNERJOIN and CROSSJOIN.

Consider the database model presented earlier, to obtain all the rows in Invoice that have corresponding rows in Product (INNER JOIN), including all the columns of the two tables only once:

SQL

SELECT  *
FROM    Invoice
        INNER JOIN Product 
                ON Product.ProductId = Invoice.ProductId

DAX

EVALUATE
NATURALINNERJOIN ( Invoice, Product ) 

Now, to obtain all the rows in product regardless of having a record in the Invoice table, we can use the following:

SQL

SELECT  *
FROM    Product
        LEFT JOIN Invoice 
               ON Product.ProductId = Invoice.ProductId

DAX

EVALUATE
NATURALLEFTOUTERJOIN ( Product, Invoice ) 

And to replicate the CROSS JOIN, we can use the following syntax:

SQL

SELECT  *
 FROM    Invoice
         CROSS JOIN Product 

DAX

EVALUATE
 CROSSJOIN ( Invoice, Product )

Based on what we have seen here, it is safe to say that the best way to replicate the JOIN behaviour is to create the relationships on the model to take advantage of this functionality and implement better performing and simpler queries in DAX. However, is important to mention that there are other solutions available when there are no relationships established, such as calculated tables. These solutions will be slower, complex, and can make your reports run very slow.

To the last item on the agenda (for now): the WHERE clause! Let’s talk about filtering, in DAX we can use two functions to achieve the same behaviour: FILTER and CALCULATETABLE. Using FILTER the DAX code conversion from SQL will look like this:

SQL

SELECT  * 
FROM    Member
WHERE   Status = ‘Active’

DAX

EVALUATE
FILTER ( Member, Member[Status] = 'Active' ) 

If we had multiple conditions, it would look like this:

SQL

SELECT  * 
FROM    Member
WHERE   Status = ‘Active’
        AND FirstName = 'John'

DAX

EVALUATE
 FILTER ( Member, AND ( Member[Status] = 'Active', Member[FirstName] = 'John' ) )

When using OR in the WHERE clause the DAX code looks exactly the same but put OR instead of AND.

 Let’s discuss the other way to filter data, we can also use CALCULATABLE to achieve the same result like this:

DAX

EVALUATE
 CALCULATETABLE ( Member, Member[Status] = 'Active' )

With the multiple conditions using AND:

EVALUATE
 CALCULATETABLE ( Member, Member[Status] = 'Active', Member[Name] = 'John' )

Now, the OR statements, in this case, will require the explicit FILTER argument. Let’s look at the following example:

DAX

EVALUATE
 CALCULATETABLE (
     Member,
     FILTER ( Member, OR ( Member[Status] = 'Active', Member[Name] = 'John' ) )
 )

The decision of which one to use comes down to preference, it is important to always consider both options; as your queries get bigger and have more conditions, the CALCULATETABLE option can result in an easier to understand syntax than the FILTER option.

The SQL to DAX is a very extensive topic to cover, so we’ll leave it here for now and cover more ground on part 2 of this series. Dax is a very powerful language, and there is a lot to learn when going from a background using SQL. At this point my main takeaway is that to create good performing DAX code, there is a real need to understand how the engine interprets it. As I go along this transition process, I hope to document it and share with you better ways of optimization and good practices!

Happy Learning!

For more information on the topic, visit the SQLBI on the topic From SQL to DAX

Leave a comment