Power BI DAX — Filter Functions
There are two major kinds of purposes when using DAX in Power BI, creating a calculated column, or creating a measure.
1. Calculated Columns
It is used to add new formulated columns that are visible to the table, so that later in the visualization stage, we can use this calculated column for filtering. It is best suitable for using as a flag that describes that specific row. DO NOT use calculated column for aggregation or create new numerical values, which is useless and redundant.
2. Measures
It is calculated numerical value or aggregated value that can be analyzed and displayed in the visualization. Usually used in Value field in Report view.
Measure results do not exist in the table, they can only been viewed in a visualization.
FILTER Functions
There are about 15 types of functions in Power BI DAX, in this first post of Power BI functions series, we are only going to demonstrate some common FILTER functions that analysts use for constructing complex expressions in their daily workflow. The reason that I choose FILTER functions as the first post for this series is that in most cases, you’ll be using other functions with or nested within one of the FILTER functions that we are going through in this post. Therefore, having a fundamental understanding of those most commonly used FILTER functions, will help you understand the use case scenario when you are learning other types of functions down the road.
Common DAX Filter functions
CALCULATE
CALCULATE is probably one of the most commonly used function that analysts used for constructing complex expressions.
The syntax is fairly straight forward.
CALCULATE(Expression [, [Filter1],[Filter2][,..]])
The first parameter is an expression to be evaluated. And the following optional parameters are filters that can be the following:
- A Boolean filter.
- A Table filter.
- or filter modification.
Let’s say we want to create a measure that calculate a batter’s RBIs when batting against right hand pitcher (Sports data analysis scenario), we can create the measure in the following DAX statement. Note that the following statement is an example of using Boolean filter.
CALCULATE(
SUM('at_bat_stats'[RBI]),
'at_bat_stats'[pitcher_hand] = "R"
)
In some cases, you may want to have a matrix or visualization that compare a batter’s batting average with all batters batting average in the leagues. You will have to make the overall average measure automatically ignore all filter context when slicing the data in visualization. In this scenario, we will use CALCULATE function with filter modification. The following statement is an example that no matter which batter we choose in the slicer, it will always display all batters overall batting average.
Overall Batting Average =
CALCULATE(
[battingAvg],
ALL(
at_bat_stats
)
)
In the above expression, the [battingAvg] measure is defined by following expression.
battingAvg =
DIVIDE(
SUM('at_bat_stats'[hit]),
SUM('at_bat_stats'[at_bat]
)
After creating the overall batting average measure, we can compare the batter’s performance against the average of all batter’s in the following matrix.
Now you may wonder what is the ALL function that we used in the CALCUALTE function, and that is exactly what we are going to talk about next.
ALL
ALL returns all rows of the table, or all values in a chosen column, ignoring all filter contexts. ALL function, like CALCULATE, is one of those filter functions in Power BI. ALL often time is used as an filter modification for clearing filter and creating calculation on all the rows in a table, therefore, it is most likely to be used inside of other function such as CALCUALTE.
In the overall batting average example above, we are simply saying calculate the batting average from all at_bat_stats table, no matter what batter we choose to slice or no matter what filter context we have in place.
FILTER
FILTER(TABLE, FilterExpression)
Returns an (subset) of a table, whose remaining rows met the criteria in the filter expression.
FILTER can handle more complex filter expression than CALCULATE, for example, another measures.
FILTER is an iterator function that goes through each row of the table, exam it with the filter expression, and only rows that meet the criteria stay in the returning result, therefore, it is a compute intensive function that only utilized when CALCULATE can not meet the requirements.
EXAMPLE
High Income Household Counts =
CALCULATE(
[TotalHousehold],
FILTER(
'Household Profile Lookup',
'Household Profile'[AnnualIncome] > [Overall Average AnnualIncome]
)
)
The above example shows the counts of high income families, while OVERALL AVERAGE ANNUALINCOME is a measure that gets the average of all households annual income in the entire table.
There are still many filter functions that we did not go through in this post. Those functions can be handy when you are doing all the data transformation in Power BI. However, as a person with data engineer background, I highly recommend that data transformation should be done as upstream as possible before you even consume those data into Power BI.
It is always the best practice to cooperate with data engineers in your team to form your data mart in the ways that are as close to your business analysis requirements as possible. Remember that more calculation in Power BI will decrease the dashboard user experience.
FILTER functions are commonly used for creating measures to be displayed in visualization, almost all other functions will be used either alongside with or nested within in one of the FILTER functions. In the future posts, we will be going over other commonly used types of function, such as iterator and time intelligence.