Like many of my discoveries, this article was inspired by a Qlik Community conversation that followed an old, but still brilliant blog post by Henric Cronström, which describes the “Expression Search”. In one of the comments, a curious guy named Daniel asked a question, which led all of us to this interesting realization.
First, let’s clarify the terminology. In the early days of Set Analysis, we knew two types of searches – the Simple Search and the Advanced Search.
The Simple Search could be textual:
or Numeric:
While Advanced Search used an expression that started with the equals sign:
Today, all three forms of search still exist, but the current Qlik documentation uses different names for them. Now we call them “Text Searches”, “Numeric Searches”, and “Expression Searches”. The latter shouldn’t be confused with a new little-known feature that could also be called “Expression Search”, as described recently by Rob Wunderlich in this blog article. This kind of search contains an expression in place of the field:
With that in mid, to avoid any confusion and to be very explicit, I’ll continue calling the Expression Search by its old-fashioned name, “Advanced Search”.
So, other than obvious differences in syntax, are there any substantial differences between the Simple and Advanced searches?
For example, are these two searches identical?
vs.
[Date] = {“=Date<41640”}I bet most of you, curious folks who read this blog, are nodding your heads in approval – yes, these two filters look practically identical. Some of you might point out that simple searches with Dates require the dates to be fully formatted as dual date fields, while Advanced Searches are more forgiving and allow both the dual and the numeric date formats. Other than that, they look the same, right? Until today, I surely believed so. Until Daniel asked his question, and I went down the rabbit hole of testing and shaking my head in disbelief.
So, here is the issue. Let’s say that we need to select dates that belong to the prior year and show a KPI of “Prior Year Sales”. Surely we want to protect this calculation from any user selections in the calendar fields. So, we need to ignore selections in fields like Year, Month, Quarter, etc. For simplicity, let’s just ignore selections in the field Year. So, the formula for last year sales could look like one of the following (by the way, my testing data set, which came from my book QlikView Your Business, contains data for years 2013 and 2014:)
Sum( {$<Year=, Date = {“=Date<41640”} >} [# Amount])
Sum( {$<Year=, _PYTD_Flag={1} >} [# Amount])
Sum( {$<Year=, Date = {“=_PYTD_Flag=1”} >} [# Amount])
Sum({$<Year, Date = {“=Year(Date) = $(=max({1} year([Date])-1))”} >} [# Amount])
Sum( {$<Year=, Date = {“=Year(Date)=2013”} >} [# Amount])
Conceptually, all of these expressions are supposed to produce identical results that contain Sales for Year 2013, and they do, as long as no selections are made in the field Year:
However, what happens when we select Year 2014? We still want to show Sales for 2013, hence we asked to disregard selections in the field Year. But look, they are not always disregarded:
Only columns 1 and 3 continue showing the right numbers, and all other expressions show zeroes. What is the difference between columns 1 and 3 and the rest of the columns?
So, what’s wrong with Advanced Search? It looks like our instruction to ignore selections in the field Year wouldn’t work in combination with Advanced Search? Have we landed on a bug, or is it doing something that we simply don’t understand?
Well, while we were struggling with this mystery, both Daniel and myself came to the same realization, which was later confirmed by Henric (thank you, Henric!). Let’s try to “peel the onion” together.
The Advanced (Expression) Search involves an expression. (Well, duh, everybody knows that!) It’s not just any expression, but an aggregation that is evaluated at the level of the Field that appears in the Set Modifier. In our examples, it’s the field Date. Let’s examine a simplistic Expression Search like this:
It includes an aggregation, which is performed at the Date level. Wait a minute, you might say. I don’t see any aggregation functions here! What kind of an aggregation is that? Well, just like it happens in chart measures, when no aggregation is specified explicitly, the default aggregation function is the ONLY() function. So, if you really need to see that aggregation function, here it is, only it’s usually implicit:
OK, now we can all agree that this is an aggregation, right? Like any other aggregation, it has its own data scope, correct? By default, the data scope for this aggregation is the “Current Selections” data scope, which is usually identified by Set Analysis {$}. In this data scope, there is no instruction to ignore the field Year, is there? Hence, the Expression within the Advanced Search condition does not know of our desire to ignore selections in the field Year. It is evaluated independently from any other Set Modifiers that are placed outside of this Advanced Search condition.
So, there’s the reason why seemingly identical conditions produce different results. Simple Searches don’t involve any additional aggregations, and these modifiers’ results are being intersected with all other modifiers, just as we expect. But Expressions within the Advanced (Expression) Searches “have a life of their own” – they have their own aggregation scope, and unless explicitly specified, they evaluate their data independently from all other modifiers.
With this in mind, we can now fix our non-working Advanced Searches with the explicit use of the ONLY() function, and with Set Analysis conditions that explicitly require the field Year to be disregarded in these aggregations as well:
Sum( {$<Year=, Date = {“=ONLY({<Year=>} _PYTD_Flag)=1″} >} [# Amount])
Sum({$<Year, Date = {“=ONLY({<Year=>} Year(Date)) = $(=max({1} year([Date])-1))”} >} [# Amount])
Sum( {$<Year=, Date = {“=ONLY({<Year=>} Year(Date))=2013″} >} [# Amount])
Voilà! Problem solved. Now, what about that NEW Expression search that I mentioned earlier? Ah yes, some of these Advanced Search conditions can now be somewhat simplified. The same conditions that requires the Year(Date) to be limited to a search value, can now be written like this:
Sum( {$<Year=, “=Year(Date)” = {2013} >} [# Amount])
Note that these modifiers no longer include Advanced (Expression) Search conditions. In fact these are not search conditions at all – we are simply selecting specified values in … a field? … an expression? Hmm… In fact these are called Calculated Fields, but this is a whole different blog article that Henric published in April 2020, while the rest of us were busy “flattening the curve”…
So, there’s another Set Analysis mystery that we managed to understand and clear up. Special thanks to Daniel, who started the conversation, and to Henric, who confirmed our conclusions.
This topic will definitely find its place in my lecture on advanced Set Analysis and AGGR at the Masters Summit for Qlik, which by the way is coming to the fabulous New Orleans on November 14-16. There is still time to secure your seat and to book your flights. You will learn many advanced development techniques, from Data Modeling to Scripting to Visualizations to Performance Tuning, and even an introduction into mashups and extensions. Even the most seasoned experts leave our events with a lot of new knowledge acquired. And… did I mention it’s in New Orleans?
2 Comments
Hi Oleg, Would the new option of Qlik’s set analysis where you can put the set above and outside of the expression solve this problem with advanced? {} Sum( {$<Date = {“=ONLY(Date)} [# Amount])?
Hi Debbie,
Good question. I tested it, and it doesn’t look like to would. The outer set doesn’t seem to get applied to the search expression.
Cheers,
Oleg