This interesting story originated from a question that we received at the Q&A session on the recent Masters Summit for QlikView in Barcelona. I’d say, it’s another little nugget that we uncovered in our sessions. If you follow QlikView blogs, you probably read about many unexpected findings that were shared on our recent events. Henric Cronström, for example, realized that the rest of the world outside of the R&D department still considers Count(Distinct …) to be a bad practice, and he posted his research article in the QlikView Design Blog. Roland Vecera shared with us that DROP FIELD does not release all the memory that was allocated for a field, and Rob Wunderlich blogged about it here. One of the challenging questions we answered in London was about facilitating “OR-mode selection” between list boxes, and Barry Harmsen blogged about it on QlikFix.com.
This question was asked in Barcelona, and it almost stomped us, at least for a few moments. Imagine that you need to rank Products based on their Sales. “Easy!” – exclaims every QlikView developer who is familiar with the function Rank():
=rank( sum(Sales) )
This expression will do the job, right? Right… Here is the result. You will probably recognize the familiar data elements from the standard Developer Class:
This is where the problem begins… Let’s select 5 products from the list. The Straight Table will only show the selected Products and rank them from one to five:
Well, says the client. I’d like to see the original ranking. Bike Helmet should be number 20 and Bow Tie was originally number 43. I need to preserve the original ranking, despite the selection… “Easy!” – exclaims every QlikView developer familiar with Set Analysis. We can use Set Analysis {1} to rank the Products using the unfiltered data set. We can even get more sophisticated and respect all other user selections, except for the selection in Products:
= rank( sum( {$<Product=>} Sales) )
This way, the rank will be based on a modified set of data, and the original Rank order will be preserved. Let’s modify our expression, and here is what we see as a result:
Hmm, now the chart simply doesn’t respond to the Product selection, and always shows all products. Well, yes, it’s a side effect of ignoring the Product selection in Set Analysis.
Now what? Well, says a bit more experienced QlikView developer, we’ve done those things before. We need to condition the result in such a way that we only show the Rank if the product is associated with the current selections. For example, we will only show the Rank if total Sales for the Product are different than 0. This way, unselected Products will render zero sales, and therefore will be excluded from the chart. Something like this expression will do the job:
= if (sum(Sales) <> 0, rank( sum( {$<Product=>} Sales) ))
Since the first function sum(Sales) doesn’t have any Set Analysis, it will only return a non-zero value for selected products, and then the Rank function will do the rest. Let’s modify our expression and see what happens:
Hmm, this was not what we expected to see… The Ranks are, indeed, preserved, and they are only displayed for the five selected products, but the other products are still displayed. And, to exclude any possibility of a mistake, we verified that “Suppress Zero-Values” and “Suppress Missing” are both set to “Yes”, in the Presentation Properties. “What’s going on here?” – exclaims every QlikView developer who understands how QlikView works. This is not how QlikView should deal with unassociated data! Well, quite frankly, we think it’s a glitch related specifically to the function Rank. For some reason, Rank is causing the chart to show missing values, despite the settings.
At this point in our Q&A session, all four of us agreed that this is probably a bug, and we moved on to the next question. While another question was presented, we kept thinking that there must be a work around… Earlier in our Set Analysis session, we learned how Set Analysis works in combination with AGGR, and how Set Analysis in the inner aggregation function may be different from the Set Analysis in the outer aggregation function. Our conclusion was that in most cases, we need to repeat the same Set Analysis conditions both in the inner and the outer aggregations. And this is where the work around for this problem was found.
=avg(aggr(rank( sum( {$<Product=>} Sales) ), Product))
If we enclose our Rank inside an AGGR, aggregated by the same Chart Dimension, and add the outer aggregation function (could be any function like Avg, Min or Max), we could possibly reach the desired effect:
Voilà! The list is limited to the selected Products, and the original Ranking is presented. As an added benefit, we even got a total value, which is an average Rank of the selected Products. Why does it work? Well, the answer is hiding in the understanding of how Set Analysis and AGGR work together. The Set Analysis in the inner aggregation is using the modified data set, therefore calculating the original Rank accurately. The outer aggregation, in this case with no Set Analysis, is selecting the data associated with the current selection, causing the list to be limited to the selected products only. And the total appeared because the Rank is calculated at the Product level, due to the AGGR, and the results are then averaged out. This was not possible using the original function Rank with simple aggregation.
Admittedly, this is a trick and a work around, to overcome the “undocumented behavior” of the function Rank, but we think it’s a pretty good one. And it requires an advanced understanding of Set Analysis and AGGR, – something that we offer at the Masters Summit for QlikView. Our next event is scheduled in Chicago on April 1-3, 2014. Please check our agenda and decide if Masters Summit for QlikView is a good fit for you. You can even take a quick Self-Assessment Test that could help you make an informed decision.
0 Comments
[…] In addition to the four main courses, we will have our evening sessions that usually include one of the guest presentations and an expert panel. Our “Stump the Geek” panel was very successful in Barcelona, where we encouraged the participants to bring their toughest QlikView questions and stump the panel of experts. We answered several tough questions and we built several neat solutions on the fly (for an example, see my blog article here.) […]
This is a great post on ranking. Thanks Oleg!
Oleg – this post just saved me on my current project. THANKS!
In you ranking solution above say if a business is null and we have sales associates against this null businesses, so avg() will not rank them.
I tried this for my data and sales for null businesses should have been ranked #1,
even if i do suppress when value is null the rank starts from 2.
if i do suppress when value is null,suppress zero values rank starts from from 1.
what if i want to rank to suppress this null value and rank should start from 1. ignore this null businesses in ranking also?
I’d say, first of all you should look for the reason for having null values – many times, null values point at a hidden data issue. If the null values are acceptable, and you simply want to disregard them in your ranking, then you should add another Set Analysis condition that excludes Null values from the ranking.
Thank
I was enjoied to read it
Just a quick note to let everyone know that the bug that stopped the 2nd to last solution from working appears to have been corrected in the most recent version of Qlik Sense 3.0x.