Those of us old enough to have used QlikView will remember a feature that allows showing all dimension values in a chart, even if there were zeroes or no data for those values. In Qlik Sense, unfortunately, there is no such feature. After a bit of tinkering, though, I found a solution for this. And with Spring Training well underway, I thought a blog post referring to baseball would be appropriate.
Below, I’ve compiled all the games that my team, the Chicago Cubs, played in 2018. You’ll notice that the results are split into wins (blue bars) and losses (red bars). And just for the sake of pointing it out, the developers behind Qlik Sense are clearly Cubs fans, based on their selection of bar colors!
Now, let’s say I’d like to see how the Cubs fared against our biggest rival, the St. Louis Cardinals. When I select “STL”, as you will see below, months in which the Cubs did not face the Cardinals are suddenly not shown in the chart.
That would be fine, but my (imaginary) client required very assertively that we should always see all 12 months, no matter how many months have data and how many months don’t.
My first thought was: “Well, we have that setting “Show Zeros”, which I usually disable, while grumpily muttering to myself “Why would I want to see zeros by default in all my charts?” Here is the one business case when it can be handy. I check the setting. Nothing. Still not seeing zeros.
But then, if you think about it, these values are not zeros. They are just not there. Missing Values. Where is QlikView with its full set of settings like “Show all values”?
Then, I decided to tweak the Dimension in a way that would make it ignore user selections. With the help of my two best buddies, Set Analysis and AGGR, surely I can create a Dimension that shows all values, no matter what was selected. Something like this should work:
AGGR( {1} only(MyDimension), MyDimension)
The AGGR function loads the data directly from the data set, and the Set Analysis {1} condition requires all the data in the document, disregarding user selections. I plug the same formula into my chart:
Nothing. The chart ignored all my efforts to bring back all values.
Well, it’s time to work on the Measures. My two measures are two simple counts of the field “Result”, with simple Set Analysis filters that separated Wins from Losses. Time to dress them up in AGGR() and add another aggregation function on top:
sum({Set1}
AGGR( count({Set2} MyCounter), MyDimension)
)
With two nested aggregation functions working together, I can play with the scope of aggregated data in a more advanced way than I could with a simple Count(). If I open the outer aggregation to the whole document, and keep the inner aggregation restricted to the desired values, I can get the winning combination of enabling all data and yet aggregating just what I need:
Voilà! The {1} in the outer sum() allows all data in, and the silent {$} in the inner aggregation ensures that only selected data is aggregated.
Now, when I go back to my sheet and select the Cardinals, the bar chart will display all months along the X-axis, including those for which data does not exist or renders a zero.
Interested in more techniques to optimize your data visualization? My Set Analysis and Advanced Aggregation session at the Masters Summit for Qlik will teach you creative ways to use Set Analysis, AGGR, and Alternate States for advanced analytical solutions. Interested in signing up? The next Masters Summit takes place April 1-5, 2019 in Stockholm, Sweden. There are still a few open seats left. We hope to see you there!
10 Comments
Hi Oleg,
I enjoyed a read, especially this bit”….while grumpily muttering to myself “Why would I want to see zeros by default in all my charts?”….” made laugh as i just thought of myself at this stage!
One quick question – what do you think about performance overhead of such solution (given that you are dealing with big data set – say 500 million of rows, single fact table)? I totaly understand concept, i know that there is no better solution for it at the moment, but still would like to hear your thoughts. Thanks!
cheers
Lech
Hi Lech,
Thank you for your comment! You are making a valid point – for a large dataset, like 500 million of rows, these types of solutions may be terribly slow. Having said that, I wouldn’t necessarily dismiss it without testing it first. The biggest source of extra overhead is using the AGGR() function – it creates a virtual array in memory, which can be prohibitive in large data sets. However, if your dimension is not huge (like 12 months, for example), then the AGGR() only needs to build a virtual table of 12 rows. That may actually fly even in a huge data set.
So, as a rule of thumb, I’d avoid any excessive “pyrotechnics” with a large dataset, but then – “everything needs to be tested”. It might actually not be that bad. I’d be thrilled if you tried something like this on your large dataset and posted your observations here.
Cheers,
Oleg Troyansky
In case Aggr() is too resource heavy, then a similar result can probably be achieved by tacking on to the expression something that is guaranteed to evaluate across all dimensions without changing the result – e.g. +Sum({1} 0). As this is a simple aggregation, it could be faster than Aggr(). Not something that could be authoritatively said without testing, though 🙂
Yeah, I’ve used this approach quite a few times even in QlikView times (because the “Show all values” always sorted those values which would be normally hidden to the end of a chart). But it has some performance cost, so do be careful.
Hi Oleg,
Forgive me if this is silly. What if i use something like the below in the expression and let the dimension as Month
alt(count(Results),0) * sum({1}1)
Let me know your thoughts.
Hi Pradosh,
yes, this trick works too, thank you for sharing!
Buen día Oleg, execelente artículo! Aún no he tenido la oportunidad de probarlo en alguno de mis proyectos! Una duda que me.surfe, ¿la solución requiere de las dos modificaciones que presentas en el artículo ? Es decir sobre la dimensión y sobre la medida?
Gracias nuevamente por compartir tu conocimiento!
Deberías tener un apartado en tu blog que se llame “Formas de hacer cosas que se hacen en Qlikview y que aún no se pueden hacer en Qliksense de forma nativa!
Hi Hugo, thank you for your comments!
Hi Oleg,
The post is REALLY Helpful!! It works PERFECT.
I’ve tried your formula applied in my expression. However, I use the “YearMonth” as my dimension, and It display all years in the data set.
I’m wondering if there’s possible way to display specific period? Like Latest two year or 2016~2017.
Thanks in advance
Hi Kate,
thank you for your comment. I think if you add a Year filter to your Dimension calculation, it should do the trick.
Best,
Oleg