Hi there! Long time no talk… As you can imagine, I’ve been busy – Master Summits for Qlik, my own “QlikView Your Business” master classes, etc etc etc…
Plus, it’s not too often that we see something so interesting and so exciting that warrant a blog article. Maybe I’m not looking hard enough…
Anyway, speaking of exciting things – are you familiar with AUTONUMBER? Yes, yes, that wonderful function that replaces long strings with short numeric values by numbering the strings sequentially? Of course you know about it! Everybody knows the benefit of using AUTONUMBER – smaller numbers are lighter than long strings, which leads to memory savings.
But wait, there is more! Many of you also know that the most important benefit of the AUTONUMBER function is the fact that it generates sequential integers – a sequence of whole numbers that doesn’t have any holes in it. Why is this so important, you may ask? Because this way, Qlik doesn’t need to store the Symbols for this field.
Now, if you don’t know what Symbols are in Qlik jargon, then you should definitely drop everything and check your schedule at the end of October or in the beginning of November. That’s when our next sessions of the Masters Summit for Qlik are going to take place – October 28-30 in Amsterdam, Netherlands (Amsterdam, Carl!), and November 6-8 in Washington, DC. There, you will learn all about Symbols, and why they are so important to know for developing high-performing and scalable Qlik Apps. Not to mention Advanced Data Modeling, Scripting, Visualizations, Set Analysis and AGGR, and an introduction to Extensions. A full bag of tangible knowledge, delivered by the best experts!
However, I digress. We talked about AUTONUMBER generating sequential integers. There is a caveat to that, and not every developer out there knows about it – in order to generate sequential integers for multiple fields, these fields must be qualified with their unique name spaces. So, instead of the simple form:
AutoNumber(KeyField1) as KeyNum1,
AutoNumber(KeyFIeld2) as KeyNum2,
we should be using an extended form:
AutoNumber(KeyField1, 'Key1') as KeyNum1,
AutoNumber(KeyField2, 'Key2') as KeyNum2,
“So far, nothing new or exciting”, – you might be muttering to yourself right now. But wait, there is more! Good things come to those who wait…
The good ol’ AutoNumber function comes with its own challenges though. First, it’s somewhat a pain to add it consistently to all the key fields and not mess it up at least once. Second, it’s known for being quite slow. Applying this function to several concatenated keys will certainly slow your load down. And, since we usually add AutoNumber to our scripts after we test and troubleshoot any possible issues, it always comes as a tedious chore at the end – to go through your whole script and add those darn functions to all the keys…
If only we could tell Qlik to apply AutoNumber to some of our fields at the end of the process, right?
Right, and here is the exciting news for you. Starting in June 2018 (both in QlikView and Qlik Sense), in addition to the AutoNumber function, there is a new AutoNumber command, which allows you to do just that – tell Qlik to apply AutoNumber to some of your fields. Just like this:
AutoNumber KeyField1, KeyField2;
Or, you can use an extended format, like this:
Autonumber KeyField1, KeyField2 USING '
namespace
';
Now, how cool is that?
At first glance, it sounds like we should always use the optional “name space” parameter, just like we do in the AutoNumber function. However, my testing shows the opposite. Using the shorter form of the AutoNumber command generates perfect sequential integer keys, and it happens very fast – virtually in no time. It feels as if Qlik simply dropped the Symbols for the specified fields and used the index instead.
On the other hand, the AutoNumber with the USING setting, took some time to get performed. It feels as if Qlik had to rebuild the index from scratch. Perhaps, it allows developers to place multiple key fields in the same “name space”. I’m still trying to figure out why someone would want to do that, knowing that this option will not allow as much memory savings as the simpler option. If you can find a compelling reason to do so, please tell me about it in comments below.
When I shared the news with my colleagues from the Masters Summit team, Barry Harmsen got so excited that he immediately created a little utility that allows you to automatically AutoNumber all of your key fields. You can find Barry’s article here. Next, Barry discovered that the AutoNumber command allows wildcards, which makes his utility… er, less-than-super-useful… “Oh well, it was still a cool exercise to do!” – said Barry in response.
So yes, you can AutoNumber all of your key fields at once, especially if you use any kind of a consistent naming convention for your keys. For example:
AutoNumber '%*';
– if your keys fields always begin with the %-sign.
Or,
AutoNumber '*_Key';
– if your key fields always end with the word ‘_Key’
Notice that you should use quotes when using wildcards for your fields.
That’s it, boys and girls! These are all the exciting news I have for you today – well, other than reminding you that the next Master Summit sessions are coming soon to Amsterdam and to Washington, DC. And, in addition to our “traditional” track for QlikView and Qlik Sense developers, we are excited to introduce a new “API track” for Web developers that want to extend Qlik Sense functionality with extensions, mashups, apps that use the Qlik Core platform, and more. This class is one of a kind, and it’s delivered in parallel with our traditional Qlik content.
At this time, the early bird pricing has expired, but for the readers of this article, I got an agreement with the Masters to honor the Early Bird Price for two more weeks. Use the promo code AUTONUMBER during registration to get the Early Bird price until October 21st. See you at the Summit?
10 Comments
great post
And it also works in Partial Reload!
(Yes, I’m a partial reload addicted, I use it whenever I can while developing my apps)
Great post. Is there a reverse to AutoNumber? I find AutoNumber really helpful when dealing with large data sets and merging them together for ETL purposes but in order to reach the end goal, I have to drag the original values in a separate column. Given the translation of the Key Values and Number are already in the memory during the load script, I was wondering if there is a command that we can use to “Un-AutoNumber” at the end of the process. Thanks in advance.
Hi Syed,
thank you for your comment! Unfortunately, there is no reverse function to AutoNumber. When you need to keep the original values, I’d recommend to just use the original fields and not reduce them using AutoNumber. The only advantage of using AutoNumber is the ability to drop the original values and save on the Symbols memory. If you need these values, then AutoNumber doesn’t help at all. There is a common perception that tables would link faster with numeric keys. I couldn’t validate that in my testing. I believe it’s a myth. So, I’d simply use the original values in this case.
Cheers,
Oleg Troyansky
If you want to convert a key from alphanumeric to numeric but you don’t want to lose the alphanumeric values, you might use this little piece of code:
rename field MyKey to %MyKey; // the field %MyKey will be the new numeric key
%MyKeyTable:
Load
FieldValue(‘%MyKey’,RecNo()) AS %MyKey,
FieldValue(‘%MyKey’,RecNo()) AS MyKey // this field will retain the distinct alphanumeric values
Autogenerate FieldValueCount(‘%MyKey’);
Autonumber %MyKey;
Hi Code the Coder,
Thank you for your comment and for the suggestion. This idea will certainly work, however let me point out that it would be counter-productive for the App memory footprint. The only reason to use AutoNumber is to save on the memory footprint of the alphanumeric field. If we keep both the original field and the auto-numbered field, we are actually adding extra memory instead of saving it. In this case, it’s better to just use the original field and not apply AutoNumber at all. Qlik does its own indexing behind the scenes anyway, so we can just use the original field “as is”.
Cheers,
Oleg Troyansky
Ok, but what if you believe to the “common perception that tables would link faster with numeric keys”?
Sometimes, and not only for testing purposes, you need to keep the alphanumeric values.
Awsome. Very helpfull!
Hi Oleg,
I’ve read your article several times but I don’t quite get why
in order to generate sequential integers for multiple fields
we should better use the extended form AutoNumber(KeyField1, ‘Key1’) as KeyNum1
instead of the simple form AutoNumber(KeyField1) as KeyNum1?
What are the real advantages of the extended form compared to the simple form?
As far as I know the simple form works perfectly well with multiple fields as well.
Many thanks in advance!
Cheers,
Denis
Hi Denis,
You and have discussed this topic over the phone. For the benefit of other readers, I’ll try to briefly explain here.
When multiple key fields – NOT the same key field in multiple tables, but actually different keys, need to be optimized with the function Autonumber(), they will all share the same list of unique values. This list will be a sequence of integer values, however the list will be shared between multiple fields. Hence, each individual field will not be represented by a list of sequential integer numbers. In order to ensure that, the second parameter allows qualifying each key field differently and then each field gets a separate list of integers.
That being said, if you use the recently added command Autonumber at the end of your script, then each field will get its own list of sequential integer by default. In that case, specifying a single “Name Space” for multiple key fields is acutally counter productive.
Hopefully this clarifies the question for everyone,
Oleg