SAP is about to release a cool new feature that they call Intelligent Lookup and brand as “Empowering Business Users to harmonize data with ease where standard techniques fail”. It is supposed to solve data integration challenges where datasets match “in principle” but lack a common join key that so that a standard database join can merge the two datasets.
Use Case: Guest orders
We tested this new development by taking a specific use case that is known to us from our projects with retail customers: in the case at hand, buyers in a web shop could order either as guest or register for a new user. As you certainly know it from your own online shopping experience, buyers often cannot be bothered to register with the web shop, cannot find the credentials they used for their last order or only register after a couple of good experiences with that same web shop.
What is convenient for the buyer poses quite a data integration challenge for the retailer: how do they map all orders to the same customer id, if the order itself has been posted as guest. If they could map, that would be helpful for reporting and targeted messaging. But how to get there? Ideally – but not always – users would be using the same email address, but what if they don’t? One could revert to compare the name and address of order record and customer master, but again, even slight spelling differences could lead such an approach to failure.
SAP claims that with Intelligent Lookup, essentially any matching problem can be tackled regardless of the problem domain. So, we decided to give Intelligent Lookup a shot at this problem.
Intelligent Lookup has a unique approach to this matching problem in that it is essentially an interactive environment for building a mapping table between the two datasets. If we had such a table in our concrete case, we’d know what order id matches to what user id. Intelligent Lookup is a standard operator in the Data Builder of SAP Data Warehouse Cloud along the same lines as a table, a graphical view, a SQL view or a data flow.
The way Intelligent Lookup finds those mappings is by letting the user specify a series of mapping rules that compare relevant field values between order records and customer records. These rule either succeed (we have a match and use that now) or fail (let’s formulate another rule and use that instead). Intelligent Lookup today supports exact comparison as well as fuzzy matching. Taken together, these two can solve many mind-boggling data integration problems that are otherwise too hard for even SQL experts to handle.
Intelligent Lookup: Step-by-Step
Once we modelled the mapping rules of the Intelligent Lookup, our resulting canvas looked like this:
For a better understanding let’s go through this step by step:
We are trying to look up for 5k sales records for their respective customers.
The first rule looks for an exact match in the customer number:
This rule is successful in 91% of the cases but failed to hit for all those guest orders without a provided customer number, that are the core of our data integration problem.
For those 9% of the records where the matching could not be done with the customer number, a further rule is created.
That compares the email addresses used in the order and the one from the customer master:
With this rule another part of our guest orders can be matched, as the same email address was used.
But there are still 8% of the records where no match is found using the email address.
Our final rule then is a fuzzy rule that fuzzily compares address & names of order records and customer master. This is where things become really interesting:
- Out of those 395 records that flow into it, 289 are solved 100% (because address & names are fully identical)
- 47 records are now up for review (because the system spotted slight differences in spelling, but only with exactly one record). Users can now individually confirm or reject the system proposal.
- Another 8 are now in a category that SAP calls “multiple match”: the system found several possible hits and users can now manually resolve each one of them in the work area or tie in another rule that does it for them. With 8 records only, we are happy to do that manually, but if it had been hundreds, we’d chain in another rule that would have made that decision for us.
- We are finally left with only 51 records for which the system cannot make out a match. They are genuine new customers that ordered through a guest account.
All these matches as well as unmatched records finally get piped into the output node. We could configure what fields to look up from the lookup table, to include or discard failed matches and fill unmatched elements with default values instead of NULL.
This output view of Intelligent Lookup can subsequently be consumed in any other DWC artefact – be it a view, data flow, business layer or even in SAC. It’s truly a first-class citizen in that respect.
Summary
Once we understood the mechanics of Intelligent Lookup, that whole process took us less than 20 minutes. It would have been extremely challenging to integrate them via standard database or ETL tooling and we were amazed how quickly and intuitively things were solved. We are eager to explore more data cases. We foresee that especially in the context of the about-to-be launched Data Marketplace (Link), such harmonization cases will happen a lot, simply because the governance of SAP’s internal data and that external data from Data Marketplace is different, thus again leading to a lack of common join key between the data sets. We’ll keep you posted as we learn more.
Video
We recorded the hole usecase for you, so that you can follow along every step directly in the system: Watch the video