Skip to main content

Fuzzy Duplicate Checking in an Inventory Master Dump

Hello Group Members,

 
Quite often we are required to examine master data files from a Data Quality Management viewpoint.
 
In an Inventory Master dump, the Item Description is a free text entry field where users key in the Item Description while registering the Inventory item in the system. More often than not the same Inventory item is re-entered in the system with minor changes to the text without establishing the availability of the same item in the masters.
 
So as an example you will have an existing item like 'M.S. Flange 34/5' which will be incorrectly re-entered as 'MS Flange 34-5'. This creates an unnecessary duplicate for the second item entered which just adds to the master entry redundancies in the system.
 
To unravel such cases the Duplicate Key Detection in IDEA will not serve the purpose as we are not looking for exact duplicates in the Inventory Item Description but near duplicates.
 
It is in situations like this where IDEA's Fuzzy Duplicate lends immense application and value.
 
By applying the Fuzzy Duplicate on the Inventory Master list with the key field being 'Item Description', the user can set the similarity degree to 85 or 90% and capture similar item descriptions.
 
What's more interesting is that the Fuzzy check can be applied on multiple fields too. So if the item description is spread across say two fields like 'Description' and 'Variant' with sample text like 'SKF Bearing' being the 'Description' and 'Spec No - 78' being the 'Variant', Fuzzy can be used to tag and present similar duplicates on near match text strings like 'SKF Bearring' and 'Spoc Num / 78'. 
 
Regards
 
Group Administrator