Skip to main content

Two File Recon on Free Text User Entered Fields

Hello Group Members,

 
Let's say we have two delimited files in IDEA.
 
The first contains the following fields Unique Code, Document Reference Number and Amount with sample records like -
 
U1, ABC/1819/1234, 45789
 
U1, 007, 35
 
U2, PQR/2018-19|1234A, 567843
 
Now the second file contains the following fields Code, Ref Number and Value with sample records like -
 
U1, aBc20182019BILL1234, 45789
 
U2, pqR18191234, 567843
 
U2, TYPX2, 245
 
Now if you notice the second field (i.e. Document Reference Number) is a free text user entered field. Users tend to exercise their own style of entering the Document Reference Number.
 
Since each Unique Code may have multiple Document Reference Numbers we cannot be a match only on Unique Code. We have to take the Document Reference Number into the Join to get the correct results.
 
So can the Group suggest innovative ways of executing the Join with the Unique Code and Document Reference Number keeping in mind the variability of the Document Reference Number and the need to execute a close proximity match.
 
I have already tried with @Strip(@upper()) and @JustNumbers() and got many valid matches. But I am still left with many unrecon items for which I am looking at innovative matching methods.
 
Thank You
 
Group Admin

 

osaajah Fri, 05/25/2018 - 02:56

Hi Brian, you can use @similarword or @similarphrase or combination both of them. But first, you have to use visual connector to join both files. Then create virtual numeric field with parameter as in picture below. Last, apply criteria on the virtual field to display only results with minimum similary degree you want. For example: FUZZY_MATCH >= 70.
 
 

Steven Luciani Fri, 05/25/2018 - 09:20

Hi osaajah,

I love your solution. I have never had the opportunity to apply @similarphrase or @similarword in the work I undertake with IDEA.

Steve

osaajah Sun, 05/27/2018 - 12:50

Hi Steven & Brian,
Brian's case was similar to my case a few years ago. Luckily, at that time IDEA launched version 9.1 that came up with two new functions: @similarphrase and @similarword. Those functions helped me a lot to solve the case with the solution like I have posted above. Hope it is also suit for Brian's case.
 
Firdaus Sentosa