Skip to main content

SimilarPhrase Loop on Array

I need to identify all transactions involving certain companies in a dataset.  Since the names on my company list won't exactly match the names in the transaction dataset, I plan to identify variations using @similarphrase.  I have over 100 companies to search for in the dataset so I am trying to define an array containing the 100 companies and create a loop so that the @similarphrase function is performed on all 100 names.  I would need the output to be a separate column for each company name.  I am fairly new to IDEAScript and am having trouble getting this to work.  Any help would be appreciated.

Brian Element Mon, 01/22/2018 - 13:07

Hi dreynol5, you might want to check out this vidoe on fuzzy matching (https://www.youtube.com/watch?v=Lu3mwVqE-G4&t=395s)

You would append your two files together and make sure that your list and the items you want to compare against is the same.  Then you run the fuzzy matching on that field and see which items from your list has matches to the other items.

Let me know how it goes and if you have any questions.

Brian

dreynol5 Mon, 01/22/2018 - 15:27

That is certainly a much simpler approach.  The only challenge with this approach is that it also matches companies within the transaction data that aren't on my list.  I can manually go through the results and remove the matches identified that I don't need faily easily.  However, any thoughts on how I can only show matches where at least one of the matches came from my list?

Brian Element Mon, 01/22/2018 - 18:28

Hi dreynol5,

What you can do is add a field to both files with the file name or something so you know where each transaction comes from.  Do the fuzzy matching.  Then do a summary by the Group_Name field and your field.  Then just do a summary by Group_Name, anything with 2 items means that they are in a match from both files.  You can then do a join with the fuzzy file to this file on Group Name and only join on items that have 2 records and you have the transactions that are in both files.

Hopefully that makes sense.

Brian