Change order of words in a field alphabetically

7 posts / 0 new
Last post
Daniel Pech
Offline
Joined: 11/06/2014 - 05:03
Change order of words in a field alphabetically

Hello, has anyone, please, been dealt with this issue? I am preparing a macro to be able to compare 2 databases which includes names of persons, and find the matches. There can be several gaps in the names and the individual words can be in different order in the databases.  Therefore I need to create in each database a common field for the comparison – a name field with individual words sorted alphabetically.
Preparing one of the databases for comparison:
Default state:

I managed to separate individual words from the whole name and put it below each other to be able to sort it alphabetically (by “RECNO”, “NAME_KEY”), like this:
Middle step:

Now I have stucked in the final step. I need to revert it back to individual rows and put together the whole name but with words in alphabetical order, like this (can be with or without gaps between words):
Final state:

Then I would do the same steps with the second database and compare the fields “WHOLE_NAME_SORTED” between themselves if there is a match.
I tried to use a function “GetPreviousValue”/”GetNextValue”, but it doesn´t work within the same column (compared to excel)…
Maybe this proceeding is too much complicated, and something easier can be used.
Thank you very much in advance.
Regards,
Dan

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

Hi Daniel,

I am not sure exactly what you are trying to accomplish.  For the final step why don't you just do an extract of the first database and remove the blanks, wouldn't that give you the final database you are looking for?

Maybe if you could show the second database and what you are trying to accomplish with the two I might have an easier solution for you.

Daniel Pech
Offline
Joined: 11/06/2014 - 05:03

I am sorry for my late reply. I have found a solution. Maybe not optimal, but it works finally. The description is attached in a pdf document.

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

Hi Daniel, I have an alternative scenario for you to try and see if it works for you as it might be a bit easier.

Import each file and add a new virtual chracter field that removes the spaces and any punction in the name.

II then appended the two files together to get one file.

II then used a function from the Custom function package that you can get through the IDEA Lab.  It is the String to ASCII function.  What it does is adds up all the ASCII characters within the field.  So if all the letters are the same but in different order the value returned would be the same.

The comma one turns all the character to upper case so you don't have to worry about upper and lower case.

Your new file now looks like this:

II then did a duplicate detection to get all the duplicates and got the following:

Which is the same as your finding.  You might want to try the two out together in a real scenario and see if you get the same answers.

Good luck with your project.

 

 

 

Daniel Pech
Offline
Joined: 11/06/2014 - 05:03

Thank you a lot, Brian, for your help. Your solution looks much better. I will test it (first i have to upgrade to idea ver.12). It is always good to have several solutions at disposal... :-) BR, Dan.

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

If you have a chance let me know how it works out for you.

Daniel Pech
Offline
Joined: 11/06/2014 - 05:03

Hi Brian,
I have tested the proposed solution (via the "String to ASCII" function). Testing on original data gives unfortunately too many false findings. Even if the items in the field "Update_name" have different string length, sometimes the same ASCII number is assigned.
I think I cannot use it. Especially if the letters like "á", "í" (typical e.g. for the Czech, or Hungarian language), or other special characters (e.g. for the German language) can be part of the name (the original data). But thank you anyway for the tip.