Duplicate Journal Entries

2 posts / 0 new
Last post
tkh
Offline
Joined: 06/14/2018 - 00:22
Duplicate Journal Entries

Hi IDEA Folks,
I am new to IDEA and data analytics. I have a question on detection of Duplicate Journal Entries.
I was thinking to use the date of entry (date), account (chr), cost centre (chr) and amount (num) as the fields to determine a duplicate journal entry. Not sure if this is the correct method.
Was thinking to append a new virtual chr column with the expression
date_entry + account + cost_centre + amount
However, I am getting an error on data type since there is a date, character and number data type. I tried converting it all to characters by 
@dtoc(date_entry, ''DD/MMM/YYYY'') + account + cost_centre + @chr(amount)
Although the expression is "valid", unfortunately this doesnt seem to work with some fields showing a red "Error" and some other fields with special characters.
Would appreciate any help on if there is a better way of doing this or if my expression is correct. 
Thanks!
tkh
 

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

Hi tkh and welcome to the site.

The problem with your forumula is the @chr doesn't take a number and change it to a character it is for ASCII characters so @chr(65) will return a capital A.  The function you are looking for is the @Str which takes three parameters, the first being the number, the second second being the number of characters to output and the thrid being the number of decimals.

Since you are doing duplicates you don't actually have to put all the fields together as duplicate key detection allows for you to select up to 8 different fields to look for your duplicates.

There is no one fast rule for doing duplicate key detection, it depends on the information you have available and the different ways (there are usually multiple) that duplicates can be in your population.  This is when it is good to have a good understanding of your data.

Generally I would sit down and map out the different ways duplicates could appear in the database.  I would then perform multiple duplicates.  Usually after each duplicate run I would remove the items that have come up duplicates from the main population so I don't have duplicates of duplicates.  Once I have extracted all the different type of duplicates I would append them into a new file for the testing.

Hope this helps.

Brian