Number Identification/Extraction In Text String

12 posts / 0 new
Last post
zswanson1
Offline
Joined: 07/12/2016 - 11:38
Number Identification/Extraction In Text String

Hi Brian! 
I hope all is well with you!
I was wondering if you could help me.....
I have a transactions file that includes a notation/notes field (formatted as character). This field is essentially used for notation by our employees relating a transaction that has taken place. In some cases, notes may contain a manually entered ID which identifies the employee who completed the transaction. Unfortunately, these particular transactions do not capture the employee who created them by way of system automation. As for what I am looking to do, in short, I would like to be able to tag or link the transacting employee to the transactions that have an ID listed within the notes field. Due to these files containing upwards of 100k records, it would be quite a pain to create some sort of work-around. 
Below I have listed an example of what you might see in the contents of one of these transactions for that notation field (none of this information is confidential or PII. I masked the employee ID with 0’s & 1’s):
29XXXXX0 UPD 2184262-qual date updted-jmUpdated in Prod Con Template 101010  ACRUAL DT UPD2193636
The content we would like to capture or extract is what I have listed in bold (101010). This is the employee ID which is always 6 numbers in length. I have tried a combination of functions to @split and @justnumbers, etc. There are too many variables however as the notes don't follow any specific notation guidlines or standards.
Any insight you can provide would be greatly appreciated! Thanks in advance. 
 

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

Hi Zach,

I don't think there is a way to do it in the equation editor as what you want is fairly complex.  I put together a custom function for you that should work.  The custome function except the parameters of the field name and the number of numeric characters, in your case that would be 6.  It checks to see if it is only 6 numeric characters so if there is a number before or after the 6 characters it will be ignored.  It takes the first instance of finding the 6 characters.

Try it out and let me know how it works for you.

Thanks

Brian

 

 

zswanson1
Offline
Joined: 07/12/2016 - 11:38

Thanks for the quick reply Brian! I went ahead and tried to apply that custom function and came across a compile/syntax error when trying to save or use it. It looks like the error points to the following lines (whenever < or > is referenced):
Line 16 <Function Body>            If iLen &lt;= p2 Then
Line 29 <Function Body>            If i &lt;&gt; 1 Then         
 
Line 38 <Function Body>            If i + p2 - 1 &lt;&gt; iLen Then
Thoughts? 

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

Hi Zack,

Did you do a right click and select save as?  If you just open it and copy it it will probably give the error as it changes the < and > to &lt; and &gt;.  If you still have a problem let me know and I will zip it up for you and that should fix the problem.

Brian

zswanson1
Offline
Joined: 07/12/2016 - 11:38

That would be great if its no trouble for you. For some reason I am not given the option to save as. I can only open it. 

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

Here you go, let me know if you have any problems with it.

Brian

zswanson1
Offline
Joined: 07/12/2016 - 11:38

Thanks Brian! The fuction works perfectly! Truly appriciate the help!

scotchy33
Offline
Joined: 09/05/2012 - 15:51

Regular Expression would work too.
 
@strip(@regexpr(notes, "\s[0-9][0-9][0-9][0-9][0-9][0-9]\s"))

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

Thanks Scott, I always forget the power of the regexpr as I don't use it enough.

zswanson1
Offline
Joined: 07/12/2016 - 11:38

Thanks Scott, this seemed to work for the majority of situations but missed a few for some reason. I will have to look at it a bit closer to figure out why when I have the opportunity. Thus far it looks like those that have the 6 digit number cited at the end of the string.  I am positive I can utilize this going forward however. Thanks for your input Sir!

Steven Luciani
Offline
Joined: 07/31/2012 - 13:20

Hello,

Love to see @regexpr being used. It's one of my top three favourite functions. Scott's equation probably missed records where the field was fomatted as template101010 acrual or template 101010acrual. The \s is telling IDEA to look for six numbers in a row with a space before and after the six numbers. Assuming your notes field does not contain 6 numbers in a row that does not represent what you are looking for, you can most likely drop the \s.

@strip(@regexpr(notes, "[0-9][0-9][0-9][0-9][0-9][0-9]"))

Pages