Skip to main content

Number Identification/Extraction In Text String

Hi Brian! 
I hope all is well with you!
I was wondering if you could help me.....
I have a transactions file that includes a notation/notes field (formatted as character). This field is essentially used for notation by our employees relating a transaction that has taken place. In some cases, notes may contain a manually entered ID which identifies the employee who completed the transaction. Unfortunately, these particular transactions do not capture the employee who created them by way of system automation. As for what I am looking to do, in short, I would like to be able to tag or link the transacting employee to the transactions that have an ID listed within the notes field. Due to these files containing upwards of 100k records, it would be quite a pain to create some sort of work-around. 
Below I have listed an example of what you might see in the contents of one of these transactions for that notation field (none of this information is confidential or PII. I masked the employee ID with 0’s & 1’s):
29XXXXX0 UPD 2184262-qual date updted-jmUpdated in Prod Con Template 101010  ACRUAL DT UPD2193636
The content we would like to capture or extract is what I have listed in bold (101010). This is the employee ID which is always 6 numbers in length. I have tried a combination of functions to @split and @justnumbers, etc. There are too many variables however as the notes don't follow any specific notation guidlines or standards.
Any insight you can provide would be greatly appreciated! Thanks in advance. 
 

scotchy33 Fri, 02/02/2018 - 11:57

In reply to by Steven Luciani

In the original field there were more than one occurence of six digits so you could try an if statement to catch the remainder where spaces don't exist:
 
@strip(@if(@regexpr(notes, "\s[0-9][0-9][0-9][0-9][0-9][0-9]\s")="", @regexpr(notes, "[0-9][0-9][0-9][0-9][0-9][0-9]"),@regexpr(notes, "\s[0-9][0-9][0-9][0-9][0-9][0-9]\s")))
 
Or you  could use caret or dollar sign

  1. If the caret (^) is at the beginning of the (sub)expression, then the matched string must be at the beginning of the string being searched.
  2. If the dollar sign ($) is at the end of the (sub)expression, then the matched string must be at the end of the string being searched.