Text Field analytics in IDEA

Brian Element

Fri, 09/11/2015 - 07:03

Forums

Postings from the India IDEA data Analysis software users group

IDEA has a bevy of features which can be used to glean insights and / or patterns from text fields.

Some of them are -

(a) Search in IDEA. Search allows the user to identify key word string/s (not case sensitive) in a single field or combination of fields within an active database or across multiple databases in IDEA. As an example identifying the field 'Account Description' and 'Narration' in a Ledger/s containing the key word/s 'gift', 'donation', 'grant', 'graft' etc. is an affective Anti-Bribery Corruption test.

(b) Field Manipulation allows the user to apply @strip through a virtual character field to extract relevant data bits from a text field into an independent column for further insight, tracking, monitoring. As an example extracting a bank wire transfer number from the narration field of a bank account statement to identify rolling fund transfers between a small group of accounts indicating money laundering perhaps.

(c) Field Manipulation enables the user to apply the criteria @soundex through a virtual character field to arrive at a sound code for a name. @soundex gives a representative (may not be 100% accurate and subject to false positives) code which can be used to capture duplicate names based on the mnemonic and not the word spell construction. IDEA V 9.2 has a custom function titled #FuzzyMatch which now allows the IDEA user to set the precision of the near match on names using the Levenshtein distance string metric.

(d) Direct Extraction along with the criteria @isblank() may be applied to text fields to isolate blank text fields like blank 'Benefactor Names' in a Campaign Grant Ledger.

There is a lot we can experience in terms of text based analytics once we deep-dive into IDEA and integrate its usage in our daily analytic schedule.

Happy Textanalytics.

Kind Regards

Jairam

Forums

Share This Page