Skip to main content

Random Sampling from multiple databases

Hi Brian,
I am new to scripting but hoping you can help assist.  I have 100+ databases each named "Fund XXX".   Each fund contains a column heading "trans" signifying a specific transaction code.  I need to select 4 random samples with the "trans" code "SELL" from each of the funds.  I then need to export these 4 selections to a database saved as "Fund XXX Sell Random Sample".  Nothing needs to be in excel, everything can remain in IDEA. 
I have watched your 10 part beginner's video and looked all over this site for a possible solution, but all seem to have minor hiccups.  I cannot seem to figure out how to open or save multiple databases without going through each script and updating.  I am very new to IDEA so any help is greatly appreciated. 
Thanks,
Jon 

Brian Element Thu, 09/22/2016 - 19:05

Hi Jon and welcome to the site.

Looks like you have an interesting problem but quite doable.  I will see if I can put something together for you to check out and start off with.

What the script will have to do is loop through each file, then extract all the transactions with a trans code of SELL, then select 4 items and finally append them all together.  

Probably the easiest way to do this is to assume that all the files are ones you want to perform the sampling on, so you would move all the files into a separate directory and then select it from the script.

I will see what I can come up with for you.

Brian

jonnyperx Fri, 09/23/2016 - 10:40

Hi Brian, 
That is correct, all the files will be in a separate directory.  As long as all the random samples are pulled from the "Sell" trans code, and there are 4 from each Fund, we should be good to go.
Thanks again for the help

Brian Element Sat, 09/24/2016 - 08:11

Hi Jon,

Here is a demo for you to check out.  It uses loops to read the files in a folder, the folder has to be under the project folder.  It stores the file names in the sFiles() array.  It then performs an extract on each file extracting all the Sell trans codes and stores the new filenames in the sExtractFileNames() array.  It finally performs a random sample on each of the extracts and stores the file names in the sRandomFileNames() array.  The final step is to take each of the random samples and combine them into one file.

Let me know if you have any questions.  I have also attached the file for you.

Brian

'********************************************************************************************************************************************
'* Script:	Multi Random Sample.iss
'* Author:	Brian Element - brian.element@ideascripting.com
'* Date:	Sept 24, 2016
'* Purpose:	A demo script that shows how to read files and perform an extract and random sample and then combine them
'* This script is presented without any warranty or guarantee.  Users are encouraged to validate the effectiveness and reliability on their own.
'********************************************************************************************************************************************

Option Explicit

Dim sFolder As String
Dim sFiles() As String
Dim sExtractFileNames() As String
Dim sRandomFileNames() As String

Sub Main
	Dim i As Integer
	sFolder = getFolder()
	Call getFiles(sFolder)
	Call performExtraction()
	Call performRandomSample()
	call appendRandomSamples
	client.RefreshFileExplorer
	MsgBox "Script Complete"
End Sub

'append the random files together to create one large file
Function appendRandomSamples()
	Dim db As database
	Dim task As task
	Dim i As Integer
	Dim dbName As String
	
	Set db = Client.OpenDatabase(sRandomFileNames(0))
		Set task = db.AppendDatabase
			For i = 1 To UBound(sRandomFileNames)
				task.AddDatabase sRandomFileNames(i)
			Next i
			dbName = client.uniqueFilename("Combined Random Sample")
			task.PerformTask dbName, ""
		Set task = Nothing
	Set db = Nothing
	Client.OpenDatabase (dbName)

End Function

'perform a random sample on each file and select 4 items
Function performRandomSample()
	Dim db As database
	Dim task As task
	Dim i As Integer
	Dim iSeed As Integer
	Dim dbName As String
	
	ReDim sRandomFileNames(ubound(sFiles))
	
	For i = 0 To UBound(sFiles)
		Set db = Client.OpenDatabase(sExtractFileNames(i))
			Set task = db.RandomSample
				task.IncludeAllFields
				dbName = client.UniqueFileName(Left(sFiles(i), Len(sFiles(i)) - 4) & "-Random")
				iSeed = Int((30000 - 10000 + 1) * Rnd + 10000) 'generate a new random seed for each file
				task.PerformTask dbName, "", 4, 1, db.Count, iSeed, False
			Set task = Nothing
			db.close
		Set db = Nothing
		sRandomFileNames(i) = dbName
	Next i
	
End Function

'Extract the Sell transactions for each of the files.
Function performExtraction()
	Dim db As database
	Dim task As task
	Dim i As Integer
	Dim dbName As String
	
	ReDim sExtractFileNames(UBound(sFiles))
	
	For i = 0 To UBound(sFiles)
		Set db = Client.OpenDatabase(sFolder & sFiles(i))
			Set task = db.Extraction
				task.IncludeAllFields
				dbName = client.UniqueFileName(Left(sFiles(i), Len(sFiles(i)) - 4) & "-Extract")
				task.AddExtraction dbName, "", "TRAN_CODE == ""Sell"""
				task.PerformTask 1, db.Count
			Set task = Nothing
			db.close
		Set db = Nothing
		sExtractFileNames(i) = dbName
	Next i

End Function

'obtain the imd files from the selected directory
Function getFiles(sFolder As String)
	Dim objFSO As Object
	Dim objFolder As Object
	Dim objFile As Object
	Dim colFiles As Object
	Dim bFirstTime As Boolean
	
	bFirstTime = True
	ReDim sFiles(0)
	Set objFSO = CreateObject("Scripting.FileSystemObject")
	
	Set objFolder = objFSO.GetFolder(sFolder)
	Set colFiles = objFolder.Files
	
	For Each objFile In colFiles
		If Right(UCase(objFile.Name), 3) = "IMD" Then 'only select IDEA files
			If bFirstTime Then
				bFirstTime = false
				sFiles(0) = objFile.Name
			Else
				ReDim preserve sFiles(UBound(sFiles) + 1)
				sFiles(UBound(sFiles)) = objFile.Name
	    		End If
	    	End If
	Next
	
	Set objFile = Nothing
	Set colFiles = Nothing
	Set objFolder = Nothing
	Set objFso = Nothing

End Function

Function getFolder() 'this one uses the working directory as the highest level directory
	Dim BrowseFolder As String
	Dim oFolder, oFolderItem
	Dim oPath, oShell, strPath
	Dim Current As Variant 'per Windows documentation this is defined as a variant and not a string
	
	
	Set oShell = CreateObject( "Shell.Application" )
	Set oFolder = oShell.Namespace(17) 'the 17 indicates that we are looking at the virtual folder that contains everything on the local computer
	Current = Client.WorkingDirectory()
	Set oFolder = oShell.BrowseForFolder(0, "Please select the folder where the files are located:", 1, Current)
	
	If (Not oFolder is Nothing) Then
		Set oFolderItem = oFolder.Self
		oPath = oFolderItem.Path
		
		If Right(oPath, 1) <> "\" Then
			oPath = oPath & "\"
		End If
	End If
	
	getFolder = oPath
   
End Function

 

jonnyperx Mon, 09/26/2016 - 10:36

Hi Brian,
Thank you so much for all the help thus far. Really appreciate it. 
I have imported all my databases into a single folder as instructed above.  When I run the script, I select the folder with the databases and it gives me an "Error on line 79 - Bad Equation provided" error.  Do you have any tips on how to address such an error?
Thanks,
Jon 

Brian Element Mon, 09/26/2016 - 10:52

In reply to by jonnyperx

Hi Jon,

The script was just an example as I didn't have access to your database.  The problem would be in this line where you are extracting the Sell items, most likely I don't have the correct field name.  So modify this line and it should work.

"TRAN_CODE == ""Sell"""

If you are not sure of the equation perform the extraction on one of the files for all the Sell items and then go look in the history to see the proper syntax for the extraction.

Hope this helps.

Brian

jonnyperx Mon, 09/26/2016 - 11:06

In reply to by Brian Element

Hi Brian,
Thank you for the quick response.  I tweaked the names and it worked beautifully.  You are a king. 
Thanks again for the help.  Really can't thank you enough!
Jon 

The website encountered an unexpected error. Try again later.