Skip to main content

Issue with importing date field from text file

Hi ,
I am trying to import data from a delimted text file which is produced as a result of Sql server bulk copy. After importing, date fields are showing as 'Error'. If I set rawDataType as RawDataType.it_asciiCharacter, then it is working. But data type of the column is shown as Char.
Example date value in the text file: 2/6/2012
A row in the text file will be like this
IN 00059|9082|Purchase ledger|7|2012|7|7/26/2012|7/26/2012|20|tax.pay|1000|0|346.62|C|False
Column delimeter is pipe (|)
Here is the main part of my code. I am creating RDF file using RDFCREATORLib.IdeaRDF after setting rawDataType as RawDataType.it_asciiDate
        If (fieldType.Contains("Date")) Then
                rawDataType = RawDataType.it_asciiDate
        End If 
irdf.AppendField(columnName, "", rawType, 0, length, decimals, False, columnName) 
And finally calling
irdf.CreateFile(RDFFileName)
ideaClient.ImportDelimFile(FileName, dbName, False, "", RDFFileName, True).
 
Please see complete method below and kindly provide your input.
Public Function ConvertTextFileToIDEA(ByVal FileName As String, randomnumber As String, ByVal ColumnInfo As Dictionary(Of String, String),
                                                    ByVal ColumnDelimeter As String) As String
        Dim ideaClient As IdeaClient = New Idea.IdeaClient()
        Dim irdf As New RDFCREATORLib.IdeaRDF()
        Dim workingDirectory As String
         workingDirectory = GetWorkingDirectory()
        Dim textdata As String() = System.IO.File.ReadAllLines(FileName)
        Dim headers As String() = textdata(0).Split(ColumnDelimeter)
        Dim dataTable1 As New DataTable()
        For Each header As String In headers
            dataTable1.Columns.Add(header, GetType(String), Nothing)
        Next
        irdf.FileType = RDFCREATORLib.FileType.FT_COMMA_SEPARATED
        irdf.FieldSeparator = ColumnDelimeter
        irdf.RecordDelimiter = vbCr & vbLf
        Dim fieldType As String
        Dim columnName As String
        For Each column As DataColumn In dataTable1.Columns
            columnName = column.ColumnName.ToString().Trim()
            Dim rawType As RawDataType = Nothing
            Dim length As Integer = 15
            Dim decimals As Integer = 2
            fieldType = ColumnInfo(columnName)
            If (fieldType.Contains("Numeric")) Then
                rawType = RawDataType.it_asciiNumeric
            ElseIf (fieldType.Contains("Char")) Then
                rawType = RawDataType.it_asciiCharacter
            ElseIf (fieldType.Contains("Date")) Then
                rawType = RawDataType.it_asciiDate
            ElseIf (fieldType.Contains("Bool")) Then
                rawType = RawDataType.it_idbBool
            Else
                rawType = RawDataType.it_asciiCharacter
            End If
            irdf.AppendField(columnName, "", rawType, 0, length, decimals, False, columnName)
        Next
        Dim RDFFileName = System.IO.Path.Combine(workingDirectory, "TEST" & ".RDF")
        irdf.CreateFile(RDFFileName)
        Dim dbName As String
        dbName = ideaClient.UniqueFileName(randomnumber)
        ideaClient.ImportDelimFile(FileName, dbName, False, "", RDFFileName, True)
        ideaClient.CloseDatabase(dbName)
        File.Delete(RDFFileName)
        Return dbName
    End Function
 

Brian Element Thu, 08/20/2015 - 07:46

In reply to by mohamed

Hi Mohamed,

Thanks for the txt file, that helped out quite a bit.  I see you are writing your script in some other app then IDEAScript, just wondering if there is a reason you are doing this versus doing a direct IDEAScript? 

My solution (and I copied the code in the next post) is to import the file as a delimited text file using the | as the delimiter.  Doing that the file came in no problem but as you said the date fields brought in as a character field which is a common problem because the date format isn't consistent.

What I would do next is create two virtual fields to change the date character field to a true date field.  This is the equation I used:

@Ctod(@if(@len(@SimpleSplit( J, "", 1, "/" )) = 1, "0" + @SimpleSplit( J, "", 1, "/" ), @SimpleSplit( J, "", 1, "/" )) + @if(@len(@SimpleSplit( J, "/", 1, "/" )) = 1, "0" + @SimpleSplit( J, "/", 1, "/" ), @SimpleSplit( J, "/", 1, "/" )) + @SimpleSplit( J, "", 1, "/", 1 ), "MMDDYYYY")

I used the simpleSplit function to split up the field between day, month and year and checked to see if the day or month was only one character to add a leading 0 to it.  I then changed the revised character field to a date field.

So you can see the code in the next post.  Let me know if this is what you are looking for.

Thanks

Brian

 

Brian Element Thu, 08/20/2015 - 07:47

In reply to by mohamed

Sub Main
	Call TextImport()	'C:\Users\belement\Documents\My IDEA Documents\IDEA Projects\William\Source Files.ILB\Import file_0.txt
	Call AppendField()	'Import file_0.IMD
	Call AppendField1()	'Import file_0.IMD
End Sub


' File - Import Assistant: Delimited Text
Function TextImport
	dbName = "Import file_0.IMD"
	Client.ImportDelimFile "C:\Users\USer\Documents\My IDEA Documents\IDEA Projects\Project\Source Files.ILB\Import file_0.txt", dbName, FALSE, "", "C:\Users\User\Documents\My IDEA Documents\IDEA Projects\Project\Import Definitions.ILB\Import file_0.RDF", TRUE
	Client.OpenDatabase (dbName)
End Function

' Append Field
Function AppendField
	Set db = Client.OpenDatabase("Import file_0.IMD")
	Set task = db.TableManagement
	Set field = db.TableDef.NewField
	field.Name = "J_DATE"
	field.Description = ""
	field.Type = WI_VIRT_DATE
	field.Equation = "@Ctod(@if(@len(@SimpleSplit( J, """", 1, ""/"" )) = 1, ""0"" + @SimpleSplit( J, """", 1, ""/"" ), @SimpleSplit( J, """", 1, ""/"" )) + @If(@Len(@SimpleSplit( J, ""/"", 1, ""/"" )) = 1, ""0"" + @SimpleSplit( J, ""/"", 1, ""/"" ), @SimpleSplit( J, ""/"", 1, ""/"" ))  + @SimpleSplit( J, """", 1, ""/"", 1 ), ""MMDDYYYY"")"
	task.AppendField field
	task.PerformTask
	Set task = Nothing
	Set db = Nothing
	Set field = Nothing
End Function

' Append Field
Function AppendField1
	Set db = Client.OpenDatabase("Import file_0.IMD")
	Set task = db.TableManagement
	Set field = db.TableDef.NewField
	field.Name = "K_DATE"
	field.Description = ""
	field.Type = WI_VIRT_DATE
	field.Equation = "@Ctod(@if(@len(@SimpleSplit( K, """", 1, ""/"" )) = 1, ""0"" + @SimpleSplit( K, """", 1, ""/"" ), @SimpleSplit( K, """", 1, ""/"" )) + @If(@Len(@SimpleSplit( K, ""/"", 1, ""/"" )) = 1, ""0"" + @SimpleSplit( K, ""/"", 1, ""/"" ), @SimpleSplit( K, ""/"", 1, ""/"" ))  + @SimpleSplit( K, """", 1, ""/"", 1 ), ""MMDDYYYY"")"
	task.AppendField field
	task.PerformTask
	Set task = Nothing
	Set db = Nothing
	Set field = Nothing
End Function

 

mohamed Thu, 08/20/2015 - 10:52

Thank you so much Brian.
It is working.
1. Hope you have used date format as "MMDDYYYY" in the formula because  the given data is in that format. Is there anyway to provide a generic format in the formula so that it can work independent of the date format based on different locales?
2. My input file can have numeric columns. Some of the numeric columns will have one or more decimals and others will not have decimal places. And I can't understand if a particular column will have decimal point or not. In this case, if I write code to define RDF file for numeric columns, I am forced to add number of decimal places. Is there any numeric data type available so that decimal places can be added depending on the input file? For example, there will be some ID columns which are numeric, but does not have decimal places. When I import the file, the ID column should not have decimal places.
 
Thanks in advance,
Shafeer

Brian Element Thu, 08/20/2015 - 11:10

In reply to by mohamed

Hi Mohamed,

I am glad it is working for you.

For item 1 it is possible, you will have to obtain the regional settings for the person's computer and use that as their date format (I did a blog on internationalization which this would be the next step: http://ideascripting.com/blog/2015/08/03/how-internationalize-and-localize-your-scripts).  I will see if I can come up with something for you.

For item 2 what is your source?  It sounds like you might have to go to your source first to see the number of decimals there are, for excel that should be fairly easy even though IDEA should bring it in with the correct number.  Can you give me an example of where you are having problems.

Brian

mohamed Fri, 08/21/2015 - 07:52

Hi Brian,
Thanks for your suggestions.
My second issue is with importing delimited text file. Please find the attached sample source file. There are 3 columns.
SL_NO should not have any decimal points
ACTUAL_VALUE will have maximum 4 decimal points
DIFFERENCE column will have 2 decimal points.
While defining RDF structure, I will not know the number of decimal points in advance. I can know that it is numeric field. 
In my generic code, if I add decimal points, it will be applied to all columns which is not desirable.
So I am looking for an efficient way to define RDF file in such a way that decimal points are shown correctly in the imported file.
Thanks,
Shafeer
 

Brian Element Fri, 08/21/2015 - 16:17

In reply to by mohamed

Unfortunately once you create an RDF you can't make any changes to it so there is no way to create one on the fly based on the contents of the file or make any modifications to it, it needs to be defined ahead of time.  One possible solution is to bring in all the numeric fields with the maximum number of decimals and then to make changes to the field once it is in IDEA.  How were you planning on knowing the number of decimals?  Will it be some type of user intervention or will you have the script read through the file looking for the maximum number of decimals based on the source information?