Issue with importing date field from text file
Forums
Hi ,
I am trying to import data from a delimted text file which is produced as a result of Sql server bulk copy. After importing, date fields are showing as 'Error'. If I set rawDataType as RawDataType.it_asciiCharacter, then it is working. But data type of the column is shown as Char.
Example date value in the text file: 2/6/2012
A row in the text file will be like this
IN 00059|9082|Purchase ledger|7|2012|7|7/26/2012|7/26/2012|20|tax.pay|1000|0|346.62|C|False
Column delimeter is pipe (|)
Here is the main part of my code. I am creating RDF file using RDFCREATORLib.IdeaRDF after setting rawDataType as RawDataType.it_asciiDate
If (fieldType.Contains("Date")) Then
rawDataType = RawDataType.it_asciiDate
End If
irdf.AppendField(columnName, "", rawType, 0, length, decimals, False, columnName)
And finally calling
irdf.CreateFile(RDFFileName)
ideaClient.ImportDelimFile(FileName, dbName, False, "", RDFFileName, True).
Please see complete method below and kindly provide your input.
Public Function ConvertTextFileToIDEA(ByVal FileName As String, randomnumber As String, ByVal ColumnInfo As Dictionary(Of String, String),
ByVal ColumnDelimeter As String) As String
Dim ideaClient As IdeaClient = New Idea.IdeaClient()
Dim irdf As New RDFCREATORLib.IdeaRDF()
Dim workingDirectory As String
workingDirectory = GetWorkingDirectory()
Dim textdata As String() = System.IO.File.ReadAllLines(FileName)
Dim headers As String() = textdata(0).Split(ColumnDelimeter)
Dim dataTable1 As New DataTable()
For Each header As String In headers
dataTable1.Columns.Add(header, GetType(String), Nothing)
Next
irdf.FileType = RDFCREATORLib.FileType.FT_COMMA_SEPARATED
irdf.FieldSeparator = ColumnDelimeter
irdf.RecordDelimiter = vbCr & vbLf
Dim fieldType As String
Dim columnName As String
For Each column As DataColumn In dataTable1.Columns
columnName = column.ColumnName.ToString().Trim()
Dim rawType As RawDataType = Nothing
Dim length As Integer = 15
Dim decimals As Integer = 2
fieldType = ColumnInfo(columnName)
If (fieldType.Contains("Numeric")) Then
rawType = RawDataType.it_asciiNumeric
ElseIf (fieldType.Contains("Char")) Then
rawType = RawDataType.it_asciiCharacter
ElseIf (fieldType.Contains("Date")) Then
rawType = RawDataType.it_asciiDate
ElseIf (fieldType.Contains("Bool")) Then
rawType = RawDataType.it_idbBool
Else
rawType = RawDataType.it_asciiCharacter
End If
irdf.AppendField(columnName, "", rawType, 0, length, decimals, False, columnName)
Next
Dim RDFFileName = System.IO.Path.Combine(workingDirectory, "TEST" & ".RDF")
irdf.CreateFile(RDFFileName)
Dim dbName As String
dbName = ideaClient.UniqueFileName(randomnumber)
ideaClient.ImportDelimFile(FileName, dbName, False, "", RDFFileName, True)
ideaClient.CloseDatabase(dbName)
File.Delete(RDFFileName)
Return dbName
End Function
Hi Mohamed,
Hi Mohamed,
Thanks for the txt file, that helped out quite a bit. I see you are writing your script in some other app then IDEAScript, just wondering if there is a reason you are doing this versus doing a direct IDEAScript?
My solution (and I copied the code in the next post) is to import the file as a delimited text file using the | as the delimiter. Doing that the file came in no problem but as you said the date fields brought in as a character field which is a common problem because the date format isn't consistent.
What I would do next is create two virtual fields to change the date character field to a true date field. This is the equation I used:
@Ctod(@if(@len(@SimpleSplit( J, "", 1, "/" )) = 1, "0" + @SimpleSplit( J, "", 1, "/" ), @SimpleSplit( J, "", 1, "/" )) + @if(@len(@SimpleSplit( J, "/", 1, "/" )) = 1, "0" + @SimpleSplit( J, "/", 1, "/" ), @SimpleSplit( J, "/", 1, "/" )) + @SimpleSplit( J, "", 1, "/", 1 ), "MMDDYYYY")
I used the simpleSplit function to split up the field between day, month and year and checked to see if the day or month was only one character to add a leading 0 to it. I then changed the revised character field to a date field.
So you can see the code in the next post. Let me know if this is what you are looking for.
Thanks
Brian
Sub Main
Sub Main
Call TextImport() 'C:\Users\belement\Documents\My IDEA Documents\IDEA Projects\William\Source Files.ILB\Import file_0.txt
Call AppendField() 'Import file_0.IMD
Call AppendField1() 'Import file_0.IMD
End Sub
' File - Import Assistant: Delimited Text
Function TextImport
dbName = "Import file_0.IMD"
Client.ImportDelimFile "C:\Users\USer\Documents\My IDEA Documents\IDEA Projects\Project\Source Files.ILB\Import file_0.txt", dbName, FALSE, "", "C:\Users\User\Documents\My IDEA Documents\IDEA Projects\Project\Import Definitions.ILB\Import file_0.RDF", TRUE
Client.OpenDatabase (dbName)
End Function
' Append Field
Function AppendField
Set db = Client.OpenDatabase("Import file_0.IMD")
Set task = db.TableManagement
Set field = db.TableDef.NewField
field.Name = "J_DATE"
field.Description = ""
field.Type = WI_VIRT_DATE
field.Equation = "@Ctod(@if(@len(@SimpleSplit( J, """", 1, ""/"" )) = 1, ""0"" + @SimpleSplit( J, """", 1, ""/"" ), @SimpleSplit( J, """", 1, ""/"" )) + @If(@Len(@SimpleSplit( J, ""/"", 1, ""/"" )) = 1, ""0"" + @SimpleSplit( J, ""/"", 1, ""/"" ), @SimpleSplit( J, ""/"", 1, ""/"" )) + @SimpleSplit( J, """", 1, ""/"", 1 ), ""MMDDYYYY"")"
task.AppendField field
task.PerformTask
Set task = Nothing
Set db = Nothing
Set field = Nothing
End Function
' Append Field
Function AppendField1
Set db = Client.OpenDatabase("Import file_0.IMD")
Set task = db.TableManagement
Set field = db.TableDef.NewField
field.Name = "K_DATE"
field.Description = ""
field.Type = WI_VIRT_DATE
field.Equation = "@Ctod(@if(@len(@SimpleSplit( K, """", 1, ""/"" )) = 1, ""0"" + @SimpleSplit( K, """", 1, ""/"" ), @SimpleSplit( K, """", 1, ""/"" )) + @If(@Len(@SimpleSplit( K, ""/"", 1, ""/"" )) = 1, ""0"" + @SimpleSplit( K, ""/"", 1, ""/"" ), @SimpleSplit( K, ""/"", 1, ""/"" )) + @SimpleSplit( K, """", 1, ""/"", 1 ), ""MMDDYYYY"")"
task.AppendField field
task.PerformTask
Set task = Nothing
Set db = Nothing
Set field = Nothing
End Function
Thank you so much Brian.
Thank you so much Brian.
It is working.
1. Hope you have used date format as "MMDDYYYY" in the formula because the given data is in that format. Is there anyway to provide a generic format in the formula so that it can work independent of the date format based on different locales?
2. My input file can have numeric columns. Some of the numeric columns will have one or more decimals and others will not have decimal places. And I can't understand if a particular column will have decimal point or not. In this case, if I write code to define RDF file for numeric columns, I am forced to add number of decimal places. Is there any numeric data type available so that decimal places can be added depending on the input file? For example, there will be some ID columns which are numeric, but does not have decimal places. When I import the file, the ID column should not have decimal places.
Thanks in advance,
Shafeer
Hi Mohamed,
Hi Mohamed,
I am glad it is working for you.
For item 1 it is possible, you will have to obtain the regional settings for the person's computer and use that as their date format (I did a blog on internationalization which this would be the next step: http://ideascripting.com/blog/2015/08/03/how-internationalize-and-localize-your-scripts). I will see if I can come up with something for you.
For item 2 what is your source? It sounds like you might have to go to your source first to see the number of decimals there are, for excel that should be fairly easy even though IDEA should bring it in with the correct number. Can you give me an example of where you are having problems.
Brian
Hi Brian,
Hi Brian,
Thanks for your suggestions.
My second issue is with importing delimited text file. Please find the attached sample source file. There are 3 columns.
SL_NO should not have any decimal points
ACTUAL_VALUE will have maximum 4 decimal points
DIFFERENCE column will have 2 decimal points.
While defining RDF structure, I will not know the number of decimal points in advance. I can know that it is numeric field.
In my generic code, if I add decimal points, it will be applied to all columns which is not desirable.
So I am looking for an efficient way to define RDF file in such a way that decimal points are shown correctly in the imported file.
Thanks,
Shafeer
Unfortunately once you create
Unfortunately once you create an RDF you can't make any changes to it so there is no way to create one on the fly based on the contents of the file or make any modifications to it, it needs to be defined ahead of time. One possible solution is to bring in all the numeric fields with the maximum number of decimals and then to make changes to the field once it is in IDEA. How were you planning on knowing the number of decimals? Will it be some type of user intervention or will you have the script read through the file looking for the maximum number of decimals based on the source information?
Attached a sample txt file
Attached a sample txt file used for importing