Sometimes we will recieve large text files that need to be modified before bringing them into IDEA. One example I had was an SAP file in which one of the reform fields the user had used the enter key so in the SAP text file there was an extra carriage return that caused an additional line and screwed everything up after that point. The file was quite large, more than 20 million lines so I developed this script to be able to break-down the file into smaller chunks that I could then load into Notepad ++ and edit manually, in this case removing the extra carriage return.
The script will first give you a message stating what it will do:
You will then be asked to select the text file.
The final step is to select the number of rows each new file will contain. This will depend on the number of fields in each row, the large the number of fields the smaller the number of rows you should select. You should check for any limitations in your text editor, I believe that Notepad ++ has a limitation of 500 megs for the file size.
The script will then create x number of new files. The naming convention is to use the original name and add -split1, for the first, split2 for the second and so on. You would then make any chances you need to make on the individual files and import them separately. You can then append them to get your original file.