SSIS reading LF as terminator when its set as CRLF
Asked Answered
G

5

7

using SSIS 2012. My flat file connection manager I have a delimited file where the row delimiter is set to CRLF, but when it processes the file, I have a text column that has an LF in it. This is causing it to read that as a row terminator causing it fail. Any ideas?

Gentilis answered 24/5, 2017 at 16:58 Comment(1)
i was mis-understanding your question. i was thinking that your flat file contains multiple row delimiter. i edited my answer take a lookAllseed
G
2

thank u for all the suggestions. turned out that the vendor had changed the encoding of the file from Ascii to unicode. changing the the package to read the correct encoding did the trick.

Gentilis answered 5/6, 2017 at 22:29 Comment(1)
Just accept this answer even if it yours, so the question is marked as answered. Also the answers provided are helpful and these people spend there times to solve your issue so it is nice to up vote there answersNullifidian
A
2

Before answering, i don't think that the column contains only LF because if the row delimiter is CRLF it will not consider it as delimiter. So it is probably CRLF, but i will give a solution for the two cases (CRLF or LF)

Solution

You can fix this situation with the following steps:

  1. First in the Flat File connection manager add only one column (of type DT_STR and length 4000) so you will consider each row as one column.
  2. In the data flow task you have to add a Script component that fix the file structure. and split row into columns.

Simple Test

I will consider a flat file with the following content

ID;name;DOB;Notes;ClassID{CRLF}
1;John;2001-01-01;;1{CRLF}
2;Moh;2002-01-01;Very cool{LF}
Genius;2{CRLF}
3;Ali;2000-01-01;Calm;2{CRLF}
  1. First i will add a flat file connection manager with the following options:
    • Row Delimiter = {CRLF}
    • Header Row Delimiter = {CRLF}

enter image description here

  1. In the DataFlow Task i will add a Flat File Source, 2 x Script Component , OLEDB Destination

  2. In the first Script Component i will mark Column0 as input and i will add 5 output Columns ID,Name,DOB,Notes,ClassID and i will set the Output Synchronous Input as None

enter image description here

  1. In the first Script Component i will write a script that store each line in a memory variable and assign it to an output row when row is complete and another row is present.

    Dim strLine As String = String.Empty
    
    Dim strDelimiter As String = ";"
    
    Public Sub EmptyMemoryVariables()
    
    
        strLine = String.Empty
    
    
    End Sub
    
    Public Sub AssignMemoryVariablesToOutput()
    
        With Output0Buffer
    
            .AddRow()
            .NewRow = strLine
        End With
    
    End Sub
    
    Public Function AreVariablesEmpty() As Boolean
    
        If strLine = "" Then
    
            Return True
    
        Else
    
            Return False
    
        End If
    
    
    End Function
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
    
        Dim strColumns As String() = Row.Column0.Split(CChar(strDelimiter))
    
        If strColumns.Length = 5 Then
    
            If Not AreVariablesEmpty() Then
                AssignMemoryVariablesToOutput()
                EmptyMemoryVariables()
            End If
    
            strLine = Row.Column0
    
            AssignMemoryVariablesToOutput()
            EmptyMemoryVariables()
    
    
        Else
    
            If strLine.Split(CChar(strDelimiter)).Length = 5 Then
    
                AssignMemoryVariablesToOutput()
                EmptyMemoryVariables()
    
            End If
    
    
            strLine &= Row.Column0
    
    
    
    
    
    
    
        End If
    
  2. In the second Script COmponent i will split each row into Columns

enter image description here

    Dim strDelimiter As String = ";"
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

        Dim strColumns As String() = Row.NewRow.Split(CChar(strDelimiter))


        Row.ID = strColumns(0)
        Row.NAME = strColumns(1)
        Row.DOB = strColumns(2)
        Row.NOTES = strColumns(3)
        Row.CLASSID = strColumns(4)


    End Sub

Important Note: the provided code is not optimal it may need more validations or can be simpler and better but i am trying to give you the way you can think to solve this issue

Allseed answered 27/5, 2017 at 14:47 Comment(2)
This won't work because the OP stated that the extra LF is in a column, which means to me that the reader stops reading the row at that point. Removing the CR's from the ends would make no change at all, since it would still read extra false lines for those lines that contain LF characters in the columns.Musk
Thx for the remark. I misunderstood the question. I will change my answer. Thx a lotAllseed
F
2

I have no SSIS experience but as an ETL developer I have faced this many times. So my suggestions might not help you solve the problem but hopefully point you in the right direction

  • If the problem field has text qualifier (single or double quote usually) and SSIS supports use it
  • Also if there is an option to force SSIS to use different end of record delimiter other than LF (CRLF in this case) I'd use it (hopefully there is no CRLF in the problem field text)
  • If the problem field is not the last field, you can count the number of de-limiters by reading the entire record as a single LF delimited field to identify and filter out the problem records (if they are only few) and try to stitch them back
  • If possible read the file as single record (if SSIS has an option) and replace all LF, provided CR is consistent end of record delimiter from the source
Frieder answered 1/6, 2017 at 16:44 Comment(0)
G
2

thank u for all the suggestions. turned out that the vendor had changed the encoding of the file from Ascii to unicode. changing the the package to read the correct encoding did the trick.

Gentilis answered 5/6, 2017 at 22:29 Comment(1)
Just accept this answer even if it yours, so the question is marked as answered. Also the answers provided are helpful and these people spend there times to solve your issue so it is nice to up vote there answersNullifidian
T
0

In your Flat File Connection Manager component you have a property that I forgot its name, in it you can set the row delimiter ({CR}{LF}, {LF}, {CR}, ...etc).

Please try to adjust this property I think it'll work.

Tomas answered 31/5, 2017 at 15:36 Comment(1)
It's not true! I used SSIS a lot in my last job, but I don't have it in my current job, so I couldn't check the property and I answered by mind. I don't think it's a good idea to downvote when someone is trying to help and don't put a incorrect answer, by the way it's just my option. I really don't know how is that other answer, Can you inform, please? I think it'll be usual to have this other awnserTomas
B
0

I had a similar issue to this. I had a CSV file with LF as the terminator. However, the client also had CRLF in two of the columns and this was causing the "delimiter for column is not found" error.

It took me a few days of googling solutions and trial and error, but I got it working.

In the end, I needed two script components.

In the first Script component, I had a column named Output0 string with Length of 4000. In the script (see below) I used ReadToEnd to load the data, replace the CRLF with an empty string, and then spliting into rows with the LF as the terminator.

using System.IO;
using System.Text;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    private StreamReader textReader;
    private string collateralFile;

public override void AcquireConnections(object Transaction)
{

    IDTSConnectionManager100 connMgr = this.Connections.Collateral;
    collateralFile = (string)connMgr.AcquireConnection(null);

}

public override void PreExecute()
{
    base.PreExecute();


}

public override void CreateNewOutputRows()
{

    StreamReader textReader = new StreamReader(collateralFile);
    string collatFile = textReader.ReadToEnd();


    collatFile = collatFile.Replace("\r\n", " ");
    
    String[] lines = collatFile.Split(new char[] { '\n' });
    textReader.Close();


        string nextLine;



        for (int i = 0; i < lines.Length; i++)
        {
            if (lines[i] != null)
            {
                nextLine = lines[i];

                if (!String.IsNullOrEmpty(nextLine))
                {
                    Output0Buffer.AddRow();
                    Output0Buffer.Output0 = nextLine;

                }
            }
          }

        }
    }

I tried splitting it again into columns, but it returned null values, so in the second script component I created my columns and loaded the data into them in the script.

public override void Input0_ProcessInputRow(Input0Buffer Row)
{

String[] columns = Row.Output0.Split(',');

Row.Description = columns[0];
Row.LegalDescription = columns[1];
Row.Address1ParsedLine1 = columns[2];
Row.Address1ParsedLine2 = columns[4];
Row.Address1ParsedCityname = columns[5];
Row.Address1ParsedStatecode = columns[6];
Row.Address1ParsedPostalcode = columns[7];
}
Bushwhacker answered 30/7, 2020 at 17:56 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.