CSV generation with infix delimiters

Mar 22, 2013 at 12:21 PM
Hi,

I'm trying to generate a CSV with Infix delimiters (as depicted here) such as the following:
Kent,25,M<LF>
Belinda,26,F<EOF>
I've searched in the docs but i was unable to find a property to drive this behavior, even if i've found the behavior depicted as Rule RL (The last record in the CSV data may or may not end with a line break (as per rule RS).) in the CHM for the CSV Parser.

Does the CSV writer supports this behavior? and if yes, how?

Thank you very much for your work,
Angelo
Coordinator
Mar 22, 2013 at 1:04 PM
Edited Mar 22, 2013 at 1:05 PM
Hi Angelo,

Infix is how KBCsv separates values, so it's not clear to me what the issue is. For example:
using (var sw = new StringWriter())
using (var w = new CsvWriter(sw))
{
    w.WriteRecord("Foo", "Bar");
    w.WriteRecord("Biz", "Baz");

    Console.WriteLine(sw.ToString());
}
Output:
Foo,Bar
Biz,Baz
Can you describe your issue in more detail? All I can think of is that maybe you've got an empty/null value at the end of your values that is resulting in an extra separator at the end of each line?

Thanks,
Kent
Mar 22, 2013 at 2:09 PM
Hi Kent,
Thank you for your fast response,

In my experience CSV Child delimiters can be of three kind: Prefix, Infix and Postfix depending on where row separator is placed.

Let's say <LF> is a row separator:

Prefix example (almost never used in CSV, newline is placed before writing the row):
<LF>Kent,25,M
<LF>Belinda,26,F<EOF>
Postfix example (often used in CSV, newline is placed after each row)
Kent,25,M<LF>
Belinda,26,F<LF><EOF>
Infix example (also often used in CSV, newline is only placed between rows)
Kent,25,M<LF>
Belinda,26,F<EOF>
In my situation i need to feed a system who accept CSV with postfix notation
(and, according to our tests this is the default behavior for CSVWriter but correct me if i'm wrong)
and another system accepts just CSV with infix notation.

So we would like to use KBCsv for both systems but we're unable to find a switch
to choose between behaviors or think a smart workaround to keep using KBCsv in both scenarios.

Angelo
Coordinator
Mar 22, 2013 at 2:46 PM
Now I understand - thanks for the clear examples.

KBCsv reads both postfix and infix, but writes only postfix as you've discovered. There is no switch on CsvWriter to choose between the two - it always does postfix.

You could fork and add your own switch, or just strip the trailing <LF>:
using (var sw = new StringWriter())
using (var w = new CsvWriter(sw))
{
    w.WriteRecord("Foo", "Bar");
    w.WriteRecord("Biz", "Baz");

    var csv = sw.ToString();
    Console.Write(csv + "<END>");

    Console.WriteLine();
    Console.WriteLine();

    var stripped = csv.Substring(0, csv.Length - w.NewLine.Length);
    Console.Write(stripped + "<END>");
}
Output:
Foo,Bar
Biz,Baz
<END>

Foo,Bar
Biz,Baz<END>
For a more flexible workaround, you could define your own TextWriter that automatically refrains from passing on the last Environment.NewLine to the inner TextWriter:
public class InfixTextWriter : TextWriter
{
    private readonly TextWriter inner;
    private bool hasPendingNewLine;

    public InfixTextWriter(TextWriter inner)
    {
        this.inner = inner;
    }

    public override Encoding Encoding
    {
        get { return this.inner.Encoding; }
    }

    public override void Write(string value)
    {
        if (this.hasPendingNewLine)
        {
            this.inner.Write(Environment.NewLine);
        }

        if (value.EndsWith(Environment.NewLine))
        {
            this.inner.Write(value.Substring(0, value.Length - Environment.NewLine.Length));
            this.hasPendingNewLine = true;
        }
        else
        {
            this.inner.Write(value);
            this.hasPendingNewLine = false;
        }
    }
}
You could use it like this:
using (var sw = new StringWriter())
using (var iw = new InfixTextWriter(sw))
using (var w = new CsvWriter(iw))
{
    w.WriteRecord("Foo", "Bar");
    w.WriteRecord("Biz", "Baz");

    var csv = sw.ToString();
    Console.Write(csv + "<END>");
}
Output:
Foo,Bar
Biz,Baz<END>
Hope that helps,
Kent