OutofMemoryException

Jan 12, 2011 at 5:41 PM

Attempting to read a CSV file that has a little over 8 million records and is approximately 3Gb in size. I am running Windows7 64-bit with 16Gb of memory and a bunch of disk space.

I used your code sample as follows:

using (var reader = new CsvReader("myfile.csv"))
{
    reader.ReadHeaderRecord();

    foreach (var record in reader.DataRecords)
    {
        if (reader.RecordNumber % 10000 == 0)
        {
            Console.WriteLine(reader.RecordNumber.ToString("#,###"));
        }    
    }
}

I received the following error on record 47,895 (Note: The file should be well formed as it imports correctly using the SQL Server Import and Export Wizard, which is VERY picky *lol*). I do not see any obvious problems with records in the vicinity either:

System.OutOfMemoryException was unhandled
  Message=Exception of type 'System.OutOfMemoryException' was thrown.
  Source=Kent.Boogaart.KBCsv
  StackTrace:
       at Kent.Boogaart.KBCsv.CsvParser.EnsureValueBufferCapacity(Int32 count) in c:\Repository\kbcsv\trunk\Src\Kent.Boogaart.KBCsv\CsvParser.cs:line 768
       at Kent.Boogaart.KBCsv.CsvParser.AppendToValue(Int32 startIndex, Int32 endIndex) in c:\Repository\kbcsv\trunk\Src\Kent.Boogaart.KBCsv\CsvParser.cs:line 704
       at Kent.Boogaart.KBCsv.CsvParser.CloseValuePartExcludeCurrent() in c:\Repository\kbcsv\trunk\Src\Kent.Boogaart.KBCsv\CsvParser.cs:line 451
       at Kent.Boogaart.KBCsv.CsvParser.ParseRecord() in c:\Repository\kbcsv\trunk\Src\Kent.Boogaart.KBCsv\CsvParser.cs:line 370
       at Kent.Boogaart.KBCsv.DataRecord.FromParser(HeaderRecord headerRecord, CsvParser parser) in c:\Repository\kbcsv\trunk\Src\Kent.Boogaart.KBCsv\DataRecord.cs:line 78
       at Kent.Boogaart.KBCsv.CsvReader.ReadDataRecord() in c:\Repository\kbcsv\trunk\Src\Kent.Boogaart.KBCsv\CsvReader.cs:line 631
       at Kent.Boogaart.KBCsv.CsvReader.<get_DataRecords>d__0.MoveNext() in c:\Repository\kbcsv\trunk\Src\Kent.Boogaart.KBCsv\CsvReader.cs:line 430
       at JumboConsole.Program.importVoterHistory() in D:\My Documents\Visual Studio 2010\Projects\JumboConsole\JumboConsole\Program.cs:line 133
       at JumboConsole.Program.Main(String[] args) in D:\My Documents\Visual Studio 2010\Projects\JumboConsole\JumboConsole\Program.cs:line 24
       at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
       at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
       at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
       at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
       at System.Threading.ThreadHelper.ThreadStart()
  InnerException:

Jan 27, 2011 at 11:52 AM

Hello,

I do have the same problem. And in my case the number of records are much lower.

Does anyone knows how to solve this?

Thank You,

Miguel

Coordinator
Jan 28, 2011 at 7:39 PM
Edited Jan 29, 2011 at 9:19 AM

Guys,

Thanks for reporting this. Unfortunately, I cannot reproduce it - neither on an old, crappy machine, nor on my 64 bit Windows 7 box with 8GB RAM.

Here's the test code I ran:

 

using System;
using System.Text;
using Kent.Boogaart.KBCsv;

namespace KBCsvTest
{
    class Program
    {
        static void Main(string[] args)
        {
            // comment out as necessary
            CreateBigCsvFile();
            //ReadBigCsvFile();

            Console.WriteLine();
            Console.WriteLine();
            Console.WriteLine("DONE - PRESS ANY KEY");
            Console.ReadKey();
        }

        private static void CreateBigCsvFile()
        {
            using (var writer = new CsvWriter(@"SampleData.csv"))
            {
                writer.WriteHeaderRecord("Name", "DOB", "Height", "Weight");
                var random = new Random();
                var records = 135000000;

                for (var i = 0; i < records; ++i)
                {
                    var name = GenerateName(random);
                    var dob = new DateTime(random.Next(1000, int.MaxValue));
                    var height = random.Next(100, 210);
                    var weight = random.Next(30, 300);

                    writer.WriteDataRecord(name, dob.ToShortDateString(), height.ToString(), weight.ToString());

                    if ((i % 1000) == 0)
                    {
                        Console.CursorLeft = 0;
                        Console.Write("{0:p}    ", (i + 1) / (double)records);
                    }
                }

                Console.CursorLeft = 0;
                Console.WriteLine("100%     ");
            }
        }

        private static readonly StringBuilder name = new StringBuilder();

        private static string GenerateName(Random random)
        {
            var length = random.Next(3, 20);
            name.Length = 0;
            var chars = "abcdefghijklmnopqrstuvwxyz";

            for (var i = 0; i < length; ++i)
            {
                name.Append(chars[random.Next(0, chars.Length)]);
            }

            return name.ToString();
        }

        private static void ReadBigCsvFile()
        {
            using (var reader = new CsvReader(@"SampleData.csv"))
            {
                reader.ReadHeaderRecord();

                foreach (var record in reader.DataRecords)
                {
                    if (reader.RecordNumber % 10000 == 0)
                    {
                        Console.CursorLeft = 0;
                        Console.Write("{0} records read", reader.RecordNumber);
                    }
                }
            }
        }
    }
}

Can you please confirm whether the above works for you? If so, we may need to look at the actual data you're attempting to parse. Shapper, if there's any way you could send me your CSV file, that would be very helpful.

 

Thanks,
Kent