Linq System.OutofMemoryException
Asked Answered
K

5

5

I have a long running process in C# that hits a Sql table anywhere from 10 to 200 times. When the process exceeds about 50 hits and queries greater than about 100,000 rows from the same table each time, it will throw a System Out of Memory Exception at this line, specifically at the bottom where it converts the IQuery object to a List:

var cht = from p in _db.TickerData
          where p.Time >= Convert.ToDateTime(start) &&
          p.Time <= Convert.ToDateTime(end)
          orderby p.Time
          select p;

_prices = cht.ToList();    < this is where the System.OutofMemoryException occurs >

What can I do to prevent this error??

Kamakura answered 20/9, 2015 at 15:33 Comment(3)
I have tried to create an index on the [Time] column in Sqlexpress. The current index is a primary key [Id] column.Kamakura
So it runs out of memory when you try to load 100,000 rows into memory? How do you think you might fix this?Solve
The question is, what are you doing with _prices? After that we can tell you how to fix the problem.Intumescence
D
6

First off:

specifically at the bottom where it converts the IQuery object to a List

Yes, that's where you'd expect the Out Of Memory condition to occur.

The assignment of cht above doesn't actually hit the database; all it does is declare the shape of the query. This is called deferred execution and LINQ uses it all over the place. It means "we don't actually process anything until your code needs it."

Calling ToList, though, essentially says "the code needs it, all of it, right now." So that's where it sends the query to the database, pulls back all the results all at once, uses LINQ magic to turn them into CLR objects, and stuff them all in a List<T> for you.

Having said that, this is just a hunch, but it's possible that your LINQ provider doesn't know what Convert.ToDateTime is. If it doesn't know how to handle that, it won't put it into the WHERE clause in the query it executes, and instead it will load the entire table and filter it client-side, which might be why you crash when the table gets too big, rather than when the result set gets too big.

To verify this, use a profiler for your database to intercept the query, and see if the WHERE clause looks like you'd expect. If it's not translating right, try this instead:

var startTime = Convert.ToDateTime(start);
var endTime = Convert.ToDateTime(end);
var cht = from p in _db.TickerData
          where p.Time >= startTime && p.Time <= endTime
          orderby p.Time
          select p;
_prices = cht.ToList();

If that doesn't help, you're probably just pulling back too much data, and you'll have to work on that the same ways you'd work with processing too much data in any other context.

Dermatitis answered 20/9, 2015 at 15:45 Comment(2)
The Problem was occurring due to Convert.ToDateTime(). Thanks @MasonKamakura
That problem, however was only a subset of the real problem which was using ToList instead of using IEnumerable to create a class object which seemed to solve the larger issue. Hope that is an accurate description of how to use IEnumerable. That answer was mentioned by @IVan below in the comments.Kamakura
N
5

The data you trying to retrieve is too huge for your list. The exception comes on ToList() because it's exactly the place where query is executing. What do you want to achieve with such large list? Possible solutions are:

1) Restrict your search with more criteria. Load not the whole data, but a part of it, and another part if you really need it.

2) Use another data structure than a list if you want to load whole data in memory, take a look on ConcurrentDictionary

Newfangled answered 20/9, 2015 at 16:24 Comment(0)
K
4

Your problem is that the query returns a very large set of data which needs to be stored in our process's memory. Very much data => OutOfMemoryException. That's normal. What's not normal is trying to do such a thing. Instead, you could limit the result set with some extra filtering or break the large result set into smaller ones, maybe like so:

        DateTime startDateTime = Convert.ToDateTime(start);
        DateTime endDateTime = Convert.ToDateTime(end);
        int fetched = 0;
        int totalFetched = 0;

        do
        {
            //fetch a batch of 1000 records
            var _prices = _db.TickerData.Where(p => p.Time >= startDateTime && p.Time <= endDateTime)
                                    .OrderBy(p => p.Time)
                                    .Skip(totalFetched)
                                    .Take(1000)
                                    .ToList();                

            //do here whatever you want with your batch of 1000 records
            fetched = _prices.Count;
            totalFetched += fetched;
        }
        while (fetched > 0);

This way you can process any amount of data, in batches.

EDIT: fixed some issues, as reported by @Code.me in the comments section.

EDIT: I suggest you set up an index at the database level on the Time column, if you haven't already, to speed up these queries.

Kucera answered 20/9, 2015 at 16:51 Comment(4)
I see 3 issues here. First, Take(1000) will always fetch the first 1000 records. You need to Skip(1000) after first iteration and continue fetching. Second, _prices is out of scope in the while statement. Third, IEnumerable<T> doesn't have a Count property. You need to invoke .Count() extension method on it.Lydell
@Lydell Thank you for spotting those problems. Fixed first two issues (post was made in a hurry...). As for third, ToList() returns a List<T> which does have the Count property.Kucera
You are correct. I always work with IEnumerable<T>. I forgot that it returns IList<T>.Lydell
@Lydell public static List<TSource> ToList<TSource>( this IEnumerable<TSource> source ) It really is returning a List<T>, not IList<T>.Kucera
L
2

Because of deferred execution, the query will execute when you call ToList() on it. Since loading all the data will consume too much memory, its a good practice to batch process.

The below code will let you fetch 1000 (or whatever works best for you) records at time and you can process them.

var startTime = Convert.ToDateTime(start);
var endTime = Convert.ToDateTime(end);

IEnumerable<T> prices= new List<T>();  // whatever T is

var currentFetched = 0;
var totalFetched = 0;

do
{
    var cht = _db.TickerData.Where(p => p.Time >= startTime && p.Time << endTime)
                        .OrderBy(p => p.Time)
                        .Skip(totalFetched)
                        .Take(1000)
                        .ToList();

    currentFetched = cht.Count();
    totalFetched += currentFetched;

    // prices = prices.Concat(cht).ToList();  
    // ^ This might throw an exception later when the list becomes too big 
    // So you can probably process currently fetched data
}
while (currentFetched > 0);
Lydell answered 20/9, 2015 at 18:37 Comment(6)
This implementation retrieves records 1-1000, then 1001-2000, and repeatedly same 1001-2000 because after the first fetch, you will always skip only the first 1000 records.Kucera
Ah, thanks for noticing it. I accidentally deleted the part for calculating total fetched when fixing an earlier typo. Fixed it now.Lydell
I don't see how this answer helps. If the OP wanted to process the query result, he could do that himself by turning it into IEnumerable instead of using ToList. Also loading 100K records in memory is not so big deal nowadays.Ahola
@Ivan I am going to try your suggestion on IEnumerable.Kamakura
@Ivan the real issue I was having was solved by creating an IEnumerable Class instead of using ToList... but you did not give me an answer only a comment to someone else's answer so I could not mark it correct!Kamakura
@Kamakura Np, to be honest I was thinking you know that, ToList usually is used when you need to bind UI or do multiple things with the query result. Be careful thought to not use Linq sugars like Any,Count etc because IEnumerable does not cache the query result and will reexecute the query. Anyway, glad that your problem has been resolved, that's the most important, and the answer you picked is just fine. Take care.Ahola
A
0

What I did with mine I just let it return an IQueryable object. It's still a list but it performs a lot better.

Afterbody answered 11/1, 2017 at 17:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.