LINQ vs Loop – A performance test
I just installed Visual Studio 2008 beta 2 to see what the future holds for C#. The addition of LINQ has brought a variety of query keywords to the language. “Anything” can be queried; SQL databases (naturally), XML documents, and regular collections. Custom queryable objects can also be created by implementing IQueryable. Sadly, like every abstraction, these goodies all come at a cost. The question is how much?
I decided to create a simple test to see how much of a performance hit LINQ is. The simple test I deviced finds the numbers in an array that are less than 10. The code is quoted below.
- public void LinqTest()
- {
- const int SIZE = 10000, RUNS = 1000;
- for (int i = 0; i < SIZE; i++)
- ints[i] = i;
- DateTime start = DateTime.Now;
- for (int t = 0; t < RUNS; ++t)
- {
- int[] less = (from int i in ints
- where i < 10
- select i).ToArray();
- }
- TimeSpan spent = DateTime.Now – start;
- Trace.WriteLine(string.Format("LINQ: {0}, avg. {1}",
- DateTime start2 = DateTime.Now;
- for (int t = 0; t < RUNS; ++t)
- {
- foreach (var i in ints)
- if (i < 10)
- l.Add(i);
- int[] less2 = l.ToArray();
- }
- TimeSpan spent2 = DateTime.Now – start2;
- Trace.WriteLine(string.Format("Loop: {0}, avg. {1}",
- }
Initially, I assumed the performance impact would not be too large, since its equivalent is the straightforward imperative loop, which should not be too hard for a compiler to deduce given static typing and a single collection to iterate across. Or?
- LINQ: 00:00:04.1052060, avg. 00:00:00.0041052
- Loop: 00:00:00.0790965, avg. 00:00:00.0000790
As you can see, the performance impact is huge. LINQ performs 50 times worse than the traditional loop! This seems rather wild at first glance, but the explanation is this: The keywords introduced by LINQ are syntactic sugar for method invocations to a set of generic routines for iterating across collections and filtering through lambda expressions. Naturally, this will not perform as good as a traditional imperative loop, and less optimization is possible.
Having seen the performance impact, I am still of the view that LINQ is a great step towards a more declarative world for developers. Instead of saying “take these numbers, iterate over all of them, and insert them into this list if they are less then ten”, which is an informal description of a classical imperative loop, you can now say “from these numbers, give me those that are less than ten”. The difference may be subtle, but the latter is in my opinion far more declarative and easy to read.
This may very well be the next big thing, but it comes at a cost. So far, my advice is to create simple performance tests for the cases where you consider adopting LINQ, to spot possible pitfalls as early as possible.
























Matt Warren said
on August 10 2007 at 04:07
Iteration of arrays is highly optimized by the C# compiler and .Net JIT. So if all you are doing is a tiny bit of work over an array and it is performance critical then iterate the loop directly. LINQ comes in handy in many other situations, as there are a lot of other API’s that expose collections that are not arrays.
Håvard said
on August 10 2007 at 11:04
Actually, foreach iteration across an array is not very optimized, but that’s another discussion. The point is rather that LINQ performs very lousily when it should not be necessary to perform so bad.
Still, it is worth mentioning that this is a test done with a beta release, which probably has an impact on the performance (logging, debugging information, recent changes that hit performance, etc.).
Anders Hejlsberg said
on October 31 2007 at 01:27
I just ran across this and tried it out.
It turns out that the culprit is the explicit mention of the “int” type in your query (“from int i in …”). This translates into a call to the Cast() extension method which ends up boxing and unboxing each integer. If you remove the “int” from the query you’ll see much better performance (about 2 times slower than the for loop). That’s representative of the expected performance.
Explicit typing of a range variable in a query is really only necessary when querying older non-generic collection types (such as ArrayList). We’ll give some thought to ways of optimizing away the Cast() overhead in cases where it isn’t needed.
Håvard said
on October 31 2007 at 03:40
Yes, removing the int indeed speeds up the query. Thank you, Anders, for noticing and clearing this up.
A Cast() optimization would be very welcome.
You say that about 2 times slower is the expected performance, but where does that time go? Is the lambda in the Where() that slow? Or do the additional method calls introduced by substituting the loop for a query bring with them that much penalty? If the select is also a lambda here (which I presume it is), there may be some gain there as well in some cases (you could just skip it and yield return directly in the case above).
Of course, the implementation might lose generality from performance improvement, but in my view a good implementation should not penalize the trivial cases. As a programmer I want a more declarative language, and being able to stop writing loops for the trivial cases and start writing declarative queries instead is a great step in the right direction.
Maybe the introduction of comprehensions could be an alternative to optimizing LINQ for trivial queries? C# is becoming more and more function-oriented, and a form of comprehensions would be a welcome addition in my opinion.
Axel said
on November 18 2007 at 22:49
Use System.Diagnostics.Stopwatch instead of manually constructing the time checks, it is much easier. Note that your input array should have been randomized or shuffled, it appears very predictive to the human eye, also, it does little justice to the sequential execution of the code because of the way branching will occur in one way the first 10 times, and the other way the other 10,000 times, which probably will affect the CPU’s pipeline prediction, yielding even better performance to the for-loop.
Håvard said
on November 25 2007 at 15:10
Axel, thanks for the tip on Stopwatch.
I just tried running the test with a randomized input array, and randomized partitioning point. Neither had any noticeable effect on the test results.
Pavol said
on December 12 2007 at 11:47
I also tried your performance test.(I’m using final release of .NET 3.5 and VS 2008). With a little change of code (making TimeSpan to Stopwatch) I got this results (in miliseconds): a) With explicit CAST to INT and result is Array: LINQ: 11550 FOREACH: 91 That’s terrible score…
b) As Anders said, removing explicit CAST to INT speed up things a lot. After this change: LINQ: 335 FOREACH: 91 So very nice change…
But also a huge impact to a results has ToArray() method. In this test, it’s called 1000 times, but in a real world you may work with large collections – make some LINQ operations on them with their implicit type (mostly IEnumerable) and then turn the result to Array, List or whatever. So with a little change made: RUNS = 1 (yes only loop) and SIZE = 10 000 000 and also delete ToArray() in LINQ and in FOREACH statement as well, you’ll get this piece of code:
var less = (from i in ints where i < 10 select i);
and
var l = new List(); foreach (var i in ints) if (i < 10) l.Add(i);
c) Run it and we’ll get: LINQ: 2 FOREACH: 111 d) if you add ToArray methods to both: LINQ: 346 FOREACH: 111 So only ToArray in LINQ case lasts 344 ms !!!
OK, to make some decisions, we should test more than this, but in a result I’m glad, that I may use LINQ to simplify code in a huge manner (especially with Lambda expressions) a know that performance impact would be minimal.
Exploring LINQ - Noticias externas said
on December 31 2007 at 06:13
[...] http://ox.no/posts/linq-vs-loop-a-performance-test [...]
Eric said
on February 15 2008 at 01:42
Hey great post. I was wondering the same thing. Some of my colleagues and I were pitching the idea around of “how exactly are they doing the query under the hood?”.
I was hoping that at least it was translating it into the same IL as a nested foreach; the idea that it’s slower is terrible. Booo!
john said
on February 22 2008 at 16:08
Pavol – the ToArray appears to take so long as the query is only executed at this point. Deferred query execution in LINQ means the query is only executed when you iterate through the query result (e.g. by calling ToArray). declaring and initialising the query doesn’t execute the query, but this initialisation probably accounts for the shorter time (2ms) you’ve seen.
Robert said
on March 2 2008 at 23:00
Wow, building in release, Loop is 10x faster than Linq without doing the int cast.
Bart said
on October 23 2008 at 23:56
An update using the latest service packs and running the same code as in your blog: the LINQ query is 5 – 6 times slower than the loop on my machine.
However, the LINQ query is probably 5 – 6 times more understandable than the loop for someone reading the code and trying to figure out what it’s doing.
Brian said
on December 9 2008 at 22:10
I ran the same test and didn’t get the same results
Linq .20 Loop .08
Håvard said
on January 7 2009 at 22:56
Brian: Interesting, with 3.5 SP1 I get the following results:
With explicit int: Linq .446 Loop .231
And without the explicit int: Linq .431 Loop .256
This could indicate that Cast() is optimized away in SP1, but I have not verified this.
Venkat said
on February 16 2009 at 05:47
All the times in ms.
I run the tests as Pavol suggested.
LINQ: 1, avg. 24 Loop: 92, avg. 1322
from int i / from i, There is no much difference
LINQ: 247, avg. 3803 Loop: 89, avg. 1286
JS said
on February 25 2009 at 22:08
If you run the compiled verision of the application (.exe or .dll), LINQ actually runs faster
running out of VS is not a valid test and should not be compared.
Brian said
on May 9 2009 at 19:29
This function does nothing, and thus benchmarking a compiled version is dangerous (so is benchmarking a debug version, but less so). A compiler will happily say, “This loop does nothing! Let’s skip it!” I’d be hesitant to trust any benchmarks that are not in use on production code…though they’re a nice tool for developers of compilers to use to check if their optimizations are working.
Eric Kaufman said
on May 15 2009 at 20:38
Just wanted to add in that with .Net 4.0 they’re upgrading the compiler, and adding parallelism for LINQ queries, which I expect will change things out a lot. For those of you who did hop on the LINQ train, I would imagine that a lot of your queries might just start to work a bit quicker once the new compiler comes out.
http://www.betanews.com/article/Everyone-talk-at-once-NET-40-will-include-Parallel-Extensions/1223931673
Ray said
on September 12 2009 at 05:03
My code for Linq:
for (int t = 0; t < RUNS; ++t) { int[] less = ints.Where(i => i < 10).ToArray(); }
My result: LINQ: 00:00:00.0936002, avg. 00:00:00.0000936 Loop: 00:00:00.1092002, avg. 00:00:00.0001092