Speeding up LINQ to SQL

There are a number of ways to improve the performance of LINQ to SQL queries.

1. Cache the Mapping Source

When executing large LINQ to SQL queries (those involving many tables), LINQ to SQL's object relational mapper incurs a significant overhead in building the internal metamodel. The metamodel describes how your entity classes map to the underlying tables and columns, and is built automatically via reflection (assuming you're using attributes rather than an XML file, to map tables and columns). Ordinarily, this overhead is incurred every time you instantiate a DataContext, which is not great for performance. You can avoid this by manually specifying a single static MappingSource object for all typed DataContext instances as follows:

public class MyDataContext : DataContext
{
   static MappingSource _sharedMappingSource = new AttributeMappingSource();

   public MyDataContext (IDbConnection cx) : base (cx, _sharedMappingSource) { }
   public MyDataContext (string cxString) : base (cxString, _sharedMappingSource) { }
}

You might wonder what happens in a multithreaded environment. What if two threads (e.g., from two different client requests) instantiate their own instances of MyDataContext and start executing queries at the same time? Fortunately, everything works correctly because MetaModel instances are thread-safe. I've used this technique in a large CRM application that uses LINQ to SQL almost entirely for data access: the system has over 500 users and has been running without trouble since early 2008. (Without this optimization, some queries were simply too slow for LINQ to SQL to be viable).

Note that although MetaModel (MappingSource) instances are thread-safe, the same is not true for DataContext instances. You cannot share a single DataContext instance between multiple threads, or things will go terribly wrong! (And not that you'd want to: doing so would mean taking on unpleasant concurrency issues that are otherwise handled very well by SQL Server itself, through its transaction isolation level semantics).

2. Use Compiled Queries

Ordinarily, LINQ to SQL must translate LINQ queries to SQL every time a query executes; this involves recursing the expression tree that makes up the query in several stages. It sounds worse than it is: the computation cost is not enormous in the overall scheme of things (certainly not as big as the cost of building an AttributeMappingSource when lots of entities are involved). Nonetheless, you can avoid paying the price on each query execution by precompiling the query using the CompiledQuery class.

Here's an example that you can paste directly into LINQPad:

var cc = CompiledQuery.Compile ((TypedDataContext dc, decimal minPrice) =>    
   from c in Customers
   where c.Purchases.Any (p => p.Price > minPrice)
   select c
);

cc (this, 100).Dump ("Customers who spend more than $100");
cc (this, 1000).Dump ("Customers who spend more than $1000");

3. Use Table-Valued Functions Where Necessary

Finally, there are some queries for which LINQ to SQL producesless-than-ideal SQL. Furthermore, some SQL queries need optimization hints to get the best performance, and optimization hints are impossible with LINQ to SQL alone. A compromise in these situations is to encapsulate the difficult part of the query in a table-valued function, and then map this function into your typed DataContext class. You can then run LINQ queries over this function, and the queries will be translated to SQL and execute on the server. So, in effect, you're mix SQL and LINQ to SQL in the same query.