Skip to content

Commit e2f3efb

Browse files
authored
Docs for query splitting (#2486)
Part of #2479
1 parent 938ec49 commit e2f3efb

File tree

3 files changed

+59
-4
lines changed

3 files changed

+59
-4
lines changed

entity-framework/core/querying/related-data.md

+47-2
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,53 @@ You may want to include multiple related entities for one of the entities that i
4848

4949
[!code-csharp[Main](../../../samples/core/Querying/RelatedData/Sample.cs#MultipleLeafIncludes)]
5050

51-
> [!CAUTION]
52-
> Since version 3.0.0, each `Include` will cause an additional JOIN to be added to SQL queries produced by relational providers, whereas previous versions generated additional SQL queries. This can significantly change the performance of your queries, for better or worse. In particular, LINQ queries with an exceedingly high number of `Include` operators may need to be broken down into multiple separate LINQ queries in order to avoid the cartesian explosion problem.
51+
### Single and split queries
52+
53+
> [!NOTE]
54+
> This feature is introduced in EF Core 5.0.
55+
56+
In relational databases, all related entities are by default loaded by introducing JOINs:
57+
58+
```sql
59+
SELECT [b].[BlogId], [b].[OwnerId], [b].[Rating], [b].[Url], [p].[PostId], [p].[AuthorId], [p].[BlogId], [p].[Content], [p].[Rating], [p].[Title]
60+
FROM [Blogs] AS [b]
61+
LEFT JOIN [Post] AS [p] ON [b].[BlogId] = [p].[BlogId]
62+
ORDER BY [b].[BlogId], [p].[PostId]
63+
```
64+
65+
If a typical blog has multiple related posts, rows for these posts will duplicate the blog's information, leading to the so-called "cartesian explosion" problem. As more one-to-many relationships are loaded, the amount of duplicated data may grow and adversely affect the performance of your application.
66+
67+
EF allows you to specify that a given LINQ query should be *split* into multiple SQL queries. Instead of JOINs, split queries perform an additional SQL query for each included one-to-many navigation:
68+
69+
[!code-csharp[Main](../../../samples/core/Querying/RelatedData/Sample.cs?name=AsSplitQuery&highlight=5)]
70+
71+
This will produce the following SQL:
72+
73+
```sql
74+
SELECT [b].[BlogId], [b].[OwnerId], [b].[Rating], [b].[Url]
75+
FROM [Blogs] AS [b]
76+
ORDER BY [b].[BlogId]
77+
78+
SELECT [p].[PostId], [p].[AuthorId], [p].[BlogId], [p].[Content], [p].[Rating], [p].[Title], [b].[BlogId]
79+
FROM [Blogs] AS [b]
80+
INNER JOIN [Post] AS [p] ON [b].[BlogId] = [p].[BlogId]
81+
ORDER BY [b].[BlogId]
82+
```
83+
84+
While this avoids the performance issues associated with JOINs and cartesian explosion, it also has some drawbacks:
85+
86+
* While most databases guarantee data consistency for single queries, no such guarantees exist for multiple queries. This means that if the database is being updated concurrently as your queries are being executed, resulting data may not be consistent. This may be mitigated by wrapping the queries in a serializable or snapshot transaction, although this may create performance issues of its own. Consult your database's documentation for more details.
87+
* Each query currently implies an additional network roundtrip to your database; this can degrade performance, especially where latency to the database is high (e.g. cloud services). EF Core will improve this in the future by batching the queries into a single roundtrip.
88+
* While some databases allow consuming the results of multiple queries at the same time (SQL Server with MARS, Sqlite), most allow only a single query to be active at any given point. This means that all results from earlier queries must be buffered in your application's memory before executing later queries, increasing your memory requirements in a potentially significant way.
89+
90+
Unfortunately, there isn't one strategy for loading related entities that fits all scenarios. Carefully consider the advantages and disadvantages of single and split queries, and select the one that fits your needs.
91+
92+
> [!NOTE]
93+
> One-to-one related entities are always loaded via JOINs, as this has no performance impact.
94+
>
95+
> At the moment, use of query splitting on SQL Server requires settings `MultipleActiveResultSets=true` in your connection string. This requirement will be removed in a future preview.
96+
>
97+
> Future previews of EF Core 5.0 will allow specifying query splitting as the default for your context.
5398
5499
### Filtered include
55100

samples/core/Querying/Querying.csproj

+2-2
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@
88
</PropertyGroup>
99

1010
<ItemGroup>
11-
<PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="3.1.3" />
12-
<PackageReference Include="Microsoft.EntityFrameworkCore.Sqlite" Version="3.1.3" />
11+
<PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="5.0.0-preview.6.20312.4" />
12+
<PackageReference Include="Microsoft.EntityFrameworkCore.Sqlite" Version="5.0.0-preview.6.20312.4" />
1313
</ItemGroup>
1414

1515
</Project>

samples/core/Querying/RelatedData/Sample.cs

+10
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,16 @@ public static void Run()
8686
}
8787
#endregion
8888

89+
#region AsSplitQuery
90+
using (var context = new BloggingContext())
91+
{
92+
var blogs = context.Blogs
93+
.Include(blog => blog.Posts)
94+
.AsSplitQuery()
95+
.ToList();
96+
}
97+
#endregion
98+
8999
#region Eager
90100
using (var context = new BloggingContext())
91101
{

0 commit comments

Comments
 (0)