Tag Archives: String.Join in Entity Framework
Using LINQ Group By and String.Join() / Aggregate() in Entity Framework 3.5
“Group By” in LINQ with “String.Join(), Aggregate()” Problem
First of all, if you ever need to use “Group By” in LINQ on Entity Framework (EF), and you want to use “String.Join()” to convert a LIST to a delimited string, you may bump into this error easily:
- “LINQ to Entities does not recognize the method ‘System.String Join(System.String, System.String[])’ method, and this method cannot be translated into a store expression. “
If you use “Aggregate()”, you will get another error message.
- LINQ to Entities does not recognize the method ‘System.String Aggregate[String](System.Collections.Generic.IEnumerable`1[System.String], System.Func`3[System.String,System.String,System.String])’ method, and this method cannot be translated into a store expression.
The most annoying part is, this works fine in LINQ-To-SQL, but not LINQ-To-Entity. After searching on the Internet, one of the solutions is to convert your object query using .ToList() and you can use String.Join(). Here is an example, Users and Roles are Entities in Data Context which are mapped as Many-To-Many (In relational database behind the scene, there are 3 tables).
For such data, we have:
jSmith Administrator
jSmith Users
We want to group them and display it in Grid:
jSmith Administrator, User
It is easy to do it in SQL and TSQL, however LINQ in Entity Framework, it is a little bit tricky.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | var result1 = (from a in users b in roles where (a.RoleCollection.Any(x => x.RoleId = b.RoleId)) group a by a.UserName into userGroup select new { UserName = userGroup.FirstOrDefault().UserName, RoleNames = userGroup.FirstOrDefault().RoleCollection.Select(x => x.RoleName) }); var result2 = (from a in result1.ToList() select new { UserName = a.UserName, RoleNames = String.Join(", ", a.RoleNames.ToArray()) }); |
The above code actually worked fine when it was binded to the Grid, “RoleNames” showed properly by separated with a delimiter, i.e. “Administrator, User”.
Performance Issue
When I looked at the code, at first I didn’t find anything wrong with it. However when I threw 50,000 records on the Users table and ran the above code again, we got serious performance issue. It took so long for the page to load and everytime you flip to another page on the Grid, it took more than 20 seconds. That is totally not usable.
LINQ’s “Group By” does not directly translated into SQL’s “Group By”, they can be very expensive. Please refer to MSDN for details. Let’s take a look at the following by moving the LINQ group by to result2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | var result1 = (from a in users b in roles where (a.RoleCollection.Any(x => x.RoleId = b.RoleId)) select new { UserName = a.UserName, RoleNames = b.RoleName) }); var result2 = (from a in result1.ToList() group a by a.UserName into userGroup select new { UserName = userGroup.FirstOrDefault().UserName, RoleNames = String.Join(", ", (userGroup.Select(x => x.RoleNames)).ToArray()) }); |
This time, it only took 2 seconds to flip through each page on the Gird, with the same 50,000 records. That’s a 10 times difference.
To further improve on the speed, replace String.Join() with Aggregate():
1 2 3 4 5 6 | // String.Join() RoleNames = String.Join(", ", (userGroup.Select(x => x.RoleNames)).ToArray()) // Aggregate() RoleNames = (userGroup.Select(x => x.RoleNames)).Aggregate((a,b) => (a + ", " + b)) |
This improved the speed by 1 more second on getting 50,000 records.
Interesting Findings
In the 2nd code above, if you change “result1.ToList()” to “result1″ and then comment out “RoleNames” line like the following. The performance is going slower again, it took about 10 seconds to flip to next page on the Grid. Interesting…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | var result1 = (from a in users b in roles where (a.RoleCollection.Any(x => x.RoleId = b.RoleId)) select new { UserName = a.UserName, RoleNames = b.RoleName) }); var result2 = (from a in result1 group a by a.UserName into userGroup select new { UserName = userGroup.FirstOrDefault().UserName, // RoleNames = String.Join(", ", (userGroup.Select(x => x.RoleNames)).ToArray()) }); |
It seems .ToList() and group by on the same query will make the performance faster, probably it translates to SQL properly.
If you have LINQ Profiler, you can see it behind the scene and then compare the 3 cases on SQL level. Unfortunately I don’t have such tool and I am pretty happy with the result, I ain’t gonna spend more time on this to further optimize it (perhaps it is possible).