C#

Grouping Queries in LINQ

To begin with, this post aims to make you understand Grouping Queries in LINQ. Basically, we need to group data whenever the data elements exhibit a certain common feature. In other words, some of the members of a sequence share a specific value for an attribute while other members may have another value for that attribute.

Specifically, this common feature is known as the Key. Moreover, we can have several attributes forming the key. Hence, all the attributes forming the key collectively called the Composite Key.

Accordingly, the Language-Integrated Query (LINQ) offers a group by clause that we can use to group the data in a sequence. Additionally, we can apply the aggregate operators on the grouped data in order to extract more information on each group.

Furthermore, if the data is spread over many sequences, it is required to first join all of these sequences before grouping. In fact, we can perform both the join operation and the group by operation in the same query. It is important to note that group by query is performed as deferred query execution rather than the immediate execution.

In order to find more information about these two kinds of query execution in LINQ, read this post on Deferred Query Execution and Immediate Query Execution in LINQ.

Examples of Grouping Queries in LINQ

As an illustration of grouping queries in LINQ, let us first create some types that we use later for creating data sources. Hence, we need the following three classes.

class Student
    {
        public int student_id { get; set; }
        public string sname { get; set; }
        public int course_id { get; set; }
        public override string ToString()
        {
            return $"Student ID: {student_id}, Name: {sname}, Course ID: {course_id}";
        }
    }
    class Course
    {
        public int course_id { get; set; }
        public string cname { get; set; }
        public override string ToString()
        {
            return $"Course ID: {course_id}, Course Name: {cname}";
        }

    }
    class Result
    {
        public int student_id { get; set; }
        public double percentage { get; set; }
        public override string ToString()
        {
            return $"Student ID: {student_id}, Percentage: {percentage}";
        }
    }

Further, we create the arrays of the above-mentioned classes in order to create the data sources as shown below.

Course[] arr1 = new Course[]
            {
                new Course{course_id=1, cname="MCA"},
                new Course{course_id=2, cname="MBA"}
            };
            Student[] arr2 = new Student[]
            {
                new Student{student_id=1, sname="A", course_id=1},
                new Student{student_id=2, sname="B", course_id=1},
                new Student{student_id=3, sname="C", course_id=2},
                new Student{student_id=4, sname="D", course_id=2},
                new Student{student_id=5, sname="E", course_id=2},
                new Student{student_id=6, sname="F", course_id=2},
                new Student{student_id=7, sname="G", course_id=2},
                new Student{student_id=8, sname="H", course_id=1},
                new Student{student_id=9, sname="I", course_id=2},
                new Student{student_id=10, sname="J", course_id=2},
                new Student{student_id=11, sname="K", course_id=1},
                new Student{student_id=12, sname="L", course_id=1},
                new Student{student_id=13, sname="M", course_id=1},
                new Student{student_id=14, sname="N", course_id=1},
                new Student{student_id=15, sname="O", course_id=1},
                new Student{student_id=16, sname="P", course_id=2},
                new Student{student_id=17, sname="Q", course_id=2},
                new Student{student_id=18, sname="R", course_id=1},
                new Student{student_id=19, sname="S", course_id=1},
                new Student{student_id=20, sname="T", course_id=1},
                new Student{student_id=21, sname="U", course_id=2},
            };
            Result[] arr3 = new Result[]
            {
                new Result{student_id=1, percentage=66.9},
                new Result{student_id=2, percentage=82.5},
                new Result{student_id=3, percentage=56.3},
                new Result{student_id=4, percentage=70.7},
                new Result{student_id=5, percentage=37.8},
                new Result{student_id=6, percentage=49.4},
                new Result{student_id=7, percentage=46.9},
                new Result{student_id=8, percentage=85.4},
                new Result{student_id=9, percentage=67.8},
                new Result{student_id=10, percentage=57.3},
                new Result{student_id=11, percentage=80.3},
                new Result{student_id=12, percentage=28.6},
                new Result{student_id=13, percentage=88.4},
                new Result{student_id=14, percentage=24.8},
                new Result{student_id=15, percentage=88.7},
                new Result{student_id=16, percentage=58.7},
                new Result{student_id=17, percentage=84.2},
                new Result{student_id=18, percentage=73.7},
                new Result{student_id=19, percentage=59.3},
                new Result{student_id=20, percentage=63.9},
                new Result{student_id=21, percentage=75.9}
            };

Once, data sources have been created, we can use them in our queries. Therefore, first we create a simple group by query on the array of Student objects with the Course_ID as the key of the group. The following code shows the query.

Console.WriteLine("Grouping Student Records on the basis of Course ID...");
            //Grouping Students by their Course ID
            var q1=from s1 in arr2
                   group s1 by s1.course_id into group1
                   select group1; 
            //Executing the Group By Query
            foreach(var ob in q1)
            {
                Console.WriteLine($"Grouping on Course ID: {ob.Key}");
                foreach (Student s in ob)
                    Console.WriteLine(s);
            }

Output

Grouping Student Records on Course_ID as the Key
Grouping Student Records on Course_ID as the Key

After that, we retrieve data from other data sources also. Therefore, we need to perform a join operation before grouping. As an illustration, see the following query. The following query shows grouping on course name as the key rather than course ID.

Console.WriteLine("Grouping Queries by Course Name...");
            //Grouping Students by the Course Name
            var q2 = from s1 in arr2
                     join s2 in arr1 on s1.course_id equals s2.course_id
                     select new { s1, s2 } into t1
                     group t1 by t1.s2.cname into g1
                     select g1;
            //Executing the Group By Query
            foreach (var ob in q2)
            {
                Console.WriteLine($"Grouping on Course ID: {ob.Key}");
                foreach (var s in ob)
                    Console.WriteLine($"Student ID: {s.s1.student_id}, Student Name: {s.s1.sname}");
            }

Output

Combining the join and group by Operations
Combining the join and group by Operations

In general, we can combine all three sequences in this example. In other words, we can perform join operation on all three arrays and then apply the group by the operation as shown in the following query. Hence, we can display the percentage along with other details of students while grouping on the Course Name.

 Console.WriteLine("Grouping Queries by Course Name and display percentage also...");
            //Grouping Students by the Course Name and joining three data sources
            var q3 = from s1 in arr2
                     join s2 in arr1 on s1.course_id equals s2.course_id
                     join s3 in arr3 on s1.student_id equals s3.student_id
                     orderby s3.percentage
                     select new { s1, s2, s3 } into t1
                     group t1 by t1.s2.cname into g1
                     select g1;
            //Executing the Group By Query
            foreach (var ob in q3)
            {
                Console.WriteLine($"Grouping on Course ID: {ob.Key}");
                foreach (var s in ob)
                    Console.WriteLine($"Student ID: {s.s1.student_id}, Student Name: {s.s1.sname}, Percentage: {s.s3.percentage}");
            }

Output

Joining the Three Sequences before Grouping
Joining the Three Sequences before Grouping

Lastly, we make use of aggregate operators on the grouped data. For this purpose, first we perform grouping as we have done earlier and then apply the aggregate operators Count and Average on the grouped data.

In short, first-ever the query performs a join operation on all three data sources and creates a new anonymous type containing the three sequences. Also, note the use of into operator for temporarily storing the result. Further, the query performs grouping on the course name field and creates another anonymous type containing the key as the Course, Count, and the Average.

Likewise, the query makes use of the lambda expression to compute the average of the percentage field. Finally, the foreach loop executes the query and displays the value of these three computed fields.

Console.WriteLine("Grouping Queries by Course Name and display Count and average percentage also...");
            //Grouping Students by the Course Name and joining three data sources
            var q4 = from s1 in arr2
                     join s2 in arr1 on s1.course_id equals s2.course_id
                     join s3 in arr3 on s1.student_id equals s3.student_id
                     orderby s3.percentage
                     select new { s1, s2, s3 } into t1
                     group t1 by t1.s2.cname into g1
                     select new { Course=g1.Key, Count = g1.Count(), Average = g1.Average(g1 => g1.s3.percentage) }; ;
            //Executing the Group By Query
            foreach (var ob in q4)
            {
                Console.WriteLine($"Grouping on the Course: {ob.Course}");
                Console.WriteLine($"Total Students: {ob.Count}");
                Console.WriteLine($"Average Percentage: {ob.Average}");
            }

Output

Using Aggregate Operators in the Group By Query
Using Aggregate Operators in the Group By Query

Summary

Basically, this article on Grouping Queries in LINQ explains the concept of grouping data on the basis of a key. Also, the use of aggregate operators in the group by query is explained with examples.

Whenever the data arrives from different sequences, it can be combined using the join query before grouping.


Related Topics

A Beginner’s Tutorial on WPF in C#

Everything about Tuples in C# and When to Use?

Linear Search and Binary Search in C#

Creating Jagged Arrays in C#

When should We Use Private Constructors?

C# Practice Questions

C# Basic Examples

Private and Static Constructors in C#

Constructors in C#

C# Arrays

C# Examples

How to Create a C# Console Application

Creating Navigation Window Application Using WPF in C#

LINQ To SQL Examples

Understanding the Concept of Nested Classes in C#

How to Setup a Connection with SQL Server Database in Visual Studio

C# Root Class – Object

KeyValuePair and its Applications

IEnumerable and IEnumerator Interfaces

IEqualityComparer Interface

New Features in C# 9

Generic IList Interface and its Implementation in C#

Examples of Connected and Disconnected Approach in ADO.NET

How Data Binding Works in WPF

Language-Integrated Query (LINQ) in C#

Examples of Query Operations using LINQ in C#

Join Operation using LINQ

Deferred Query Execution and Immediate Query Execution in LINQ

Understanding the Quantifiers in LINQ

Leave a Reply

Your email address will not be published. Required fields are marked *