• Tidak ada hasil yang ditemukan

Introducing Query Expressions

Dalam dokumen ebin.pub essential c 120 8nbsped 9780138219550 (Halaman 115-148)

"false","finally","fixed","float","for","

"from*","get*","global*","goto","group*", "in","int","interface","internal","into*"

"let*","lock","long", "nameof*", "namespa "nonnull*","null","object","on*","operato "out","override","params","partial*","pri "protected","public","readonly","ref","re "sbyte","sealed","select*","set*","short"

"stackalloc","static","string","struct","

"throw","true","try","typeof","uint","ulo "unsafe","ushort","using","value*","var*"

"void","volatile","when*","where*","while });

}

// ...

using System;

using System.Collections.Generic;

using System.Linq;

// ...

private static void ShowContextualKeywords() {

IEnumerable<string> selection = from word in CSharp.Keywords where !word.Contains('*') select word;

foreach (string keyword in selection) {

Console.Write(keyword + " ");

} } //...

OUTPUT 16.1

abstract as base bool break byte case catch char chec continue decimal default delegate do double else enum extern false finally fixed float for foreach goto if interface internal is lock long namespace new null ob override params private protected public readonly ref sealed short sizeof stackalloc static string struct s true try typeof uint ulong unchecked unsafe ushort us volatile while

In this query expression, selection is assigned the collection of C#

reserved keywords. The query expression in this example includes a where clause that filters out the noncontextual keywords.

Query expressions always begin with a from clause and end with a

select clause or a group clause, identified by the from , select , or group contextual keyword, respectively. The identifier word in the from clause is called a range variable; it represents each item in the

collection, much as the loop variable in a foreach loop represents each item in a collection.

Developers familiar with SQL will notice that query expressions have a syntax that is similar to that of SQL. This design was deliberate—it was intended that programmers who already know SQL should find it easy to learn LINQ. However, there are some obvious differences. The first

difference that most SQL-experienced developers will notice is that the C#

query expression shown here has the clauses in the following order: from , then where , then select . The equivalent SQL query puts the SELECT clause first, followed by the FROM clause, and finally the WHERE clause.

One reason for this change in sequence is to enable use of IntelliSense, the feature of the IDE whereby the editor produces helpful user interface elements such as drop-down lists that describe the members of a given object. Because from appears first and identifies the string array

Keywords as the data source, the code editor can deduce that the range variable word is of type string . When you are entering the code into the editor and reach the dot following word , the editor will display only the members of string .

If the from clause appeared after the select , like it does in SQL, as you were typing in the query the editor would not know what the data type of

word was, so it would not be able to display a list of word ’s members. In

LISTING 16.1, for example, it wouldn’t be possible to predict that Contains() is a possible member of word .

The C# query expression order also more closely matches the order in which operations are logically performed. When evaluating the query, you begin by identifying the collection (described by the from clause), then filter out the unwanted items (with the where clause), and finally describe the desired result (with the select clause).

Finally, the C# query expression order ensures that the rules for “where”

(range) variables are in scope are mostly consistent with the scoping rules for local variables. For example, a (range) variable must be declared by a clause (typically a from clause) before the variable can be used, much as a local variable must always be declared before it can be used.

Projection

The result of a query expression is a collection of type IEnumerable<T>

or IQueryable<T> . The actual type T is inferred from the select or group by clause. In LISTING 16.1, for example, the compiler knows that Keywords is of type string[] , which is convertible to

IEnumerable<string> , and it deduces that word is therefore of type string . The query ends with select word , which means the result of the query expression must be a collection of strings, so the type of the query expression is IEnumerable<string>.

2

2. The result of a query expression is, as a practical matter, almost always IEnumerable<T> or a type derived from it. It is legal, though somewhat perverse, to create an implementation of the query methods that return other types; there is no requirement in the language that the result of a query expression be convertible to IEnumerable<T> .

In this case, the “input” and the “output” of the query are both a collection of strings. However, the output type can be quite different from the input type if the expression in the select clause is of an entirely different type.

Consider the query expression in LISTING 16.2 and its corresponding output in OUTPUT 16.2.

LISTING 16.2: Projection Using Query Expressions

using System;

using System.Collections.Generic;

using System.IO;

using System.Linq;

// ...

public static void List1

(string rootDirectory, string searchPattern) {

IEnumerable<string> fileNames = Directory.Get rootDirectory, searchPattern);

IEnumerable<FileInfo> fileInfos = from fileName in fileNames select new FileInfo(fileName);

foreach (FileInfo fileInfo in fileInfos) {

Console.WriteLine(

$@".{ fileInfo.Name } ({

fileInfo.LastWriteTime })");

} } //...

OUTPUT 16.2

Account.cs (11/22/2011 11:56:11 AM) Bill.cs (8/10/2011 9:33:55 PM)

Contact.cs (8/19/2011 11:40:30 PM) Customer.cs (11/17/2011 2:02:52 AM) Employee.cs (8/17/2011 1:33:22 AM) Person.cs (10/22/2011 10:00:03 PM)

This query expression results in an IEnumerable<FileInfo> rather than the IEnumerable<string> data type returned by

Directory.GetFiles() . The select clause of the query expression

can potentially project out a data type that differs from that collected by the from clause expression.

In this example, the type FileInfo was chosen because it has the two relevant fields needed for the desired output: the filename and the last write time. There might not be such a convenient type if you needed other

information not captured in the FileInfo object. Tuples provide a convenient and concise way to project the exact data you need without having to find or create an explicit type. LISTING 16.3 provides output similar to that in LISTING 16.2, but uses tuple syntax rather than

FileInfo .

3. Anonymous types used prior to C# 7.0.

LISTING 16.3: Tuples within Query Expressions

using System;

using System.IO;

using System.Linq;

// ...

public static void List2

(string rootDirectory, string searchPattern) {

var fileNames = Directory.EnumerateFiles(

rootDirectory, searchPattern);

3

var fileResults = from fileName in fileNames select ( Name: fileName, LastWriteTime: File.GetLastWriteTime(

);

foreach (var fileResult in fileResults) {

Console.WriteLine(

$@"{ fileResult.Name } ({

fileResult.LastWriteTime })");

} } //...

In this example, the query projects out only the filename and its last file write time. A projection such as the one in LISTING 16.3 makes little difference when working with something small, such as FileInfo. However, “horizontal” projection that filters down the amount of data associated with each item in the collection is extremely powerful when the amount of data is significant and retrieving it (perhaps from a different computer over the Internet) is expensive. Rather than retrieving all the data when a query executes, the use of a tuple enables the capability of storing and retrieving only the required data into the collection.

Imagine, for example, a large database that has tables with 30 or more

columns. If there were no tuples, developers would be required either to use objects containing unnecessary information or to define small, specialized classes useful only for storing the specific data required. Instead, tuples enable support for types to be defined by the compiler—types that contain only the data needed for their immediate scenario. Other scenarios can have a different projection of only the properties needed for that scenario.

Beginner Topic: Deferred Execution with Query Expressions

Queries written using query expression notation exhibit deferred execution, just as the queries written in Chapter 15 did. Consider again the assignment of a query object to variable selection in LISTING 16.1. The creation of the query and the assignment to the variable do not execute the query;

rather, they simply build an object that represents the query. The method word.Contains("*") is not called when the query object is created.

Instead, the query expression saves the selection criteria to be used when iterating over the collection identified by the selection variable.

To demonstrate this point, consider LISTING 16.4 and the corresponding output (OUTPUT 16.3).

LISTING 16.4: Deferred Execution and Query Expressions (Example 1)

using System;

using System.Collections.Generic;

using System.Linq;

// ...

private static void ShowContextualKeywords2() {

IEnumerable<string> selection = from word in where IsKeywo select word;

Console.WriteLine("Query created.");

foreach(string keyword in selection) {

// No space output here Console.Write(keyword);

} }

// The side effect of console output is included // in the predicate to demonstrate deferred execu // predicates with side effects are a poor practi // production code

private static bool IsKeyword(string word) {

if(word.Contains('*')) {

Console.Write(" ");

return true;

} else {

return false;

} } //...

OUTPUT 16.3

Query created.

add* alias* ascending* async* await* by* descending*

equals* from* get* global* group* into* join* let* na on* orderby* partial* remove* select* set* value* var d*

In LISTING 16.4, no space is output within the foreach loop. The side effect of printing a space when the predicate IsKeyword() is executed happens when the query is iterated over—not when the query is created.

Thus, although selection is a collection (it is of type

IEnumerable<T> , after all), at the time of assignment everything following the from clause serves as the selection criteria. Not until we begin to iterate over selection are the criteria applied.

Now consider a second example (see LISTING 16.5 and OUTPUT 16.4).

LISTING 16.5: Deferred Execution and Query Expressions (Example 2)

using System;

using System.Collections.Generic;

using System.Linq;

// ...

private static void CountContextualKeywords() {

int delegateInvocations = 0;

string func(string text) {

delegateInvocations++;

return text;

}

IEnumerable<string> selection = from keyword in CSharp.Keywords where keyword.Contains('*') select func(keyword);

Console.WriteLine(

$"1. delegateInvocations={ delegateInvoca

// Executing count should invoke func once fo // each item selected

Console.WriteLine(

$"2. Contextual keyword count={ selection

Console.WriteLine(

$"3. delegateInvocations={ delegateInvoca

// Executing count should invoke func once fo // each item selected

Console.WriteLine(

$"4. Contextual keyword count={ selection

Console.WriteLine(

$"5. delegateInvocations={ delegateInvoca

// Cache the value so future counts will not // another invocation of the query

List<string> selectionCache = selection.ToLis

Console.WriteLine(

$"6. delegateInvocations={ delegateInvoca

// Retrieve the count from the cached collect Console.WriteLine(

$"7. selectionCache count={ selectionCach Console.WriteLine(

$"8. delegateInvocations={ delegateInvoca

} //...

OUTPUT 16.4

1. delegateInvocations=0

2. Contextual keyword count=28 3. delegateInvocations=28

4. Contextual keyword count=28 5. delegateInvocations=56

6. delegateInvocations=84 7. selectionCache count=28 8. delegateInvocations=84

Rather than defining a separate method, LISTING 16.5 uses a statement lambda that counts the number of times the method is called.

Two things in the output are remarkable. First, after selection is assigned, DelegateInvocations remains at zero. At the time of

assignment to selection , no iteration over Keywords is performed. If Keywords were a property, the property call would run—in other words, the from clause executes at the time of assignment. However, neither the projection, nor the filtering, nor anything after the from clause will

execute until the code iterates over the values within selection . It is as

though at the time of assignment, selection would more appropriately be called “query.”

Once we call ToList() , however, a term such as selection or

Items or something that indicates a container or collection is appropriate because we begin to count the items within the collection. In other words, the variable selection serves the dual purpose of saving the query information and acting like a container from which the data is retrieved.

A second important characteristic of OUTPUT 16.4 is that calling Count() a second time causes func to again be invoked once on each item

selected. Given that selection behaves both as a query and as a

collection, requesting the count requires that the query be executed again by iterating over the IEnumerable<string> collection that selection refers to and counting the items. The C# compiler does not know whether anyone has modified the strings in the array such that the count would now be different, so the counting has to happen anew every time to ensure that the answer is correct and up-to-date. Similarly, a foreach loop over

selection would trigger func to be called again for each item. The same is true of all the other extension methods provided via

System.Linq.Enumerable .

Advanced Topic: Implementing Deferred Execution

Deferred execution is implemented by using delegates and expression trees.

A delegate provides the ability to create and manipulate a reference to a method that contains an expression that can be invoked later. An expression tree similarly provides the ability to create and manipulate information about an expression that can be examined and manipulated later.

In LISTING 16.5, the predicate expressions of the where clauses and the projection expressions of the select clauses are transformed by the compiler into expression lambdas, and then the lambdas are transformed into delegate creations. The result of the query expression is an object that holds references to these delegates. Only when the query results are iterated over does the query object actually execute the delegates.

Filtering

In LISTING 16.1, a where clause filters out reserved keywords but not contextual keywords. This clause filters the collection “vertically”: If you think of the collection as a vertical list of items, the where clause makes that vertical list shorter so that the collection holds fewer items. The filter criteria are expressed with a predicate—a lambda expression that returns a

bool such as word.Contains() (as in LISTING 16.1) or File.GetLastWriteTime(fileName) <

DateTime.Now.AddMonths(-1) . The latter is shown in LISTING 16.6, whose output appears in OUTPUT 16.5.

LISTING 16.6: Query Expression Filtering Using where

using System;

using System.Collections.Generic;

using System.IO;

using System.Linq;

// ...

public static void FindMonthOldFiles(

string rootDirectory, string searchPattern) {

IEnumerable<FileInfo> files =

from fileName in Directory.EnumerateFiles rootDirectory, searchPattern)

where File.GetLastWriteTime(fileName) <

DateTime.Now.AddMonths(-1) select new FileInfo(fileName);

foreach(FileInfo file in files) {

// As a simplification, the current dire // assumed to be a subdirectory of

// rootDirectory

string relativePath = file.FullName.Subst Directory.GetCurrentDirectory().Lengt Console.WriteLine(

$".{ relativePath } ({ file.LastWrite }

} //...

OUTPUT 16.5

.\TestData\Bill.cs (8/10/2011 9:33:55 PM) .\TestData\Contact.cs (8/19/2011 11:40:30 PM) .\TestData\Employee.cs (8/17/2011 1:33:22 AM) .\TestData\Person.cs (10/22/2011 10:00:03 PM)

Sorting

To order the items using a query expression, you can use the orderby clause, as shown in LISTING 16.7.

LISTING 16.7: Sorting Using a Query Expression with an orderby Clause

using System;

using System.Collections.Generic;

using System.Linq;

using System.IO;

// ...

public static void ListByFileSize1(

string rootDirectory, string searchPattern)

{

IEnumerable<string> fileNames =

from fileName in Directory.EnumerateFiles rootDirectory, searchPattern)

orderby (new FileInfo(fileName)).Length d fileName select fileName;

foreach(string fileName in fileNames) {

Console.WriteLine(fileName);

} } //...

LISTING 16.7 uses the orderby clause to sort the files returned by

Directory.GetFiles() first by file size in descending order, and then by filename in ascending order. Multiple sort criteria are separated by

commas, such that first the items are ordered by size, and then, if the size is the same, they are ordered by filename. ascending and descending are contextual keywords indicating the sort order direction. Specifying the order as ascending or descending is optional; if the direction is omitted (as it is here on filename ), the default is ascending .

The let Clause

LISTING 16.8 includes a query that is very similar to the query in LISTING

16.7, except that the type argument of IEnumerable<T> is FileInfo . This query has a problem, however: We have to redundantly create a

FileInfo twice, in both the orderby clause and the select clause.

LISTING 16.8: Projecting a FileInfo Collection and Sorting by File Size

using System;

using System.Collections.Generic;

using System.Linq;

using System.IO;

// ...

public static void ListByFileSize2(

string rootDirectory, string searchPattern) {

IEnumerable<FileInfo> files =

from fileName in Directory.EnumerateFiles rootDirectory, searchPattern)

orderby new FileInfo(fileName).Length, fi select new FileInfo(fileName);

foreach (FileInfo file in files) {

// As a simplification, the current dire // is assumed to be a subdirectory of // rootDirectory

string relativePath = file.FullName.Subst Directory.GetCurrentDirectory().Lengt Console.WriteLine(

$".{ relativePath }({ file.Length })"

} } //...

Unfortunately, although the end result is correct, LISTING 16.8 ends up instantiating a FileInfo object twice for each item in the source collection, which is wasteful and unnecessary. To avoid this kind of unnecessary and potentially expensive overhead, you can use a let clause, as demonstrated in LISTING 16.9.

LISTING 16.9: Ordering the Results in a Query Expression

//...

IEnumerable<FileInfo> files =

from fileName in Directory.EnumerateFiles(

rootDirectory, searchPattern)

let file = new FileInfo(fileName) orderby file.Length, fileName

select file;

//...

The let clause introduces a new range variable that can hold the value of an expression that is used throughout the remainder of the query expression.

You can add as many let clauses as you like; simply insert each as an additional clause to the query after the first from clause but before the final select / group by clause.

Grouping

A common data manipulation scenario is the grouping of related items. In SQL, this generally involves aggregating the items to produce a summary or total or some other aggregate value. LINQ, however, is notably more expressive. LINQ expressions allow for individual items to be grouped into a series of subcollections, and those groups can then be associated with items in the collection being queried. For example, LISTING 16.10 and OUTPUT 16.6 demonstrate how to group together the contextual keywords and the regular keywords.

LISTING 16.10: Grouping Together Query Results

using System;

using System.Collections.Generic;

using System.Linq;

// ...

private static void GroupKeywords1() {

IEnumerable<IGrouping<bool, string>> selectio from word in CSharp.Keywords

group word by word.Contains('*');

foreach(IGrouping<bool, string> wordGroup in selection)

{

Console.WriteLine(Environment.NewLine + "

wordGroup.Key ?

"Contextual Keywords" : "Keywords foreach(string keyword in wordGroup)

{

Console.Write(" " + (wordGroup.Key ?

keyword.Replace("*", null) : }

} } //...

OUTPUT 16.6

Keywords:

abstract as base bool break byte case catch char chec

const continue decimal default delegate do double els explicit extern false finally fixed float for foreach implicit in int interface internal is lock long names operator out override object params private protected readonly ref return sbyte sealed short sizeof stackal string struct switch this throw true try typeof uint ushort using virtual unchecked void volatile while Contextual Keywords:

add alias ascending async await by descending dynami get global group into join let nameof nonnull on orde select set value var where yield

There are several things to note in LISTING 16.10. First, the query result is a sequence of elements of type IGrouping<bool, string>. The first type argument indicates that the “group key” expression following by was of type bool , and the second type argument indicates that the “group element” expression following group was of type string . That is, the query produces a sequence of groups where the Boolean key is the same for each string in the group.

Because a query with a group by clause produces a sequence of collections, the common pattern for iterating over the results is to create nested foreach loops. In LISTING 16.10, the outer loop iterates over the groupings and prints out the type of keyword as a header. The nested

foreach loop prints each keyword in the group as an item below the header.

Dalam dokumen ebin.pub essential c 120 8nbsped 9780138219550 (Halaman 115-148)

Dokumen terkait