Relational Databases ~ Introduction to SQL IV

SQL provides a mechanism for nesting subqueries. A subquery is a select-from-where expression that is nested within another query.1、Nested Subqueries

A common use of subqueries is to perform tests for set membership, make set comparisons, and determine set cardinality, by nesting subqueries in the where clause.

1.1 Set Membership

SQL allows testing tuples for membership in a relation. The in connective tests for set membership, where the set is a collection of values produced by a select clause. The not in connective tests for the absence of set membership.

As an illustration, reconsider the query “Find all the courses taught in the both the Fall 2009 and Spring 2010 semesters.” Earlier, we wrote such a query by intersecting two sets: the set of courses taught in Fall 2009 and the set of courses taught in Spring 2010. We can take the alternative approach of finding all courses that were taught in Fall 2009 and that are also members of the set of courses taught in Spring 2010. It leads us to write our query using the in connective of SQL. We begin by finding all courses taught in Spring 2010, and we write the subqueryWe then need to find those courses that were taught in the Fall 2009 and that appear in the set of courses obtained in the subquery. We do so by nesting the subquery in the where clause of an outer query. The resulting query is
We use the not in construct in a way similar to the in construct. For example, to find all the courses taught in the Fall 2009 semester but not in the Spring 2010 semester, we can write:The in and not in operators can also be used on enumerated sets. The following query selects the names of instructors whose names are neither “Mozart” nor “Einstein”.

1.2 Set Comparison

As an example of the ability of a nested subquery to compare sets, consider the query “Find the names of all instructors whose salary is greater than at least one instructor in the Biology department.” We wrote this query as follows:SQL does, however, offer an alternative style for writing the preceding query. The phrase “greater than at least one” is represented in SQL by > some. This construct allows us to rewrite the query in a form that resembles closely our formulation of the query in English.The subquery:generates the set of all salary values of all instructors in the Biology department. The > some comparison in the where clause of the outer select is true if the salary value of the tuple is greater than at least one member of the set of all salary values for instructors in Biology.

SQL also allows < some, <=some, >=some, =some, and <>some comparisons. As an exercise, verify that = some is identical to in, whereas <> some is not the same as not in. Now we modify our query slightly. Let us find the names of all instructors that have a salary value greater than that of each instructor in the Biology department. The construct > all corresponds to the phrase “greater than all.” Using this construct, we write the query as follows:As it does for some, SQL also allows < all, <= all, >= all, = all, and <> all comparisons. As an exercise, verify that <> all is identical to not in, whereas = all is not the same as in.

As another example of set comparisons, consider the query “Find the departments that have the highest average salary.” We begin by writing a query to find all average salaries, and then nest it as a subquery of a larger query that finds those departments for which the average salary is greater than or equal to all average salaries:

1.3 Test for Empty Relations

SQL includes a feature for testing whether a subquery has any tuples in its result. The exists construct returns the value true if the argument subquery is nonempty. Using the exists construct, we can write the query “Find all courses taught in both the Fall 2009 semester and in the Spring 2010 semester” in still another way:The above query also illustrates a feature of SQL where a correlation name from an outer query ( S in the above query ), can be used in a subquery in the where clause. A subquery that uses a correlation name from an outer query is called a correlated subquery.

In queries that contain subqueries, a scoping rule applies for correlation names. In a subquery, according to the rule, it is legal to use only correlation names defined in the subquery itself or in any query that contains the subquery. If a correlation name is defined both locally in a subquery and globally in a containing query, the local definition applies. This rule is analogous to the usual scoping rules used for variables in programming languages.

We can test for the nonexistence of tuples in a subquery by using the not exists construct. We can use the not exists construct to simulate the set containment ( that is, superset ) operation: We can write “relation A contains relation B” as “not exists (B except A).” (Although it is not part of the current SQL standards, the contains operator was present in some early relational systems.) To illustrate the not exists operator, we can write the query as follows:1.4 Test for the Absence of Duplicate Tuples

SQL includes a boolean function for testing whether a subquery has duplicate tuples in its result. The unique construct returns the value true if the argument subquery contains no duplicate tuples. Using the unique construct, we can write the query “Find all courses that were offered at most once in 2009” as follows:“The unique construct” is not yet widely implemented. An equivalent version of the above query not using the unique construct is:1.5 Subqueries in the From Clause

SQL allows a subquery expression to be used in the from clause. The key concept applied here is that any select-from-where expression returns a relation as a result and, therefore, can be inserted into another select-from-where anywhere that a relation can appear. Consider the query “Find the average instructors’ salaries of those departments where the average salary is greater than $42000.” We wrote this query previously by using the having clause. We can now rewrite this query, without using the having clause, by using a subquery in the from clause, as follows:The subquery generates a relation consisting of the names of all departments and their corresponding average instructors’ salaries. The attributes of the subquery result can be used in the outer query, as can be seen in the above example.

Note that we do not need to use the having clause, since the subquery in the from clause computes the average salary, and the predicate that was in the having clause earlier is now in the where clause of the outer query. We can give the subquery result relation a name, and rename the attributes, using the as clause, as illustrated below.The subquery result relation is named dept_avg, with the attributes dept_name and avg_salary. Nested subqueries in the from clause are supported by most but not all SQL implementations. However, some SQL implementations, notably Oracle, do not support renaming of the result relation in the from clause.

AS another example, suppose we wish to find the maximum across all departments of the total salary at each department. The having clause does not help us in this task, but we can write this query easily by using a subquery in the from clause, as follows:We note that nested subqueries in the from clause cannot use correlation variables from other relations in the from clause. However, SQL:2003 allows a subquery in the from clause that is prefixed by the lateral keyword to access attributes of preceding tables or subqueries in the from clause. For example, if we wish to print the names of each instructor, along with their salary and the average salary in their department, we could write the query as follows:Without the lateral clause, the subquery cannot access the correlation variable I1 from the outer query. Currently, only a few SQL implementation, such as IBM DB2, support the lateral clause.

1.6 The with Clause

The with clause provides a way of defining a temporary relation whose definition is available only to the query in which the with clause occurs. Consider the following query, which finds those departments with the maximum budget.The with clause defines the temporary relation max_budget, which is used in the immediately following query. The with clause, introduced in SQL:1999, is supported by mangy, but not all, database systems.

1.7 Scalar Subqueries

SQL allows subqueries to occur wherever an expression returning a value is permitted, provided the subquery returns only one tuple containing a single attribute; such subqueries are called scalar subqueries. For example, a subquery can be used in the select clause as illustrated in the following example that lists all departments along with the number of instructors in each department:The subquery in the above example is guaranteed to return only a single value since it has a count( * ) aggregate without a group by. The example also illustrates the usage of correlation variables, that is, attributes of relations in the from clause of the outer query, such as department.dept_name in the above example.

Scalar subqueries can occur in select, where, and having clauses. Scalar subqueries may also be defined without aggregates. It is not always possible to figure out at compile time if a subquery can return more than one tuple in its result; if the result has more than one tuple when the subquery is executed, a run-time error occurs.

Note that technically the type of a scalar subquery result is still a relation, even if it contains a single tuple. However, when a scalar subquery is used in an expression where a value is expected, SQL implicity extracts the value from the single attribute of the single tuple in the relation, and returns that value.

2、Modification of the Database

We have restricted our attention until now to the extraction of information from the database. Now, we show how to add, remove, or change information with SQL.

2.1 Deletion

A delete request is expressed in much the same way as a query. We can delete only whole tuples; we cannot delete values on only particular attributes. SQL expresses a deletion by

delete from r where P;

where P represents a predicate and r represents a relation. The delete statement first finds all tuples t in r for which P(t) is true, and then deletes them from r. The where clause can be omitted, in which case all tuples in r are deleted.

Note that a delete command operates on only one relation. If we want to delete tuples from several relations, we must use one delete command for each relation. The predicate in the where clause may be as complex as a select command’s where clause. At the other extreme, the where clause may be empty.

delete from instructor;

The request deletes all tuples from the instructor relation. The instructor relation itself still exists, but it is empty. Here are examples of SQL delete requests:

Delete all tuples in the instructor relation pertaining to instructors in the Finance department.

delete from instructor where dept_name=’Finance’;

Delete all instructors with a salary between $13000 and $15000.

delete from instructor where salary between 13000 and 15000;

Delete all tuples in the instructor relation for those instructors associated with a department located in the Watson building.

delete from instructor where dept_name in (select dept_name from department where building = ‘Watson’);

This delete request first finds all departments located in Watson, and then deletes all instructor tuples pertaining to those departments.

2.2 Insertion

To insert data into a relation, we either specify a tuple to be inserted or write a query whose result is a set of tuples to be inserted. Obviously, the attribute values for inserted tuples must be members of the corresponding attribute’s domain. Similarly, tuples inserted must have the correct number of attributes.

The simplest insert statement is a request to insert one tuple. Suppose that we wish to insert the fact that there is a course CS-437 in the Computer Science department with title “Database Systems”, and 4 credit hours. We write:

发表评论 取消回复

发表评论取消回复