C112: Other query clauses – Kimmer's Course Notes

In the previous notes on single-table SELECT FROM WHERE queries, we saw that the SQL statement describes the table that will provide the results for the given information need. We can get a lot of mileage out of writing SELECT FROM WHERE queries and accepting the results table generated by default, but it’s also useful to be able to customize those results sometimes. SQL provides additional clauses that can be added after SELECT FROM WHERE to customize the results table.

Sorting result set rows

We can use other clauses in a query to control the order or number of rows in the result set. The ORDER BY clause allows us to control the order (big surprise there!) of rows we get back. We can specify a column or columns to sort the results by. ORDER BY will come after the WHERE clause in SELECT FROM WHERE query. If we leave out the WHERE clause, it comes after the FROM clause. If we want to sort sailors by name, we can use

SELECT *
FROM Sailors
WHERE age < 40
ORDER BY sname

By default, this will alphabetize the results (i.e. sort from least to greatest, ascending order). To sort in descending order, we can add the DESC keyword:

SELECT *
FROM Sailors
WHERE age < 40
ORDER BY sname DESC

There is a corresponding ASC keyword we can use (ORDER BY sname ASC), but since it’s the default, it’s not required. We can sort by multiple columns by separating by commas in the ORDER BY clause. To sort by age from oldest to youngest (descending order) but then organize all sailors with the same as in alphabetical order by name, we can write

SELECT *
FROM Sailors
WHERE age < 40
ORDER BY age DESC, sname

Controlling the number of rows

The LIMIT clause allows you to describe the maximum number of rows in your result. It’s a maximum number, because generally before running a query, you may not be sure how many rows it would actually return. It also allows you to optionally specify how many of the first rows to not include in the result. The LIMIT clause comes after ORDER BY, or if there’s no ORDER BY after WHERE, or if there’s no ORDER BY and no WHERE, after FROM!

Here are all 10 of our sailors sorted by name in alphabetical order.

SELECT * 
FROM Sailors
ORDER BY sname

sid	sname	rating	age
32	Andy	8	25.5
85	Art	3	25.5
95	Bob	3	63.5
29	Brutus	1	33.0
22	Dustin	7	45.0
64	Horatio	7	35.0
74	Horatio	9	35.0
31	Lubber	8	55.5
58	Rusty	10	35.0
71	Zorba	10	16.0
sid	sname	rating	age

To keep the first 5 sailors, we would add LIMIT 5 as the last clause in the query (it even goes after ORDER BY)

SELECT * 
FROM Sailors
ORDER BY sname
LIMIT 5

sid	sname	rating	age
32	Andy	8	25.5
85	Art	3	25.5
95	Bob	3	63.5
29	Brutus	1	33.0
22	Dustin	7	45.0
sid	sname	rating	age

If we had used a value greater than 10 in that query, like LIMIT 15, then we would get 10 rows back–the full table–since the database wouldn’t have any way to come up with a version of that table with 15 rows without making stuff up!

LIMIT clause with offsets

We can add one more value to the LIMIT clause to tell the database to begin returning rows after a specified number called the offset. If we wanted to start returning rows after Brutus, we would need to know Brutus is row number 4 and then specify that value as the offset.

SELECT * 
FROM Sailors
ORDER BY sname
LIMIT 4, 5

sid	sname	rating	age
22	Dustin	7	45.0
64	Horatio	7	35.0
74	Horatio	9	35.0
31	Lubber	8	55.5
58	Rusty	10	35.0
sid	sname	rating	age

So when we specify x as the offset we begin getting results with row x + 1. The second value is still the maximum number of rows we will get back. So in the example above we get 5 rows starting from row 5 (i.e. we get rows 5, 6, 7, 8, and 9). If the table didn’t have at least9 rows in it, we would have gotten fewer than 5 rows back.

Getting Rid of Duplicates

This one is not really its own clause, but it sort of fits the theme. Imagine we want to know the set of ratings that the sailors have. This typically means we’re not interested in knowing two have a rating of 3, two a rating of 7, and so on. We’re interested in just knowing that 1,3,7,8,9 are the ratings sailors have in our club. We can use the DISTINCT keyword right after SELECT to remove duplicates.

SELECT DISTINCT rating
FROM Sailors

DISTINCT works with any number of columns. It looks at the entire row of the result and only removes values for which all columns are unique. If we have SELECT DISTINCT sname, sid we would see Horatio appear twice because one Horatio has sid 64 and one has 74. If we have SELECT DISTINCT sname, age we only see Horatio appear once because both Horatios have the same name and the same age.

Renaming Columns in the Result

The final tidbit is also not a separate clause, but it fits the theme here at the end. Since a SELECT FROM WHERE query always returns a table, we may like some control over the names of the columns in that table. Sid may be a great name for a Sailor’s id internally, but we may need to present a better label to external consumers. We may use an alias in the SELECT clause to change the name of a column in the result. All we have to do is provide the new name after the AS keyword right after the column we wish to rename. The renaming is only in the result returned from the query. It has no effect on the original tables in the database.

SELECT sid AS SailorIdentificationNumber
FROM Sailors