The SQL topics to be covered in this article are done using Microsoft SQL. The databases used are the popular Adventure Works and Northwind databases.
In SQL, ORDER BY is a way to order the result set of a query by a column. This can be done in ascending or descending order for columns with numerical variables, and alphabetically for columns with string or text values. In the diagram below with the respective query, the result set isn’t ordered by any column, hence the product name column isn’t ordered alphabetically.
To order the result of the query alphabetically or numerically use the ORDER BY function. This comes after the FROM statement. In the image below, the query is ordered by the standard cost column. This column happens to be in a numerical format, so it is automatically ordered in ascending order. To order in descending order, we use “DESC”. The result of that would look like this:
In the above image, the column “Standard Cost” has the rows of numerical values ordered from largest to smallest. Also, “DESC” comes after the column being used to order the query.
The GROUP BY function is the easiest way to return aggregated and non-aggregated data together from your database. It is a function that comes before the ORDER BY statement when both are used in a query. The GROUP BY clause is also used when working with aggregate functions such as SUM, AVERAGE, COUNT, MIN, MAX.
For example, let’s say your line manager wants you to find out the total quantity of orders per respective order ID. How do you go about this?
First, select the order ID and quantity columns from the order details table in the database like so.
The result of this query will look like this:
Take note of the duplicate Order IDs such as 10248. They have order quantities that add up to 27. Make use of the SUM function to sum the quantities for each order ID. This is an aggregate function that is used to add up records. In this case, I want to add the total order quantities per order ID. I already have my order ID as a unique identifier for each order and my quantity column.
When I run the query above, I get this error:
Column ‘dbo.Order Details.OrderID’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
This error message is returned because the quantity column has been aggregated by the SUM function while the order ID column remains unaggregated. It can remain unaggregated but, to resolve the error message the next step would be to group your resulting dataset by the column that has no aggregate function, which would be the order ID column.
A query summing the quantity without the order ID column would look like this:
And this is what the result of that query would look like:
Similarly, the result of a query checking for only the order IDs would look like this:
These are just the first eleven rows out of 2,155 rows of unique order IDs. So, imagine how difficult it would be for the database to return an aggregated result of order quantities with an unaggregated result of order IDs, side by side.
This is where the importance of the GROUP BY clause comes in.
With the GROUP BY clause, the result of this query would be:
And these are just the first eleven of 830 rows of records. The Order ID 10248 now shows a corresponding row sum of 27, as we noticed earlier.
GROUP BY is a very simple but powerful way to work with data. It is also good to remember that the GROUP BY clause is applied when there’s a need to sort a specific set of rows into a set of summary rows by the values of one or more columns. In summary:
|ORDER BYs||GROUP BYs|
|They are used to order/arrange the result set of a query by a column(s)||They are used to group/categorize a result set of a query by a column(s)|
|They cannot be used to handle aggregated & non-aggregated data in a query||They are used to organize aggregated & non-aggregated data in a query|
|They never come before Group Bys in SQL syntax or SQL-speak when used together||They always come before Order Bys when used together|