A Common Table Expression (CTE) is a name given to a temporary result set of a query that is required in the context of a more significant query. These temporary result sets are part of the query that can be referenced within the SELECT, UPDATE, INSERT, or DELETE statements. Common table expressions are not stored anywhere; they only exist for the duration until the more significant query is executed. The Common Table Expression enables us to write a complex query in a more readable and simplified format.
WITH cte_name (column_names, … ) AS ( CTE Query definition ) SELECT …. ( MAIN QUERY )
Common Table Expression is defined using the WITH clause; there can be more than one CTE defined in one SQL statement, separated by commas. Postgres doesn’t allow you to have more than one WITH statement in one query.
WITH cte_name (column_names, … ) AS ( CTE Query definition ), cte_name2 (column_names, … ) AS ( CTE Query definition ) SELECT …. ( MAIN QUERY )
Common table expressions can be divided into two categories: non-recursive and recursive.
Non-Recursive Common Table Expression
Non-recursive CTEs can be used when we want to group data into a single name and return it as one result set. We can then use this result set in our queries. They have several advantages over regular queries, as they are used to simplify complex joins and subqueries.
WITH CTE_NAME (number) AS ( -- Regular query SELECT 1 ) SELECT * FROM CTE_NAME;
Recursive Common Table Expressions
A recursive common table expression (CTE) is a CTE that references itself or recursively calls itself. By doing so, CTEs repeatedly execute and return subsets of data until the complete result set of data is returned. Recursive common table expressions have the anchor member and the recursive member within the simple query.
WITH RECURSIVE CTE_NAME (number) AS ( -- Anchor Member SELECT 1 AS number UNION ALL -- Recursive Member SELECT number + 1 FROM CTE_NAME WHERE number <= 5 ) SELECT number FROM CTE_NAME;
Common Table Expression in Postgres
The common table expression in Postgres uses the WITH clause, which allows assigning names to the sub-query and using the result set in the main query. PostgreSQL provides the WITH and AS statements that support the design of auxiliary queries, also known as CTEs (Common Table Expressions). This WITH query is evaluated only once per execution of the main query, even if it refers to it multiple times in the main query.
CTE Syntax in Postgres
WITH CTE_NAME (COL_LIST) AS ( CTE_QUERY ) STATEMENT;
Analysing the above syntax:
- The WITH clause is used to define a temporary relationship using a temporary query.
- CTE_NAME stores the result set from the temporary query
- In the above query, COL_LIST is a list of columns in the result set.
- STATEMENT is our main query, which uses the result set stored in CTE_NAME
In the above CTE syntax, as we have seen, the column list is defined beside the CTE name. Postgres also allows defining this column list within a sub-query.
WITH CTE_NAME AS ( CTE_QUERY AS COL_LIST ) STATEMENT;
Postgres CTE examples
For understanding CTEs in Postgres in a better way. Let’s consider an example e-commerce shop that has a table called orders that stores sales data for each product.
For creating Orders Table
CREATE TABLE orders ( id SERIAL PRIMARY KEY, order_number INT NOT NULL, product_name VARCHAR(10) NOT NULL, order_date DATE DEFAULT CURRENT_DATE, total_price INT NOT NULL );
Now, using non-recursive CTEs, we will find the total price and count of each product that is sold using this order table.
WITH CTE_NAME (product_name, total_count, total_price) AS ( SELECT product_name, COUNT(product_name), SUM(total_price) FROM orders GROUP BY product_name ) SELECT * FROM CTE_NAME;
Understanding the above code
The above PostgreSQL code has a CTE under the name CTE_NAME, which selects three columns product_name, total_count, and total_price from the orders table, subquery used GROUP BY statement over product_name to find these columns and then used in the main query to get the following output.
Recursive Common Table Expression in Postgres
In recursive CTEs, the subquery within the CTE expression calls its own result set. They can be used for querying hierarchical data, such as tracking lineage or finding organisational structure in the data sets. Postgres uses the RECURSIVE keyword in order to define recursive CTEs.
Recursive CTE Syntax in Postgres
WITH RECURSIVE CTE_NAME AS( CTE_query_definition -- anchor member UNION [ ALL ] CTE_query CTE_NAME -- recursive member ) SELECT * FROM CTE_NAME;
Analyzing the above syntax:
- Postgres uses the RECURSIVE keyword in order to define recursive CTEs.
- An anchor member works like a base condition for the termination of these recursive calls.
- One or more subquery definitions in CTEs are joined with the non-recursive term through the UNION or UNION ALL operator.
- The recursive member calls its own result set until it gets the complete result set.
Example of a Factorial program Using Recursive CTEs in Postgres
WITH RECURSIVE CTE_NAME (line_number, factorial) AS ( VALUES (0, 1) -- base subquery UNION ALL SELECT line_number + 1, (line_number + 1) * factorial FROM CTE_NAME -- recursive subquery ) SELECT * FROM CTE_NAME LIMIT 5;
The following code uses recursive CTE to get factorial series with anchor values (0, 1) and the UNION ALL statement to join both anchor and recursive subquery together. In the main query, we have a LIMIT statement to define the number of rows to display.
Postgres Recursive CTE Example
Recursive CTEs are also good when we have hierarchical data in our table. To understand this, let us consider an example of an e-commerce store where we have different categories of products.
This flowchart shows the e-commerce store under which we have clothing and accessories sections, and clothing is the parent of men's and women's clothing. In the same way, grooming and footwear are children of accessories.
Creating an e-commerce table
CREATE TABLE ecommerce ( id SERIAL PRIMARY KEY, name VARCHAR(10) NOT NULL, level INT NOT NULL, parent_id INT NOT NULL );
Inserting values in the above table
INSERT INTO ecommerce (id, name, level, parent_id) VALUES (0, 'E Commerce store', 1, NULL), (1, 'Clothing', 2, 0), (2, 'Accessories', 2, 0), (3, 'Men Clothing', 3, 1), (4, 'Women Clothing', 3, 1), (5, 'Grooming', 3, 2), (6, 'Footwear', 3, 2);
Now let’s look at the query to fetch all children for the given id. For this, we will define a subquery with an anchor member with the root node id and union it with the recursive member in order to fetch all children's root nodes.
WITH RECURSIVE CTE_NAME AS ( SELECT id, name, level, parent_id FROM ecommerce e1 WHERE e1.id = 1 -- root node id ( base query ) UNION SELECT e2.id, e2.name, e2.level, e2.parent_id FROM ecommerce e2 JOIN CTE_NAME ON e2.parent_id = CTE_NAME.id -- recursive subquery ( to get all children of the root node ) ) SELECT * FROM CTE_NAME;
Basic Guidelines For Writing a Good CTE Query
- CTEs must be followed by a single Data Manipulation Language (DML) statements such as SELECT, INSERT, UPDATE, or DELETE that references the CTEs column
- Multiple CTEs can be defined in one query using a comma
- A CTE can refer to itself or other CTE in the same query
- Only one WITH Clause can be defined in a single query
- Recursive CTE might result in a loop, to avoid this loop LIMIT statement can be used
CTEs can be used instead of views and join which allows us to simplify for readability and maintainability of code. CTEs are a powerful tool, that enables solving some problems in a very simple and elegant way. Recursive CTEs are used in places where tables have hierarchical data or we need to track lineage within data.