Common Table Expressions (CTEs)
Non-recursive CTEs
What
First the "what": What is a common table expression (CTE)? A matter-of-fact answer is given in [1]:
A CTE is a named subquery that appears at the top of a query in a
WITHclause, which can contain multiple CTEs separated by commas. Along with making queries more understandable, this feature also allows each CTE to refer to any other CTE defined above it in the sameWITHclause.
Why
Then the "why": Most Leetcode queries require you to go beyond what is possible using the tables as they are made available in the provided SQL Schema, especially in relation to aggregate and window functions. Some queries require the use of a derived table, namely a subquery or a CTE.
As noted in [6]:
Arguably the simplest way to create a virtual table that allows you to run queries on window functions or aggregate functions is a subquery. All that's required here is to write the query that you need within parentheses and then to write a second query that uses it.
Why, then, would we want to use something other than a subquery for such purposes?
How
Now the "how": Consider the following Leetcode problem: LC 1098. Unpopular Books . Here is a solution that uses a subquery:
-- answer to LC 1098 that uses a subquery
SELECT
  B.book_id,
  B.name
FROM
  Books B
WHERE
  B.available_from < DATE_SUB('2019-06-23', INTERVAL 1 MONTH)
  AND NOT EXISTS (
    SELECT
      1
    FROM
      Orders O
    WHERE
      O.book_id = B.book_id
      AND O.dispatch_date
        BETWEEN DATE_SUB('2019-06-23', INTERVAL 1 YEAR) AND '2019-06-23'
    GROUP BY
      O.book_id
    HAVING
      SUM(O.quantity) >= 10
  );
And now a solution that uses CTEs:
-- answer to LC 1098 that uses CTEs
WITH eligible_books AS (
  SELECT
    *
  FROM
    Books B
  WHERE B.available_from < DATE_SUB('2019-06-23', INTERVAL 1 MONTH)
), eligible_orders AS (
  SELECT
    *
  FROM
    Orders O
  WHERE O.dispatch_date > DATE_SUB('2019-06-23', INTERVAL 1 YEAR)
)
SELECT
  EB.book_id, EB.name
FROM
  eligible_books EB
  LEFT JOIN eligible_orders EO ON EB.book_id = EO.book_id
GROUP BY
  EB.book_id, EB.name
HAVING
  SUM(EO.quantity) IS NULL OR SUM(EO.quantity) < 10;
Which answer is clearer? Probably the answer using CTEs. Why? Probably because CTEs give you the chance to use named subqueries in a way that makes your code not only easier to read but also easier to structure.
Indeed, as noted in [6], CTEs were intended to overcome some of the limits of subqueries, especially as it relates to enabling the use of recursion within SQL (something that will be addressed momentarily). Consider the following made up query:
WITH head_count_tab (job, HeadCount) AS (
  SELECT
    job,
    COUNT(empno)
  FROM
    emp
  GROUP BY
    job
)
SELECT
  MAX(HeadCount) AS HighestJobHeadCount
FROM
  head_count_tab;
Although this query solves a simple problem, it illustrates the essential features of a CTE. We introduce the derived table using the WITH clause, specifying the column headings in the parentheses (optional in general), and use parentheses around the derived table's query itself. If we want to add more derived tables, we can add more as long as we separate each one with a comma and provide its name before its query (the reverse of how aliasing usually works in SQL).
Because the inner queries are presented before the outer query, in many circumstances they may also be considered more readable; they make it easier to study each logical element of the query separately in order to understand the logical flow. Of course, as with all things in coding, this will vary according to circumstances, and sometimes the subquery will be more readable.
In some cases, it may even make sense to mix subqueries and CTEs; for example, consider the following answer to LC 2084. Drop Type 1 Orders for Customers With Type 0 Orders:
WITH Type0Orders AS (
  SELECT * FROM Orders O1 WHERE O1.order_type = 0
), ValidType1Orders AS (
  SELECT * FROM Orders O2 WHERE O2.customer_id NOT IN (SELECT customer_id FROM Type0Orders) AND O2.order_type = 1
)
SELECT * FROM Type0Orders
UNION ALL
SELECT * FROM ValidType1Orders;
This should give some sense of the versatility that CTEs have to offer. Ultimately, as noted in [6]:
There's not a lot of difference between a subquery and CTE in terms of usability. Both allow for nesting or writing more complicated queries that refer to other derived tables. However, once you start nesting many subqueries, readability is lessened because the meaning of different variables is hidden in successive query layers. In contrast, because a CTE arranges each element vertically, it is easier to understand the meaning of each element.
To bring closure to the point made in the last sentence above, consider the following two problems and their answers that utilize CTEs and what those answers might look like had they simply used subqueries:
Recursive CTEs
The discussion above clearly outlines how CTEs can be powerful agents of clarity and structure, but the discussion fails to highlight the key reason why CTEs were introduced into SQL in the first place: recursion. Enabling recursion within SQL is the primary reason for the existence of CTEs. So how do recursive CTEs work and why/when would we want to use them? To answer this question effectively, it may help to first look at some documentation (we will use Postgres' documentation since it is fairly easy to understand), then some examples, and then finally some Leetcode problems where this construct proves to be quite valuable (the PostgreSQL Exercises site also has helpful material on recursive queries).
Documentation
Reading what the official documentation has to say about WITH queries is a good investment of time, especially the portion about recursive query evaluation (as Postgres points out, the WITH RECURSIVE process is iteration not recursion, but RECURSIVE is the terminology chosen by the SQL standards committee):
The general form of a recursive
WITHquery is always a non-recursive term, thenUNION(orUNION ALL), then a recursive term, where only the recursive term can contain a reference to the query's own output. Such a query is executed as follows:
- Evaluate the non-recursive term. For
UNION(but notUNION ALL), discard duplicate rows. Include all remaining rows in the result of the recursive query, and also place them in a temporary working table.- So long as the working table is not empty, repeat these steps:
- Evaluate the recursive term, substituting the current contents of the working table for the recursive self-reference. For UNION (but not UNION ALL), discard duplicate rows and rows that duplicate any previous result row. Include all remaining rows in the result of the recursive query, and also place them in a temporary intermediate table.
- Replace the contents of the working table with the contents of the intermediate table, then empty the intermediate table.
In more simple syntactical terms, here's what the process outlined above looks like:
WITH RECURSIVE cte_name AS(
  CTE_auxiliary_query_1   -- non-recursive term
  UNION [ALL]
  CTE_auxiliary_query_2   -- recursive term
) CTE_primary_query;
Detailed verbal description of each component above
- WITH RECURSIVE cte_name AS (...): The- RECURSIVEkeyword in- WITH RECURSIVEtells our database that the CTE we are building is not like the normal CTE(s) that can be built using only the- WITHkeyword — our CTE will be built in an iterative fashion instead.- cte_nameis the label or name we are assigning to the CTE built in the iterative process (this name is typically used or referred to in- CTE_auxiliary_query_2and- CTE_primary_query). The- ASkeyword simply denotes how- cte_nameis to be built throughout the iterative process (i.e., the- ...part inside the parentheses).
- CTE_auxiliary_query_1: Evaluate the non-recursive term. When- UNIONis used, duplicate rows will be discarded; when- UNION ALLis used, duplicate rows will be kept. Include all remaining rows in the result of the recursive query (i.e.,- cte_namebeing the result of the recursive query), and also place them in a temporary working table. This working table is what will be used or referred to in- CTE_auxiliary_query_2(i.e., the so-called "recursive term") to really kick off the iterative process.
- CTE_auxiliary_query_2: Evaluate the recursive term, substituting the current contents of the working table for the recursive self-reference; that is, the recursive self-reference will always be- cte_name, but the working table depends on where we are in the iterative process.- Beginning: At the beginning of the iterative process, the result set of CTE_auxiliary_query_1, if any, constitutes the working table and we can refer to this working table ascte_nameinCTE_auxiliary_query_2. Note that the final result of the recursive query, which we ultimately refer to ascte_nameinCTE_primary_queryonce the iterative process has finished, is different from thecte_namereferred to inCTE_auxiliary_query_2during each iteration. (This will become clearer momentarily by means of some examples.)
- Not beginning: For every iteration, except at the beginning, the cte_namereferred to inCTE_auxiliary_query_2refers to the output of the priorCTE_auxiliary_query_2query; that is, for every iteration except the first, the temporary working table is the query output ofCTE_auxiliary_query_2for the previous iteration. Once the recursive termCTE_auxiliary_query_2finishes executing, the following rough sequence of operations takes place:- If the recursive term CTE_auxiliary_query_2has an empty query output, then the query terminates and control is passed toCTE_primary_query.
- If the recursive term CTE_auxiliary_query_2has a non-empty query output, then the following sequence of operations takes place:- The result set from CTE_auxiliary_query_2is appended to the total result set for theWITH RECURSIVEquery (i.e., thecte_namewe refer to inCTE_primary_queryonce the iterative process has finished). Duplicate rows will be kept when usingUNION ALLbut discarded when using onlyUNION.
- The result set from CTE_auxiliary_query_2is placed in a temporary intermediate table.
- The contents of the working table just used by CTE_auxiliary_query_2are replaced by the contents of the intermediate table referred to above. The intermediate table is then emptied so we can repeat this whole process again (i.e., the "Not beginning" process).
 
- The result set from 
 
- If the recursive term 
 
- Beginning: At the beginning of the iterative process, the result set of 
- CTE_primary_query: This is where- WITH RECURSIVEreally pays off, of course. The table we built in an iterative fashion,- cte_name, is now available in its entirety for us to query from however we like. This could be a simple- SELECT * FROM cte_nameor it could be a much more complicated query involving joins and/or whatever else we might want to throw in there.
Examples
Let's now consider some examples (executed using MySQL) to make clear all of the abstract details outlined above:
Generating integers 1 through 10 (with in-depth explanation)
Try executing the following query:
WITH RECURSIVE nums_consecutive AS (
  SELECT 1 AS num
  UNION ALL
  SELECT num + 1 FROM nums_consecutive WHERE num < 10
) SELECT * FROM nums_consecutive;
This query yields the following result set:
+------+
| num  |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
|    7 |
|    8 |
|    9 |
|   10 |
+------+
How does this query work exactly? Here's a detailed breakdown using the previously discussed abstract process, encapsulated in the syntax
WITH RECURSIVE cte_name AS(
  CTE_auxiliary_query_1   -- non-recursive term
  UNION [ALL]
  CTE_auxiliary_query_2   -- recursive term
) CTE_primary_query;
but this time we will have actual data to work with instead of just vague concepts:
- 
WITH RECURSIVE cte_name AS (...): We specify viaWITH RECURSIVEthat we will not just be building a CTE but one that we will build iteratively. Note that we refer tonums_consecutiveinCTE_auxiliary_query_2(i.e., the recursive term):SELECT num + 1 FROM nums_consecutive WHERE num < 10And we also refer to nums_consecutiveinCTE_primary_queryonce the iterative process has finished and the final CTE has been built:SELECT * FROM nums_consecutive;
- 
CTE_auxiliary_query_1: This is the non-recursive term and its result set gives us our first working table:SELECT 1 AS numgives us +-----+
 | num |
 +-----+
 | 1 |
 +-----+as our first working table. 
- 
CTE_auxiliary_query_2: Since the query output from the non-recursive term was non-empty, the iterative process begins and the working table seen above can now be referred to asnums_consecutiveinCTE_auxiliary_query_2:SELECT num + 1 FROM nums_consecutive WHERE num < 10Since nums_consecutiveabove really refers to the working table+-----+
 | num |
 +-----+
 | 1 |
 +-----+it's not hard to see why the result set of CTE_auxiliary_query_2(i.e.,SELECT num + 1 FROM nums_consecutive WHERE num < 10) for this iteration is simply+-----+
 | num |
 +-----+
 | 2 |
 +-----+Pay special attention to the fact that the result set here is not +-----+
 | num |
 +-----+
 | 1 |
 | 2 |
 +-----+but simply +-----+
 | num |
 +-----+
 | 2 |
 +-----+That is, the total result set being iteratively built right now may be +-----+
 | num |
 +-----+
 | 1 |
 | 2 |
 +-----+but this is not the current working table. The current working table is +-----+
 | num |
 +-----+
 | 1 |
 +-----+and the temporary intermediate table is +-----+
 | num |
 +-----+
 | 2 |
 +-----+By means of UNION ALL, the total result set being iteratively built is+-----+
 | num |
 +-----+
 | 1 |
 | 2 |
 +-----+but we have a way to go before the iterative process terminates. As indicated in the descriptions directly before this example began, we now replace the contents of the working table +-----+
 | num |
 +-----+
 | 1 |
 +-----+with the contents of the intermediate table +-----+
 | num |
 +-----+
 | 2 |
 +-----+and then we empty the intermediate table so it can be +-----+
 | num |
 +-----+
 | |
 +-----+for the pending iteration. For the next iteration, with +-----+
 | num |
 +-----+
 | 2 |
 +-----+as the working table, we see that our CTE_auxiliary_query_2,SELECT num + 1 FROM nums_consecutive WHERE num < 10, will give us+-----+
 | num |
 +-----+
 | 3 |
 +-----+as the new temporary intermediate table, which will then replace the working table +-----+
 | num |
 +-----+
 | 2 |
 +-----+and so on and so forth. This process will continue for some time until our working table is +-----+
 | num |
 +-----+
 | 9 |
 +-----+at which point our CTE_auxiliary_query_2,SELECT num + 1 FROM nums_consecutive WHERE num < 10, will give us+-----+
 | num |
 +-----+
 | 10 |
 +-----+as the new temporary intermediate table. This table replaces +-----+
 | num |
 +-----+
 | 9 |
 +-----+as the working table and when our recursive term is executed again, that is, when SELECT num + 1 FROM nums_consecutive WHERE num < 10runs against the working table+-----+
 | num |
 +-----+
 | 10 |
 +-----+we actually come up empty since the WHERE num < 10condition is not satisfied. Now both the working table and temporary intermediate table are empty and there's nothing left for the recursive term to run a query against; hence, the query terminates and control is passed toCTE_primary_query(i.e.,SELECT * FROM nums_consecutive;in this example).
- 
CTE_primary_query: The table we built in an iterative fashion,nums_consecutive, is now available in its entirety for us to run queries against as we please. Since we want everything from this small table in this example, we simply executeSELECT * FROM nums_consecutive;and this gives us the result set of having all the working tablesUNIONedALLtogether:+------+
 | num |
 +------+
 | 1 |
 | 2 |
 | 3 |
 | 4 |
 | 5 |
 | 6 |
 | 7 |
 | 8 |
 | 9 |
 | 10 |
 +------+For this example, note how the working table has just a single row in each step, and it takes on the values from 1through10in successive steps. In the 10th step, there is no output because of theWHEREclause, and so the query terminates.
Generating sequential numbers fitting a certain pattern
Consider the query
WITH RECURSIVE seq_nums AS (
  SELECT
    5 AS num,
    0 AS iteration
  UNION ALL
  SELECT
    (2 * num + num),
    iteration + 1
  FROM
    seq_nums
  WHERE
    iteration < 5
) SELECT * FROM seq_nums;
with
+------+-----------+
| num  | iteration |
+------+-----------+
|    5 |         0 |
|   15 |         1 |
|   45 |         2 |
|  135 |         3 |
|  405 |         4 |
| 1215 |         5 |
+------+-----------+
as its result set. Note how, just as in the previous example, the working table has just a single row in each step, and it takes on the values 5, 15, 45, 135, and 405 in successive steps. Each iteration value simply denotes which application of CTE_auxiliary_query_2 has been used to generate the next row (the first row indicates the initial application of the non-recursive term CTE_auxiliary_query_1 as well as the first application of CTE_auxiliary_query_2 to generate the next row; the final row indicates the 5th application of CTE_auxiliary_query_2 which has an empty query output, thus terminating the query).
Finding the sum of the first 100 consecutive positive integers
A classic question from the school days: Find the sum of the first 100 consecutive positive integers. But this time use SQL implementing a WITH RECURSIVE solution! We can use the work we did in the first example to generate numbers 1 through 100, inclusive, and then our CTE_primary_query can involve an aggregate function instead of just selecting everything from the CTE we just generated:
WITH RECURSIVE nums_consecutive AS (
  SELECT 1 AS num
  UNION ALL
  SELECT num + 1 FROM nums_consecutive WHERE num < 100
) SELECT SUM(num) AS total_sum FROM nums_consecutive;
Result set:
+-----------+
| total_sum |
+-----------+
|      5050 |
+-----------+
Generating the first 15 Fibonacci numbers
The following is a somewhat more complicated usage of WITH RECURSIVE:
WITH RECURSIVE fib_table (FibBuilder, FibNum, FibIndex) AS (
  SELECT 1, 0, 0
  UNION ALL
  SELECT FibNum, FibNum + FibBuilder, FibIndex + 1 FROM fib_table WHERE FibIndex < 15
)
SELECT FibBuilder, FibNum, FibIndex FROM fib_table;
The result set:
+------------+--------+----------+
| FibBuilder | FibNum | FibIndex |
+------------+--------+----------+
|          1 |      0 |        0 |
|          0 |      1 |        1 |
|          1 |      1 |        2 |
|          1 |      2 |        3 |
|          2 |      3 |        4 |
|          3 |      5 |        5 |
|          5 |      8 |        6 |
|          8 |     13 |        7 |
|         13 |     21 |        8 |
|         21 |     34 |        9 |
|         34 |     55 |       10 |
|         55 |     89 |       11 |
|         89 |    144 |       12 |
|        144 |    233 |       13 |
|        233 |    377 |       14 |
|        377 |    610 |       15 |
+------------+--------+----------+
LeetCode problems
The following LeetCode problems provide instances where the WITH RECURSIVE construct can be utilized. This list is not exhaustive. Right click a widget title to visit the actual problem and try your hand or simply click the widget title to see a possible solution.
LC 571. Find Median Given Frequency of Numbers
WITH RECURSIVE nums_full_list AS (
  SELECT 
    N.Number, 1 AS num_count, N.Frequency 
  FROM 
    Numbers N
  UNION ALL
  SELECT 
    NFL.Number, NFL.num_count + 1, NFL.Frequency 
  FROM 
    nums_full_list NFL 
  WHERE NFL.num_count < NFL.Frequency
), ordered_nums AS (
  SELECT
    ROW_NUMBER() OVER (ORDER BY NFL.Number) AS num_order,
    NFL.Number
  FROM 
    nums_full_list NFL
), eligible_medians AS (
  SELECT
    O.Number
  FROM
    ordered_nums O
  WHERE
    O.num_order BETWEEN 
      (SELECT FLOOR(AVG(num_order)) FROM ordered_nums) 
      AND (SELECT CEIL(AVG(num_order)) FROM ordered_nums)
)
SELECT
  AVG(E.Number) AS median
FROM
  eligible_medians E;
LC 579. Find Cumulative Salary of an Employee
WITH RECURSIVE months_range AS (
  SELECT 1 AS month_num
  UNION ALL
  SELECT month_num + 1 FROM months_range WHERE month_num < 12
), emps_and_months AS (
  SELECT
    *
  FROM
    (SELECT DISTINCT Id FROM Employee) X, months_range
), cum_sal_info AS (
  SELECT
    EM.Id,
    EM.month_num AS Month,
    E.Salary,
    SUM(E.Salary) OVER (
      PARTITION BY EM.Id
      ORDER BY
        EM.month_num ROWS BETWEEN 2 PRECEDING
        AND 0 FOLLOWING
    ) AS cum_sal
  FROM
    emps_and_months EM
    LEFT JOIN Employee E
      ON EM.Id = E.Id AND EM.month_num = E.Month
)
SELECT
  C.Id,
  C.Month,
  C.cum_sal AS Salary
FROM
  cum_sal_info C
  INNER JOIN Employee E1
  ON E1.Id = C.Id
    AND E1.Month = C.Month
    AND C.Month < (
      SELECT
        MAX(E2.Month)
      FROM
        Employee E2
      WHERE
        E2.Id = E1.Id
    )
ORDER BY
  C.Id,
  C.Month DESC;
LC 1270. All People Report to the Given Manager
WITH RECURSIVE boss_chain AS (
  SELECT
    E1.employee_id
  FROM
    Employees E1
  WHERE
    E1.employee_id != 1 AND E1.manager_id = 1
  UNION ALL
  SELECT
    E.employee_id
  FROM
    boss_chain B
    INNER JOIN Employees E
    ON B.employee_id = E.manager_id
)
SELECT DISTINCT
  B.employee_id
FROM
  boss_chain B;
LC 1336. Number of Transactions per Visit
WITH RECURSIVE trans_counts AS (
  SELECT
    V.user_id,
    V.visit_date,
    COUNT(T.transaction_date) AS trans_count
  FROM
    Visits V
    LEFT JOIN Transactions T
      ON V.visit_date = T.transaction_date 
        AND V.user_id = T.user_id
  GROUP BY
    V.user_id, V.visit_date
), trans_count_vals AS (
  SELECT MAX(TC.trans_count) as trans_id FROM trans_counts TC
  UNION ALL
  SELECT trans_id - 1 FROM trans_count_vals WHERE trans_id > 0
)
SELECT
  TCV.trans_id AS transactions_count,
  COUNT(TC.trans_count) AS visits_count
FROM
  trans_count_vals TCV
  LEFT JOIN trans_counts TC ON TC.trans_count = TCV.trans_id
GROUP BY
  TC.trans_count, TCV.trans_id
ORDER BY
  transactions_count;
LC 1384. Total Sales Amount by Year
WITH RECURSIVE min_max_dates AS (
  SELECT
    MIN(S.period_start) min_period_date, 
    MAX(S.period_end) max_period_date
  FROM Sales S
), period_dates AS (
  SELECT 
    min_period_date AS period_date 
  FROM 
    min_max_dates
  UNION ALL
  SELECT 
    DATE_ADD(period_date, INTERVAL 1 day) 
  FROM 
    period_dates
  WHERE 
    period_date <= (SELECT max_period_date FROM min_max_dates)
)
SELECT
  S.product_id,
  P.product_name,
  LEFT(PD.period_date, 4) AS report_year,
  COUNT(PD.period_date) * S.average_daily_sales AS total_amount
FROM
  Sales S
  INNER JOIN Product P 
    ON P.product_id = S.product_id
  INNER JOIN period_dates PD 
    ON PD.period_date BETWEEN S.period_start AND S.period_end
GROUP BY
  S.product_id, P.product_name, report_year, S.average_daily_sales
ORDER BY
  S.product_id, report_year;
LC 1613. Find the Missing IDs
WITH RECURSIVE possible_id_values AS (
  SELECT MAX(C.customer_id) AS id_val FROM Customers C
  UNION ALL
  SELECT id_val - 1 FROM possible_id_values WHERE id_val > 1
)
SELECT
  P.id_val AS ids
FROM
  possible_id_values P
WHERE
  P.id_val NOT IN (SELECT C2.customer_id FROM Customers C2)
ORDER BY
  P.id_val;
LC 1635. Hopper Company Queries I
WITH RECURSIVE months AS (
  SELECT 1 AS month, '2020-01-31' AS month_end
  UNION ALL
  SELECT month + 1, LAST_DAY(DATE_ADD(month_end, INTERVAL 1 MONTH)) FROM months WHERE month < 12
), active_drivers_by_month AS (
  SELECT
    M.month,
    COUNT(D.driver_id) AS num_drivers
  FROM
    months M
    LEFT JOIN Drivers D ON D.join_date <= M.month_end
  GROUP BY
    M.month
)
SELECT
  M.month,
  AD.num_drivers AS active_drivers,
  COUNT(R.requested_at) AS accepted_rides
FROM
  (SELECT ride_id, requested_at FROM Rides WHERE YEAR(requested_at) = 2020) R
  INNER JOIN AcceptedRides AR ON AR.ride_id = R.ride_id
  RIGHT JOIN months M ON M.month = MONTH(R.requested_at)
  INNER JOIN active_drivers_by_month AD ON M.month = AD.month
GROUP BY
  M.month
ORDER BY
  M.month;
LC 1645. Hopper Company Queries II
WITH RECURSIVE months AS (
  SELECT 1 AS month, '2020-01-31' AS month_end
  UNION ALL
  SELECT month + 1, LAST_DAY(DATE_ADD(month_end, INTERVAL 1 MONTH)) FROM months WHERE month < 12
), active_drivers_by_month AS (
  SELECT
    M.month,
    COUNT(D.driver_id) AS num_drivers
  FROM
    months M
    LEFT JOIN Drivers D ON D.join_date <= M.month_end
  GROUP BY
    M.month
), month_stats AS (
  SELECT
    M.month,
    COUNT(DISTINCT AR.driver_id) AS accepting_drivers,
    AD.num_drivers AS active_drivers,
    COUNT(R.requested_at) AS accepted_rides
  FROM
    (SELECT ride_id, requested_at FROM Rides WHERE YEAR(requested_at) = 2020) R
    INNER JOIN AcceptedRides AR ON AR.ride_id = R.ride_id
    RIGHT JOIN months M ON M.month = MONTH(R.requested_at)
    INNER JOIN active_drivers_by_month AD ON M.month = AD.month
  GROUP BY
    M.month
  ORDER BY
    M.month
)
SELECT
  MS.month,
  (CASE
    WHEN MS.accepted_rides = 0 THEN 0
    ELSE ROUND(100 * MS.accepting_drivers / MS.active_drivers, 2)
  END) AS working_percentage
FROM
  month_stats MS
ORDER BY
  MS.month;
LC 1651. Hopper Company Queries III
WITH RECURSIVE months AS (
  SELECT 1 AS month, '2020-01-31' AS month_end
  UNION ALL
  SELECT month + 1, LAST_DAY(DATE_ADD(month_end, INTERVAL 1 MONTH)) FROM months WHERE month < 12
), active_drivers_by_month AS (
  SELECT
    M.month,
    COUNT(D.driver_id) AS num_drivers
  FROM
    months M
    LEFT JOIN Drivers D ON D.join_date <= M.month_end
  GROUP BY
    M.month
), month_stats AS (
  SELECT
    M.month,
    SUM(IFNULL(AR.ride_distance,0)) AS ride_distance,
    SUM(IFNULL(AR.ride_duration,0)) AS ride_duration
  FROM
    (SELECT ride_id, requested_at FROM Rides WHERE YEAR(requested_at) = 2020) R
    INNER JOIN AcceptedRides AR ON AR.ride_id = R.ride_id
    RIGHT JOIN months M ON M.month = MONTH(R.requested_at)
    INNER JOIN active_drivers_by_month AD ON M.month = AD.month
  GROUP BY
    M.month
)
SELECT
  MS.month,
  ROUND(AVG(MS.ride_distance) OVER(ORDER BY MS.month ROWS BETWEEN 0 PRECEDING AND 2 FOLLOWING),2) 
    AS average_ride_distance,
  ROUND(AVG(MS.ride_duration) OVER(ORDER BY MS.month ROWS BETWEEN 0 PRECEDING AND 2 FOLLOWING),2) 
    AS average_ride_duration
FROM
  month_stats MS
ORDER BY
  MS.month
LIMIT 10;
LC 1767. Find the Subtasks That Did Not Execute
WITH RECURSIVE subtask_listing AS (
  SELECT T.task_id, T.subtasks_count AS subtask_id FROM Tasks T
  UNION ALL
  SELECT SL.task_id, SL.subtask_id - 1
  FROM subtask_listing SL
  WHERE SL.subtask_id > 1
)
SELECT
  *
FROM
  subtask_listing SL
WHERE
  NOT EXISTS (
    SELECT
      1
    FROM
      Executed E
    WHERE
      SL.task_id = E.task_id
      AND SL.subtask_id = E.subtask_id
  );
LC 1843. Suspicious Bank Accounts
WITH RECURSIVE activity_months AS (
  SELECT
    T.account_id,
    DATE_FORMAT(MAX(T.day), '%Y-%m-01') AS activity_month,
    DATE_FORMAT(MIN(T.day), '%Y-%m-01') AS min_date
  FROM
    Transactions T GROUP BY T.account_id
  UNION ALL
  SELECT
    AD.account_id, 
    DATE_SUB(AD.activity_month, INTERVAL 1 MONTH), min_date
  FROM activity_months AD WHERE activity_month > min_date
) SELECT * FROM activity_months ORDER BY account_id, activity_month;