Skip to main content

Templates for Data Structures and Algorithms

Contents

Exercise symbol legend
SymbolDesignation
This is a problem from LeetCode's interview crash course.
This is a problem stemming from work done through Interviewing IO.
A right-aligned ★ (one or more) indicates my own personal designation as to the problem's relevance, importance, priority in review, etc.

Backtracking

Remarks

TBD

def fn(curr, OTHER_ARGUMENTS...):
if (BASE_CASE):
# modify the answer
return

ans = 0
for (ITERATE_OVER_INPUT):
# modify the current state
ans += fn(curr, OTHER_ARGUMENTS...)
# undo the modification of the current state

return ans
Examples
LC 46. Permutations (✓) ★★

Given an array nums of distinct integers, return all the possible permutations. You can return the answer in any order.


class Solution:
def permute(self, nums: List[int]) -> List[List[int]]:
def backtrack(curr):
if len(curr) == len(nums):
permutations.append(curr[:]) # note that we append a _copy_ of curr
return

for num in nums:
if num not in curr:
curr.append(num)
backtrack(curr)
curr.pop()

permutations = []
backtrack([])

return permutations

What actually is a permutation of nums? It's essentially all possible orderings of the elements of nums such that no element is duplicated. A backtracking strategy to generate all permutations sounds promising — what would the base case be? It would be when the current permutation being generated, say curr, has the same length as the input array nums: curr.length == nums.length (of course, this assumes we've done our due diligence and have prevented duplicates from being added to curr). The base case of curr.length == nums.length means we have completed the process of generating a permutation and we cannot go any further; specifically, if we look at the process of generating permutations as a tree, then completing the generation of a permutation means we have reached a leaf node, as illustrated in the following image for the input nums = [1, 2, 3]:

Building all permutations for this problem, where the input is an array of numbers, means we need all elements to make an appearance at the first index, all other elements to make an appearance at the second index, and so on. Hence, we should loop over all elements of nums for each call to our backtrack function, where we should always check to see if a number is already in curr before adding it to curr. Each call to backtrack is like visiting a node in the tree of candidates being generated. The leaves are the base cases/answers to the problem.

For the solution given above, if we simply add print(curr) after the line curr.append(num), then we can very clearly see how each call to backtrack is like visiting a node in the tree (it's like performing a DFS on an imaginary tree):

[1]
[1, 2]
[1, 2, 3] # complete permutation (leaf node)
[1, 3]
[1, 3, 2] # complete permutation (leaf node)
[2]
[2, 1]
[2, 1, 3] # complete permutation (leaf node)
[2, 3]
[2, 3, 1] # complete permutation (leaf node)
[3]
[3, 1]
[3, 1, 2] # complete permutation (leaf node)
[3, 2]
[3, 2, 1] # complete permutation (leaf node)

The entire list of complete permutations is then returned as the answer:

[[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]

The time complexity above effectively amounts to roughly O(n2n!)O(n^2\cdot n!), because we iterate through all of nums for each call to backtrack, and membership to curr is a linear cost when curr is an array, and then we're guaranteed to make nn calls to backtrack, and each call to backtrack then results in n1n - 1 calls, and so forth. If we wanted to make a micro-optimization, then we could introduce a hash set to make the membership checks on curr O(1)O(1) instead of O(n)O(n), but this change pales in comparison to the factorial cost of calling backtrack so many times:

class Solution:
def permute(self, nums: List[int]) -> List[List[int]]:
def backtrack(curr):
if len(curr) == len(nums):
permutations.append(curr[:])
return

for num in nums:
if num not in lookup:
curr.append(num)
lookup.add(num)
backtrack(curr)
curr.pop()
lookup.remove(num)

lookup = set()
permutations = []
backtrack([])

return permutations
LC 78. Subsets (✓) ★★

Given an integer array nums of unique elements, return all possible subsets (the power set). The solution set must not contain duplicate subsets. Return the solution in any order.


class Solution:
def subsets(self, nums: List[int]) -> List[List[int]]:
def backtrack(curr, num_idx):
subs.append(curr[:])

if num_idx == len(nums): # note that this base case is implied
return # since `backtrack` is only called in the
# for loop when `num_idx < len(nums)`

for i in range(num_idx, len(nums)):
curr.append(nums[i])
backtrack(curr, i + 1)
curr.pop()

subs = []
backtrack([], 0)

return subs

This problem is quite similar to LC 46. Permutations but with some very notable differences, namely container length and element order:

  • Length: A subset can have any length from 0 through n (where n is the size of the input array of distinct integers), inclusive, but a permutation has a fixed length of n.
  • Order: The containers [1, 2, 3] and [3, 2, 1] are considered to be different permutations but the same subset.

If we have a problem where containers like those above are considered to be duplicates and we do not want to consider duplicates (e.g., such as this problem concerned with finding subsets), then a common "trick" is to add a rule where each call of the backtrack function allows us to only consider elements that come after the previously processed element:

This is a very common method of avoiding duplicates in backtracking problems — have an integer argument that represents a starting point for iteration at each function call.

For example, in this problem, we start with the root being the empty container, []:

[]

With nums = [1, 2, 3], we can clearly consider each element as the beginning of its own subset:

     [ ]
/ | \
[1] [2] [3]

Now what? Remember that calling backtrack is like moving to another node; hence, to respect the strategy remarked on above, when we move to another node, that node should only involve elements that come after the one we have just processed. This means the tree of possibilities above should end up looking like the following:

         [ ]          # level subsets: []
/ | \
[1] [2] [3] # level subsets: [1], [2], [3]
/ \ |
[2] [3] [3] # level subsets: [1,2], [1,3], [2,3]
|
[3] # level subsets: [1,2,3]

The actual order of subset generation from our solution code is not hard to anticipate in light of the strategy we've been discussing, where, again, we're basically doing a DFS on an imaginary tree, and once we hit the last indexed element (i.e., when index == len(nums)) we move back up the tree from child to parent:

[
[],
[1],
[1,2],
[1,2,3],
[1,3],
[2]
[2,3]
[3]
]

The order conjectured above is confirmed by the return value of our solution when nums = [1, 2, 3]:

[[], [1], [1, 2], [1, 2, 3], [1, 3], [2], [2, 3], [3]]

The process becomes even clearer if we add the print statement print(subs) after subs.append(curr[:]) as well as the print statement print('COMPLETED') after if num_idx == len(nums):. Making these modifications and running the solution code again on the input nums = [1,2,3] results in the following being printed to standard output:

[[]]
[[], [1]]
[[], [1], [1, 2]]
[[], [1], [1, 2], [1, 2, 3]]
COMPLETED
[[], [1], [1, 2], [1, 2, 3], [1, 3]]
COMPLETED
[[], [1], [1, 2], [1, 2, 3], [1, 3], [2]]
[[], [1], [1, 2], [1, 2, 3], [1, 3], [2], [2, 3]]
COMPLETED
[[], [1], [1, 2], [1, 2, 3], [1, 3], [2], [2, 3], [3]]
COMPLETED
[[], [1], [1, 2], [1, 2, 3], [1, 3], [2], [2, 3], [3]]

Note that the text 'COMPLETED' is only ever printed after an element/subset has been added to subs that ends with 3, which corresponds to the leaves of the tree shown earlier.

To summarize, when generating permutations, we had a length requirement, where we needed to use all of the elements in the input; hence, we only considered leaf nodes as part of the actual returned answer. With subsets, however, there is no length requirement; thus, every node should be in the returned answer, including the root node, which is why the very first line of the backtrack function is to add a copy of curr to the returned answer subs.

LC 77. Combinations (✓) ★★

Given two integers n and k, return all possible combinations of k numbers out of 1 ... n. You may return the answer in any order.


class Solution:
def combine(self, n: int, k: int) -> List[List[int]]:
def backtrack(curr, start_val):
if len(curr) == k:
combinations.append(curr[:])
return

for i in range(start_val, n + 1):
curr.append(i)
backtrack(curr, i + 1)
curr.pop()

combinations = []
backtrack([], 1)

return combinations

It can be very helpful to sketch out a tree of possibilities if we suspect backtracking may be an effective strategy for coming up with a solution (just like sketching out things for general tree problems, linked list problems, etc.).

The root of our tree would be [], the empty list. Unlike in LC 78. Subsets, there is an explicit length requirement in terms of the elements that can be added to the container we must ultimately return; hence, not all nodes in the tree of possibilities will represent entities that should be added to our answer array.

What is clear is that each entry added to the list we must return must have a length of k, and it must contain numbers from the list [1, ..., n], inclusive, where no number in the k-length list is duplicated. This means we should consider combining some of the strategy points we used in both LC 46. Permutations and LC 78. Subsets, namely using a length requirement as part of the base case as well as preventing duplicates from being considered, respectively.

The thinking goes something like the following: start with the empty list, [], as the root of the tree:

[]

Each k-length list can have a number from [1, ..., n], inclusive; thus, the root should have n children, namely nodes with values from 1 through n (the example below uses the values of n = 4, k = 2 from the first example on LeetCode):

         [ ]
[1] [2] [3] [4]

Now what? A complete solution will be generated only when the path has length k, but each path right now has a length of 1 (but k = 2 for this example). To keep generating paths (i.e., possible solutions), we need to avoid considering duplicates, which means for each node value i, we only subsequently consider node values [i + 1, ..., n], which results in our overall tree looking like the following:

                      [ ]
[1] [2] [3] [4]
[2] [3] [4] [3] [4] [4]

Since the input specifies k = 2, the tree above suffices to report the following as the combinations list:

[
[1,2],
[1,3],
[1,4],
[2,3],
[2,4],
[3,4]
]
LC 797. All Paths From Source to Target (✓)

Given a directed acyclic graph (DAG) of n nodes labeled from 0 to n - 1, find all possible paths from node 0 to node n - 1, and return them in any order.

The graph is given as follows: graph[i] is a list of all nodes you can visit from node i (i.e., there is a directed edge from node i to node graph[i][j]).


class Solution:
def allPathsSourceTarget(self, graph: List[List[int]]) -> List[List[int]]:
def backtrack(curr_path, node):
if node == len(graph) - 1:
paths.append(curr_path[:])
return

for neighbor in graph[node]:
curr_path.append(neighbor)
backtrack(curr_path, neighbor)
curr_path.pop()

paths = []
backtrack([0], 0)

return paths

This is a great backtracking problem because backtracking is not necessarily the obvious first strategy of attack. Even if we do think of backtracking, how exactly should we proceed? This becomes much clearer (again!) once we start to sketch out the tree of possibilities. What would the root of our tree be? It has to be node 0 since our goal is to find all paths from node 0 to node n - 1. This also suggests something about our base case: we should terminate path generation whenever node n - 1 is reached. We don't have to worry about cycles or anything like that since we're told the graph is a DAG. Additionally, the graph is already provided as an adjacency list which makes traversing each node's neighbors quite easy.

So what's the strategy? Let's start with our root node (which needs to be part of every solution):

0

Now what? Each neighbor of node 0 needs to be considered (we're trying to get to node n - 1 from node 0 in whatever way is possible, which means exploring all possible paths). Let's use the input of the second example on LeetCode for a concrete illustration: graph = [[4,3,1],[3,2,4],[3],[4],[]]. This graphs looks like the following:

As stated above, from 0 we need to consider each of its neighbors:

     0
/ | \
1 3 4

The tree above makes it clear [0,4] will be one possible path, but what are the other possible paths? It looks like leaf nodes will be the complete solutions that need to be added to our answer that we ultimately return. For each node that is not 4 (i.e., n - 1 == 4 in this case), we need to consider each possible neighbor (this will not be endless because we're told the graph is a DAG). Considering the neighbors for each node means our tree of possibilities ultimately ends up looking as follows:

             0
/ | \
1 3 4
/ | \ |
2 3 4 4
| |
3 4
|
4

Hence, the set of possible paths is as expected:

[[0,4],[0,3,4],[0,1,3,4],[0,1,2,3,4],[0,1,4]]
LC 17. Letter Combinations of a Phone Number (✓) ★★

Given a string containing digits from 2-9 inclusive, return all possible letter combinations that the number could represent. Return the answer in any order.

A mapping of digit to letters (just like on the telephone buttons) is given below. Note that 1 does not map to any letters.


class Solution:
def letterCombinations(self, digits: str) -> List[str]:
if not digits:
return []

keypad = {
'2': 'abc',
'3': 'def',
'4': 'ghi',
'5': 'jkl',
'6': 'mno',
'7': 'pqrs',
'8': 'tuv',
'9': 'wxyz'
}

def backtrack(curr, start_idx):
if len(curr) == len(digits):
combinations.append("".join(curr))
return

for i in range(start_idx, len(digits)):
digit = digits[i]
for letter in keypad[digit]:
curr.append(letter)
backtrack(curr, i + 1)
curr.pop()

combinations = []
backtrack([], 0)

return combinations

As with most other backtracking problems, it helps if we start by sketching out the tree of possibilities, where we can imagine the root being an empty string:

''

Now let's consider the input digits = "23", the input for the first example on LeetCode. The desired output is ["ad","ae","af","bd","be","bf","cd","ce","cf"]. How can this be achieved? The first digit is 2, which means 'a', 'b', and 'c' are valid starting letters for combinations:

      ' '
/ | \
'a' 'b' 'c'

We do not want to add duplicates of these letters for subsequent possible combinations but instead consider the potential letters arising from the next digit (i.e., we want to prevent processing duplicates by only ever processing digits after the current digit). The next digit in this case is 3 which corresponds to possible letters 'd', 'e', and 'f'. Our tree now looks like the following:

                   ' '
/ | \
'a' 'b' 'c'
/ | \ / | \ / | \
'd' 'e' 'f' 'd' 'e' 'f' 'd' 'e' 'f'

The tree of possibilities above makes it clear the letter combinations we should return are as follows, as expected:

[
'ad',
'ae',
'af',
'bd',
'be',
'bf',
'cd',
'ce',
'cf',
]
LC 39. Combination Sum (✓)

Given an array of distinct integers candidates and a target integer target, return a list of all unique combinations of candidates where the chosen numbers sum to target. You may return the combinations in any order.

The same number may be chosen from candidates an unlimited number of times. Two combinations are unique if the frequency of at least one of the chosen numbers is different.

It is guaranteed that the number of unique combinations that sum up to target is less than 150 combinations for the given input.


class Solution:
def combinationSum(self, candidates: List[int], target: int) -> List[List[int]]:
def backtrack(path, path_sum, next_path_node_idx):
if path_sum == target:
paths.append(path[:])
return

for i in range(next_path_node_idx, len(candidates)):
candidate = candidates[i]
if path_sum + candidate <= target:
path.append(candidate)
backtrack(path, path_sum + candidate, i)
path.pop()

paths = []
backtrack([], 0, 0)

return paths

This problem is similar to LC 77. Combinations, but now we are taksed with generating all combinations of candidates such that each candidate number is allowed however many times we want, but the sum of the numbers for each combination/mixture of candidates has to equal the given target.

Start by drawing a tree!

[]

Each number in candidates may be part of a path sum we want to return, where the sum of the elements in the path equates to target. Let's use the input provided in the first example on LeetCode: candidates = [2,3,6,7], target = 7:

     [ ]
/ / \ \
2 3 6 7

The problem description indicates any number may be reused however many times we desire, but we still don't want to generate duplicates such as [2,2,3], [2,3,2], etc. How can we avoid doing this? We can use the same strategy that many other backtracking problems use (i.e., ensure we only begin processing elements that are either the current element itself or elements that come after the current element):

                   [ ]
/ / \ \
2 3 6 7
/ / \ \ /|\ / \ |
2 3 6 7 3 6 7 6 7 7
..................................

When do we stop the process of generating paths? Since all possible values in candidates are positive, this necessarily means a path is no longer valid if its path sum exceeds the given target value.

Note that the following lines in the solution above are important:

# ...
for i in range(next_path_node_idx, len(candidates)):
candidate = candidates[i]
if path_sum + candidate <= target:
path.append(candidate)
# ...

It might be tempting to modify path_sum with the candidate value before entering the if block, but this would be a mistake. Why? Because the list of candidates is not necessarily ordered; hence, if we made the update path_sum += candidate and the new path_sum exceeded target, then we would no longer consider that path, but this would also prevent us from exploring other branches using candidates of a possibly lesser value where path_sum + candidate did not exceed target.

LC 79. Word Search (✓) ★★★

Given an m x n board and a word, find if the word exists in the grid.

The word can be constructed from letters of sequentially adjacent cells, where "adjacent" cells are horizontally or vertically neighboring. The same letter cell may not be used more than once.


class Solution:
def exist(self, board: List[List[str]], word: str) -> bool:
def valid(row, col):
return 0 <= row < m and 0 <= col < n

def backtrack(row, col, word_idx, seen):
if word_idx == len(word):
return True

for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
if valid(next_row, next_col) and (next_row, next_col) not in seen:
next_char = board[next_row][next_col]
if next_char == word[word_idx]:
seen.add((next_row, next_col))
if backtrack(next_row, next_col, word_idx + 1, seen):
return True
seen.remove((next_row, next_col))

return False

m = len(board)
n = len(board[0])
dirs = [(1,0),(-1,0),(0,1),(0,-1)]

for row in range(m):
for col in range(n):
if board[row][col] == word[0] and backtrack(row, col, 1, {(row, col)}):
return True

return False

This problem has a very DFS feel to it, but what makes this a backtracking problem and not a strict DFS problem is because we may visit a square multiple times on the same initial function call. For example, suppose we're looking for the word CAMERA in the following grid:

CAM
ARE
MPS

We may start exploring as follows (each | represents the path taken):

C  A  M
|
A R E
|
M P S

The M-letter above does not have E as a neighbor so we remove the M from consideration and move back up to A, which also does not have an M-neighbor except the one we just visited, which we know leads to nothing. So we remove A as well and we're back at C, and we note C does have another A-neighbor it can visit. And we end up seeing the following as a valid path:

C -> A -> M
|
A <- R <- E

M P S

This path visits the A under C again in order to come up with a valid path. Thus, while we may not be allowed to use a square more than once for the answer, there are possibly multiple ways to use a square to form different candidates, as illustrated above.

We should still use a set seen like we would normally do with DFS solutions so as to avoid using the same letter in the same path. Unlike in DFS solutions, however, we will remove from the set seen when backtracking and it's been determined that a path solution is no longer possible — we should only ever traverse an edge in the graph if we know that the resultant path could lead to an answer. Thus, we can also pass an index variable word_idx that indicates we are currently looking for word[word_idx], and then only move to (next_row, next_col) if word[word_idx] is the correct letter.

Since the answer could start from any square whose letter matches word[0], we need to systematically start our backtracking approach from any square with word[0] as its letter. If we exhaust all such squares and never find word, then we should return false. Of course, if we do find word in the midst of one of our backtracking approaches, then we return true immediately.

LC 52. N-Queens II (✓) ★★★

The n-queens puzzle is the problem of placing n queens on an n x n chessboard such that no two queens attack each other.

Given an integer n, return the number of distinct solutions to the n-queens puzzle.


class Solution:
def totalNQueens(self, n: int) -> int:
def backtrack(row, cols, diagonals, anti_diagonals):
# base case: n queens have been placed
if row == n:
return 1

solution = 0
for col in range(n):
diagonal = row - col
anti_diagonal = row + col

# queen is not placeable, move to considering next column
if (col in cols or diagonal in diagonals or anti_diagonal in anti_diagonals):
continue

# add the queen to the board
cols.add(col)
diagonals.add(diagonal)
anti_diagonals.add(anti_diagonal)

# move to next row with the updated board state
solution += backtrack(row + 1, cols, diagonals, anti_diagonals)

# remove the queen from the board since we have
# explored all valid paths using the above backtrack function call
cols.remove(col)
diagonals.remove(diagonal)
anti_diagonals.remove(anti_diagonal)

return solution

return backtrack(0, set(), set(), set())

This problem is difficult. And it's often used to explain backtracking at various learning levels (e.g., university and college classes). The solution to this problem is informative, especially in regards to how we might want to go about modeling backtracking solutions to other problems.

Whenver we make a call to backtrack, we're doing so by describing the state of whatever it is that is important for consideration in solving the problem at hand. For some problems, maybe that state involves a string and the number of used left and right parentheses (e.g., LC 22. Generate Parentheses), maybe the state involves the row and column of a grid under consideration along with the cells already seen and the index of a word that represents a character for which we are searching (e.g., LC 79. Word Search). And maybe, as in this problem, the state needs to involve a chessboard!

What state should we consider tracking? We need to somehow keep track of rows and columns in some way. The trickier part about this problem involves the diagonals. Even though we have a board in this problem, this is not like a graph problem where we need to conduct a DFS or BFS — fundamentally, our job at each step is to place a queen on the board so that it is not under attack. This is the backtracking part where we narrow down the range of possibilities we consider — the brute force approach would involve generating all board possibilities with queens and then narrow down which solutions actually work, but that would involve placing queens on the board in positions that are being attacked, which means the rest of the generation of possibilities is useless. We want to place queens in positions where they are not being attacked and where whatever subsequent positions we consider can actually lead to a solution. Only once we find a solution do we start to undo queen placement and try other queen placements, but the key is that every placement must be one that could potentially lead to success.

With everything in mind above, how should we proceed? First, it's helpful to just think about how queens move on a chessboard: unlimited moves vertically, horizontally, diagonally (top left to bottom right), and anti-diagonally (top right to bottom left). If we are given n queens on an n x n chessboard, then it's clear that, at the end of the board configuration, each row must have a queen and each column must have a queen. The trickiest part is managing the diagonals effectively.

But we can start by working with the insight above about rows and columns. Each time backtrack is called should result in the placement of a queen on a new row; that is, we'll pass row as an argument to backtrack, and when we have finally reached row == n (i.e., rows 0 through n - 1 have been filled with queens), then we will know we have found a valid solution. Finding a workable position on any given row means processing all column positions on that row until we have found a position that is not under attack.

What about columns? As noted above, for each row being processed, we will iterate through all row positions (i.e., across all columns) until we find a column position that is not under attack. The row we are processing is definitely not under attack due to how we're processing rows, but how do we know whether or not each column is under attack? If another queen is on any column we're considering for a current row, then that column position is invalid because that other queen's line of attack runs through the current position we're considering. We need to consider the next column position. To keep track of which columns are already occupied by queens (i.e., which columns would be considered to be under attack when we're trying to add a new queen), whenever we add a queen to the board, we should add the column position of the queen to a set cols that we can use when we're trying to add subsequent queens.

Cool! We consider rows sequentially so it's guaranteed each row we're considering is not under attack. We have a set cols that shows which columns are occupied. It seems like we need a similar "strategy" for diagonals and anti-diagonals that we currently have for columns; that is, whenever we add a queen to the board, we should also note which diagonal and anti-diagonal just became occupied. But how do we effectively apply a singular label or marker to a diagonal or anti-diagonal that spans multiple positions on the board?

This seems really tricky to do at first until we make note of a brilliant observation concerning diagonal and anti-diagonal movement (R and C below represent row and column position, respectively):

  • diagonal movement (top left to bottom right): If we start at any cell (R, C) and move one position up, then we will be at cell (R - 1, C - 1). If we move one position down, then we will be at (R + 1, C + 1). In general, if we let dr and dc represent the change in row or column value, respectively, then we will find ourselves always moving from cell (R, C) to (R + dr, C + dc) on a given diagonal. Importantly, note that dr == dc since a change in any direction effects each value in the same way (e.g., moving 4 spaces up means dr == dc == -4 since the row and column values both decrease by 4).

    But how does any of this help in service of creating a unique label for a diagonal? The insight lies in how the row and column values relate to each other. Let (R, C) be the coordinates of a cell on a diagonal, and let's label the cell as having value R - C. Then what happens whenever we move from cell (R, C) to cell (R + dr, C + dc)? Since dr == dc, we have (R + dr) - (C + dc) = R + dr - C - dr = R - C; that is, the label for any cell on a diagonal is the same:

    Hence, we should label visited diagonals by adding row - col values to a diagonals set.

  • anti-diagonal movement (top right to bottom left): We can make a similar argument to the one above about effectively labeling anti-diagonals. Let (R, C) represent any given cell on an anti-diagonal. Then moving up a cell would take us to cell (R - 1, C + 1) while moving down a cell would take us to cell (R + 1, C - 1); that is, the row and column values of a cell change at the same rate in an inversely proportional manner. We essentially have dr == dc again, but there needs to be a sign difference in how we represent a movement from (R, C) to another cell: (R + dr, C - dc). Then (R + dr) + (C - dc) = R + dr + C - dr = R + C:

    Hence, we should label visited anti-diagonals by adding row + col values to an anti-diagonals set.

LC 22. Generate Parentheses (✓) ★★★

Given n pairs of parentheses, write a function to generate all combinations of well-formed parentheses.


class Solution:
def generateParenthesis(self, n: int) -> List[str]:
def backtrack(curr_str, left_count, right_count):
if len(curr_str) == 2 * n:
valid_parens.append("".join(curr_str))
return

if left_count < n:
curr_str.append('(')
backtrack(curr_str, left_count + 1, right_count)
curr_str.pop()

if left_count > right_count:
curr_str.append(')')
backtrack(curr_str, left_count, right_count + 1)
curr_str.pop()

valid_parens = []
backtrack([], 0, 0)

return valid_parens

The editorial for this solution on LeetCode is quite good. The key insights (with the first being somewhat minor but the second one being critical):

  • We should append the string we're building whenever its length is 2n because the string we're building, when taking a backtracking approach anyway, must have the potential to be valid and a complete solution at every point. Hence, once/if the candidate string reaches a length of 2n, then we know the string satisfies the constraints of the problem and is a complete solution.
  • As a starting point, we can add as many left-parentheses as we want without fear of producing an invalid string (so long as the number of left parentheses we add doesn't exceed n). We then need to start adding right parentheses. But how can we add right parentheses in general? We should only ever consider adding a right parenthesis when the total number of left parentheses exceeds the number of right parentheses. That way when we add a right parenthesis there's a chance the subsequent string could be a well-formed parenthetical string of length 2n, as desired.

The LeetCode editorial linked above shows how sketching out a tree of possibilities is very useful for this problem, where the following illustration is for the beginning of the case where n = 2:

We can start to see how the logic of the solution above makes sense. For the sake of completeness and concreteness, consider the input of the first example on LeetCode, n = 3, and the corresponding desired output:

["((()))","(()())","(())()","()(())","()()()"]

We can make our own diagram that shows how each solution is built (the x indicates that potential solution path is longer pursued since further work on that path cannot possibly lead to a correct answer):

                                                                                 _____________________________________________________"
/ \
_________________________________________________(________________________________________________ x
/ \
_______________________((_______________________ _______________________()
/ \ / \
(((___ _________________(()________________ _________________()(________________ x
/ \ / \ / \
x ((()____ (()(____ _________(()) ()((____ _________()()
/ \ / \ / \ / \ / \
x ((())__ x (()()__ (())(__ x x ()(()__ ()()(__ x
/ \ / \ / \ / \ / \
x ((())) x (()()) x (())() x ()(()) x ()()()

As a note of reference, the tree above was generated using the binarytree package:

from binarytree import build2
my_tree = build2([
'"',
"(", 'x',
"((", "()", None, None,
"(((", "(()", "()(", "x",
"x", "((()", "(()(", "(())", "()((", "()()", None, None,
None, None, "x", "((())", "x", "(()()", "(())(", "x", "x", "()(()", "()()(", "x",
None, None, "x", "((()))", None, None, "x", "(()())", "x", "(())()", None, None, None, None, "x", "()(())", "x", "()()()"
])

root = my_tree.levelorder[0]
print(root)
LC 967. Numbers With Same Consecutive Differences (✓) ★★

Return all non-negative integers of length n such that the absolute difference between every two consecutive digits is k.

Note that every number in the answer must not have leading zeros. For example, 01 has one leading zero and is invalid.

You may return the answer in any order.


class Solution:
def numsSameConsecDiff(self, n: int, k: int) -> List[int]:
def digits_to_int(digits):
ans = digits[0]
for i in range(1, len(digits)):
ans = (ans * 10) + digits[i]
return ans

def backtrack(digit_arr):
if len(digit_arr) == n:
res.append(digits_to_int(digit_arr))
return

for num in range(10):
if abs(digit_arr[-1] - num) == k:
digit_arr.append(num)
backtrack(digit_arr)
digit_arr.pop()

res = []
for start_digit in range(1, 10):
backtrack([start_digit])

return res

The impossibility of leading zeros can complicate things if we're not careful. We still need to consider the digit 0 as part of potential constraint-satisfying integers. An easy way to handle this is to completely prevent 0 from being a possible leading digit at the outset. Execute the backtrack function for digit arrays that begin with each number 1 through 9, inclusive, and append complete solutions to an overall results array, res, that we ultimately return. It's easiest to manage the digits if we build each integer using a digits array and then return the actual integer once the digits array represents a constraint-satisfying integer.

LC 216. Combination Sum III (✓)

Find all valid combinations of k numbers that sum up to n such that the following conditions are true:

  • Only numbers 1 through 9 are used.
  • Each number is used at most once.

Return a list of all possible valid combinations. The list must not contain the same combination twice, and the combinations may be returned in any order.


class Solution:
def combinationSum3(self, k: int, n: int) -> List[List[int]]:
def backtrack(nums_arr, curr_sum, next_num):
if len(nums_arr) == k and curr_sum == n:
res.append(nums_arr[:])
return

for num in range(next_num, 10):
if curr_sum + num <= n:
nums_arr.append(num)
backtrack(nums_arr, curr_sum + num, num + 1)
nums_arr.pop()

res = []
backtrack([], 0, 1)

return res

The valid combinations can only involve positive integers, which means the sum we're building can only get bigger. Hence, we should only consider as possibilities combinations of integers for which the sum does not exceed n. Also, the numbers used as part of the combination need to be unique. Which means whenever we start with a certain number we should only ever consider subsequent integer values.

Array

Remarks

TBD

def binary_search(arr, target):
left = 0 # starting index of search range (inclusive)
right = len(arr) - 1 # ending index of search range (inclusive)
result = -1 # result of -1 indicates target has not been found yet

while left <= right: # continue search while range is valid
mid = left + (right - left) // 2 # prevent potential overflow
if arr[mid] < target: # target is in right half
left = mid + 1 # (move `left` to narrow search range to right half)
elif arr[mid] > target: # target is in left half
right = mid - 1 # (move `right` to narrow search range to left half)
else: # target found (i.e., arr[mid] == target)
result = mid # store index where target is found (early return, if desired)
right = mid - 1 # uncomment to find first occurrence by narrowing search range to left half
# left = mid + 1 # uncomment to find last occurrence by narrowing search range to right half

if result != -1:
return result # target was found; return its index
else:
return left # target was not found; return insertion point to maintain sorted order

# NOTES:
# only one of the following lines should be uncommented in the else clause: `right = mid - 1` and `left = mid + 1` should
# `right = mid - 1` uncommented and `left = mid + 1` commented out results in searching for first occurrence of target
# `left = mid + 1` uncommented and `right = mid - 1` commented out results in searching for last occurrence of target
Examples
LC 704. Binary Search

Given an array of integers nums which is sorted in ascending order, and an integer target, write a function to search target in nums. If target exists, then return its index. Otherwise, return -1.


class Solution:
def search(self, nums: List[int], target: int) -> int:
left = 0
right = len(nums) - 1

while left <= right:
mid = left + (right - left) // 2
if target < nums[mid]: # target in left half, move right boundary
right = mid - 1
elif target > nums[mid]: # target in right half, move left boundary
left = mid + 1
else:
return mid # target at current mid position, return

return -1

The code above is the template for a basic binary search. We're guaranteed that the numbers in nums are unique, which means target, if it exists, will be found and the first index at which it occurs (and only index) will be returned.

LC 74. Search a 2D Matrix

Write an efficient algorithm that searches for a value in an m x n matrix. This matrix has the following properties:

  • Integers in each row are sorted from left to right.
  • The first integer of each row is greater than the last integer of the previous row.

class Solution:
def searchMatrix(self, matrix: List[List[int]], target: int) -> bool:
m = len(matrix)
n = len(matrix[0])

left = 0
right = m * n - 1

while left <= right:
mid = left + (right - left) // 2
val = matrix[mid // n][mid % n]
if target < val:
right = mid - 1
elif target > val:
left = mid + 1
else:
return True

return False

This problem is very similar to the following classic binary search problem: LC 704. Binary Search. The main difference is that in this problem we basically need to view the 2D array matrix as virtually flattened into a 1D array with index values idx bounded by 0 <= idx <= m * n - 1. The key part of the solution above is, of course, the following line: val = matrix[mid // n][mid % n]. This lets us seamlessly find a cell value in the virtually flattened matrix by converting a given index value to the appropriate row and column in the original input matrix. We could make this more explicit by abstracting away the logic in the line above into its own function:

def index_to_cell_val(idx):
row = idx // n
col = idx % n
return matrix[row][col]

The idea is that we can binary search on the virtually flattened 1D array.

LC 35. Search Insert Position (✓)

Given a sorted array of distinct integers and a target value, return the index if the target is found. If not, return the index where it would be if it were inserted in order.


class Solution:
def searchInsert(self, nums: List[int], target: int) -> int:
left = 0
right = len(nums) - 1

while left <= right:
mid = left + (right - left) // 2
if target < nums[mid]:
right = mid - 1
elif target > nums[mid]:
left = mid + 1
else:
return mid

return left

Distinctness of the input integers in nums ensures returning mid will be the index of target if it's found; otherwise, we return left, which will be the index where we should insert target in order to keep a sorted array.

LC 633. Sum of Square Numbers

Given a non-negative integer c, decide whether there're two integers a and b such that a2 + b2 = c.


class Solution:
def judgeSquareSum(self, c: int) -> bool:
def binary_search(left, right, target):
while left <= right:
b = left + (right - left) // 2
if target < b * b:
right = b - 1
elif target > b * b:
left = b + 1
else:
return True
return False

a = 0
while a * a <= c:
b_squared = c - a * a
if binary_search(0, b_squared, b_squared):
return True
a += 1

return False

Binary search can crop up in all sorts of unexpected places. This is one of them. The idea is that we iteratively search the space [0, c - a^2] for a value of b such that b^2 == c - a^2 that way a^2 + b^2 == c, as desired. The funkier part of the binary search solution is what the usual mid denotes in the binary search itself, namely the b-value we're looking for such that b^2 == target, where target == c - a^2. Hence, when we make adjustments to the left or right endpoints, we're actually comparing the target value against b * b where b takes the role of the normal mid value. If it's ever the case that the target is neither less than b * b nor greater than b * b, then we've found an integer b-value that satisfies the equation (and we've done so using binary search!).

LC 2300. Successful Pairs of Spells and Potions

You are given two positive integer arrays spells and potions, of length n and m, respectively, where spells[i] represents the strength of the ith spell and potions[j] represents the strength of the jth potion.

You are also given an integer success. A spell and potion pair is considered successful if the product of their strengths is at least success.

Return an integer array pairs of length n where pairs[i] is the number of potions that will form a successful pair with the ith spell.


class Solution:
def successfulPairs(self, spells: List[int], potions: List[int], success: int) -> List[int]:
def successful_potions(spell_strength):
# need spell * potion >= success (i.e., potion >= threshold)
threshold = success / spell_strength

left = 0
right = len(potions)

while left < right:
mid = left + (right - left) // 2

if threshold <= potions[mid]:
right = mid
else:
left = mid + 1

return len(potions) - left

potions.sort()
return [ successful_potions(spell) for spell in spells ]

For any potion to actually be successful, it must be greater than or equal to success / spell for any given spell. Since we process potions for each spell in spells, this suggests we should pre-process potions by sorting. Then we can conduct a binary search (where the target is equal to success / spell for a given spell) to determine the where the insertion point would need to be in order for a new potion to be successful. We want everything to the right of that point. Hence, if len(potions) is the length of the potions array and we determine that the leftmost insertion point should occur at i, then we want everything from [i, len(potions) - 1]; that is, (len(potions) - 1) - i + 1 == len(potions) - i, where the first + 1 reflects inclusivity of i. An example will make this clear.

Let's say we sort potions and have potions = [1, 2, 3, 4, 5], and success = 7. We have a spell with a strength of 3. To form a successful pair, we need a potion with a strength of at least 7 / 3 = 2.3333. If we do a binary search for this value on potions, we will find an insertion index of 2. Every potion on this index and to the right can form a successful pair. There are 3 indices in total (the potions with strength 3, 4, 5). In general, if there are m potions, the final index is m - 1. If the insertion index is i, then the range [i, m - 1] has a size of (m - 1) - i + 1 = m - i.

LC 2389. Longest Subsequence With Limited Sum (✓)

You are given an integer array nums of length n, and an integer array queries of length m.

Return an array answer of length m where answer[i] is the maximum size of a subsequence that you can take from nums such that the sum of its elements is less than or equal to queries[i].

A subsequence is an array that can be derived from another array by deleting some or no elements without changing the order of the remaining elements.


class Solution:
def answerQueries(self, nums: List[int], queries: List[int]) -> List[int]:
def binary_search(query):
left = 0
right = len(nums)

while left < right:
mid = left + (right - left) // 2
if query < nums[mid]:
right = mid
else:
left = mid + 1

return left

nums.sort()
for i in range(1, len(nums)):
nums[i] += nums[i - 1]

return [ binary_search(query) for query in queries ]

This problem invites us to dust off our knowledge of prefix sums — because that's really what we need to effectively answer this problem. We need to sort the input nums, create a prefix sum (either mutate the input directly, as above, or create a new array), and then conduct a binary search on the prefix sum where each time we try to find what would need to be the rightmost insertion point if we were to add query to the prefix sum array.

Why does this work. Suppose we had the prefix sum [0, 1, 3, 5, 5, 7, 9], and the query value we were given was 5. Where would we need to insert 5 in the prefix sum above to maintain sorted order so that 5 was as far right as possible? It would need to be at index i == 5 (right after the other two 5 values). This means the original numbers in nums responsible for the [0, 1, 3, 5, 5] part of the preifx sum can all be removed so that the sum is less than or equal to the query value 5. That is why the solution above works.

Solution space

Remarks

As noted in a LeetCode editorial on binary search on solution spaces, we need a few conditions to be met in order to effectively conduct our search:

  1. Possibility/condition/check/feasible function can execute in rougly O(n)O(n) time — we can quickly, in O(n)O(n) or better, verify if the task is possible for a given threshold value, threshold; that is, we define a function, possible(threshold), that returns a boolean that indicates if the given task is possible or impossible when given the specific threshold value.
  2. Max/min characteristic when task is possible given the specific threshold value — if the task is possible for a number threshold and we are looking for
  • a maximum, then it is also possible for all numbers less than threshold.
  • a minimum, then it is also possible for all numbers greater than threshold.
  1. Max/min characteristic when task is impossible given the specific threshold value — if the task is impossible for a number threshold and we are looking for
  • a maximum, then it is also impossible for all numbers greater than threshold.
  • a minimum, then it is also impossible for all numbers less than threshold.

The above depictions can be somewhat difficult to envision at first so it can be helpful to draw out a very simple outline as if we're on a number line from 0 to infinity, left to right, as demonstrated below.

Looking for a maximum threshold:

Example use case (illegal parking): Maximize time spent parked illegally without getting a ticket. Under various conditions (e.g., parking enforcers, location, etc.), we can imagine this being possible for a certain amount of time before it becomes impossible. We'd like to maximize the POSSIBLE amount of time we do not have to worry about getting a ticket before it becomes IMPOSSIBLE to avoid getting a ticket:

Problem is asking for a maximum
 -----------------------
| Possible | Impossible |
-----------------------
0 ^ ...inf
(threshold binary searched for)

As can be seen above, given a threshold amount of time, our task of going undetected when parked illegally is

  • possible for all numbers less than or equal to threshold
  • impossible for all numbers greater than threshold

Looking for a minimum threshold:

Example use case (mandatory online trainings): Minimize time spent on a manadotry online training page before clicking to continue without arousing suspicion. Many online training requirements are modules that are "click-through" in nature, where an employee must complete the module but should not "click to continue" until a sufficient amount of time has elapsed to indicate the employee has possibly consumed all of the information on the page. The goal is to minimize the amount of time spent on any given page. We can imagine this being impossible for a certain amount of time before it becomes possible. We'd like to minimize the POSSIBLE amount of time we are required to be on any given training page where it is IMPOSSIBLE to avoid doing so until a certain amount of time has elapsed:

Problem is asking for a minimum
 -----------------------
| Impossible | Possible |
-----------------------
0 ^ ...inf
(threshold binary searched for)

As can be seen above, given a threshold amount of time, our task of having to remain on a given training page before being allowed to continue making progress through the training is

  • impossible for all numbers less than threshold
  • possible for all numbers greater than or equal to threshold

TAKEAWAY:

  • Minimum: When searching to minimize a value on a solution space, our goal is to find a value, threshold, where the condition we're testing for is possible (i.e., possible(threshold) returns true) and threshold is minimized within the region of possibilities. Specifically, if we let l and r represent the smallest and largest possible solutions in the solution space, respectively, then we're essentially searching for the threshold value, say x, between l and r such that possible(x) returns true but any smaller value of x, say x - ε, results in possible(x - ε) returning false. We can use our previous illustration to capture this:

    l            x          r
    -----------------------
    | Impossible | Possible |
    -----------------------
  • Maximum: When searching to maximize a value on a solution space, our goal is to find a value, threshold, where the condition we're testing for is possible (i.e., possible(threshold) returns true) and threshold is maximized within the region of possibilities. Specifically, if we let l and r represent the smallest and largest possible solutions in the solution space, respectively, then we're essentially searching for the threshold value, say x, between l and r such that possible(x) returns true but any larger value of x, say x + ε, results in possible(x + ε) returning false. We can use our previous illustration to capture this:

    l          x            r
    -----------------------
    | Possible | Impossible |
    -----------------------

The template below makes everything discussed above feasible:

def binary_search_sol_space(arr):
def possible(threshold):
# this function is implemented depending on the problem
return BOOLEAN

left = MINIMUM_POSSIBLE_ANSWER # minimum possible value in solution space (inclusive)
right = MAXIMUM_POSSIBLE_ANSWER # maximum possible value in solution space (inclusive)
result = -1 # desired result (-1 to indicate no valid value found yet)

while left <= right: # continue search while range is valid
mid = left + (right - left) // 2
if possible(mid):
result = mid # mid satisfies condition; update result
right = mid - 1 # adjust right to find smaller valid value (minimization)
else:
left = mid + 1 # mid doesn't satisfy condition; search right half
# IMPORTANT: swap `right = mid - 1` and `left = mid + 1`
# if looking to maximize valid value (i.e., instead of minimize)

return result # return best value found satisfying condition

Above, left, right, result stand for l, r, x, respectively, in regards to the notation we used previously to visualize the solution space on which we are binary searching. A few things worth noting about the template above:

  • Line 13: This is where result is updated. Note how result is only updated once possible(mid) is true for some mid value; that is, if what we're looking to minimize or maximize is actually not possible, then result will never be updated, and a value of -1 will be returned to indicate no valid value was found.
  • Lines 14 and 16: Whether or not these lines should be swapped depends on if the problem at hand is a minimization (no swap) or maximization (swap) problem. Specificially, the template above, in its default state, is set up for minimization problems. Why? Because once a valid mid value is found, we narrow the search space to the left with right = mid - 1, which corresponds to trying to find a smaller valid value (minimization). On the other hand, if we're trying to maximize the valid values we find, then we need to narrow the search space to the right with left = mid + 1, which corresponds to trying to find a larger valid value (maximization).

Thankfully, the template above is quite similar to the template for binary searching on arrays, which means less effort needs to be devoted to memorization, and more time can be spent on understanding.

def binary_search_sol_space(arr):
def possible(threshold):
# this function is implemented depending on the problem
return BOOLEAN

left = MINIMUM_POSSIBLE_ANSWER # minimum possible value in solution space (inclusive)
right = MAXIMUM_POSSIBLE_ANSWER # maximum possible value in solution space (inclusive)
result = -1 # desired result (-1 to indicate no valid value found yet)

while left <= right: # continue search while range is valid
mid = left + (right - left) // 2
if possible(mid):
result = mid # mid satisfies condition; update result
right = mid - 1 # adjust right to find smaller valid value (minimization)
else:
left = mid + 1 # mid doesn't satisfy condition; search right half
# IMPORTANT: swap `right = mid - 1` and `left = mid + 1`
# if looking to maximize valid value (i.e., instead of minimize)

return result # return best value found satisfying condition
Examples
LC 875. Koko Eating Bananas (✓)

Koko loves to eat bananas. There are n piles of bananas, the ith pile has piles[i] bananas. The guards have gone and will come back in h hours.

Koko can decide her bananas-per-hour eating speed of k. Each hour, she chooses some pile of bananas and eats k bananas from that pile. If the pile has less than k bananas, she eats all of them instead and will not eat any more bananas during this hour.

Koko likes to eat slowly but still wants to finish eating all the bananas before the guards return.

Return the minimum integer k such that she can eat all the bananas within h hours.


class Solution:
def minEatingSpeed(self, piles: List[int], h: int) -> int:
def possible(speed):
hours_spent = 0
for banana in piles:
hours_spent += -(banana // -speed)
if hours_spent > h:
return False
return True

left = 1
right = max(piles)

while left < right:
mid = left + (right - left) // 2
if possible(mid):
right = mid
else:
left = mid + 1

return left

The idea is for Koko to go as slowly as possible while still being able to eat all bananas within h hours. Our goal is to find the speed at which Koko can eat all bananas but as soon as we decrease the speed it becomes impossible (that will give us the minimized speed for which eating all bananas is possible).

Hence, we can binary search on the solution space where the solution space is comprised of speeds in the range [min_possible_speed, max_possible_speed], inclusive. What would make sense as a minimum possible speed? The speed needs to be an integer and it clearly can't be 0; hence, the minimum possible speed is 0 so we set left = 0. What about the maximum possible speed? Each pile can be consumed within a single hour if the speed is the size of the pile with the greatest number of bananas; hence, the maximum possible speed we should account for is max(piles) so we set right = max(piles).

All that is left to do now is to greedily search for the leftmost value in the solution space that satisfies the possible constraint.

LC 1631. Path With Minimum Effort (✓) ★★

You are a hiker preparing for an upcoming hike. You are given heights, a 2D array of size rows x columns, where heights[row][col] represents the height of cell (row, col). You are situated in the top-left cell, (0, 0), and you hope to travel to the bottom-right cell, (rows-1, columns-1) (i.e., 0-indexed). You can move up, down, left, or right, and you wish to find a route that requires the minimum effort.

A route's effort is the maximum absolute difference in heights between two consecutive cells of the route.

Return the minimum effort required to travel from the top-left cell to the bottom-right cell.


class Solution:
def minimumEffortPath(self, heights: List[List[int]]) -> int:
def valid(row, col):
return 0 <= row < m and 0 <= col < n

def possible(route_max_abs_diff):
seen = {(0,0)}
def dfs(row, col):
if row == m - 1 and col == n - 1:
return True

for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
if valid(next_row, next_col) and (next_row, next_col) not in seen:
cell_val = heights[row][col]
adjacent_cell_val = heights[next_row][next_col]
if abs(adjacent_cell_val - cell_val) <= route_max_abs_diff:
seen.add((next_row, next_col))
if dfs(next_row, next_col):
return True

return False

return dfs(0, 0)

m = len(heights)
n = len(heights[0])
dirs = [(1,0),(-1,0),(0,1),(0,-1)]

left = 0
right = max([element for row in heights for element in row]) - 1

while left < right:
mid = left + (right - left) // 2
if possible(mid):
right = mid
else:
left = mid + 1

return left

This is a great problem for a variety of reasons. It can be somewhat tricky at first though. The idea is that if we can find a path from the top left to the bottom right using a certain amount of effort, then we can certainly find a path that works for any greater amount of effort (i.e., effort + 1, effort + 2, etc.). But can we find a valid path using less effort? That's the real question.

If we let our solution space be the amount of effort required for a valid path, then we can binary search from the minimum possible amount of effort to the maximum possible amount of effort, inclusive. The goal is to find the effort amount for a valid path where any effort less than that would not result in a valid path.

What are the min/max effort bounds? The minimum possible effort is 0 because all of the numbers in the path could be the same. What about the maximum possible effort? That would be the max amount in the matrix minus the smallest amount:

max([element for row in heights for element in row]) - 1

We could also observe the constraint 1 <= heights[i][j] <= 10^6, which means either of the following options would be valid for the problem at hand:

# option 1
left = 0
right = max([element for row in heights for element in row]) - 1

# option 2
left = 0
right = 10 ** 6 - 1

The first option is obviously more costly in some respects, but the second option could result in a maximum boundary that is much larger than what we need. Since binary search is so fast, the second option is really not an issue though.

Finally, it should be noted that a stack-based DFS is also quite effective here to avoid the space overhead required by the call stack to deal with recursion:

class Solution:
def minimumEffortPath(self, heights: List[List[int]]) -> int:
def valid(row, col):
return 0 <= row < m and 0 <= col < n

def possible(route_max_abs_diff):
seen = {(0,0)}
def dfs(start_row, start_col):
stack = [(start_row, start_col)]
while stack:
row, col = stack.pop()
if row == m - 1 and col == n - 1:
return True

for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
if valid(next_row, next_col) and (next_row, next_col) not in seen:
cell_val = heights[row][col]
next_cell_val = heights[next_row][next_col]
if abs(next_cell_val - cell_val) <= route_max_abs_diff:
seen.add((next_row, next_col))
stack.append((next_row, next_col))

return False

return dfs(0, 0)

m = len(heights)
n = len(heights[0])
dirs = [(1,0),(-1,0),(0,1),(0,-1)]

left = 0
right = max([element for row in heights for element in row]) - 1

while left < right:
mid = left + (right - left) // 2
if possible(mid):
right = mid
else:
left = mid + 1

return left
LC 1870. Minimum Speed to Arrive on Time (✓)

You are given a floating-point number hour, representing the amount of time you have to reach the office. To commute to the office, you must take n trains in sequential order. You are also given an integer array dist of length n, where dist[i] describes the distance (in kilometers) of the ith train ride.

Each train can only depart at an integer hour, so you may need to wait in between each train ride.

  • For example, if the 1st train ride takes 1.5 hours, you must wait for an additional 0.5 hours before you can depart on the 2nd train ride at the 2 hour mark.

Return the minimum positive integer speed (in kilometers per hour) that all the trains must travel at for you to reach the office on time, or -1 if it is impossible to be on time.

Tests are generated such that the answer will not exceed 107 and hour will have at most two digits after the decimal point.


class Solution:
def minSpeedOnTime(self, dist: List[int], hour: float) -> int:
if len(dist) > -(hour // -1):
return -1

def possible(speed):
hours_spent = 0
for i in range(len(dist) - 1):
distance = dist[i]
hours_spent += -(distance // -speed)
if hours_spent > hour:
return False

hours_spent += dist[-1] / speed
return hours_spent <= hour

left = 1
right = 10 ** 7

while left < right:
mid = left + (right - left) // 2
if possible(mid):
right = mid
else:
left = mid + 1

return left

This one is a bit wonky to reason about at first. In part because of how the time elapsed is evaluated. It's similar to LC 875. Koko Eating Bananas in that we always take the ceiling of time evaluations, but here the time spent on the last train is not rounded up (we rounded up the time spent on all piles for the Koko eating bananas problem).

When will it be impossible to reach the office on time? It's not when len(dist) > hour, as LeetCode's second example with input dist = [1,3,2], hour = 2.7 shows. However, it will be impossible if len(dist) > ceil(hour). The third example, with input dist = [1,3,2], hour = 1.9 illustrates this, where the earliest the third train cen depart is at the 2 hour mark.

The idea is to conduct a binary search on the range of speeds [min_speed_possible, max_speed_possible] for which we'll be able to make the trip on time (i.e., where the total number of hours spent is less than or equal to hour), where we want to minimize the speed required to make the trip on time. If we can make the trip on time for a given speed, then we can definitely make the trip on time if we increase the speed to speed + 1. We want to determine when making it on time is possible for speed but impossible for any valid speed value less than this. Binary search it is!

What's the minimum possible speed? We're told the speed reported must be a positive integer so we set left = 1. What about the maximum possible speed? We're told that the answer will not exceed 10 ** 7; hence, we set right = 10 ** 7.

LC 1283. Find the Smallest Divisor Given a Threshold (✓)

Given an array of integers nums and an integer threshold, we will choose a positive integer divisor, divide all the array by it, and sum the division's result. Find the smallest divisor such that the result mentioned above is less than or equal to threshold.

Each result of the division is rounded to the nearest integer greater than or equal to that element. (For example: 7/3 = 3 and 10/2 = 5).

It is guaranteed that there will be an answer.


class Solution:
def smallestDivisor(self, nums: List[int], threshold: int) -> int:
def possible(divisor):
running_sum = 0
for num in nums:
running_sum += -(num // -divisor)
if running_sum > threshold:
return False
return True

left = 1
right = max(nums)

while left < right:
mid = left + (right - left) // 2
if possible(mid):
right = mid
else:
left = mid + 1

return left

This problem is quite similar to LC 875. Koko Eating Bananas. Our solution space is a range of possible divisors, and our goal is to minimize the divisor so that the running sum obtained by dividing each number in nums by divisor (and perforing the subsequent rounding up to the nearest integer) never exceeds threshold.

If we can achieve this task for divisor, then we can definitely achieve the same task by increasing the value of divisor (e.g., divisor + 1). Our goal, then, is to find a divisor such that the task is possible but as soon as we decrease the value of divisor the task becomes impossible.

What would the minimum divisor be for our solution space? We're told it must be a positive integer; hence, we set left = 1. What about the maximum divisor? Since each division result gets rounded up to the nearest integer, the smallest the running sum could be would occur if we chose the divisor to be the maximum value in nums. Then no division would result in a value greater than 1, and since we're told nums.length <= threshold <= 10^6, we let right = max(nums).

LC 410. Split Array Largest Sum (✓) ★★★

Given an array nums which consists of non-negative integers and an integer m, you can split the array into m non-empty continuous subarrays.

Write an algorithm to minimize the largest sum among these m subarrays.


class Solution:
def splitArray(self, nums: List[int], k: int) -> int:
def possible(max_sum):
num_subarrays = 0
subarray_sum = 0
idx = 0

while idx < len(nums):
if nums[idx] > max_sum:
return False

subarray_sum += nums[idx]
if subarray_sum > max_sum:
subarray_sum = nums[idx]
num_subarrays += 1
if num_subarrays > k:
return False

idx += 1

return (num_subarrays + 1) <= k

left = 0
right = sum(nums)

while left < right:
mid = left + (right - left) // 2
if possible(mid):
right = mid
else:
left = mid + 1

return left

This problem is excellent and somewhat similar to LC 1231. Divide Chocolate in nature. The tip off that the problem may involve binary searching on a solution space is given by the fact if a subarray sum subarray_sum works as a solution to the problem then increasing the value of subarray_sum will certainly work as well. Our goal is to find a subarray sum that, when decreased at all, results in not being able to fulfill the requirements of the problem. All of this implies we should be able to conduct a binary search on the solution space of minimum subarray sum values.

What would the smallest possible subarray sum value be? Since values of 0 are allowed in nums, we should set left = 0. What about the largest possible subarray sum value? No subarray can have a larger sum than the entire array; hence, we set right = sum(nums).

The harder part for this problem is designing the possible function effectively. The intuition is that we construct the k subarray sums in a greedy fashion, where we keep adding to one of the k subarrays until the given max_sum has been exceeded, at which point we move on to constructing the next subarray sum. If the number of subarrays we must use to accomplish this ever exceeds k, then we issue an early return of false. If, however, we process all values and the total number of subarrays used is less than or equal to k, then we return true because k will always be less than or equal to nums.length per the constraint 1 <= k <= min(50, nums.length); that is, if somehow we've processed all values in nums and have never exceeded max_sum for a subarray sum and the number of subarrays used is much smaller than k, then we can simply distribute the values in the subarrays to fill up the remaining empty subarrays until the total number of subarrays equals k (the sum of the subarrays from which values are borrowed can only decrease due to the constraint 0 <= nums[i] <= 10^6).

LC 1482. Minimum Number of Days to Make m Bouquets★★

Given an integer array bloomDay, an integer m and an integer k.

We need to make m bouquets. To make a bouquet, you need to use k adjacent flowers from the garden.

The garden consists of n flowers, the ith flower will bloom in the bloomDay[i] and then can be used in exactly one bouquet.

Return the minimum number of days you need to wait to be able to make m bouquets from the garden. If it is impossible to make m bouquets return -1.


class Solution:
def minDays(self, bloomDay: List[int], m: int, k: int) -> int:
def possible(days):
bouquets_formed = flower_count = 0

for flower in bloomDay:
if flower <= days:
flower_count += 1
else:
flower_count = 0

if flower_count == k:
bouquets_formed += 1
flower_count = 0
if bouquets_formed == m:
return True

return False

# not enough flowers for required number of bouquets
if len(bloomDay) < (m * k):
return -1

left = min(bloomDay)
right = max(bloomDay)
while left < right:
mid = left + (right - left) // 2
if possible(mid):
right = mid
else:
left = mid + 1

return left

As usual, the primary difficulty in this problem is identifying it as having a nice binary search solution. The idea is to binary search on the solution space where the solution space is identified as being binary searchable as follows: if I can form m bouquets in d days, then I can definitely form m bouquets in > d days. We want to find the minimum number for d such that trying to form m bouquets in any fewer days is impossible. We can binary search for that number of days, as shown above.

LC 1231. Divide Chocolate (✓) ★★★

You have one chocolate bar that consists of some chunks. Each chunk has its own sweetness given by the array sweetness.

You want to share the chocolate with your K friends so you start cutting the chocolate bar into K+1 pieces using K cuts, each piece consists of some consecutive chunks.

Being generous, you will eat the piece with the 8* and give the other pieces to your friends.

Find the maximum total sweetness of the piece you can get by cutting the chocolate bar optimally.


class Solution:
def maximizeSweetness(self, sweetness: List[int], k: int) -> int:
def possible(min_sweetness):
pieces = 0
piece_sweetness = 0
for chunk_sweetness in sweetness:
piece_sweetness += chunk_sweetness
if piece_sweetness >= min_sweetness:
pieces += 1
piece_sweetness = 0
if pieces == k + 1:
return True
return False

left = 1
right = sum(sweetness) + 1

while left < right:
mid = left + (right - left) // 2
if not possible(mid):
right = mid
else:
left = mid + 1

return left - 1

This is such a fantastic problem, but it is quite difficult. As usual (for binary search problems on solution spaces anyway), constructing the possible function is where much of the difficulty lies. Our goal is to ensure we can actually come up with k + 1 pieces of chocolate to distribute where each piece meets or exceeds the required min_sweetness threshold. If we can do that, then we're in business. But, of course, as the problem indicates, we want to maximize the minimum sweetness of our own piece of chocolate (since all other pieces distributed must have the same or more sweetness compared to our own).

Hence, we need to binary search on a range of possible minimum sweetness values. What would the smallest possible sweetness be? We're told from the constraint that every chunk of the chocolate has a sweetness of at least 1; hence, we set left = 1. What about the largest possible sweetness? Note that k == 0 is possible, which means there's a possibility where we need to share the chocolate bar with no one — in such a case, we would want to consume all of the sweetness, sum(sweetness). But since we're binary searching a solution space for a maximum value, we need to set right = sum(sweetness) + 1 as opposed to right = sum(sweetness). Why? Because we might miss the maximum value in an off-by-one error otherwise; for example, consider the input sweetness = [5,5], k = 0. The while loop terminates once left == right and right == sum(sweetness), but we return left - 1, which is equal to 10 - 1 == 9 instead of the obviously correct answer of 10.

LC 1552. Magnetic Force Between Two Balls★★

In universe Earth C-137, Rick discovered a special form of magnetic force between two balls if they are put in his new invented basket. Rick has n empty baskets, the ith basket is at position[i], Morty has m balls and needs to distribute the balls into the baskets such that the minimum magnetic force between any two balls is maximum.

Rick stated that magnetic force between two different balls at positions x and y is |x - y|.

Given the integer array position and the integer m. Return the required force.


class Solution:
def maxDistance(self, position: List[int], m: int) -> int:
def possible(min_mag_force):
balls_placed = 1
prev_ball_pos = position[0]

for idx in range(1, len(position)):
curr_ball_pos = position[idx]
if curr_ball_pos - prev_ball_pos >= min_mag_force:
balls_placed += 1
prev_ball_pos = curr_ball_pos

if balls_placed == m:
return True

return False

position.sort()

left = 1
right = (position[-1] - position[0]) + 1
while left < right:
mid = left + (right - left) // 2
if not possible(mid):
right = mid
else:
left = mid + 1

return left - 1

The problem description is one of the hardest things about this problem. But after fighting to understand it, our thinking will gradually start to resemble what's included in the hints and point us to binary search on the solution space as a good strategy:

Hint 1: If you can place balls such that the answer is x,
then you can do it for y where y < x.

Hint 2: Similarly, if you cannot place balls such that
the answer is x then you can do it for y where y > x.

Hint 3: Binary search on the answer and greedily see if it is possible.

This problem is quite similar to LC 1231. Divide Chocolate in many ways, but instead of trying to maximize the sweetness of our least sweet piece of chocolate amongst k friends, we are trying to maximize the magnetic force between the least magnetically attracted pair of balls. The idea is to binary search on possible answer values for the minimum magnetic force required and to maximize that value as much as possible.

Hence, our first task is to build a possible function to determine whether or not the task at hand is possible for some given magnetic force, and we greedily try to determine the possibility; that is, we place a ball whenever the magnetic force provided has been met or exceeded (this ensures the magnetic force provided is, indeed, minimum). All balls placed must have at least min_mag_force magnetic force between them. Our goal is to maximize that value.

The smallest possible magnetic force would be when we have n positions and m == n balls, where the positions are all spaced a single unit apart. That would give us a magnetic force of 1, meaning we should set left = 1. The largest possible magnetic force would be the last position value minus the first position value (assuming the position array to be sorted at that point). Since the right endpoint needs to be included and we're always returning left - 1, we should set right = (position[-1] - position[0]) + 1.

Dynamic programming

Remarks

TBD

Memoization
def fn(arr):
# 1. define a function that will compute/contain
# the answer to the problem for any given state
def dp(STATE):
# 3. use base cases to make the recurrence relation useful
if BASE_CASE:
return 0

if STATE in memo:
return memo[STATE]

# 2. define a recurrence relation to transition between states
ans = RECURRENCE_RELATION(STATE)
memo[STATE] = ans
return ans

memo = {}
return dp(STATE_FOR_WHOLE_INPUT)
Tabulation
def fn(arr):
# 1. initialize a table (array, list, etc.)
# to store solutions of subproblems.
dp_table = INITIALIZE_TABLE()

# 2. fill the base cases into the table.
dp_table = FILL_BASE_CASES(dp_table)

# 3. iterate over the table in a specific order
# to fill in the solutions of larger subproblems.
for STATE in ORDER_OF_STATES:
dp_table[STATE] = CALCULATE_STATE_FROM_PREVIOUS_STATES(dp_table, STATE)

# 4. the answer to the whole problem is now in the table,
# typically at the last entry or a specific position.
return dp_table[FINAL_STATE_OR_POSITION]

# example usage
arr = [INPUT_DATA]
result = fn(arr)
Examples
LC 746. Min Cost Climbing Stairs

You are given an integer array cost where cost[i] is the cost of ith step on a staircase. Once you pay the cost, you can either climb one or two steps.

You can either start from the step with index 0, or the step with index 1.

Return the minimum cost to reach the top of the floor.


class Solution:
def minCostClimbingStairs(self, cost: List[int]) -> int:
def dp(step):
if step in memo:
return memo[step]

step_cost = min(dp(step - 1) + cost[step - 1], dp(step - 2) + cost[step - 2])
memo[step] = step_cost

return step_cost

memo = dict()
memo[0] = memo[1] = 0
return dp(len(cost))

The solution above may be the simplest, but there are multiple ways of going about this problem. What follows is mostly meant for potential use in the future.

This problem highlights how we can interpret the same problem in two fundamentally different ways and still end up with a correct answer. Specifically, we may interpret the line

You can either start from the step with index 0, or the step with index 1.

from the problem stem in two notable ways:

  1. We start after step 0, and we have to reach step n (i.e., we consider the top of the floor to be the step beyond the last step, n - 1), where the cost of each step is taken into account once we have departed from that step. In this sense, step 0 and step 1 both cost 0 because we're told we can start from the step with index 0 or the step with index 1 — it costs nothing to get to that step, and the cost of that step is only considered once we've left it.
  2. We start before step 0, and we have to reach the last step, step n - 1, where the cost of each step is taken into account once landed on. In this sense, choosing to go to step 0 at the beginning means it costs cost[0] to do so; similarly, choosing to go to step 1 instead means it costs cost[1] to do so. The goal, then, is to minimize the cost it takes to get to either step n - 2 or n - 1 because once we get to either of those steps and that step's cost is taken into account, we can reach the top of the floor, as desired.

Which interpretation we choose to go with ultimately does not matter in terms of the correctness of our result, but the differences will manifest themselves in our implementation(s).

Generally speaking, a DP solution may be implemented top-down with memoization or bottom-up with tabulation. Furthermore, the recurrence used to characterize how subproblems are broken down into smaller and smaller subproblems may be either a backward recurrence (i.e., the usual approach where index values decrease) or a forward recurrence (i.e., more of a chronological approach, where index values increase).

Graphs

BFS

Remarks

Assume the nodes are numbered from 0 to n - 1 and the graph is given as an adjacency list. Depending on the problem, you may need to convert the input into an equivalent adjacency list before using the templates.

Basic motivation, concepts, and considerations for BFS

The following observations are largely derived from this video. The underlying graph referred to for the rest of this note is assumed to be represented as an adjacency list of index arrays.

The basic idea behind BFS is that you have a starting vertex and you explore vertices in order of how many links they are away from it: first explore the starting vertex, then all vertices one link away (i.e., incident edges involving the start vertex and its immediate neighbors), then all vertices two links away, and so forth, until you explore all vertices reachable from the start vertex. That's it. We can go ahead and come up with some basic pseudocode to model how we might do this.

Basic implementation of BFS
ResetGraph()
for v in V
v.discovered = False

BFS(startVertex)
ResetGraph()
startVertex.discovered = True
Q = new Queue()
Q.enqueue(startVertex)
while(not Q.isEmpty())
u = Q.dequeue()
for each v such that u -> v
if(not v.discovered)
v.discovered = True
Q.enqueue(v)

Running the algorithm above implies a specific "breadth first search tree". The root is the start node, and if one node discovers another node (i.e., u -> v), then it's that node's parent in the tree (i.e., u is considered to be v's parent in the tree). We can imagine that, for the same graph, we could get a different breadth first search tree based on whatever node we picked as the start node (it can get even wilder if we pick several start nodes, something that happens when executing a multi-source BFS).

A more detailed implementation of BFS would be as follows, where we not only create the breadth first search tree but also store the number of links needed to reach each vertex.

Full implementation of BFS
ResetGraph()
for v in V
v.discovered = False
v.dist = +inf
v.pi = nil

BFS(startVertex)
ResetGraph()
startVertex.discovered = True
startVertex.dist = 0
Q = new Queue()
Q.enqueue(startVertex)
while(not Q.isEmpty())
u = Q.dequeue()
for each v such that u -> v
if(not v.discovered)
v.discovered = True
v.dist = u.dist + 1
v.pi = u
Q.enqueue(v)

Of course, for the scheme above to work, assume there's space to store extra information with each vertex, v, namely the distance needed to reach the vertex as well as well as the predecessor parent for that vertex, v.dist and v.pi, respectively. This information can be associated with the vertex itself so that we end up representing a vertex basically as a list, where each slot represents, say, the vertex index, the distance to that vertex, and then the predecessor for that vertex: [v_index, v_dist, v_pred]. These items then get enqueued, dequeued, and updated accordingly. But it's more common to leave the vertex indices to themselves and to instead initialize dist and pred arrays of length n, where n represents the total number of vertices in the graph (remember we're assuming the underlying graph is an adjacency list of index arrays, where nodes are labeled according to their index).

Certainly an observation worth making is that vertices with a smaller distance value must be dequeued and processed before any vertex with a greater distance value is dequeued and processed. Predecessor values are recorded in part to aid in the process of potentially reconstructing the shortest path from the start vertex to one of its reachable vertices.

Reconstructing the shortest path

This can be done recursively as follows:

Reconstruct shortest path from start vertex to target vertex (recursive)
FindPath(s, t)
if (s == t)
return new List().add(t)
if (t.pi == nil) # no path exists
return nil
return FindPath(s, t.pi).add(t)

Or iteratively (perhaps more intuitively), where the implementation below assumes we've maintained a pred array where pred[x] houses the predecessor for node x if it exists or is -1 otherwise:

Reconstruct shortest path from start vertex to target vertex (iterative)
ReconstructPath(s, t, pred)
path = new Stack()
path.push(t)
while path.peek() != s:
if pred[path.peek()] == -1:
return None # no path exists from s to t
path.push(pred[path.peek()])
reverse path # obtain original path direction from s to t
return path

In CLRS (i.e., [23]), the PrintPath procedure is provided to simply print the path from the vertex s to the vertex v (i.e., as opposed to really reconstructing the path):

PrintPath procedure from CLRS to reconstruct shortest path
PrintPath(G, s, v)
if v == s:
print s
else if v.pi == nil:
print "no path from" s "to" v "exists
else PrintPath(G, s, v.pi)
print v

We can use Python to actually implement the PrintPath procedure but in a way where we actually obtain the path:

Recursively reconstruct shortest path (Python)
def shortest_path(s, t, path, pred):
if s == t:
path.append(s)
return path
elif pred[t] == -1:
return [] # shortest path from s to t does not exist
else:
path.append(t)
return shortest_path(s, pred[t], path, pred)

In most cases, however, we would probably be more inclined to reconstruct the path in an iterative fashion, where the code is more readable and we avoid some of the overhead associated with recursion:

Recursively reconstruct shortest path (Python)
def shortest_path(s, t, pred):
path = [t]
while path[-1] != s:
parent = pred[path[-1]]
if parent == -1:
return [] # shortest path from s to t does not exist
path.append(parent)
path.reverse()
return path

If we look at CLRS, we will note that there's only a minor change from the "full implementation" algorithm above and how it appears in CLRS: instead of marking vertices as discovered or undiscovered, they use three colors:

  • White: Undiscovered
  • Gray: Discovered but not yet finished (i.e., when it's in the queue waiting to be dequeued)
  • Black: Finished (i.e., dequeued and all unvisited child nodes enqueued)

That's it:

Full implementation of BFS (CLRS style)
ResetGraph()
for v in V
v.color = White
v.dist = +inf
v.pi = nil

BFS(startVertex)
ResetGraph()
startVertex.color = Gray
startVertex.dist = 0
Q = new Queue()
Q.enqueue(startVertex)
while(not Q.isEmpty())
u = Q.dequeue()
for each v such that u -> v
if(v.color == White)
v.color = Gray
v.dist = u.dist + 1
v.pi = u
Q.enqueue(v)
u.color = Black
from collections import deque

def fn(graph):
queue = deque([START_NODE])
seen = {START_NODE}
ans = 0

while queue:
node = queue.popleft()
# do some logic
for neighbor in graph[node]:
if neighbor not in seen:
seen.add(neighbor)
queue.append(neighbor)

return ans
Examples
LC 1091. Shortest Path in Binary Matrix (✓)

Given an n x n binary matrix grid, return the length of the shortest clear path in the matrix. If there is no clear path, return -1.

A clear path in a binary matrix is a path from the top-left cell (i.e., (0, 0)) to the bottom-right cell (i.e., (n - 1, n - 1)) such that:

  • All the visited cells of the path are 0.
  • All the adjacent cells of the path are 8-directionally connected (i.e., they are different and they share an edge or a corner).

The length of a clear path is the number of visited cells of this path.


class Solution:
def shortestPathBinaryMatrix(self, grid: List[List[int]]) -> int:
def valid(row, col):
return 0 <= row < n and 0 <= col < n and grid[row][col] == 0

dirs = [(1,0),(1,1),(1,-1),(-1,0),(-1,1),(-1,-1),(0,-1),(0,1)]
n = len(grid)
seen = {(0,0)}

if grid[0][0] != 0 or grid[n-1][n-1] != 0:
return -1

queue = deque([(0,0,1)])
while queue:
row, col, path_length = queue.popleft()
if row == n - 1 and col == n - 1:
return path_length

for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
next_node = (next_row, next_col)
if valid(*next_node) and next_node not in seen:
queue.append((*next_node, path_length + 1))
seen.add(next_node)

return -1

It's fairly conventional in BFS solutions for graphs to encode with each node additional information like the current level for that node or some other kind of stateful data. We do not need to encode anything other than each node's position in the seen set because whenever we encounter a node it will be in the fewest steps possible (i.e., the trademark of BFS solutions ... finding shortest paths).

LC 863. All Nodes Distance K in Binary Tree (✓)

We are given a binary tree (with root node root), a target node, and an integer value K.

Return a list of the values of all nodes that have a distance K from the target node. The answer can be returned in any order.


class Solution:
def distanceK(self, root: TreeNode, target: TreeNode, k: int) -> List[int]:
def build_parent_lookup(node, parent = None):
if not node:
return

parent_lookup[node] = parent
build_parent_lookup(node.left, node)
build_parent_lookup(node.right, node)

parent_lookup = dict()
build_parent_lookup(root)
seen = {target}

queue = deque([target])
for _ in range(k):
num_nodes_this_level = len(queue)
for _ in range(num_nodes_this_level):
node = queue.popleft()
for neighbor in [node.left, node.right, parent_lookup[node]]:
if neighbor and neighbor not in seen:
seen.add(neighbor)
queue.append(neighbor)

return [ node.val for node in queue ]

The key in the solution above is to recognize that a BFS traversal will give us exactly what we want if we have some way to reference each node's parent node. The build_parent_lookup function, which uses DFS to build a hashmap lookup for each node's parent node, gives us this. Much of the rest of the problem then becomes a standard BFS traversal.

LC 542. 01 Matrix (✓)

Given a matrix consists of 0 and 1, find the distance of the nearest 0 for each cell.

The distance between two adjacent cells is 1.


class Solution:
def updateMatrix(self, mat: List[List[int]]) -> List[List[int]]:
def valid(row, col):
return 0 <= row < m and 0 <= col < n and mat[row][col] == 1

m = len(mat)
n = len(mat[0])
res = [[0] * n for _ in range(m)]
dirs = [(-1,0),(1,0),(0,1),(0,-1)]
seen = set()
queue = deque()

for i in range(m):
for j in range(n):
if mat[i][j] == 0:
seen.add((i, j))
queue.append((i, j, 0))

while queue:
row, col, dist = queue.popleft()
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
next_node = (next_row, next_col)
if valid(*next_node) and next_node not in seen:
res[next_row][next_col] = dist + 1
seen.add(next_node)
queue.append((*next_node, dist + 1))

return res

The solution above takes advantage of a so-called multi-source BFS (i.e., a BFS traversal with multiple starting sources). The solution also takes advantage of the fact that cells with a value of 0 do not need to be updated; hence, our BFS can start from all nodes with a value of 0 and explore outwards, updating non-zero nodes (i.e., just nodes with value 1 for this problem) with the distance so far from a node with value 0. Our intuition tells us to start from the nodes with value 1, but it's much easier to start from the nodes with value 0.

Additionally, in the valid function above, the condition and mat[row][col] == 1 is not necessary since all nodes with value 0 are added to the seen set before exploring outwards, which means all subsequent nodes we'll explore that are both valid and not in seen will have a non-zero value (i.e., 1 for this problem). This conditional of the valid function is only kept for the sake of clarity, but it's worth noting that it's not necessary here.

LC 1293. Shortest Path in a Grid with Obstacles Elimination (✓) ★★

Given a m * n grid, where each cell is either 0 (empty) or 1 (obstacle). In one step, you can move up, down, left or right from and to an empty cell.

Return the minimum number of steps to walk from the upper left corner (0, 0) to the lower right corner (m-1, n-1) given that you can eliminate at most k obstacles. If it is not possible to find such walk return -1.


class Solution:
def shortestPath(self, grid: List[List[int]], k: int) -> int:
# ensures the next node to be visited is in bounds
def valid(row, col):
return 0 <= row < m and 0 <= col < n

m = len(grid)
n = len(grid[0])
dirs = [(1,0),(-1,0),(0,1),(0,-1)]
seen = {(0,0,k)}
queue = deque([(0,0,k,0)])

while queue:
row, col, rem, steps = queue.popleft()

# only valid nodes exist in the queue
if row == m - 1 and col == n - 1:
return steps

for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
next_node = (next_row, next_col)

if valid(*next_node):
next_val = grid[next_row][next_col]

# if the next value is not an obstacle, then proceed with visits as normal
if next_val == 0:
if (*next_node, rem) not in seen:
seen.add((*next_node, rem))
queue.append((*next_node, rem, steps + 1))
# the next value is an obstacle: can we still remove obstacles? if so, proceed with visits
else:
if rem > 0 and (*next_node, rem - 1) not in seen:
seen.add((*next_node, rem - 1))
queue.append((*next_node, rem - 1, steps + 1))

return -1

This is an excellent problem for thinking through how a node's state should be recorded in the seen set; that is, the majority of BFS and DFS traversals on matrix graphs simply record a node's position (i.e., row and column) because the node's position fully describes the state we do not want to visit again. But for some problems, like this one, it's helpful to record more information than just a node's position. Specifically, the state we do not want to visit more than once is a node's position in addition to the number of remaining obstacles we can move.

Thinking about using the seen set to record states we do not want to visit multiple times is much more accurate and reflective of our actual goal — only perform computation when absolutely necessary.

LC 1129. Shortest Path with Alternating Colors (✓) ★★

Consider a directed graph, with nodes labelled 0, 1, ..., n-1. In this graph, each edge is either red or blue, and there could be self-edges or parallel edges.

Each [i, j] in red_edges denotes a red directed edge from node i to node j. Similarly, each [i, j] in blue_edges denotes a blue directed edge from node i to node j.

Return an array answer of length n, where each answer[X] is the length of the shortest path from node 0 to node X such that the edge colors alternate along the path (or -1 if such a path doesn't exist).


class Solution:
def shortestAlternatingPaths(self, n: int, redEdges: List[List[int]], blueEdges: List[List[int]]) -> List[int]:
def build_graph(edge_arr):
graph = defaultdict(list)
for node, neighbor in edge_arr:
graph[node].append(neighbor)
return graph

RED_GRAPH = build_graph(redEdges)
BLUE_GRAPH = build_graph(blueEdges)
RED = 1
BLUE = 0

ans = [float('inf')] * n
seen = {(0,RED), (0,BLUE)}
queue = deque([(0,RED,0),(0,BLUE,0)])

while queue:
node, color, steps = queue.popleft()
ans[node] = min(ans[node], steps)
alt_color = 1 - color
graph = RED_GRAPH if alt_color == 1 else BLUE_GRAPH
for neighbor in graph[node]:
if (neighbor, alt_color) not in seen:
seen.add((neighbor, alt_color))
queue.append((neighbor, alt_color, steps + 1))

return [ val if val != float('inf') else -1 for val in ans ]

This is such a great problem in so many ways. The idea is to execute a "semi-multi-source" BFS, where we start with node 0 as if it's red as well as if it's blue. Then we expand outwards.

We also take advantage of a nice numerical trick: 1 - 1 == 0, and 1 - 0 == 1. This allows us to effectively (and efficiently) alternate between colors.

LC 1926. Nearest Exit from Entrance in Maze (✓)

You are given an m x n matrix maze (0-indexed) with empty cells (represented as '.') and walls (represented as '+'). You are also given the entrance of the maze, where entrance = [entrancerow, entrancecol] denotes the row and column of the cell you are initially standing at.

In one step, you can move one cell up, down, left, or right. You cannot step into a cell with a wall, and you cannot step outside the maze. Your goal is to find the nearest exit from the entrance. An exit is defined as an empty cell that is at the border of the maze. The entrance does not count as an exit.

Return the number of steps in the shortest path from the entrance to the nearest exit, or -1 if no such path exists.


class Solution:
def nearestExit(self, maze: List[List[str]], entrance: List[int]) -> int:
def valid(row, col):
return 0 <= row < m and 0 <= col < n and maze[row][col] == '.'

m = len(maze)
n = len(maze[0])
dirs = [(1,0),(-1,0),(0,1),(0,-1)]
start_node = tuple(entrance)
seen = {start_node}
queue = deque([(*start_node, 0)])

exit_rows = {0, m - 1}
exit_cols = {0, n - 1}

while queue:
row, col, moves = queue.popleft()
if row in exit_rows or col in exit_cols:
if row != start_node[0] or col != start_node[1]:
return moves

for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
next_node = (next_row, next_col)
if valid(*next_node) and next_node not in seen:
seen.add(next_node)
queue.append((*next_node, moves + 1))

return -1

There are several variables to initialize before the proper traversal and that is okay.

LC 909. Snakes and Ladders (✓)

On an N x N board, the numbers from 1 to N*N are written boustrophedonically starting from the bottom left of the board, and alternating direction each row. For example, for a 6 x 6 board, the numbers are written as follows:

You start on square 1 of the board (which is always in the last row and first column). Each move, starting from square x, consists of the following:

  • You choose a destination square S with number x+1, x+2, x+3, x+4, x+5, or x+6, provided this number is <= N*N.
  • (This choice simulates the result of a standard 6-sided die roll: ie., there are always at most 6 destinations, regardless of the size of the board.)
  • If S has a snake or ladder, you move to the destination of that snake or ladder. Otherwise, you move to S.

A board square on row r and column c has a "snake or ladder" if board[r][c] != -1. The destination of that snake or ladder is board[r][c].

Note that you only take a snake or ladder at most once per move: if the destination to a snake or ladder is the start of another snake or ladder, you do not continue moving. (For example, if the board is [[4,-1],[-1,3]], and on the first move your destination square is 2, then you finish your first move at 3, because you do not continue moving to 4.)

Return the least number of moves required to reach square N*N. If it is not possible, return -1.


class Solution:
def snakesAndLadders(self, board: List[List[int]]) -> int:
def valid(row, col):
return 0 <= row < n and 0 <= col < n

def label_to_cell(num):
row = (num - 1) // n
col = (num - 1) % n
if row % 2 == 1:
col = (n - 1) - col
return [(n - 1) - row, col]

n = len(board)
seen = {1}
queue = deque([(1,0)])
while queue:
label, moves = queue.popleft()

if label == n ** 2:
return moves

for next_label in range(label + 1, min(label + 6, n ** 2) + 1):
row, col = label_to_cell(next_label)
if valid(row, col) and next_label not in seen:
seen.add(next_label)
if board[row][col] != -1:
queue.append((board[row][col], moves + 1))
else:
queue.append((next_label, moves + 1))

return -1

This is a rough one, mostly because of the convoluted problem description. The trickiest part is coming up with a performant way of converting from label to cell:

def label_to_cell(num):
row = (num - 1) // n
col = (num - 1) % n
if row % 2 == 1:
col = (n - 1) - col
return [(n - 1) - row, col]

Coming up with the function above takes some patience and experimentation at first. Another challenge is the mental change from generally recording in the seen set the position of the node we're processing; in this case, we only care about what node labels we've previously seen. As we explore neighbors, we add each label to the seen set, but the items we add to the queue will depend on whether or not we've encounter a snake or ladder.

LC 1376. Time Needed to Inform All Employees (✓)

A company has n employees with a unique ID for each employee from 0 to n - 1. The head of the company is the one with headID.

Each employee has one direct manager given in the manager array where manager[i] is the direct manager of the ith employee, manager[headID] = -1. Also, it is guaranteed that the subordination relationships have a tree structure.

The head of the company wants to inform all the company employees of an urgent piece of news. He will inform his direct subordinates, and they will inform their subordinates, and so on until all employees know about the urgent news.

The ith employee needs informTime[i] minutes to inform all of his direct subordinates (i.e., After informTime[i] minutes, all his direct subordinates can start spreading the news).

Return the number of minutes needed to inform all the employees about the urgent news.


class Solution:
def numOfMinutes(self, n: int, headID: int, manager: List[int], informTime: List[int]) -> int:
# treat the graph as directed (from managers to subordinates)
def build_adj_list(manager_arr):
graph = defaultdict(list)
for emp in range(n):
graph[manager_arr[emp]].append(emp)
del graph[-1]
return graph

graph = build_adj_list(manager)
minutes_needed = informTime[headID]
seen = {headID}
queue = deque([(headID, 0)]) # unique base case

while queue:
manager, time_so_far = queue.popleft()
for subordinate in graph[manager]:
minutes_needed = max(minutes_needed, time_so_far + informTime[manager])
seen.add(subordinate)
queue.append((subordinate, time_so_far + informTime[manager]))

return minutes_needed

This seems like a mostly natural BFS problem, but one of the tricks involves effectively handling how queue is initialized. It's tempting to do queue = deque([(headID, informTime[headID])]), but this would be wrong when subordinates exist because we almost certainly end up overcounting (this is because time_so_far + informTime[manager] is the time so far for each subordinate node of a manager).

Sometimes BFS problems can be tricky because of how we handle queue initialization. This is definitely one of those problems.

LC 994. Rotting Oranges (✓) ★★

You are given an m x n grid where each cell can have one of three values:

  • 0 representing an empty cell,
  • 1 representing a fresh orange, or
  • 2 representing a rotten orange.

Every minute, any fresh orange that is 4-directionally adjacent to a rotten orange becomes rotten.

Return the minimum number of minutes that must elapse until no cell has a fresh orange. If this is impossible, return -1.


class Solution:
def orangesRotting(self, grid: List[List[int]]) -> int:
def valid(row, col):
return 0 <= row < m and 0 <= col < n and grid[row][col] == 1

def find_fresh_and_rotten(mat):
fresh = 0
rotten = set()
for i in range(m):
for j in range(n):
if mat[i][j] == 1:
fresh += 1
elif mat[i][j] == 2:
rotten.add((i,j))
return fresh, rotten

m = len(grid)
n = len(grid[0])
dirs = [(1,0),(-1,0),(0,1),(0,-1)]
fresh, seen = find_fresh_and_rotten(grid)

if fresh == 0:
return 0

queue = deque([(*rotten, 0) for rotten in seen])
while queue:
row, col, mins = queue.popleft()
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
if valid(next_row, next_col) and (next_row, next_col) not in seen:
seen.add((next_row, next_col))
fresh -= 1
if fresh == 0:
return mins + 1
queue.append((next_row, next_col, mins + 1))

return -1

The solution editorial makes it clear there are several ways of going about solving this problem. The approach above is arguably more straightforward than the solutions provided in the editorial though.

The key idea is to pre-process the grid in order to find the total number of fresh oranges as well as where the rotten oranges are located (if no fresh oranges are found, then immediately return 0) — we then execute a mult-source BFS from each rotten orange, and we keep track of how many fresh oranges remain as each cell is processed (if the counter ever reaches 0, then we immediately return the number of minutes required at that point).

DFS (recursive)

Remarks

Assume the nodes are numbered from 0 to n - 1 and the graph is given as an adjacency list. Depending on the problem, you may need to convert the input into an equivalent adjacency list before using the templates.

Basic motivation, concepts, and considerations for DFS

The following observations are largely derived from this video. The underlying graph referred to for the rest of this note is assumed to be represented as an adjacency list of index arrays.

If you're looking at a graph from above, then breadth first search is pretty intuitive. It expands from your search vertex like a wave in a pond. But lots of times you're not looking at a graph from above — you're looking at it from within (e.g., you're exploring a maze, an area in a video game, etc.).

The simple implementation for DFS is somewhat similar to BFS:

Simple implementation of DFS
ResetGraph(G)
for v in V
v.discovered = False

DFSVertex(u)
u.discovered = True
for each v such that u -> v
if (not v.discovered)
DFSVertex(v)

DFS(G, startVertex)
ResetGraph(G)
DFSVertex(startVertex)

We start from a single vertex and then go to adjacent vertices. If the next vertex has several adjacent vertices, it doesn't matter — we just choose one of them and keep exploring, ignoring other vertices we could have explored along the way. We do this by making recursive calls to DFSVertex for each new vertex.

Like in BFS, if you think about which vertex discovers another, that implies a "depth first search tree", and we can similarly store parent values in a π\pi variable:

Simple implementation of DFS (with parent values)
ResetGraph(G)
for v in V
v.pi = nil
v.discovered = False

DFSVertex(u)
u.discovered = True
for each v such that u -> v
if (not v.discovered)
v.pi = u
DFSVertex(v)

DFS(G, startVertex)
ResetGraph(G)
DFSVertex(startVertex)

Similar to BFS, in DFS we can reconstruct paths from the root of the tree — but the paths are less interesting here because they might have more than the minimum number of links. So instead of counting links, we track when each vertex was discovered (i.e., reached) and finished (i.e., fully explored) with time stamps. Of course, we don't need actual time stamps — just a relative order will do:

Full implementation of DFS (with parent values and time stamps)
ResetGraph(G)
for v in V
v.pi = nil
v.discovered = -1
v.finished = -1
time = 1

DFSVertex(u)
u.discovered = time++
for each v such that u -> v
if (v.discovered < 0)
v.pi = u
DFSVertex(v)
u.finished = time++

DFS(G, startVertex)
ResetGraph(G)
DFSVertex(startVertex)
CLRS pseudocode for DFS

It's probably worth reproducing the pseudocode from [23] to serve as a comparison point. The use of colors is similar to what was used for BFS:

  • White: Undiscovered
  • Gray: Discovered but not yet finished (i.e., it is actively being explored)
  • Black: Finished (i.e., it is done being explored)

With the above in mind, here is the DFS algorithm as it appears in CLRS:

CLRS implementation of DFS
DFS(G)
for each vertex u in G.V
u.color = White
u.pi = nil
time = 0
for each vertex u in G.V
if u.color == White
DFSVisit(G, u)

DFSVisit(G, u)
time = time + 1 # white vertex has just been discovered
u.d = time
u.color = Gray
for each vertex v in G.Adj[u] # explore each edge (u, v)
if v.color == White
v.pi = u
DFSVisit(G, v)
time = time + 1
u.f = time
u.color = Black # blacken u; it is finished

Why would we want to track time stamps when executing a DFS? It turns out that DFS time stamps will be useful for other tasks like topological sorting. Additionally, for some algorithms, you want to run DFS from one vertex while for others you want to run it on the whole graph.

To do that, after initializing the graph once (i.e.,m with ResetGraph), run DFS on each previously undiscovered vertex; then, instead of getting a DFS tree rooted at one vertex (i.e., startVertex) that only includes vertices reachable from that vertex, we get a forest that will always contain all the graph's vertices.

In terms of our earlier pseudocode for the full implementation, this means changing the block

DFS(G, startVertex)
ResetGraph(G)
DFSVertex(startVertex)

to the following:

DFS(G)
ResetGraph(G)
for u in V
if (u.discovered < 0)
DFSVertex(u)

We then get the following as our updated full implementation:

Full implementation of DFS for entire graph
ResetGraph(G)
for v in V
v.pi = nil
v.discovered = -1
v.finished = -1
time = 1

DFSVertex(u)
u.discovered = time++
for each v such that u -> v
if (v.discovered < 0)
v.pi = u
DFSVertex(v)
u.finished = time++

DFS(G)
ResetGraph(G)
for u in V
if (u.discovered < 0)
DFSVertex(u)

It's a useful exercise to consider an example graph and work out how everything above is represented and calculated.

Worked example

Let's consider an example of the algorithm above in action on the following graph:

We'll execute top-level searches on vertices in alphabetical order. Doing so yields the following completely searched graph, as illustrated in the linked video at the top of this note:

As can be seen above, A, E, and I are the roots of their own depth first search trees in this depth first search forest.

To reproduce the final figure above using code, we can first represent the graph by mapping vertices A through I to 0 through 8, inclusive. It also helps to define a lookup table for ease of reference:

graph = [
[3],
[0, 6],
[1, 3, 6],
[1, 5, 6],
[7],
[2],
[],
[3, 4, 5],
[4],
]

lookup = {
-1: ' ',
0: 'A',
1: 'B',
2: 'C',
3: 'D',
4: 'E',
5: 'F',
6: 'G',
7: 'H',
8: 'I',
}

Now we can write the dfs function to execute a DFS on the entire graph — the inner visit function is where the DFSVertex method is implemented (a reformatting of the output is included at the bottom of the code):

def dfs(graph):
n = len(graph)
discovered = [-1] * n
finished = [-1] * n
pred = [-1] * n
time = 0

def visit(node):
nonlocal time
time += 1
discovered[node] = time
for nbr in graph[node]:
if discovered[nbr] < 0:
pred[nbr] = node
visit(nbr)
time += 1
finished[node] = time

for node in range(n):
if discovered[node] < 0:
visit(node)

return discovered, finished, [ lookup[parent] for parent in pred ]

print(dfs(graph))

"""

( A B C D E F G H I
[ 1, 3, 8, 2, 13, 7, 4, 14, 17], # discovered times
[12, 6, 9, 11, 16, 10, 5, 15, 18], # finished times
[ , D, F, A, , D, B, E, ] # pi values
)

"""

The next two aspects of DFS we will consider, namely parenthesis notation and edge classification are remarked on in greater detail in [23]. The relevant snippets from CLRS are reproduced below (it would likely be helpful to look at these snippets before looking at these topics in the context of the example we've been working through).

Parenthesis theorem (CLRS)

Depth-first search yields valuable information about the structure of a graph. Perhaps the most basic property of depth-first search is that the predecessor subgraph GπG_\pi does indeed form a forest of trees, since the structure of the depth-first trees exactly mirrors the structure of recursive calls of DFS-VISIT. That is, u=v.πu = v.\pi if and only if DFS-Visit(G,v)\text{DFS-Visit}(G, v) was called during a search of uu's adjacency list. Additionally, vertex vv is a descendant of vertex uu in the depth-first forest if and only if vv is discovered during the time in which uu is gray.

Another important property of depth-first search is that discovery and finish times have parenthesis structure. If the DFS-VISIT procedure were to print a left parenthesis "(u(u" when it discovers vertex uu and to print a right parenthesis "u)u)" when it finishes uu, then the printed expression would be well formed in the sense that the parentheses are properly nested.

The following theorem provides another way to characterize the parenthesis structure.

In any depth-first search of a (directed or undirected) graph G=(V,E)G = (V, E), for any two vertices uu and vv, exactly one of the following three conditions holds:

  • the intervals [u.d,u.f][u.d, u.f] and [v.d,v.f][v.d, v.f] are entirely disjoint, and neither uu nor vv is a descendant of the other in the depth-first forest,
  • the interval [u.d,u.f][u.d, u.f] is contained entirely within the interval [v.d,v.f][v.d, v.f], and uu is a descendant of vv in a depth-first tree, or
  • the interval [v.d,v.f][v.d, v.f] is contained entirely within the interval [u.d,u.f][u.d, u.f], and vv is a descendant of uu in a depth-first tree.
Edge classification (CLRS)

You can obtain important information about a graph by classifying its edges during a depth-first search. For example, Section 20.4 will show that a directed graph is acyclic if and only if a depth-first search yields no "back" edges (Lemma 20.11).

The depth-first forest GπG_\pi produced by a depth-first search on graph GG can contain four types of edges

  1. Tree edges are edges in the depth-first forest GπG_\pi. Edge (u,v)(u, v) is a tree edge if vv was first discovered by exploring edge (u,v)(u, v).
  2. Back edges are those edges (u,v)(u, v) connecting a vertex uu to an ancestor vv in a depth-first tree. We consider self-loops, which may occur in directed graphs, to be back edges.
  3. Forward edges are those nontree edges (u,v)(u, v) connecting a vertex uu to a proper descendant vv in a depth-first tree.
  4. Cross edges are all other edges. They can go between vertices in the same depth-first tree, as long as one vertex is not an ancestor of the other, or they can go between vertices in different depth-first trees.

The DFS algorithm has enough information to classify some edges as it encounters them. The key idea is that when an edge (u,v)(u, v) is first explored, the color of vertex vv says something about the edge:

  1. WHITE indicates a tree edge,
  2. GRAY indicates a back edge, and
  3. BLACK indicates a forward or cross edge.

The first case is immediate from the specification of the algorithm. For the second case, observe that the gray vertices always form a linear chain of descendants corresponding to the stack of active DFS-VISIT invocations. The number of gray vertices is 1 more than the depth in the depth-first forest of the vertex most recently discovered. Depth-first search always explores from the deepest gray vertex, so that an edge that reaches another gray vertex has reached an ancestor. The third case handles the remaining possibility.

The entire depth first search from the worked example above can be summarized nicely using parentheses:

An opening parentheses stands for when the depth first search call is made on a vertex, and the closed parentheses stands for when that call exits:

The parentheses are properly nested because the inner recursive calls must complete before the code that called them. A child will always be nested in its parent, and a vertex is only the child of at most one vertex, the one that discovered it.

If we just count the parentheses from the beginning, they will match the discovery and finish times:

Recall the fully explored graph for reference and comparison:

In CLRS they do two more things. First, they color vertices in the following manner:

  • White: Undiscovered
  • Gray: Discovered (but unifinished)
  • Black: Finished

Second, they classify edges:

  • Tree edges (parent to a child): go to an undiscovered vertex
  • Back edges (to an ancestor): to a discovered but unfinished vertex (creates a cycle). During DFS, every back edge completes a cycle. Removing back edges from a graph would remove all cycles.
  • Forward edges (to a non-child descendant): to a finished vertex discovered after the current vertex. A forward edge is basically an edge that goes to an indirect descendant, not a direct child. For the graph we've been considering, the edge from D to G is a forward edge. How can we tell? Because G is finished but its discovery time is after that of the current node, D. The vertex G was discovered and explored during the lifetime of G — it's a descendant of D but not a direct descendant. Node B is the direct descendant of D; node G is the direct descendant of B; and node G is an indirect descendant of D.
  • Cross edges (everything else): to a vertex finished before the current vertex's discovery. It's essentially any edge that's not captured in the edge classifications above. It can go from one branch of a tree to another or even from one tree to another. And there isn't any ancestor or descendant relation between the vertices it links. You can tell because it leads to a vertex that finished before the current vertex was discovered. Its parentheses don't overlap.

If we color our example graph so that tree edges are red, back edges are black, forward edges are blue, and cross edges are green, then we end up with the following:

For undirected graphs, we end up seeing each edge twice, once from each vertex. If we classify the edge the first time we see it, then there won't be any forward or cross edges, only tree and back edges.

Unlike breadth first search, if a graph is more connected, then its depth first search trees tend to be taller and more vine-like. If vertices have lots of outgoing edges, then you can keep finding new vertices to explore, and the tree depth can get large. This brings us to a possible implementation hiccup for DFS — for each recursive call, we push variables onto our program's call stack. Different languages have different limits to how deep the call stack can be. A graph with 20,000 vertices might try to make 20,000 nested recursive DFS calls, which can give you a program stack overflow error. To avoid this, we can use our own stack instead of the implicit call stack with recursion.

We can do this in a few ways, but here is one possible manner:

Non-recursive DFS
ResetGraph(G)
for v in V
v.pi = nil
v.discovered = -1
v.finished = -1
time = 1

ExploreVertex(u, S)
if (u.discovered < 0)
u.discovered = time++
S.push(u)
for each v such that u -> v
S.push(u -> v)
else if (u.finished < 0)
u.finished = time++
# else: ignore, top level search of finished vertex

ExploreEdge(u -> v, S)
if (v.discovered < 0)
(u -> v).label = "treeEdge"
v.pi = u
ExploreVertex(v, S)
else if (v.finished < 0)
(u -> v).label = "backEdge
else if (v.discovered > u.discovered)
(u -> v).label = "forwardEdge"
else
(u -> v).label = "crossEdge"

DFS(G)
ResetGraph(G)
S = new Stack()
for u in V
S.push(u)
while (not S.isEmpty())
x = S.pop()
if (x.isVertex())
ExploreVertex(x, S)
else
ExploreEdge(x, S)

Let's go through how the pseudocode above is meant to function:

  • Line 32: Here, we make a stack where we can push edges and vertices.
  • Lines 33-36: Push all vertices on to the stack and pop while the stack isn't empty.
  • Lines 9-13: If we pop an undiscovered vertex, then discover it! Push it again to finish it later, and push its outgoing edges. When we pop it again later, after it's discovered (line 14), then finish it (line 15). And if we pop an already finished vertex, then just ignore it (line 16). That's the equivalent of looping through all vertices but only running depth first search on the undiscovered ones.
  • Lines 19-28: When we pop an edge from the stack, if it leads to an undiscovered vertex, then it's a tree edge and label it as so (line 20) and explore that vertex (line 22); otherwise, just label the edge and we're done with it (lines 23-28).

In the two lines where we push either all vertices (line 34) or all edges from a vertex (line 13), if we push them in the opposite order that you would normally loop through them in the recursive version of the algorithm, then the version above will give us the same results, same times, edge classifications, etc. It will all be the same as the recursive version. The code above doesn't look quite as clean, but it's a nice parallel to see that while breadth first search explicitly uses a first in first out queue of vertices, depth first search can explicitly use a stack of vertices and edges instead of just implicitly using the program call stack.

If we implement all of the pseudocode above for the iterative version, then we will end up with something like the following (the blocks of code have been highlighted where ordering has intentionally been reversed to ensure the same results as the recursive version):

def dfs_iter(graph):
n = len(graph)
edge_classifications = dict()
discovered = [-1] * n
finished = [-1] * n
pred = [-1] * n
time = 0

def explore_vertex(node):
nonlocal time
if discovered[node] < 0:
time += 1
discovered[node] = time
stack.append(node)
for i in range(len(graph[node]) - 1, -1, -1):
nbr = graph[node][i]
stack.append((node, nbr))
elif finished[node] < 0:
time += 1
finished[node] = time

def explore_edge(edge):
node, nbr = edge
if discovered[nbr] < 0:
edge_classifications[edge] = 'treeEdge'
pred[nbr] = node
explore_vertex(nbr)
elif finished[nbr] < 0:
edge_classifications[edge] = 'backEdge'
elif discovered[nbr] > discovered[node]:
edge_classifications[edge] = 'forwardEdge'
else:
edge_classifications[edge] = 'crossEdge'

stack = []
for node in range(n - 1, -1, -1):
stack.append(node)

while stack:
x = stack.pop()
if not isinstance(x, tuple):
explore_vertex(x)
else:
explore_edge(x)

return discovered, finished, pred, { (lookup[edge[0]], lookup[edge[1]]): edge_classifications[edge] for edge in edge_classifications }

The outcome, formatted manually for the sake of clarity, matches exactly what was produced previously using the recursive approach (the edge classification also matches what we would expect):

""" 

( A B C D E F G H I
[ 1, 3, 8, 2, 13, 7, 4, 14, 17], # discovered
[12, 6, 9, 11, 16, 10, 5, 15, 18], # finished
[-1, 3, 5, 0, -1, 3, 1, 4, -1], # predecessors
{ # edge classifications
('A', 'D'): 'treeEdge',
('D', 'B'): 'treeEdge',
('B', 'A'): 'backEdge',
('B', 'G'): 'treeEdge',
('D', 'F'): 'treeEdge',
('F', 'C'): 'treeEdge',
('C', 'B'): 'crossEdge',
('C', 'D'): 'backEdge',
('C', 'G'): 'crossEdge',
('D', 'G'): 'forwardEdge',
('E', 'H'): 'treeEdge',
('H', 'D'): 'crossEdge',
('H', 'E'): 'backEdge',
('H', 'F'): 'crossEdge',
('I', 'E'): 'crossEdge'
}
)

"""

Why is the order reversal in the iterative version important in order to ensure the same results as in the recursive version? Because stacks are LIFO data structures — if we push elements onto the stack in the same order as we would process them recursively, then they will be popped off in reverse order. Consider the worked example we've been dealing with throughout this entire note. If we pushed the vertices A through I onto the stack in that order, then the first vertex popped off would be F, not A. But that's not the desired result! Hence, we push vertices I, H, ... , B, A onto the stack in that order so they are popped in the order A, B, ... , H, I, much as the order they are processed in the recursive version. Similarly, the order in which neighboring vertices are processed needs to be reversed as well; that is, we need to push neighbors onto the stack in reverse order — this ensures that when they are popped off the stack, they are processed in the original order.

Consider a generalized example to fully clarify things:

  • Adjacency list: Let's say u has neighbors [v1, v2, v3].
  • Recursive DFS: Processes v1, then v2, then v3.
  • Iterative DFS (without reversal):
    • Push v1, v2, v3 onto the stack.
    • Pop and process v3, v2, v1 (reverse order).
  • Iterative DFS (with reversal):
    • Push v3, v2, v1 onto the stack.

    • Pop and process v1, v2, v3 (original order).

def fn(graph):
def dfs(node):
ans = 0
# do some logic
for neighbor in graph[node]:
if neighbor not in seen:
seen.add(neighbor)
ans += dfs(neighbor)

return ans

seen = {START_NODE}
return dfs(START_NODE)
Examples
LC 547. Number of Provinces (✓)

There are n cities. Some of them are connected, while some are not. If city a is connected directly with city b, and city b is connected directly with city c, then city a is connected indirectly with city c.

A province is a group of directly or indirectly connected cities and no other cities outside of the group.

You are given an n x n matrix isConnected where isConnected[i][j] = 1 if the ith city and the jth city are directly connected, and isConnected[i][j] = 0 otherwise.

Return the total number of provinces.


class Solution:
def findCircleNum(self, isConnected: List[List[int]]) -> int:
def build_adj_list(adj_mat):
graph = defaultdict(list)
n = len(adj_mat)
for node in range(n):
for neighbor in range(node + 1, n):
if isConnected[node][neighbor] == 1:
graph[node].append(neighbor)
graph[neighbor].append(node)
return graph

def dfs(node):
for neighbor in graph[node]:
if neighbor not in seen:
seen.add(neighbor)
dfs(neighbor)

graph = build_adj_list(isConnected)
seen = set()
provinces = 0

for city in range(len(isConnected)):
if city not in seen:
provinces += 1
seen.add(city)
dfs(city)

return provinces

Cities are nodes, connected cities are provinces (i.e., connected components). The idea here is to explore all provinces by starting with each city and seeing how many cities we can explore from that city — every time we have to start a search again from a new city, we increment the number of overall provinces encountered thus far.

LC 200. Number of Islands (✓)

Given an m x n 2D binary grid grid which represents a map of '1's (land) and '0's (water), return the number of islands.

An island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. You may assume all four edges of the grid are all surrounded by water.


class Solution:
def numIslands(self, grid: List[List[str]]) -> int:
def valid(row, col):
return 0 <= row < m and 0 <= col < n and grid[row][col] == '1'

def dfs(row, col):
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
neighbor = (next_row, next_col)
if neighbor not in seen and valid(*neighbor):
seen.add(neighbor)
dfs(*neighbor)

m = len(grid)
n = len(grid[0])
dirs = [(-1,0),(1,0),(0,-1),(0,1)]
seen = set()
islands = 0

for i in range(m):
for j in range(n):
node = (i, j)
if node not in seen and grid[i][j] == '1':
islands += 1
seen.add(node)
dfs(*node)

return islands

Each "island" is a connected component — our job is to count the total number of connected components. The traversal is a fairly standard DFS traversal on a grid-like graph.

LC 1466. Reorder Routes to Make All Paths Lead to the City Zero (✓)

There are n cities numbered from 0 to n-1 and n-1 roads such that there is only one way to travel between two different cities (this network form a tree). Last year, The ministry of transport decided to orient the roads in one direction because they are too narrow.

Roads are represented by connections where connections[i] = [a, b] represents a road from city a to b.

This year, there will be a big event in the capital (city 0), and many people want to travel to this city.

Your task consists of reorienting some roads such that each city can visit the city 0. Return the minimum number of edges changed.

It's guaranteed that each city can reach the city 0 after reorder.


class Solution:
def minReorder(self, n: int, connections: List[List[int]]) -> int:
roads = set()
def build_adj_list(edge_arr):
graph = defaultdict(list)
for node, neighbor in edge_arr:
roads.add((node, neighbor))
graph[node].append(neighbor)
graph[neighbor].append(node)
return graph

def dfs(node):
ans = 0
for neighbor in graph[node]:
if neighbor not in seen:
seen.add(neighbor)
if (node, neighbor) in roads:
ans += 1
ans += dfs(neighbor)
return ans

graph = build_adj_list(connections)
seen = {0}
return dfs(0)

This is a tough one to come up with if you haven't seen it before. The solution approach above is quite clever. The idea is to build an undirected graph in the form of an adjacency list and then to conduct a DFS from node 0, which means every edge we encounter is necessarily leading away from 0; hence, if that edge appeared in the original road configuration, roads, then we know that road's direction must be changed so that it faces toward node 0 instead of away.

LC 841. Keys and Rooms (✓)

There are N rooms and you start in room 0. Each room has a distinct number in 0, 1, 2, ..., N-1, and each room may have some keys to access the next room.

Formally, each room i has a list of keys rooms[i], and each key rooms[i][j] is an integer in [0, 1, ..., N-1] where N = rooms.length. A key rooms[i][j] = v opens the room with number v.

Initially, all the rooms start locked (except for room 0).

You can walk back and forth between rooms freely.

Return true if and only if you can enter every room.


class Solution:
def canVisitAllRooms(self, rooms: List[List[int]]) -> bool:
def dfs(node):
for neighbor in rooms[node]:
if neighbor not in seen:
seen.add(neighbor)
dfs(neighbor)

seen = {0}
dfs(0)
return len(seen) == len(rooms)

It's quite nice that the given input, rooms, is already in the form of an adjacency list (as an index array). The key insight is to realize that we can use our seen set to determine whether or not all rooms have been visited after conducting a DFS from node 0 (i.e., room 0); that is, if seen is the same length as rooms after the DFS from node 0, then we can say it's possible to visit all rooms (otherwise it's not).

LC 1971. Find if Path Exists in Graph (✓)

There is a bi-directional graph with n vertices, where each vertex is labeled from 0 to n - 1 (inclusive). The edges in the graph are represented as a 2D integer array edges, where each edges[i] = [ui, vi] denotes a bi-directional edge between vertex ui and vertex vi. Every vertex pair is connected by at most one edge, and no vertex has an edge to itself.

You want to determine if there is a valid path that exists from vertex start to vertex end.

Given edges and the integers n, start, and end, return true if there is a valid path from start to end, or false otherwise.


class Solution:
def validPath(self, n: int, edges: List[List[int]], source: int, destination: int) -> bool:
def build_adj_list(edge_arr):
graph = defaultdict(list)
for node, neighbor in edge_arr:
graph[node].append(neighbor)
graph[neighbor].append(node)
return graph

def dfs(node):
if node == destination:
return True

for neighbor in graph[node]:
if neighbor not in seen:
seen.add(neighbor)
if dfs(neighbor):
return True
return False

graph = build_adj_list(edges)
seen = {source}
return dfs(source)

The solution above is a direct application of a DFS traversal. The hardest part is arguably coming up with an effective way of writing the dfs function. It's better not to rely on a nonlocal variable unless we really need to. The idea is that we should stop searching if we encounter a node whose value is equal to the destination. If that is not the case, then we try to explore further. If our DFS comes up empty, then we return False, and that will propagate back up the recursion chain.

LC 323. Number of Connected Components in an Undirected Graph (✓)

You have a graph of n nodes. You are given an integer n and an array edges where edges[i] = [ai, bi] indicates that there is an edge between ai and bi in the graph.

Return the number of connected components in the graph.


class Solution:
def countComponents(self, n: int, edges: List[List[int]]) -> int:
def build_adj_list(edge_arr):
graph = defaultdict(list)
for node, neighbor in edge_arr:
graph[node].append(neighbor)
graph[neighbor].append(node)
return graph

def dfs(node):
for neighbor in graph[node]:
if neighbor not in seen:
seen.add(neighbor)
dfs(neighbor)

graph = build_adj_list(edges)
seen = set()
cc = 0

for node in range(n):
if node not in seen:
cc += 1
seen.add(node)
dfs(node)