Templates for Data Structures and Algorithms
Contents
Symbol  Designation 

✓  This is a problem from LeetCode's interview crash course. 
✠  This is a problem stemming from work done through Interviewing IO. 
★  A rightaligned ★ (one or more) indicates my own personal designation as to the problem's relevance, importance, priority in review, etc. 
 Backtracking
 Binary search
 Dynamic programming
 Graphs
 Greedy algorithms
 Heaps
 Linked lists
 Matrices
 Sliding window
 Stacks and queues
 Trees
 Manually determine order of nodes visited ("tick trick")
 Preorder traversal
 Postorder traversal
 Inorder traversal
 Levelorder traversal
 Levelorder (BFS)
 Induction (solve subtrees recursively, aggregate results at root)
 Traverseandaccumulate (visit nodes and accumulate information in nonlocal variables)
 Combining templates: induction and traverseandaccumulate
 Two pointers
 Miscellaneous
Backtracking
Remarks
TBD
def fn(curr, OTHER_ARGUMENTS...):
if (BASE_CASE):
# modify the answer
return
ans = 0
for (ITERATE_OVER_INPUT):
# modify the current state
ans += fn(curr, OTHER_ARGUMENTS...)
# undo the modification of the current state
return ans
Examples
LC 46. Permutations (✓) ★★
Given an array nums
of distinct integers, return all the possible permutations. You can return the answer in any order.
class Solution:
def permute(self, nums: List[int]) > List[List[int]]:
def backtrack(curr):
if len(curr) == len(nums):
permutations.append(curr[:]) # note that we append a _copy_ of curr
return
for num in nums:
if num not in curr:
curr.append(num)
backtrack(curr)
curr.pop()
permutations = []
backtrack([])
return permutations
What actually is a permutation of nums
? It's essentially all possible orderings of the elements of nums
such that no element is duplicated. A backtracking strategy to generate all permutations sounds promising — what would the base case be? It would be when the current permutation being generated, say curr
, has the same length as the input array nums
: curr.length == nums.length
(of course, this assumes we've done our due diligence and have prevented duplicates from being added to curr
). The base case of curr.length == nums.length
means we have completed the process of generating a permutation and we cannot go any further; specifically, if we look at the process of generating permutations as a tree, then completing the generation of a permutation means we have reached a leaf node, as illustrated in the following image for the input nums = [1, 2, 3]
:
Building all permutations for this problem, where the input is an array of numbers, means we need all elements to make an appearance at the first index, all other elements to make an appearance at the second index, and so on. Hence, we should loop over all elements of nums
for each call to our backtrack
function, where we should always check to see if a number is already in curr
before adding it to curr
. Each call to backtrack
is like visiting a node in the tree of candidates being generated. The leaves are the base cases/answers to the problem.
For the solution given above, if we simply add print(curr)
after the line curr.append(num)
, then we can very clearly see how each call to backtrack
is like visiting a node in the tree (it's like performing a DFS on an imaginary tree):
[1]
[1, 2]
[1, 2, 3] # complete permutation (leaf node)
[1, 3]
[1, 3, 2] # complete permutation (leaf node)
[2]
[2, 1]
[2, 1, 3] # complete permutation (leaf node)
[2, 3]
[2, 3, 1] # complete permutation (leaf node)
[3]
[3, 1]
[3, 1, 2] # complete permutation (leaf node)
[3, 2]
[3, 2, 1] # complete permutation (leaf node)
The entire list of complete permutations is then returned as the answer:
[[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]
The time complexity above effectively amounts to roughly $O(n^2\cdot n!)$, because we iterate through all of nums
for each call to backtrack
, and membership to curr
is a linear cost when curr
is an array, and then we're guaranteed to make $n$ calls to backtrack
, and each call to backtrack
then results in $n  1$ calls, and so forth. If we wanted to make a microoptimization, then we could introduce a hash set to make the membership checks on curr
$O(1)$ instead of $O(n)$, but this change pales in comparison to the factorial cost of calling backtrack
so many times:
class Solution:
def permute(self, nums: List[int]) > List[List[int]]:
def backtrack(curr):
if len(curr) == len(nums):
permutations.append(curr[:])
return
for num in nums:
if num not in lookup:
curr.append(num)
lookup.add(num)
backtrack(curr)
curr.pop()
lookup.remove(num)
lookup = set()
permutations = []
backtrack([])
return permutations
LC 78. Subsets (✓) ★★
Given an integer array nums
of unique elements, return all possible subsets (the power set). The solution set must not contain duplicate subsets. Return the solution in any order.
class Solution:
def subsets(self, nums: List[int]) > List[List[int]]:
def backtrack(curr, num_idx):
subs.append(curr[:])
if num_idx == len(nums): # note that this base case is implied
return # since `backtrack` is only called in the
# for loop when `num_idx < len(nums)`
for i in range(num_idx, len(nums)):
curr.append(nums[i])
backtrack(curr, i + 1)
curr.pop()
subs = []
backtrack([], 0)
return subs
This problem is quite similar to LC 46. Permutations but with some very notable differences, namely container length and element order:
 Length: A subset can have any length from
0
throughn
(wheren
is the size of the input array of distinct integers), inclusive, but a permutation has a fixed length ofn
.  Order: The containers
[1, 2, 3]
and[3, 2, 1]
are considered to be different permutations but the same subset.
If we have a problem where containers like those above are considered to be duplicates and we do not want to consider duplicates (e.g., such as this problem concerned with finding subsets), then a common "trick" is to add a rule where each call of the backtrack
function allows us to only consider elements that come after the previously processed element:
This is a very common method of avoiding duplicates in backtracking problems — have an integer argument that represents a starting point for iteration at each function call.
For example, in this problem, we start with the root being the empty container, []
:
[]
With nums = [1, 2, 3]
, we can clearly consider each element as the beginning of its own subset:
[ ]
/  \
[1] [2] [3]
Now what? Remember that calling backtrack
is like moving to another node; hence, to respect the strategy remarked on above, when we move to another node, that node should only involve elements that come after the one we have just processed. This means the tree of possibilities above should end up looking like the following:
[ ] # level subsets: []
/  \
[1] [2] [3] # level subsets: [1], [2], [3]
/ \ 
[2] [3] [3] # level subsets: [1,2], [1,3], [2,3]

[3] # level subsets: [1,2,3]
The actual order of subset generation from our solution code is not hard to anticipate in light of the strategy we've been discussing, where, again, we're basically doing a DFS on an imaginary tree, and once we hit the last indexed element (i.e., when index == len(nums)
) we move back up the tree from child to parent:
[
[],
[1],
[1,2],
[1,2,3],
[1,3],
[2]
[2,3]
[3]
]
The order conjectured above is confirmed by the return value of our solution when nums = [1, 2, 3]
:
[[], [1], [1, 2], [1, 2, 3], [1, 3], [2], [2, 3], [3]]
The process becomes even clearer if we add the print statement print(subs)
after subs.append(curr[:])
as well as the print statement print('COMPLETED')
after if num_idx == len(nums):
. Making these modifications and running the solution code again on the input nums = [1,2,3]
results in the following being printed to standard output:
[[]]
[[], [1]]
[[], [1], [1, 2]]
[[], [1], [1, 2], [1, 2, 3]]
COMPLETED
[[], [1], [1, 2], [1, 2, 3], [1, 3]]
COMPLETED
[[], [1], [1, 2], [1, 2, 3], [1, 3], [2]]
[[], [1], [1, 2], [1, 2, 3], [1, 3], [2], [2, 3]]
COMPLETED
[[], [1], [1, 2], [1, 2, 3], [1, 3], [2], [2, 3], [3]]
COMPLETED
[[], [1], [1, 2], [1, 2, 3], [1, 3], [2], [2, 3], [3]]
Note that the text 'COMPLETED'
is only ever printed after an element/subset has been added to subs
that ends with 3
, which corresponds to the leaves of the tree shown earlier.
To summarize, when generating permutations, we had a length requirement, where we needed to use all of the elements in the input; hence, we only considered leaf nodes as part of the actual returned answer. With subsets, however, there is no length requirement; thus, every node should be in the returned answer, including the root node, which is why the very first line of the backtrack
function is to add a copy of curr
to the returned answer subs
.
LC 77. Combinations (✓) ★★
Given two integers n
and k
, return all possible combinations of k
numbers out of 1
... n
. You may return the answer in any order.
class Solution:
def combine(self, n: int, k: int) > List[List[int]]:
def backtrack(curr, start_val):
if len(curr) == k:
combinations.append(curr[:])
return
for i in range(start_val, n + 1):
curr.append(i)
backtrack(curr, i + 1)
curr.pop()
combinations = []
backtrack([], 1)
return combinations
It can be very helpful to sketch out a tree of possibilities if we suspect backtracking may be an effective strategy for coming up with a solution (just like sketching out things for general tree problems, linked list problems, etc.).
The root of our tree would be []
, the empty list. Unlike in LC 78. Subsets, there is an explicit length requirement in terms of the elements that can be added to the container we must ultimately return; hence, not all nodes in the tree of possibilities will represent entities that should be added to our answer array.
What is clear is that each entry added to the list we must return must have a length of k
, and it must contain numbers from the list [1, ..., n]
, inclusive, where no number in the k
length list is duplicated. This means we should consider combining some of the strategy points we used in both LC 46. Permutations and LC 78. Subsets, namely using a length requirement as part of the base case as well as preventing duplicates from being considered, respectively.
The thinking goes something like the following: start with the empty list, []
, as the root of the tree:
[]
Each k
length list can have a number from [1, ..., n]
, inclusive; thus, the root should have n
children, namely nodes with values from 1
through n
(the example below uses the values of n = 4, k = 2
from the first example on LeetCode):
[ ]
[1] [2] [3] [4]
Now what? A complete solution will be generated only when the path has length k
, but each path right now has a length of 1
(but k = 2
for this example). To keep generating paths (i.e., possible solutions), we need to avoid considering duplicates, which means for each node value i
, we only subsequently consider node values [i + 1, ..., n]
, which results in our overall tree looking like the following:
[ ]
[1] [2] [3] [4]
[2] [3] [4] [3] [4] [4]
Since the input specifies k = 2
, the tree above suffices to report the following as the combinations list:
[
[1,2],
[1,3],
[1,4],
[2,3],
[2,4],
[3,4]
]
LC 797. All Paths From Source to Target (✓) ★
Given a directed acyclic graph (DAG) of n
nodes labeled from 0
to n  1
, find all possible paths from node 0
to node n  1
, and return them in any order.
The graph is given as follows: graph[i]
is a list of all nodes you can visit from node i
(i.e., there is a directed edge from node i
to node graph[i][j]
).
class Solution:
def allPathsSourceTarget(self, graph: List[List[int]]) > List[List[int]]:
def backtrack(curr_path, node):
if node == len(graph)  1:
paths.append(curr_path[:])
return
for neighbor in graph[node]:
curr_path.append(neighbor)
backtrack(curr_path, neighbor)
curr_path.pop()
paths = []
backtrack([0], 0)
return paths
This is a great backtracking problem because backtracking is not necessarily the obvious first strategy of attack. Even if we do think of backtracking, how exactly should we proceed? This becomes much clearer (again!) once we start to sketch out the tree of possibilities. What would the root of our tree be? It has to be node 0
since our goal is to find all paths from node 0
to node n  1
. This also suggests something about our base case: we should terminate path generation whenever node n  1
is reached. We don't have to worry about cycles or anything like that since we're told the graph is a DAG. Additionally, the graph is already provided as an adjacency list which makes traversing each node's neighbors quite easy.
So what's the strategy? Let's start with our root node (which needs to be part of every solution):
0
Now what? Each neighbor of node 0
needs to be considered (we're trying to get to node n  1
from node 0
in whatever way is possible, which means exploring all possible paths). Let's use the input of the second example on LeetCode for a concrete illustration: graph = [[4,3,1],[3,2,4],[3],[4],[]]
. This graphs looks like the following:
As stated above, from 0
we need to consider each of its neighbors:
0
/  \
1 3 4
The tree above makes it clear [0,4]
will be one possible path, but what are the other possible paths? It looks like leaf nodes will be the complete solutions that need to be added to our answer that we ultimately return. For each node that is not 4
(i.e., n  1 == 4
in this case), we need to consider each possible neighbor (this will not be endless because we're told the graph is a DAG). Considering the neighbors for each node means our tree of possibilities ultimately ends up looking as follows:
0
/  \
1 3 4
/  \ 
2 3 4 4
 
3 4

4
Hence, the set of possible paths is as expected:
[[0,4],[0,3,4],[0,1,3,4],[0,1,2,3,4],[0,1,4]]
LC 17. Letter Combinations of a Phone Number (✓) ★★
Given a string containing digits from 29
inclusive, return all possible letter combinations that the number could represent. Return the answer in any order.
A mapping of digit to letters (just like on the telephone buttons) is given below. Note that 1 does not map to any letters.
class Solution:
def letterCombinations(self, digits: str) > List[str]:
if not digits:
return []
keypad = {
'2': 'abc',
'3': 'def',
'4': 'ghi',
'5': 'jkl',
'6': 'mno',
'7': 'pqrs',
'8': 'tuv',
'9': 'wxyz'
}
def backtrack(curr, start_idx):
if len(curr) == len(digits):
combinations.append("".join(curr))
return
for i in range(start_idx, len(digits)):
digit = digits[i]
for letter in keypad[digit]:
curr.append(letter)
backtrack(curr, i + 1)
curr.pop()
combinations = []
backtrack([], 0)
return combinations
As with most other backtracking problems, it helps if we start by sketching out the tree of possibilities, where we can imagine the root being an empty string:
''
Now let's consider the input digits = "23"
, the input for the first example on LeetCode. The desired output is ["ad","ae","af","bd","be","bf","cd","ce","cf"]
. How can this be achieved? The first digit is 2
, which means 'a'
, 'b'
, and 'c'
are valid starting letters for combinations:
' '
/  \
'a' 'b' 'c'
We do not want to add duplicates of these letters for subsequent possible combinations but instead consider the potential letters arising from the next digit (i.e., we want to prevent processing duplicates by only ever processing digits after the current digit). The next digit in this case is 3
which corresponds to possible letters 'd'
, 'e'
, and 'f'
. Our tree now looks like the following:
' '
/  \
'a' 'b' 'c'
/  \ /  \ /  \
'd' 'e' 'f' 'd' 'e' 'f' 'd' 'e' 'f'
The tree of possibilities above makes it clear the letter combinations we should return are as follows, as expected:
[
'ad',
'ae',
'af',
'bd',
'be',
'bf',
'cd',
'ce',
'cf',
]
LC 39. Combination Sum (✓) ★
Given an array of distinct integers candidates
and a target integer target
, return a list of all unique combinations of candidates
where the chosen numbers sum to target
. You may return the combinations in any order.
The same number may be chosen from candidates
an unlimited number of times. Two combinations are unique if the frequency of at least one of the chosen numbers is different.
It is guaranteed that the number of unique combinations that sum up to target
is less than 150
combinations for the given input.
class Solution:
def combinationSum(self, candidates: List[int], target: int) > List[List[int]]:
def backtrack(path, path_sum, next_path_node_idx):
if path_sum == target:
paths.append(path[:])
return
for i in range(next_path_node_idx, len(candidates)):
candidate = candidates[i]
if path_sum + candidate <= target:
path.append(candidate)
backtrack(path, path_sum + candidate, i)
path.pop()
paths = []
backtrack([], 0, 0)
return paths
This problem is similar to LC 77. Combinations, but now we are taksed with generating all combinations of candidates
such that each candidate number is allowed however many times we want, but the sum of the numbers for each combination/mixture of candidates has to equal the given target
.
Start by drawing a tree!
[]
Each number in candidates
may be part of a path sum we want to return, where the sum of the elements in the path equates to target
. Let's use the input provided in the first example on LeetCode: candidates = [2,3,6,7], target = 7
:
[ ]
/ / \ \
2 3 6 7
The problem description indicates any number may be reused however many times we desire, but we still don't want to generate duplicates such as [2,2,3]
, [2,3,2]
, etc. How can we avoid doing this? We can use the same strategy that many other backtracking problems use (i.e., ensure we only begin processing elements that are either the current element itself or elements that come after the current element):
[ ]
/ / \ \
2 3 6 7
/ / \ \ /\ / \ 
2 3 6 7 3 6 7 6 7 7
..................................
When do we stop the process of generating paths? Since all possible values in candidates
are positive, this necessarily means a path is no longer valid if its path sum exceeds the given target
value.
Note that the following lines in the solution above are important:
# ...
for i in range(next_path_node_idx, len(candidates)):
candidate = candidates[i]
if path_sum + candidate <= target:
path.append(candidate)
# ...
It might be tempting to modify path_sum
with the candidate
value before entering the if
block, but this would be a mistake. Why? Because the list of candidates
is not necessarily ordered; hence, if we made the update path_sum += candidate
and the new path_sum
exceeded target
, then we would no longer consider that path, but this would also prevent us from exploring other branches using candidates of a possibly lesser value where path_sum + candidate
did not exceed target
.
LC 79. Word Search (✓) ★★★
Given an m x n
board
and a word
, find if the word exists in the grid.
The word can be constructed from letters of sequentially adjacent cells, where "adjacent" cells are horizontally or vertically neighboring. The same letter cell may not be used more than once.
class Solution:
def exist(self, board: List[List[str]], word: str) > bool:
def valid(row, col):
return 0 <= row < m and 0 <= col < n
def backtrack(row, col, word_idx, seen):
if word_idx == len(word):
return True
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
if valid(next_row, next_col) and (next_row, next_col) not in seen:
next_char = board[next_row][next_col]
if next_char == word[word_idx]:
seen.add((next_row, next_col))
if backtrack(next_row, next_col, word_idx + 1, seen):
return True
seen.remove((next_row, next_col))
return False
m = len(board)
n = len(board[0])
dirs = [(1,0),(1,0),(0,1),(0,1)]
for row in range(m):
for col in range(n):
if board[row][col] == word[0] and backtrack(row, col, 1, {(row, col)}):
return True
return False
This problem has a very DFS feel to it, but what makes this a backtracking problem and not a strict DFS problem is because we may visit a square multiple times on the same initial function call. For example, suppose we're looking for the word CAMERA
in the following grid:
CAM
ARE
MPS
We may start exploring as follows (each 
represents the path taken):
C A M

A R E

M P S
The M
letter above does not have E
as a neighbor so we remove the M
from consideration and move back up to A
, which also does not have an M
neighbor except the one we just visited, which we know leads to nothing. So we remove A
as well and we're back at C
, and we note C
does have another A
neighbor it can visit. And we end up seeing the following as a valid path:
C > A > M

A < R < E
M P S
This path visits the A
under C
again in order to come up with a valid path. Thus, while we may not be allowed to use a square more than once for the answer, there are possibly multiple ways to use a square to form different candidates, as illustrated above.
We should still use a set seen
like we would normally do with DFS solutions so as to avoid using the same letter in the same path. Unlike in DFS solutions, however, we will remove from the set seen
when backtracking and it's been determined that a path solution is no longer possible — we should only ever traverse an edge in the graph if we know that the resultant path could lead to an answer. Thus, we can also pass an index variable word_idx
that indicates we are currently looking for word[word_idx]
, and then only move to (next_row, next_col)
if word[word_idx]
is the correct letter.
Since the answer could start from any square whose letter matches word[0]
, we need to systematically start our backtracking approach from any square with word[0]
as its letter. If we exhaust all such squares and never find word
, then we should return false
. Of course, if we do find word
in the midst of one of our backtracking approaches, then we return true
immediately.
LC 52. NQueens II (✓) ★★★
The nqueens puzzle is the problem of placing n
queens on an n x n
chessboard such that no two queens attack each other.
Given an integer n
, return the number of distinct solutions to the nqueens puzzle.
class Solution:
def totalNQueens(self, n: int) > int:
def backtrack(row, cols, diagonals, anti_diagonals):
# base case: n queens have been placed
if row == n:
return 1
solution = 0
for col in range(n):
diagonal = row  col
anti_diagonal = row + col
# queen is not placeable, move to considering next column
if (col in cols or diagonal in diagonals or anti_diagonal in anti_diagonals):
continue
# add the queen to the board
cols.add(col)
diagonals.add(diagonal)
anti_diagonals.add(anti_diagonal)
# move to next row with the updated board state
solution += backtrack(row + 1, cols, diagonals, anti_diagonals)
# remove the queen from the board since we have
# explored all valid paths using the above backtrack function call
cols.remove(col)
diagonals.remove(diagonal)
anti_diagonals.remove(anti_diagonal)
return solution
return backtrack(0, set(), set(), set())
This problem is difficult. And it's often used to explain backtracking at various learning levels (e.g., university and college classes). The solution to this problem is informative, especially in regards to how we might want to go about modeling backtracking solutions to other problems.
Whenver we make a call to backtrack
, we're doing so by describing the state of whatever it is that is important for consideration in solving the problem at hand. For some problems, maybe that state involves a string and the number of used left and right parentheses (e.g., LC 22. Generate Parentheses), maybe the state involves the row and column of a grid under consideration along with the cells already seen and the index of a word that represents a character for which we are searching (e.g., LC 79. Word Search). And maybe, as in this problem, the state needs to involve a chessboard!
What state should we consider tracking? We need to somehow keep track of rows and columns in some way. The trickier part about this problem involves the diagonals. Even though we have a board in this problem, this is not like a graph problem where we need to conduct a DFS or BFS — fundamentally, our job at each step is to place a queen on the board so that it is not under attack. This is the backtracking part where we narrow down the range of possibilities we consider — the brute force approach would involve generating all board possibilities with queens and then narrow down which solutions actually work, but that would involve placing queens on the board in positions that are being attacked, which means the rest of the generation of possibilities is useless. We want to place queens in positions where they are not being attacked and where whatever subsequent positions we consider can actually lead to a solution. Only once we find a solution do we start to undo queen placement and try other queen placements, but the key is that every placement must be one that could potentially lead to success.
With everything in mind above, how should we proceed? First, it's helpful to just think about how queens move on a chessboard: unlimited moves vertically, horizontally, diagonally (top left to bottom right), and antidiagonally (top right to bottom left). If we are given n
queens on an n x n
chessboard, then it's clear that, at the end of the board configuration, each row must have a queen and each column must have a queen. The trickiest part is managing the diagonals effectively.
But we can start by working with the insight above about rows and columns. Each time backtrack
is called should result in the placement of a queen on a new row; that is, we'll pass row
as an argument to backtrack
, and when we have finally reached row == n
(i.e., rows 0
through n  1
have been filled with queens), then we will know we have found a valid solution. Finding a workable position on any given row means processing all column positions on that row until we have found a position that is not under attack.
What about columns? As noted above, for each row being processed, we will iterate through all row positions (i.e., across all columns) until we find a column position that is not under attack. The row we are processing is definitely not under attack due to how we're processing rows, but how do we know whether or not each column is under attack? If another queen is on any column we're considering for a current row, then that column position is invalid because that other queen's line of attack runs through the current position we're considering. We need to consider the next column position. To keep track of which columns are already occupied by queens (i.e., which columns would be considered to be under attack when we're trying to add a new queen), whenever we add a queen to the board, we should add the column position of the queen to a set cols
that we can use when we're trying to add subsequent queens.
Cool! We consider rows sequentially so it's guaranteed each row we're considering is not under attack. We have a set cols
that shows which columns are occupied. It seems like we need a similar "strategy" for diagonals and antidiagonals that we currently have for columns; that is, whenever we add a queen to the board, we should also note which diagonal and antidiagonal just became occupied. But how do we effectively apply a singular label or marker to a diagonal or antidiagonal that spans multiple positions on the board?
This seems really tricky to do at first until we make note of a brilliant observation concerning diagonal and antidiagonal movement (R
and C
below represent row and column position, respectively):

diagonal movement (top left to bottom right): If we start at any cell
(R, C)
and move one position up, then we will be at cell(R  1, C  1)
. If we move one position down, then we will be at(R + 1, C + 1)
. In general, if we letdr
anddc
represent the change in row or column value, respectively, then we will find ourselves always moving from cell(R, C)
to(R + dr, C + dc)
on a given diagonal. Importantly, note thatdr == dc
since a change in any direction effects each value in the same way (e.g., moving 4 spaces up meansdr == dc == 4
since the row and column values both decrease by 4).But how does any of this help in service of creating a unique label for a diagonal? The insight lies in how the row and column values relate to each other. Let
(R, C)
be the coordinates of a cell on a diagonal, and let's label the cell as having valueR  C
. Then what happens whenever we move from cell(R, C)
to cell(R + dr, C + dc)
? Sincedr == dc
, we have(R + dr)  (C + dc) = R + dr  C  dr = R  C
; that is, the label for any cell on a diagonal is the same:Hence, we should label visited diagonals by adding
row  col
values to adiagonals
set. 
antidiagonal movement (top right to bottom left): We can make a similar argument to the one above about effectively labeling antidiagonals. Let
(R, C)
represent any given cell on an antidiagonal. Then moving up a cell would take us to cell(R  1, C + 1)
while moving down a cell would take us to cell(R + 1, C  1)
; that is, the row and column values of a cell change at the same rate in an inversely proportional manner. We essentially havedr == dc
again, but there needs to be a sign difference in how we represent a movement from(R, C)
to another cell:(R + dr, C  dc)
. Then(R + dr) + (C  dc) = R + dr + C  dr = R + C
:Hence, we should label visited antidiagonals by adding
row + col
values to anantidiagonals
set.
LC 22. Generate Parentheses (✓) ★★★
Given n
pairs of parentheses, write a function to generate all combinations of wellformed parentheses.
class Solution:
def generateParenthesis(self, n: int) > List[str]:
def backtrack(curr_str, left_count, right_count):
if len(curr_str) == 2 * n:
valid_parens.append("".join(curr_str))
return
if left_count < n:
curr_str.append('(')
backtrack(curr_str, left_count + 1, right_count)
curr_str.pop()
if left_count > right_count:
curr_str.append(')')
backtrack(curr_str, left_count, right_count + 1)
curr_str.pop()
valid_parens = []
backtrack([], 0, 0)
return valid_parens
The editorial for this solution on LeetCode is quite good. The key insights (with the first being somewhat minor but the second one being critical):
 We should append the string we're building whenever its length is
2n
because the string we're building, when taking a backtracking approach anyway, must have the potential to be valid and a complete solution at every point. Hence, once/if the candidate string reaches a length of2n
, then we know the string satisfies the constraints of the problem and is a complete solution.  As a starting point, we can add as many leftparentheses as we want without fear of producing an invalid string (so long as the number of left parentheses we add doesn't exceed
n
). We then need to start adding right parentheses. But how can we add right parentheses in general? We should only ever consider adding a right parenthesis when the total number of left parentheses exceeds the number of right parentheses. That way when we add a right parenthesis there's a chance the subsequent string could be a wellformed parenthetical string of length2n
, as desired.
The LeetCode editorial linked above shows how sketching out a tree of possibilities is very useful for this problem, where the following illustration is for the beginning of the case where n = 2
:
We can start to see how the logic of the solution above makes sense. For the sake of completeness and concreteness, consider the input of the first example on LeetCode, n = 3
, and the corresponding desired output:
["((()))","(()())","(())()","()(())","()()()"]
We can make our own diagram that shows how each solution is built (the x
indicates that potential solution path is longer pursued since further work on that path cannot possibly lead to a correct answer):
_____________________________________________________"
/ \
_________________________________________________(________________________________________________ x
/ \
_______________________((_______________________ _______________________()
/ \ / \
(((___ _________________(()________________ _________________()(________________ x
/ \ / \ / \
x ((()____ (()(____ _________(()) ()((____ _________()()
/ \ / \ / \ / \ / \
x ((())__ x (()()__ (())(__ x x ()(()__ ()()(__ x
/ \ / \ / \ / \ / \
x ((())) x (()()) x (())() x ()(()) x ()()()
As a note of reference, the tree above was generated using the binarytree
package:
from binarytree import build2
my_tree = build2([
'"',
"(", 'x',
"((", "()", None, None,
"(((", "(()", "()(", "x",
"x", "((()", "(()(", "(())", "()((", "()()", None, None,
None, None, "x", "((())", "x", "(()()", "(())(", "x", "x", "()(()", "()()(", "x",
None, None, "x", "((()))", None, None, "x", "(()())", "x", "(())()", None, None, None, None, "x", "()(())", "x", "()()()"
])
root = my_tree.levelorder[0]
print(root)
LC 967. Numbers With Same Consecutive Differences (✓) ★★
Return all nonnegative integers of length n
such that the absolute difference between every two consecutive digits is k
.
Note that every number in the answer must not have leading zeros. For example, 01
has one leading zero and is invalid.
You may return the answer in any order.
class Solution:
def numsSameConsecDiff(self, n: int, k: int) > List[int]:
def digits_to_int(digits):
ans = digits[0]
for i in range(1, len(digits)):
ans = (ans * 10) + digits[i]
return ans
def backtrack(digit_arr):
if len(digit_arr) == n:
res.append(digits_to_int(digit_arr))
return
for num in range(10):
if abs(digit_arr[1]  num) == k:
digit_arr.append(num)
backtrack(digit_arr)
digit_arr.pop()
res = []
for start_digit in range(1, 10):
backtrack([start_digit])
return res
The impossibility of leading zeros can complicate things if we're not careful. We still need to consider the digit 0
as part of potential constraintsatisfying integers. An easy way to handle this is to completely prevent 0
from being a possible leading digit at the outset. Execute the backtrack
function for digit arrays that begin with each number 1
through 9
, inclusive, and append complete solutions to an overall results array, res
, that we ultimately return. It's easiest to manage the digits if we build each integer using a digits array and then return the actual integer once the digits array represents a constraintsatisfying integer.
LC 216. Combination Sum III (✓) ★
Find all valid combinations of k
numbers that sum up to n
such that the following conditions are true:
 Only numbers
1
through9
are used.  Each number is used at most once.
Return a list of all possible valid combinations. The list must not contain the same combination twice, and the combinations may be returned in any order.
class Solution:
def combinationSum3(self, k: int, n: int) > List[List[int]]:
def backtrack(nums_arr, curr_sum, next_num):
if len(nums_arr) == k and curr_sum == n:
res.append(nums_arr[:])
return
for num in range(next_num, 10):
if curr_sum + num <= n:
nums_arr.append(num)
backtrack(nums_arr, curr_sum + num, num + 1)
nums_arr.pop()
res = []
backtrack([], 0, 1)
return res
The valid combinations can only involve positive integers, which means the sum we're building can only get bigger. Hence, we should only consider as possibilities combinations of integers for which the sum does not exceed n
. Also, the numbers used as part of the combination need to be unique. Which means whenever we start with a certain number we should only ever consider subsequent integer values.
Binary search
The two templates immediately below represent pending templates. The first for binary search on an array:
def binary_search(arr, target):
left = 0 # starting index of search range (inclusive)
right = len(arr)  1 # ending index of search range (inclusive)
result = 1 # result of 1 indicates target has not been found yet
while left <= right: # continue search while range is valid
mid = left + (right  left) // 2 # prevent potential overflow
if arr[mid] < target: # target is in right half
left = mid + 1 # (move `left` to narrow search range to right half)
elif arr[mid] > target: # target is in left half
right = mid  1 # (move `right` to narrow search range to left half)
else: # target found (i.e., arr[mid] == target)
result = mid # store index where target is found (early return, if desired)
right = mid  1 # uncomment to find first occurrence by narrowing search range to left half
# left = mid + 1 # uncomment to find last occurrence by narrowing search range to right half
if result != 1:
return result # target was found; return its index
else:
return left # target was not found; return insertion point to maintain sorted order
# NOTES:
# only one of the following lines should be uncommented in the else clause: `right = mid  1` and `left = mid + 1` should
# `right = mid  1` uncommented and `left = mid + 1` commented out results in searching for first occurrence of target
# `left = mid + 1` uncommented and `right = mid  1` commented out results in searching for last occurrence of target
And for binary search on a solution space:
def binary_search_sol_space(arr):
def possible(threshold):
# this function is implemented depending on the problem
return BOOLEAN
left = MINIMUM_POSSIBLE_ANSWER # minimum possible value in solution space (inclusive)
right = MAXIMUM_POSSIBLE_ANSWER # maximum possible value in solution space (inclusive)
result = 1 # desired result (1 to indicate no valid value found yet)
while left <= right: # continue search while range is valid
mid = left + (right  left) // 2
if possible(mid):
result = mid # mid satisfies condition; update result
right = mid  1 # adjust right to find smaller valid value (minimization)
else:
left = mid + 1 # mid doesn't satisfy condition; search right half
# IMPORTANT: swap `right = mid  1` and `left = mid + 1`
# if looking to maximize valid value (i.e., instead of minimize)
return result # return best value found satisfying condition
Find first target index (if it exists)
Remarks
This template will return the first index where target
is encountered. If duplicates are present, then the index returned is effectively random (i.e., the target
matched/identified by means of the search could be neither the leftmost occurrence nor the rightmost occurrence but somewhere in between). If no match is found, then the return value, left
, will point to the insertion point where target
would need to be inserted in order to maintain the sorted property of arr
.
def binary_search(arr, target):
left = 0
right = len(arr)  1
while left <= right:
mid = left + (right  left) // 2
if target < arr[mid]:
right = mid  1
elif target > arr[mid]:
left = mid + 1
else:
return mid
return left
Examples
LC 704. Binary Search
Given an array of integers nums
which is sorted in ascending order, and an integer target
, write a function to search target
in nums
. If target
exists, then return its index. Otherwise, return 1
.
class Solution:
def search(self, nums: List[int], target: int) > int:
left = 0
right = len(nums)  1
while left <= right:
mid = left + (right  left) // 2
if target < nums[mid]: # target in left half, move right boundary
right = mid  1
elif target > nums[mid]: # target in right half, move left boundary
left = mid + 1
else:
return mid # target at current mid position, return
return 1
The code above is the template for a basic binary search. We're guaranteed that the numbers in nums
are unique, which means target
, if it exists, will be found and the first index at which it occurs (and only index) will be returned.
LC 74. Search a 2D Matrix
Write an efficient algorithm that searches for a value in an m x n
matrix. This matrix has the following properties:
 Integers in each row are sorted from left to right.
 The first integer of each row is greater than the last integer of the previous row.
class Solution:
def searchMatrix(self, matrix: List[List[int]], target: int) > bool:
m = len(matrix)
n = len(matrix[0])
left = 0
right = m * n  1
while left <= right:
mid = left + (right  left) // 2
val = matrix[mid // n][mid % n]
if target < val:
right = mid  1
elif target > val:
left = mid + 1
else:
return True
return False
This problem is very similar to the following classic binary search problem: LC 704. Binary Search. The main difference is that in this problem we basically need to view the 2D array matrix
as virtually flattened into a 1D array with index values idx
bounded by 0 <= idx <= m * n  1
. The key part of the solution above is, of course, the following line: val = matrix[mid // n][mid % n]
. This lets us seamlessly find a cell value in the virtually flattened matrix by converting a given index value to the appropriate row and column in the original input matrix. We could make this more explicit by abstracting away the logic in the line above into its own function:
def index_to_cell_val(idx):
row = idx // n
col = idx % n
return matrix[row][col]
The idea is that we can binary search on the virtually flattened 1D array.
LC 35. Search Insert Position (✓)
Given a sorted array of distinct integers and a target value, return the index if the target is found. If not, return the index where it would be if it were inserted in order.
class Solution:
def searchInsert(self, nums: List[int], target: int) > int:
left = 0
right = len(nums)  1
while left <= right:
mid = left + (right  left) // 2
if target < nums[mid]:
right = mid  1
elif target > nums[mid]:
left = mid + 1
else:
return mid
return left
Distinctness of the input integers in nums
ensures returning mid
will be the index of target
if it's found; otherwise, we return left
, which will be the index where we should insert target
in order to keep a sorted array.
LC 633. Sum of Square Numbers★
Given a nonnegative integer c
, decide whether there're two integers a
and b
such that a^{2} + b^{2} = c
.
class Solution:
def judgeSquareSum(self, c: int) > bool:
def binary_search(left, right, target):
while left <= right:
b = left + (right  left) // 2
if target < b * b:
right = b  1
elif target > b * b:
left = b + 1
else:
return True
return False
a = 0
while a * a <= c:
b_squared = c  a * a
if binary_search(0, b_squared, b_squared):
return True
a += 1
return False
Binary search can crop up in all sorts of unexpected places. This is one of them. The idea is that we iteratively search the space [0, c  a^2]
for a value of b
such that b^2 == c  a^2
that way a^2 + b^2 == c
, as desired. The funkier part of the binary search solution is what the usual mid
denotes in the binary search itself, namely the b
value we're looking for such that b^2 == target
, where target == c  a^2
. Hence, when we make adjustments to the left or right endpoints, we're actually comparing the target
value against b * b
where b
takes the role of the normal mid
value. If it's ever the case that the target
is neither less than b * b
nor greater than b * b
, then we've found an integer b
value that satisfies the equation (and we've done so using binary search!).
Find leftmost target index (or insertion point)
Remarks
This template ensures we find the leftmost occurrence (i.e., minimum index value) of target
(if it exists). If target
does not exist in the input array, arr
, then this template will return the index at which target
should be inserted to maintain the ordered property of arr
.
How does this work? What happens if it's ever the case that arr[mid] == target
in the template function above? It's the right half that gets collapsed, by means of right = mid
, thus pushing the search space as far left as possible.
The left
value returned by the function in the template above is also the number of elements in arr
that are less than target
.
This should make sense upon some reflection — if the function in our template returns the leftmost occurrence of the target
value as well as the insertion point of target
to keep the sorted property of arr
, then it must be the case that all values to the left of the returned value are less than target
. The fact that arrays are 0indexed helps here; for example, if our template function returns 3
, then this means the three elements at index 0
, 1
, and 2
are all less than target
.
def binary_search_leftmost(arr, target):
left = 0
right = len(arr)
while left < right:
mid = left + (right  left) // 2
if target <= arr[mid]:
right = mid
else:
left = mid + 1
return left
Examples
LC 2300. Successful Pairs of Spells and Potions
You are given two positive integer arrays spells
and potions
, of length n
and m
, respectively, where spells[i]
represents the strength of the i
th spell and potions[j]
represents the strength of the j
th potion.
You are also given an integer success
. A spell and potion pair is considered successful if the product of their strengths is at least success
.
Return an integer array pairs
of length n
where pairs[i]
is the number of potions that will form a successful pair with the i
th spell.
class Solution:
def successfulPairs(self, spells: List[int], potions: List[int], success: int) > List[int]:
def successful_potions(spell_strength):
# need spell * potion >= success (i.e., potion >= threshold)
threshold = success / spell_strength
left = 0
right = len(potions)
while left < right:
mid = left + (right  left) // 2
if threshold <= potions[mid]:
right = mid
else:
left = mid + 1
return len(potions)  left
potions.sort()
return [ successful_potions(spell) for spell in spells ]
For any potion to actually be successful, it must be greater than or equal to success / spell
for any given spell. Since we process potions
for each spell in spells
, this suggests we should preprocess potions
by sorting. Then we can conduct a binary search (where the target is equal to success / spell
for a given spell) to determine the where the insertion point would need to be in order for a new potion to be successful. We want everything to the right of that point. Hence, if len(potions)
is the length of the potions array and we determine that the leftmost insertion point should occur at i
, then we want everything from [i, len(potions)  1]
; that is, (len(potions)  1)  i + 1 == len(potions)  i
, where the first + 1
reflects inclusivity of i
. An example will make this clear.
Let's say we sort potions
and have potions = [1, 2, 3, 4, 5]
, and success = 7
. We have a spell with a strength of 3
. To form a successful pair, we need a potion with a strength of at least 7 / 3 = 2.3333
. If we do a binary search for this value on potions
, we will find an insertion index of 2
. Every potion on this index and to the right can form a successful pair. There are 3
indices in total (the potions with strength 3
, 4
, 5
). In general, if there are m
potions, the final index is m  1
. If the insertion index is i
, then the range [i, m  1]
has a size of (m  1)  i + 1 = m  i
.
Find rightmost target index
Remarks
This template ensures we find the rightmost occurrence (i.e., maximum index value) of target
(if it exists). If target
does not exist in the input array, arr
, then this template will return the index before which target
should be inserted to maintain the ordered property of arr
(i.e., left
will return the proper insertion point as opposed to left  1
).
How does this work? What happens if it's ever the case that arr[mid] == target
in the template function above? It's the left half that gets collapsed, by means of left = mid + 1
, thus pushing the search space as far right as possible.
Consider the following sorted array: [4, 6, 8, 8, 8, 10, 12, 13]
. If we applied the function in the template above to this array with a target value of 8
, then the index returned would be 4
, the index of the rightmost target value 8
in the input array. This simple example illustrates two noteworthy observations:

Insertion point: Add a value of
1
to the return value of the template function to find the appropriate insertion point — this assumes the desired insertion point is meant to keep the input array sorted as well as for the inserted element to be as farright as possible.In the context of the simple example, this means the insertion point for another
8
would be4 + 1 = 5
. What if the element to be added is not present? If we wanted to add9
, then our template function would return4
. Again, adding1
to this result gives us our desired insertion point:4 + 1 = 5
. The correct insertion point will always be the returned value plus1
. 
Number of elements greater than target: If
x
is the return value of our template function, then computinglen(arr)  x + 1
will give us the total number of elements in the input array that are greater than the target.In the context of the simple example, how many elements are greater than
8
? There are three such elements, namely10
,12
, and13
; hence, the answer we want is3
. How can we reliably find the value we desire for all sorted input arrays and target values?If our template function returns the rightmost index of the target element if it exists or where it would need to be inserted if it doesn't exist, then this means all elements in the input array to the right of this index are greater than the target value. How can we reliably calculate this number? Well, if you're tasked with reading pages 2327, inclusive, then how many pages are you tasked with reading? It's not
27  23 = 4
. You have to read pages 23, 24, 25, 26, and 27 or simply27  23 + 1 = 5
. The same reasoning, albeit slightly nuanced, applies here: If indexx
is the index returned by our function, then how many elements exist from the right of this element to the end of the array, inclusive? The last element in the array has an index oflen(arr)  1
and the element to the right of the returned value has an index ofx + 1
. Hence, the total number of values shakes out to be(len(arr)  1)  (x + 1) + 1 = len(arr)  x + 1
def binary_search_rightmost(arr, target):
left = 0
right = len(arr)
while left < right:
mid = left + (right  left) // 2
if target < arr[mid]:
right = mid
else:
left = mid + 1
return left  1
Examples
LC 2389. Longest Subsequence With Limited Sum (✓) ★
You are given an integer array nums
of length n
, and an integer array queries of length m
.
Return an array answer
of length m
where answer[i]
is the maximum size of a subsequence that you can take from nums
such that the sum of its elements is less than or equal to queries[i]
.
A subsequence is an array that can be derived from another array by deleting some or no elements without changing the order of the remaining elements.
class Solution:
def answerQueries(self, nums: List[int], queries: List[int]) > List[int]:
def binary_search(query):
left = 0
right = len(nums)
while left < right:
mid = left + (right  left) // 2
if query < nums[mid]:
right = mid
else:
left = mid + 1
return left
nums.sort()
for i in range(1, len(nums)):
nums[i] += nums[i  1]
return [ binary_search(query) for query in queries ]
This problem invites us to dust off our knowledge of prefix sums — because that's really what we need to effectively answer this problem. We need to sort the input nums
, create a prefix sum (either mutate the input directly, as above, or create a new array), and then conduct a binary search on the prefix sum where each time we try to find what would need to be the rightmost insertion point if we were to add query
to the prefix sum array.
Why does this work. Suppose we had the prefix sum [0, 1, 3, 5, 5, 7, 9]
, and the query value we were given was 5
. Where would we need to insert 5
in the prefix sum above to maintain sorted order so that 5
was as far right as possible? It would need to be at index i == 5
(right after the other two 5
values). This means the original numbers in nums
responsible for the [0, 1, 3, 5, 5]
part of the preifx sum can all be removed so that the sum is less than or equal to the query value 5
. That is why the solution above works.
Greedy (looking for minimum)
Temporary remarks for greedybased binary searches on solution spaces
As noted in the editorial on binary search on solution spaces, we need a few conditions to be met in order to effectively conduct our search:
 Possibility/condition/check/feasible function can execute in $O(n)$ time  we can quickly, in $O(n)$ or better, verify if the task is possible for a given threshold value
threshold
; that is, we define a function,possible(threshold)
, that returns a boolean that indicates if the given task is possible or impossible when given the specifcthreshold
value.  Max/min characteristic when task is possible given the specific
threshold
value  if the task is possible for a numberthreshold
and we are looking for
 a maximum, then it is also possible for all numbers less than
threshold
.  a minimum, then it is also possible for all numbers greater than
threshold
.
 Max/min characteristic when task is impossible given the specific
threshold
value  if the task is impossible for a numberthreshold
and we are looking for
 a maximum, then it is also impossible for all numbers greater than
threshold
.  a minimum, then it is also impossible for all numbers less than
threshold
.
The above depictions can be somewhat difficult to envision at first so it can be helpful to draw out a very simple outline as if we're on a number line from 0 to infinity, left to right, as demonstrated below.
Looking for a maximum threshold:
Example use case (illegal parking): Maximize time spent parked illegally without getting a ticket. Under various conditions (e.g., parking enforcers, location, etc.), we can imagine this being possible for a certain amount of time before it becomes impossible. We'd like to maximize the POSSIBLE amount of time we do not have to worry about getting a ticket before it becomes IMPOSSIBLE to avoid getting a ticket:

 Possible  Impossible 

0 ^ ...inf
(threshold binary searched for)
As can be seen above, given a threshold
amount of time, our task of going undetected when parked illegally is
 possible for all numbers less than
threshold
 impossible for all numbers greater than
threshold
Looking for a minimum threshold:
Example use case (mandatory online trainings): Minimize time spent on a manadotry online trainin page before clicking to continue without arousing suspicion. Many online training requirements are modules that are "clickthrough" in nature, where an employee must complete the module but should not "click to continue" until a sufficient amount of time has elapsed to indicate the employee has possibly consumed all of the information on the page. The goal is to minimize the amount of time spent on any given page. We can imagine this being impossible for a certain amount of time before it becomes possible. We'd like to minimize the POSSIBLE amount of time we are required to be on any given training page where it is IMPOSSIBLE to avoid doing so until a certain amount of time has elapsed:

 Impossible  Possible 

0 ^ ...inf
(threshold binary searched for)
As can be seen above, given a threshold
amount of time, our task of having to remain on a given training page before being allowed to continue making progress through the training is
 impossible for all numbers less than
threshold
 possible for all numbers greater than
threshold
TAKEAWAY: For minimums, we're basically trying to find the leftmost insertion point for threshold
within the possible solution space. Our solution space should be closed: [left, right]
. That is, the left
bound should be as minimal as possible as well as inclusive. The right
bound should be as maximal as possible as well as inclusive (this differs from the normal way we would try to find the leftmost insertion point when we're binary searching on a solution space that may contain duplicates ... the idea is that a solution/threshold
that satisfies the problem requirement should exist within the original left
/right
bounds of the problem).
Here's a template:
def binary_search_on_solution_space_MINIMUM(arr):
def possible(threshold):
# this function is implemented depending on the problem
return BOOLEAN
left = MINIMUM_POSSIBLE_ANSWER
right = MAXIMUM_POSSIBLE_ANSWER # solution space being binary searched is [left, right];
# we are trying to find the leftmost insertion point for `threshold`
# in the possible segment
while left < right:
mid = left + (right  left) // 2
if possible(mid):
right = mid # squeeze `left` as far left as possible in the possible segment
else:
left = mid + 1
return left
Finding a maximum for a threshold is a bit different. It's similar to binary searching on an array of values and trying to find the position right before whatever the rightmost insertion point would need to be for the inserted value to maintain order. It's probably easier to visualize if we think of the simple diagram above for problems where we're being asked to find a maximum:

 Possible  Impossible 

We basically want to find the leftmost insertion point where we're satisfying the impossible condition, where the maximum value will then be whatever smaller position lies to the left (usually an integer value of 1
) to make the value the rightmost value in the possible segment (i.e., the maximum possible value).
This requires a small adjustment to the template, most notably the solution space being [left, right)
, where the right
bound is extended just slightly to ensure we actually capture the maximum value. And instead of returning left
, which would give us the leftmost insertion position of the impossible segment, we return left  1
, which gives us the first value in the possible segment before the impossible segment (i.e., the maximal value in the possible segment):
def binary_search_on_solution_space_MAXIMUM(arr):
def possible(threshold):
# this function is implemented depending on the problem
return BOOLEAN
left = MINIMUM_POSSIBLE_ANSWER
right = MAXIMUM_POSSIBLE_ANSWER + 1 # solution space being binary searched is [left, right) as opposed to [left, right];
# we are trying to find the leftmost insertion point for `threshold`
# in the impossible segment so we can report the position immediately to its left,
# the last position in the possible segment before the impossible segment
# (i.e., the maximum possible value)
while left < right:
mid = left + (right  left) // 2
if not possible(mid):
right = mid # squeeze `left` as far left as possible in the impossible segment
else:
left = mid + 1
return left  1
DIVIDING CHOCOLATE: This is a good problem for seeing how things work in regards to trying to find a maximum. Here's one potential solution:
class Solution:
def maximizeSweetness(self, sweetness: List[int], k: int) > int:
def possible(threshold):
total_pieces = 0
piece_sweetness = 0
for chunk in sweetness:
piece_sweetness += chunk
if piece_sweetness >= threshold:
piece_sweetness = 0
total_pieces += 1
if total_pieces == k + 1:
return True
return False
left = min(sweetness)
right = sum(sweetness) + 1
while left < right:
mid = left + (right  left) // 2
if not possible(mid):
right = mid
else:
left = mid + 1
return left  1
Consider the following simple test case:
sweetness = [5,5]
k = 0
Since we have k = 0
, this means we are not going to share the chocolate with anyone else so we should just maximize the sweetness for our own enjoyment. The value returned should be 10
, the entire sum of the values in sweetness
. But what happens if we change the line right = sum(sweetness) + 1
to right = sum(sweetness)
? Then the while
loop will terminate when left == right
, which will now happen when left
and right
both equal 10
, but the value returned, left  1
, will be 9
, which results in an offbyone error.
This simple example illustrates the importance of binary searching the solution space represented by the halfopen interval [left, right)
when looking for a maximum; otherwise, we might end up with an offbyone error.
Remarks
TBD
def binary_search_on_solution_space_MINIMUM(arr):
def possible(threshold):
# this function is implemented depending on the problem
return BOOLEAN
left = MINIMUM_POSSIBLE_ANSWER
right = MAXIMUM_POSSIBLE_ANSWER # solution space being binary searched is [left, right];
# we are trying to find the leftmost insertion point for `threshold`
# in the possible segment
while left < right:
mid = left + (right  left) // 2
if possible(mid):
right = mid # squeeze `left` as far left as possible in the possible segment
else:
left = mid + 1
return left
Examples
LC 875. Koko Eating Bananas (✓) ★
Koko loves to eat bananas. There are n
piles of bananas, the i
th pile has piles[i]
bananas. The guards have gone and will come back in h
hours.
Koko can decide her bananasperhour eating speed of k
. Each hour, she chooses some pile of bananas and eats k
bananas from that pile. If the pile has less than k
bananas, she eats all of them instead and will not eat any more bananas during this hour.
Koko likes to eat slowly but still wants to finish eating all the bananas before the guards return.
Return the minimum integer k
such that she can eat all the bananas within h
hours.
class Solution:
def minEatingSpeed(self, piles: List[int], h: int) > int:
def possible(speed):
hours_spent = 0
for banana in piles:
hours_spent += (banana // speed)
if hours_spent > h:
return False
return True
left = 1
right = max(piles)
while left < right:
mid = left + (right  left) // 2
if possible(mid):
right = mid
else:
left = mid + 1
return left
The idea is for Koko to go as slowly as possible while still being able to eat all bananas within h
hours. Our goal is to find the speed at which Koko can eat all bananas but as soon as we decrease the speed it becomes impossible (that will give us the minimized speed for which eating all bananas is possible).
Hence, we can binary search on the solution space where the solution space is comprised of speeds in the range [min_possible_speed, max_possible_speed]
, inclusive. What would make sense as a minimum possible speed? The speed needs to be an integer and it clearly can't be 0
; hence, the minimum possible speed is 0
so we set left = 0
. What about the maximum possible speed? Each pile can be consumed within a single hour if the speed is the size of the pile with the greatest number of bananas; hence, the maximum possible speed we should account for is max(piles)
so we set right = max(piles)
.
All that is left to do now is to greedily search for the leftmost value in the solution space that satisfies the possible
constraint.
LC 1631. Path With Minimum Effort (✓) ★★
You are a hiker preparing for an upcoming hike. You are given heights
, a 2D array of size rows x columns
, where heights[row][col]
represents the height of cell (row, col)
. You are situated in the topleft cell, (0, 0)
, and you hope to travel to the bottomright cell, (rows1, columns1)
(i.e., 0indexed). You can move up, down, left, or right, and you wish to find a route that requires the minimum effort.
A route's effort is the maximum absolute difference in heights between two consecutive cells of the route.
Return the minimum effort required to travel from the topleft cell to the bottomright cell.
class Solution:
def minimumEffortPath(self, heights: List[List[int]]) > int:
def valid(row, col):
return 0 <= row < m and 0 <= col < n
def possible(route_max_abs_diff):
seen = {(0,0)}
def dfs(row, col):
if row == m  1 and col == n  1:
return True
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
if valid(next_row, next_col) and (next_row, next_col) not in seen:
cell_val = heights[row][col]
adjacent_cell_val = heights[next_row][next_col]
if abs(adjacent_cell_val  cell_val) <= route_max_abs_diff:
seen.add((next_row, next_col))
if dfs(next_row, next_col):
return True
return False
return dfs(0, 0)
m = len(heights)
n = len(heights[0])
dirs = [(1,0),(1,0),(0,1),(0,1)]
left = 0
right = max([element for row in heights for element in row])  1
while left < right:
mid = left + (right  left) // 2
if possible(mid):
right = mid
else:
left = mid + 1
return left
This is a great problem for a variety of reasons. It can be somewhat tricky at first though. The idea is that if we can find a path from the top left to the bottom right using a certain amount of effort
, then we can certainly find a path that works for any greater amount of effort (i.e., effort + 1
, effort + 2
, etc.). But can we find a valid path using less effort? That's the real question.
If we let our solution space be the amount of effort required for a valid path, then we can binary search from the minimum possible amount of effort to the maximum possible amount of effort, inclusive. The goal is to find the effort amount for a valid path where any effort less than that would not result in a valid path.
What are the min/max effort bounds? The minimum possible effort is 0
because all of the numbers in the path could be the same. What about the maximum possible effort? That would be the max amount in the matrix minus the smallest amount:
max([element for row in heights for element in row])  1
We could also observe the constraint 1 <= heights[i][j] <= 10^6
, which means either of the following options would be valid for the problem at hand:
# option 1
left = 0
right = max([element for row in heights for element in row])  1
# option 2
left = 0
right = 10 ** 6  1
The first option is obviously more costly in some respects, but the second option could result in a maximum boundary that is much larger than what we need. Since binary search is so fast, the second option is really not an issue though.
Finally, it should be noted that a stackbased DFS is also quite effective here to avoid the space overhead required by the call stack to deal with recursion:
class Solution:
def minimumEffortPath(self, heights: List[List[int]]) > int:
def valid(row, col):
return 0 <= row < m and 0 <= col < n
def possible(route_max_abs_diff):
seen = {(0,0)}
def dfs(start_row, start_col):
stack = [(start_row, start_col)]
while stack:
row, col = stack.pop()
if row == m  1 and col == n  1:
return True
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
if valid(next_row, next_col) and (next_row, next_col) not in seen:
cell_val = heights[row][col]
next_cell_val = heights[next_row][next_col]
if abs(next_cell_val  cell_val) <= route_max_abs_diff:
seen.add((next_row, next_col))
stack.append((next_row, next_col))
return False
return dfs(0, 0)
m = len(heights)
n = len(heights[0])
dirs = [(1,0),(1,0),(0,1),(0,1)]
left = 0
right = max([element for row in heights for element in row])  1
while left < right:
mid = left + (right  left) // 2
if possible(mid):
right = mid
else:
left = mid + 1
return left
LC 1870. Minimum Speed to Arrive on Time (✓) ★
You are given a floatingpoint number hour
, representing the amount of time you have to reach the office. To commute to the office, you must take n
trains in sequential order. You are also given an integer array dist
of length n
, where dist[i]
describes the distance (in kilometers) of the i
th train ride.
Each train can only depart at an integer hour, so you may need to wait in between each train ride.
 For example, if the
1
st train ride takes1.5
hours, you must wait for an additional0.5
hours before you can depart on the2
nd train ride at the 2 hour mark.
Return the minimum positive integer speed (in kilometers per hour) that all the trains must travel at for you to reach the office on time, or 1
if it is impossible to be on time.
Tests are generated such that the answer will not exceed 10^{7}
and hour
will have at most two digits after the decimal point.
class Solution:
def minSpeedOnTime(self, dist: List[int], hour: float) > int:
if len(dist) > (hour // 1):
return 1
def possible(speed):
hours_spent = 0
for i in range(len(dist)  1):
distance = dist[i]
hours_spent += (distance // speed)
if hours_spent > hour:
return False
hours_spent += dist[1] / speed
return hours_spent <= hour
left = 1
right = 10 ** 7
while left < right:
mid = left + (right  left) // 2
if possible(mid):
right = mid
else:
left = mid + 1
return left
This one is a bit wonky to reason about at first. In part because of how the time elapsed is evaluated. It's similar to LC 875. Koko Eating Bananas in that we always take the ceiling of time evaluations, but here the time spent on the last train is not rounded up (we rounded up the time spent on all piles for the Koko eating bananas problem).
When will it be impossible to reach the office on time? It's not when len(dist) > hour
, as LeetCode's second example with input dist = [1,3,2], hour = 2.7
shows. However, it will be impossible if len(dist) > ceil(hour)
. The third example, with input dist = [1,3,2], hour = 1.9
illustrates this, where the earliest the third train cen depart is at the 2 hour mark.
The idea is to conduct a binary search on the range of speeds [min_speed_possible, max_speed_possible]
for which we'll be able to make the trip on time (i.e., where the total number of hours spent is less than or equal to hour
), where we want to minimize the speed required to make the trip on time. If we can make the trip on time for a given speed
, then we can definitely make the trip on time if we increase the speed to speed + 1
. We want to determine when making it on time is possible for speed
but impossible for any valid speed value less than this. Binary search it is!
What's the minimum possible speed? We're told the speed reported must be a positive integer so we set left = 1
. What about the maximum possible speed? We're told that the answer will not exceed 10 ** 7
; hence, we set right = 10 ** 7
.
LC 1283. Find the Smallest Divisor Given a Threshold (✓)
Given an array of integers nums
and an integer threshold
, we will choose a positive integer divisor
, divide all the array by it, and sum the division's result. Find the smallest divisor
such that the result mentioned above is less than or equal to threshold
.
Each result of the division is rounded to the nearest integer greater than or equal to that element. (For example: 7/3 = 3
and 10/2 = 5
).
It is guaranteed that there will be an answer.
class Solution:
def smallestDivisor(self, nums: List[int], threshold: int) > int:
def possible(divisor):
running_sum = 0
for num in nums:
running_sum += (num // divisor)
if running_sum > threshold:
return False
return True
left = 1
right = max(nums)
while left < right:
mid = left + (right  left) // 2
if possible(mid):
right = mid
else:
left = mid + 1
return left
This problem is quite similar to LC 875. Koko Eating Bananas. Our solution space is a range of possible divisors, and our goal is to minimize the divisor so that the running sum obtained by dividing each number in nums
by divisor
(and perforing the subsequent rounding up to the nearest integer) never exceeds threshold
.
If we can achieve this task for divisor
, then we can definitely achieve the same task by increasing the value of divisor
(e.g., divisor + 1
). Our goal, then, is to find a divisor
such that the task is possible but as soon as we decrease the value of divisor the task becomes impossible.
What would the minimum divisor be for our solution space? We're told it must be a positive integer; hence, we set left = 1
. What about the maximum divisor? Since each division result gets rounded up to the nearest integer, the smallest the running sum could be would occur if we chose the divisor to be the maximum value in nums
. Then no division would result in a value greater than 1
, and since we're told nums.length <= threshold <= 10^6
, we let right = max(nums)
.
LC 410. Split Array Largest Sum (✓) ★★★
Given an array nums
which consists of nonnegative integers and an integer m
, you can split the array into m
nonempty continuous subarrays.
Write an algorithm to minimize the largest sum among these m
subarrays.
class Solution:
def splitArray(self, nums: List[int], k: int) > int:
def possible(max_sum):
num_subarrays = 0
subarray_sum = 0
idx = 0
while idx < len(nums):
if nums[idx] > max_sum:
return False
subarray_sum += nums[idx]
if subarray_sum > max_sum:
subarray_sum = nums[idx]
num_subarrays += 1
if num_subarrays > k:
return False
idx += 1
return (num_subarrays + 1) <= k
left = 0
right = sum(nums)
while left < right:
mid = left + (right  left) // 2
if possible(mid):
right = mid
else:
left = mid + 1
return left
This problem is excellent and somewhat similar to LC 1231. Divide Chocolate in nature. The tip off that the problem may involve binary searching on a solution space is given by the fact if a subarray sum subarray_sum
works as a solution to the problem then increasing the value of subarray_sum
will certainly work as well. Our goal is to find a subarray sum that, when decreased at all, results in not being able to fulfill the requirements of the problem. All of this implies we should be able to conduct a binary search on the solution space of minimum subarray sum values.
What would the smallest possible subarray sum value be? Since values of 0
are allowed in nums
, we should set left = 0
. What about the largest possible subarray sum value? No subarray can have a larger sum than the entire array; hence, we set right = sum(nums)
.
The harder part for this problem is designing the possible
function effectively. The intuition is that we construct the k
subarray sums in a greedy fashion, where we keep adding to one of the k
subarrays until the given max_sum
has been exceeded, at which point we move on to constructing the next subarray sum. If the number of subarrays we must use to accomplish this ever exceeds k
, then we issue an early return of false
. If, however, we process all values and the total number of subarrays used is less than or equal to k
, then we return true
because k
will always be less than or equal to nums.length
per the constraint 1 <= k <= min(50, nums.length)
; that is, if somehow we've processed all values in nums
and have never exceeded max_sum
for a subarray sum and the number of subarrays used is much smaller than k
, then we can simply distribute the values in the subarrays to fill up the remaining empty subarrays until the total number of subarrays equals k
(the sum of the subarrays from which values are borrowed can only decrease due to the constraint 0 <= nums[i] <= 10^6
).
LC 1482. Minimum Number of Days to Make m Bouquets★★
Given an integer array bloomDay
, an integer m
and an integer k
.
We need to make m
bouquets. To make a bouquet, you need to use k
adjacent flowers from the garden.
The garden consists of n
flowers, the i
th flower will bloom in the bloomDay[i]
and then can be used in exactly one bouquet.
Return the minimum number of days you need to wait to be able to make m
bouquets from the garden. If it is impossible to make m
bouquets return 1
.
class Solution:
def minDays(self, bloomDay: List[int], m: int, k: int) > int:
def possible(days):
bouquets_formed = flower_count = 0
for flower in bloomDay:
if flower <= days:
flower_count += 1
else:
flower_count = 0
if flower_count == k:
bouquets_formed += 1
flower_count = 0
if bouquets_formed == m:
return True
return False
# not enough flowers for required number of bouquets
if len(bloomDay) < (m * k):
return 1
left = min(bloomDay)
right = max(bloomDay)
while left < right:
mid = left + (right  left) // 2
if possible(mid):
right = mid
else:
left = mid + 1
return left
As usual, the primary difficulty in this problem is identifying it as having a nice binary search solution. The idea is to binary search on the solution space where the solution space is identified as being binary searchable as follows: if I can form m
bouquets in d
days, then I can definitely form m
bouquets in > d
days. We want to find the minimum number for d
such that trying to form m
bouquets in any fewer days is impossible. We can binary search for that number of days, as shown above.
Greedy (looking for maximum)
Remarks
TBD
def binary_search_on_solution_space_MAXIMUM(arr):
def possible(threshold):
# this function is implemented depending on the problem
return BOOLEAN
left = MINIMUM_POSSIBLE_ANSWER
right = MAXIMUM_POSSIBLE_ANSWER + 1 # solution space being binary searched is [left, right) as opposed to [left, right];
# we are trying to find the leftmost insertion point for `threshold`
# in the impossible segment so we can report the position immediately to its left,
# the last position in the possible segment before the impossible segment
# (i.e., the maximum possible value)
while left < right:
mid = left + (right  left) // 2
if not possible(mid):
right = mid # squeeze `left` as far left as possible in the impossible segment
else:
left = mid + 1
return left  1
Examples
LC 1231. Divide Chocolate (✓) ★★★
You have one chocolate bar that consists of some chunks. Each chunk has its own sweetness given by the array sweetness
.
You want to share the chocolate with your K
friends so you start cutting the chocolate bar into K+1
pieces using K
cuts, each piece consists of some consecutive chunks.
Being generous, you will eat the piece with the 8* and give the other pieces to your friends.
Find the maximum total sweetness of the piece you can get by cutting the chocolate bar optimally.
class Solution:
def maximizeSweetness(self, sweetness: List[int], k: int) > int:
def possible(min_sweetness):
pieces = 0
piece_sweetness = 0
for chunk_sweetness in sweetness:
piece_sweetness += chunk_sweetness
if piece_sweetness >= min_sweetness:
pieces += 1
piece_sweetness = 0
if pieces == k + 1:
return True
return False
left = 1
right = sum(sweetness) + 1
while left < right:
mid = left + (right  left) // 2
if not possible(mid):
right = mid
else:
left = mid + 1
return left  1
This is such a fantastic problem, but it is quite difficult. As usual (for binary search problems on solution spaces anyway), constructing the possible
function is where much of the difficulty lies. Our goal is to ensure we can actually come up with k + 1
pieces of chocolate to distribute where each piece meets or exceeds the required min_sweetness
threshold. If we can do that, then we're in business. But, of course, as the problem indicates, we want to maximize the minimum sweetness of our own piece of chocolate (since all other pieces distributed must have the same or more sweetness compared to our own).
Hence, we need to binary search on a range of possible minimum sweetness values. What would the smallest possible sweetness be? We're told from the constraint that every chunk of the chocolate has a sweetness of at least 1
; hence, we set left = 1
. What about the largest possible sweetness? Note that k == 0
is possible, which means there's a possibility where we need to share the chocolate bar with no one — in such a case, we would want to consume all of the sweetness, sum(sweetness)
. But since we're binary searching a solution space for a maximum value, we need to set right = sum(sweetness) + 1
as opposed to right = sum(sweetness)
. Why? Because we might miss the maximum value in an offbyone error otherwise; for example, consider the input sweetness = [5,5], k = 0
. The while loop terminates once left == right
and right == sum(sweetness)
, but we return left  1
, which is equal to 10  1 == 9
instead of the obviously correct answer of 10
.
LC 1552. Magnetic Force Between Two Balls★★
In universe Earth C137, Rick discovered a special form of magnetic force between two balls if they are put in his new invented basket. Rick has n empty baskets, the i
th basket is at position[i]
, Morty has m
balls and needs to distribute the balls into the baskets such that the minimum magnetic force between any two balls is maximum.
Rick stated that magnetic force between two different balls at positions x
and y
is x  y
.
Given the integer array position
and the integer m
. Return the required force.
class Solution:
def maxDistance(self, position: List[int], m: int) > int:
def possible(min_mag_force):
balls_placed = 1
prev_ball_pos = position[0]
for idx in range(1, len(position)):
curr_ball_pos = position[idx]
if curr_ball_pos  prev_ball_pos >= min_mag_force:
balls_placed += 1
prev_ball_pos = curr_ball_pos
if balls_placed == m:
return True
return False
position.sort()
left = 1
right = (position[1]  position[0]) + 1
while left < right:
mid = left + (right  left) // 2
if not possible(mid):
right = mid
else:
left = mid + 1
return left  1
The problem description is one of the hardest things about this problem. But after fighting to understand it, our thinking will gradually start to resemble what's included in the hints and point us to binary search on the solution space as a good strategy:
Hint 1: If you can place balls such that the answer is x,
then you can do it for y where y < x.
Hint 2: Similarly, if you cannot place balls such that
the answer is x then you can do it for y where y > x.
Hint 3: Binary search on the answer and greedily see if it is possible.
This problem is quite similar to LC 1231. Divide Chocolate in many ways, but instead of trying to maximize the sweetness of our least sweet piece of chocolate amongst k
friends, we are trying to maximize the magnetic force between the least magnetically attracted pair of balls. The idea is to binary search on possible answer values for the minimum magnetic force required and to maximize that value as much as possible.
Hence, our first task is to build a possible
function to determine whether or not the task at hand is possible for some given magnetic force, and we greedily try to determine the possibility; that is, we place a ball whenever the magnetic force provided has been met or exceeded (this ensures the magnetic force provided is, indeed, minimum). All balls placed must have at least min_mag_force
magnetic force between them. Our goal is to maximize that value.
The smallest possible magnetic force would be when we have n
positions and m == n
balls, where the positions are all spaced a single unit apart. That would give us a magnetic force of 1
, meaning we should set left = 1
. The largest possible magnetic force would be the last position value minus the first position value (assuming the position
array to be sorted at that point). Since the right endpoint needs to be included and we're always returning left  1
, we should set right = (position[1]  position[0]) + 1
.
Dynamic programming
Memoization (topdown)
Remarks
TBD
def fn(arr):
# 1. define a function that will compute/contain
# the answer to the problem for any given state
def dp(STATE):
# 3. use base cases to make the recurrence relation useful
if BASE_CASE:
return 0
if STATE in memo:
return memo[STATE]
# 2. define a recurrence relation to transition between states
ans = RECURRENCE_RELATION(STATE)
memo[STATE] = ans
return ans
memo = {}
return dp(STATE_FOR_WHOLE_INPUT)
Examples
LC 746. Min Cost Climbing Stairs
You are given an integer array cost
where cost[i]
is the cost of i^{th}
step on a staircase. Once you pay the cost, you can either climb one or two steps.
You can either start from the step with index 0
, or the step with index 1
.
Return the minimum cost to reach the top of the floor.
class Solution:
def minCostClimbingStairs(self, cost: List[int]) > int:
def dp(step):
if step in memo:
return memo[step]
step_cost = min(dp(step  1) + cost[step  1], dp(step  2) + cost[step  2])
memo[step] = step_cost
return step_cost
memo = {}
memo[0] = memo[1] = 0
return dp(len(cost))
Tabulation (bottomup)
Remarks
TBD
def fn(arr):
# 1. initialize a table (array, list, etc.)
# to store solutions of subproblems.
dp_table = INITIALIZE_TABLE()
# 2. fill the base cases into the table.
dp_table = FILL_BASE_CASES(dp_table)
# 3. iterate over the table in a specific order
# to fill in the solutions of larger subproblems.
for STATE in ORDER_OF_STATES:
dp_table[STATE] = CALCULATE_STATE_FROM_PREVIOUS_STATES(dp_table, STATE)
# 4. the answer to the whole problem is now in the table,
# typically at the last entry or a specific position.
return dp_table[FINAL_STATE_OR_POSITION]
# example usage
arr = [INPUT_DATA]
result = fn(arr)
Examples
TBD
Graphs
BFS
Remarks
Assume the nodes are numbered from 0
to n  1
and the graph is given as an adjacency list. Depending on the problem, you may need to convert the input into an equivalent adjacency list before using the templates.
Basic motivation, concepts, and considerations for BFS
The following observations are largely derived from this video. The underlying graph referred to for the rest of this note is assumed to be represented as an adjacency list of index arrays.
The basic idea behind BFS is that you have a starting vertex and you explore vertices in order of how many links they are away from it: first explore the starting vertex, then all vertices one link away (i.e., incident edges involving the start vertex and its immediate neighbors), then all vertices two links away, and so forth, until you explore all vertices reachable from the start vertex. That's it. We can go ahead and come up with some basic pseudocode to model how we might do this.
ResetGraph()
for v in V
v.discovered = False
BFS(startVertex)
ResetGraph()
startVertex.discovered = True
Q = new Queue()
Q.enqueue(startVertex)
while(not Q.isEmpty())
u = Q.dequeue()
for each v such that u > v
if(not v.discovered)
v.discovered = True
Q.enqueue(v)
Running the algorithm above implies a specific "breadth first search tree". The root is the start node, and if one node discovers another node (i.e., u > v
), then it's that node's parent in the tree (i.e., u
is considered to be v
's parent in the tree). We can imagine that, for the same graph, we could get a different breadth first search tree based on whatever node we picked as the start node (it can get even wilder if we pick several start nodes, something that happens when executing a multisource BFS).
A more detailed implementation of BFS would be as follows, where we not only create the breadth first search tree but also store the number of links needed to reach each vertex.
ResetGraph()
for v in V
v.discovered = False
v.dist = +inf
v.pi = nil
BFS(startVertex)
ResetGraph()
startVertex.discovered = True
startVertex.dist = 0
Q = new Queue()
Q.enqueue(startVertex)
while(not Q.isEmpty())
u = Q.dequeue()
for each v such that u > v
if(not v.discovered)
v.discovered = True
v.dist = u.dist + 1
v.pi = u
Q.enqueue(v)
Of course, for the scheme above to work, assume there's space to store extra information with each vertex, v
, namely the distance needed to reach the vertex as well as well as the predecessor parent for that vertex, v.dist
and v.pi
, respectively. This information can be associated with the vertex itself so that we end up representing a vertex basically as a list, where each slot represents, say, the vertex index, the distance to that vertex, and then the predecessor for that vertex: [v_index, v_dist, v_pred]
. These items then get enqueued, dequeued, and updated accordingly. But it's more common to leave the vertex indices to themselves and to instead initialize dist
and pred
arrays of length n
, where n
represents the total number of vertices in the graph (remember we're assuming the underlying graph is an adjacency list of index arrays, where nodes are labeled according to their index).
Certainly an observation worth making is that vertices with a smaller distance value must be dequeued and processed before any vertex with a greater distance value is dequeued and processed. Predecessor values are recorded in part to aid in the process of potentially reconstructing the shortest path from the start vertex to one of its reachable vertices.
Reconstructing the shortest path
This can be done recursively as follows:
FindPath(s, t)
if (s == t)
return new List().add(t)
if (t.pi == nil) # no path exists
return nil
return FindPath(s, t.pi).add(t)
Or iteratively (perhaps more intuitively), where the implementation below assumes we've maintained a pred
array where pred[x]
houses the predecessor for node x
if it exists or is 1
otherwise:
ReconstructPath(s, t, pred)
path = new Stack()
path.push(t)
while path.peek() != s:
if pred[path.peek()] == 1:
return None # no path exists from s to t
path.push(pred[path.peek()])
reverse path # obtain original path direction from s to t
return path
In CLRS (i.e., [20]), the PrintPath
procedure is provided to simply print the path from the vertex s
to the vertex v
(i.e., as opposed to really reconstructing the path):
PrintPath(G, s, v)
if v == s:
print s
else if v.pi == nil:
print "no path from" s "to" v "exists
else PrintPath(G, s, v.pi)
print v
We can use Python to actually implement the PrintPath
procedure but in a way where we actually obtain the path:
def shortest_path(s, t, path, pred):
if s == t:
path.append(s)
return path
elif pred[t] == 1:
return [] # shortest path from s to t does not exist
else:
path.append(t)
return shortest_path(s, pred[t], path, pred)
In most cases, however, we would probably be more inclined to reconstruct the path in an iterative fashion, where the code is more readable and we avoid some of the overhead associated with recursion:
def shortest_path(s, t, pred):
path = [t]
while path[1] != s:
parent = pred[path[1]]
if parent == 1:
return [] # shortest path from s to t does not exist
path.append(parent)
path.reverse()
return path
If we look at CLRS, we will note that there's only a minor change from the "full implementation" algorithm above and how it appears in CLRS: instead of marking vertices as discovered or undiscovered, they use three colors:
 White: Undiscovered
 Gray: Discovered but not yet finished (i.e., when it's in the queue waiting to be dequeued)
 Black: Finished (i.e., dequeued and all unvisited child nodes enqueued)
That's it:
ResetGraph()
for v in V
v.color = White
v.dist = +inf
v.pi = nil
BFS(startVertex)
ResetGraph()
startVertex.color = Gray
startVertex.dist = 0
Q = new Queue()
Q.enqueue(startVertex)
while(not Q.isEmpty())
u = Q.dequeue()
for each v such that u > v
if(v.color == White)
v.color = Gray
v.dist = u.dist + 1
v.pi = u
Q.enqueue(v)
u.color = Black
from collections import deque
def fn(graph):
queue = deque([START_NODE])
seen = {START_NODE}
ans = 0
while queue:
node = queue.popleft()
# do some logic
for neighbor in graph[node]:
if neighbor not in seen:
seen.add(neighbor)
queue.append(neighbor)
return ans
Examples
LC 1091. Shortest Path in Binary Matrix (✓)
Given an n x n
binary matrix grid
, return the length of the shortest clear path in the matrix. If there is no clear path, return 1
.
A clear path in a binary matrix is a path from the topleft cell (i.e., (0, 0)
) to the bottomright cell (i.e., (n  1, n  1)
) such that:
 All the visited cells of the path are
0
.  All the adjacent cells of the path are 8directionally connected (i.e., they are different and they share an edge or a corner).
The length of a clear path is the number of visited cells of this path.
class Solution:
def shortestPathBinaryMatrix(self, grid: List[List[int]]) > int:
def valid(row, col):
return 0 <= row < n and 0 <= col < n and grid[row][col] == 0
dirs = [(1,0),(1,1),(1,1),(1,0),(1,1),(1,1),(0,1),(0,1)]
n = len(grid)
seen = {(0,0)}
if grid[0][0] != 0 or grid[n1][n1] != 0:
return 1
queue = deque([(0,0,1)])
while queue:
row, col, path_length = queue.popleft()
if row == n  1 and col == n  1:
return path_length
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
next_node = (next_row, next_col)
if valid(*next_node) and next_node not in seen:
queue.append((*next_node, path_length + 1))
seen.add(next_node)
return 1
It's fairly conventional in BFS solutions for graphs to encode with each node additional information like the current level for that node or some other kind of stateful data. We do not need to encode anything other than each node's position in the seen
set because whenever we encounter a node it will be in the fewest steps possible (i.e., the trademark of BFS solutions ... finding shortest paths).
LC 863. All Nodes Distance K in Binary Tree (✓)
We are given a binary tree (with root node root
), a target
node, and an integer value K
.
Return a list of the values of all nodes that have a distance K
from the target
node. The answer can be returned in any order.
class Solution:
def distanceK(self, root: TreeNode, target: TreeNode, k: int) > List[int]:
def build_parent_lookup(node, parent = None):
if not node:
return
parent_lookup[node] = parent
build_parent_lookup(node.left, node)
build_parent_lookup(node.right, node)
parent_lookup = dict()
build_parent_lookup(root)
seen = {target}
queue = deque([target])
for _ in range(k):
num_nodes_this_level = len(queue)
for _ in range(num_nodes_this_level):
node = queue.popleft()
for neighbor in [node.left, node.right, parent_lookup[node]]:
if neighbor and neighbor not in seen:
seen.add(neighbor)
queue.append(neighbor)
return [ node.val for node in queue ]
The key in the solution above is to recognize that a BFS traversal will give us exactly what we want if we have some way to reference each node's parent node. The build_parent_lookup
function, which uses DFS to build a hashmap lookup for each node's parent node, gives us this. Much of the rest of the problem then becomes a standard BFS traversal.
LC 542. 01 Matrix (✓)
Given a matrix consists of 0
and 1
, find the distance of the nearest 0
for each cell.
The distance between two adjacent cells is 1
.
class Solution:
def updateMatrix(self, mat: List[List[int]]) > List[List[int]]:
def valid(row, col):
return 0 <= row < m and 0 <= col < n and mat[row][col] == 1
m = len(mat)
n = len(mat[0])
res = [[0] * n for _ in range(m)]
dirs = [(1,0),(1,0),(0,1),(0,1)]
seen = set()
queue = deque()
for i in range(m):
for j in range(n):
if mat[i][j] == 0:
seen.add((i, j))
queue.append((i, j, 0))
while queue:
row, col, dist = queue.popleft()
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
next_node = (next_row, next_col)
if valid(*next_node) and next_node not in seen:
res[next_row][next_col] = dist + 1
seen.add(next_node)
queue.append((*next_node, dist + 1))
return res
The solution above takes advantage of a socalled multisource BFS (i.e., a BFS traversal with multiple starting sources). The solution also takes advantage of the fact that cells with a value of 0
do not need to be updated; hence, our BFS can start from all nodes with a value of 0
and explore outwards, updating nonzero nodes (i.e., just nodes with value 1
for this problem) with the distance so far from a node with value 0
. Our intuition tells us to start from the nodes with value 1
, but it's much easier to start from the nodes with value 0
.
Additionally, in the valid
function above, the condition and mat[row][col] == 1
is not necessary since all nodes with value 0
are added to the seen
set before exploring outwards, which means all subsequent nodes we'll explore that are both valid and not in seen
will have a nonzero value (i.e., 1
for this problem). This conditional of the valid
function is only kept for the sake of clarity, but it's worth noting that it's not necessary here.
LC 1293. Shortest Path in a Grid with Obstacles Elimination (✓) ★★
Given a m * n
grid, where each cell is either 0
(empty) or 1
(obstacle). In one step, you can move up, down, left or right from and to an empty cell.
Return the minimum number of steps to walk from the upper left corner (0, 0)
to the lower right corner (m1, n1)
given that you can eliminate at most k
obstacles. If it is not possible to find such walk return 1
.
class Solution:
def shortestPath(self, grid: List[List[int]], k: int) > int:
# ensures the next node to be visited is in bounds
def valid(row, col):
return 0 <= row < m and 0 <= col < n
m = len(grid)
n = len(grid[0])
dirs = [(1,0),(1,0),(0,1),(0,1)]
seen = {(0,0,k)}
queue = deque([(0,0,k,0)])
while queue:
row, col, rem, steps = queue.popleft()
# only valid nodes exist in the queue
if row == m  1 and col == n  1:
return steps
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
next_node = (next_row, next_col)
if valid(*next_node):
next_val = grid[next_row][next_col]
# if the next value is not an obstacle, then proceed with visits as normal
if next_val == 0:
if (*next_node, rem) not in seen:
seen.add((*next_node, rem))
queue.append((*next_node, rem, steps + 1))
# the next value is an obstacle: can we still remove obstacles? if so, proceed with visits
else:
if rem > 0 and (*next_node, rem  1) not in seen:
seen.add((*next_node, rem  1))
queue.append((*next_node, rem  1, steps + 1))
return 1
This is an excellent problem for thinking through how a node's state should be recorded in the seen
set; that is, the majority of BFS and DFS traversals on matrix graphs simply record a node's position (i.e., row and column) because the node's position fully describes the state we do not want to visit again. But for some problems, like this one, it's helpful to record more information than just a node's position. Specifically, the state we do not want to visit more than once is a node's position in addition to the number of remaining obstacles we can move.
Thinking about using the seen
set to record states we do not want to visit multiple times is much more accurate and reflective of our actual goal — only perform computation when absolutely necessary.
LC 1129. Shortest Path with Alternating Colors (✓) ★★
Consider a directed graph, with nodes labelled 0, 1, ..., n1
. In this graph, each edge is either red or blue, and there could be selfedges or parallel edges.
Each [i, j]
in red_edges
denotes a red directed edge from node i
to node j
. Similarly, each [i, j]
in blue_edges
denotes a blue directed edge from node i
to node j
.
Return an array answer
of length n
, where each answer[X]
is the length of the shortest path from node 0
to node X
such that the edge colors alternate along the path (or 1
if such a path doesn't exist).
class Solution:
def shortestAlternatingPaths(self, n: int, redEdges: List[List[int]], blueEdges: List[List[int]]) > List[int]:
def build_graph(edge_arr):
graph = defaultdict(list)
for node, neighbor in edge_arr:
graph[node].append(neighbor)
return graph
RED_GRAPH = build_graph(redEdges)
BLUE_GRAPH = build_graph(blueEdges)
RED = 1
BLUE = 0
ans = [float('inf')] * n
seen = {(0,RED), (0,BLUE)}
queue = deque([(0,RED,0),(0,BLUE,0)])
while queue:
node, color, steps = queue.popleft()
ans[node] = min(ans[node], steps)
alt_color = 1  color
graph = RED_GRAPH if alt_color == 1 else BLUE_GRAPH
for neighbor in graph[node]:
if (neighbor, alt_color) not in seen:
seen.add((neighbor, alt_color))
queue.append((neighbor, alt_color, steps + 1))
return [ val if val != float('inf') else 1 for val in ans ]
This is such a great problem in so many ways. The idea is to execute a "semimultisource" BFS, where we start with node 0
as if it's red as well as if it's blue. Then we expand outwards.
We also take advantage of a nice numerical trick: 1  1 == 0
, and 1  0 == 1
. This allows us to effectively (and efficiently) alternate between colors.
LC 1926. Nearest Exit from Entrance in Maze (✓)
You are given an m x n
matrix maze (0indexed) with empty cells (represented as '.'
) and walls (represented as '+'
). You are also given the entrance
of the maze, where entrance = [entrance_{row}, entrance_{col}]
denotes the row and column of the cell you are initially standing at.
In one step, you can move one cell up, down, left, or right. You cannot step into a cell with a wall, and you cannot step outside the maze. Your goal is to find the nearest exit from the entrance
. An exit is defined as an empty cell that is at the border of the maze
. The entrance
does not count as an exit.
Return the number of steps in the shortest path from the entrance
to the nearest exit, or 1
if no such path exists.
class Solution:
def nearestExit(self, maze: List[List[str]], entrance: List[int]) > int:
def valid(row, col):
return 0 <= row < m and 0 <= col < n and maze[row][col] == '.'
m = len(maze)
n = len(maze[0])
dirs = [(1,0),(1,0),(0,1),(0,1)]
start_node = tuple(entrance)
seen = {start_node}
queue = deque([(*start_node, 0)])
exit_rows = {0, m  1}
exit_cols = {0, n  1}
while queue:
row, col, moves = queue.popleft()
if row in exit_rows or col in exit_cols:
if row != start_node[0] or col != start_node[1]:
return moves
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
next_node = (next_row, next_col)
if valid(*next_node) and next_node not in seen:
seen.add(next_node)
queue.append((*next_node, moves + 1))
return 1
There are several variables to initialize before the proper traversal and that is okay.
LC 909. Snakes and Ladders (✓)
On an N x N
board
, the numbers from 1
to N*N
are written boustrophedonically starting from the bottom left of the board, and alternating direction each row. For example, for a 6 x 6 board, the numbers are written as follows:
You start on square 1
of the board (which is always in the last row and first column). Each move, starting from square x
, consists of the following:
 You choose a destination square
S
with numberx+1
,x+2
,x+3
,x+4
,x+5
, orx+6
, provided this number is<= N*N
.
 (This choice simulates the result of a standard 6sided die roll: ie., there are always at most 6 destinations, regardless of the size of the board.)
 If
S
has a snake or ladder, you move to the destination of that snake or ladder. Otherwise, you move toS
.
A board square on row r
and column c
has a "snake or ladder" if board[r][c] != 1
. The destination of that snake or ladder is board[r][c]
.
Note that you only take a snake or ladder at most once per move: if the destination to a snake or ladder is the start of another snake or ladder, you do not continue moving. (For example, if the board is [[4,1],[1,3]]
, and on the first move your destination square is 2
, then you finish your first move at 3
, because you do not continue moving to 4
.)
Return the least number of moves required to reach square N*N
. If it is not possible, return 1
.
class Solution:
def snakesAndLadders(self, board: List[List[int]]) > int:
def valid(row, col):
return 0 <= row < n and 0 <= col < n
def label_to_cell(num):
row = (num  1) // n
col = (num  1) % n
if row % 2 == 1:
col = (n  1)  col
return [(n  1)  row, col]
n = len(board)
seen = {1}
queue = deque([(1,0)])
while queue:
label, moves = queue.popleft()
if label == n ** 2:
return moves
for next_label in range(label + 1, min(label + 6, n ** 2) + 1):
row, col = label_to_cell(next_label)
if valid(row, col) and next_label not in seen:
seen.add(next_label)
if board[row][col] != 1:
queue.append((board[row][col], moves + 1))
else:
queue.append((next_label, moves + 1))
return 1
This is a rough one, mostly because of the convoluted problem description. The trickiest part is coming up with a performant way of converting from label to cell:
def label_to_cell(num):
row = (num  1) // n
col = (num  1) % n
if row % 2 == 1:
col = (n  1)  col
return [(n  1)  row, col]
Coming up with the function above takes some patience and experimentation at first. Another challenge is the mental change from generally recording in the seen
set the position of the node we're processing; in this case, we only care about what node labels we've previously seen. As we explore neighbors, we add each label to the seen
set, but the items we add to the queue will depend on whether or not we've encounter a snake or ladder.
LC 1376. Time Needed to Inform All Employees (✓) ★
A company has n
employees with a unique ID for each employee from 0
to n  1
. The head of the company is the one with headID
.
Each employee has one direct manager given in the manager
array where manager[i]
is the direct manager of the i
th employee, manager[headID] = 1
. Also, it is guaranteed that the subordination relationships have a tree structure.
The head of the company wants to inform all the company employees of an urgent piece of news. He will inform his direct subordinates, and they will inform their subordinates, and so on until all employees know about the urgent news.
The i
th employee needs informTime[i]
minutes to inform all of his direct subordinates (i.e., After informTime[i]
minutes, all his direct subordinates can start spreading the news).
Return the number of minutes needed to inform all the employees about the urgent news.
class Solution:
def numOfMinutes(self, n: int, headID: int, manager: List[int], informTime: List[int]) > int:
# treat the graph as directed (from managers to subordinates)
def build_adj_list(manager_arr):
graph = defaultdict(list)
for emp in range(n):
graph[manager_arr[emp]].append(emp)
del graph[1]
return graph
graph = build_adj_list(manager)
minutes_needed = informTime[headID]
seen = {headID}
queue = deque([(headID, 0)]) # unique base case
while queue:
manager, time_so_far = queue.popleft()
for subordinate in graph[manager]:
minutes_needed = max(minutes_needed, time_so_far + informTime[manager])
seen.add(subordinate)
queue.append((subordinate, time_so_far + informTime[manager]))
return minutes_needed
This seems like a mostly natural BFS problem, but one of the tricks involves effectively handling how queue
is initialized. It's tempting to do queue = deque([(headID, informTime[headID])])
, but this would be wrong when subordinates exist because we almost certainly end up overcounting (this is because time_so_far + informTime[manager]
is the time so far for each subordinate node of a manager).
Sometimes BFS problems can be tricky because of how we handle queue initialization. This is definitely one of those problems.
LC 994. Rotting Oranges (✓) ★★
You are given an m x n
grid
where each cell can have one of three values:
0
representing an empty cell,1
representing a fresh orange, or2
representing a rotten orange.
Every minute, any fresh orange that is 4directionally adjacent to a rotten orange becomes rotten.
Return the minimum number of minutes that must elapse until no cell has a fresh orange. If this is impossible, return 1
.
class Solution:
def orangesRotting(self, grid: List[List[int]]) > int:
def valid(row, col):
return 0 <= row < m and 0 <= col < n and grid[row][col] == 1
def find_fresh_and_rotten(mat):
fresh = 0
rotten = set()
for i in range(m):
for j in range(n):
if mat[i][j] == 1:
fresh += 1
elif mat[i][j] == 2:
rotten.add((i,j))
return fresh, rotten
m = len(grid)
n = len(grid[0])
dirs = [(1,0),(1,0),(0,1),(0,1)]
fresh, seen = find_fresh_and_rotten(grid)
if fresh == 0:
return 0
queue = deque([(*rotten, 0) for rotten in seen])
while queue:
row, col, mins = queue.popleft()
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
if valid(next_row, next_col) and (next_row, next_col) not in seen:
seen.add((next_row, next_col))
fresh = 1
if fresh == 0:
return mins + 1
queue.append((next_row, next_col, mins + 1))
return 1
The solution editorial makes it clear there are several ways of going about solving this problem. The approach above is arguably more straightforward than the solutions provided in the editorial though.
The key idea is to preprocess the grid in order to find the total number of fresh oranges as well as where the rotten oranges are located (if no fresh oranges are found, then immediately return 0
) — we then execute a multsource BFS from each rotten orange, and we keep track of how many fresh oranges remain as each cell is processed (if the counter ever reaches 0
, then we immediately return the number of minutes required at that point).
DFS (recursive)
Remarks
Assume the nodes are numbered from 0
to n  1
and the graph is given as an adjacency list. Depending on the problem, you may need to convert the input into an equivalent adjacency list before using the templates.
Basic motivation, concepts, and considerations for DFS
The following observations are largely derived from this video. The underlying graph referred to for the rest of this note is assumed to be represented as an adjacency list of index arrays.
If you're looking at a graph from above, then breadth first search is pretty intuitive. It expands from your search vertex like a wave in a pond. But lots of times you're not looking at a graph from above — you're looking at it from within (e.g., you're exploring a maze, an area in a video game, etc.).
The simple implementation for DFS is somewhat similar to BFS:
ResetGraph(G)
for v in V
v.discovered = False
DFSVertex(u)
u.discovered = True
for each v such that u > v
if (not v.discovered)
DFSVertex(v)
DFS(G, startVertex)
ResetGraph(G)
DFSVertex(startVertex)
We start from a single vertex and then go to adjacent vertices. If the next vertex has several adjacent vertices, it doesn't matter — we just choose one of them and keep exploring, ignoring other vertices we could have explored along the way. We do this by making recursive calls to DFSVertex for each new vertex.
Like in BFS, if you think about which vertex discovers another, that implies a "depth first search tree", and we can similarly store parent values in a $\pi$ variable:
ResetGraph(G)
for v in V
v.pi = nil
v.discovered = False
DFSVertex(u)
u.discovered = True
for each v such that u > v
if (not v.discovered)
v.pi = u
DFSVertex(v)
DFS(G, startVertex)
ResetGraph(G)
DFSVertex(startVertex)
Similar to BFS, in DFS we can reconstruct paths from the root of the tree — but the paths are less interesting here because they might have more than the minimum number of links. So instead of counting links, we track when each vertex was discovered (i.e., reached) and finished (i.e., fully explored) with time stamps. Of course, we don't need actual time stamps — just a relative order will do:
ResetGraph(G)
for v in V
v.pi = nil
v.discovered = 1
v.finished = 1
time = 1
DFSVertex(u)
u.discovered = time++
for each v such that u > v
if (v.discovered < 0)
v.pi = u
DFSVertex(v)
u.finished = time++
DFS(G, startVertex)
ResetGraph(G)
DFSVertex(startVertex)
CLRS pseudocode for DFS
It's probably worth reproducing the pseudocode from [20] to serve as a comparison point. The use of colors is similar to what was used for BFS:
 White: Undiscovered
 Gray: Discovered but not yet finished (i.e., it is actively being explored)
 Black: Finished (i.e., it is done being explored)
With the above in mind, here is the DFS algorithm as it appears in CLRS:
DFS(G)
for each vertex u in G.V
u.color = White
u.pi = nil
time = 0
for each vertex u in G.V
if u.color == White
DFSVisit(G, u)
DFSVisit(G, u)
time = time + 1 # white vertex has just been discovered
u.d = time
u.color = Gray
for each vertex v in G.Adj[u] # explore each edge (u, v)
if v.color == White
v.pi = u
DFSVisit(G, v)
time = time + 1
u.f = time
u.color = Black # blacken u; it is finished
Why would we want to track time stamps when executing a DFS? It turns out that DFS time stamps will be useful for other tasks like topological sorting. Additionally, for some algorithms, you want to run DFS from one vertex while for others you want to run it on the whole graph.
To do that, after initializing the graph once (i.e.,m with ResetGraph
), run DFS on each previously undiscovered vertex; then, instead of getting a DFS tree rooted at one vertex (i.e., startVertex
) that only includes vertices reachable from that vertex, we get a forest that will always contain all the graph's vertices.
In terms of our earlier pseudocode for the full implementation, this means changing the block
DFS(G, startVertex)
ResetGraph(G)
DFSVertex(startVertex)
to the following:
DFS(G)
ResetGraph(G)
for u in V
if (u.discovered < 0)
DFSVertex(u)
We then get the following as our updated full implementation:
ResetGraph(G)
for v in V
v.pi = nil
v.discovered = 1
v.finished = 1
time = 1
DFSVertex(u)
u.discovered = time++
for each v such that u > v
if (v.discovered < 0)
v.pi = u
DFSVertex(v)
u.finished = time++
DFS(G)
ResetGraph(G)
for u in V
if (u.discovered < 0)
DFSVertex(u)
It's a useful exercise to consider an example graph and work out how everything above is represented and calculated.
Worked example
Let's consider an example of the algorithm above in action on the following graph:
We'll execute toplevel searches on vertices in alphabetical order. Doing so yields the following completely searched graph, as illustrated in the linked video at the top of this note:
As can be seen above, A
, E
, and I
are the roots of their own depth first search trees in this depth first search forest.
To reproduce the final figure above using code, we can first represent the graph by mapping vertices A
through I
to 0
through 8
, inclusive. It also helps to define a lookup table for ease of reference:
graph = [
[3],
[0, 6],
[1, 3, 6],
[1, 5, 6],
[7],
[2],
[],
[3, 4, 5],
[4],
]
lookup = {
1: ' ',
0: 'A',
1: 'B',
2: 'C',
3: 'D',
4: 'E',
5: 'F',
6: 'G',
7: 'H',
8: 'I',
}
Now we can write the dfs
function to execute a DFS on the entire graph — the inner visit
function is where the DFSVertex
method is implemented (a reformatting of the output is included at the bottom of the code):
def dfs(graph):
n = len(graph)
discovered = [1] * n
finished = [1] * n
pred = [1] * n
time = 0
def visit(node):
nonlocal time
time += 1
discovered[node] = time
for nbr in graph[node]:
if discovered[nbr] < 0:
pred[nbr] = node
visit(nbr)
time += 1
finished[node] = time
for node in range(n):
if discovered[node] < 0:
visit(node)
return discovered, finished, [ lookup[parent] for parent in pred ]
print(dfs(graph))
"""
( A B C D E F G H I
[ 1, 3, 8, 2, 13, 7, 4, 14, 17], # discovered times
[12, 6, 9, 11, 16, 10, 5, 15, 18], # finished times
[ , D, F, A, , D, B, E, ] # pi values
)
"""
The next two aspects of DFS we will consider, namely parenthesis notation and edge classification are remarked on in greater detail in [20]. The relevant snippets from CLRS are reproduced below (it would likely be helpful to look at these snippets before looking at these topics in the context of the example we've been working through).
Parenthesis theorem (CLRS)
Depthfirst search yields valuable information about the structure of a graph. Perhaps the most basic property of depthfirst search is that the predecessor subgraph $G_\pi$ does indeed form a forest of trees, since the structure of the depthfirst trees exactly mirrors the structure of recursive calls of DFSVISIT
. That is, $u = v.\pi$ if and only if $\text{DFSVisit}(G, v)$ was called during a search of $u$'s adjacency list. Additionally, vertex $v$ is a descendant of vertex $u$ in the depthfirst forest if and only if $v$ is discovered during the time in which $u$ is gray.
Another important property of depthfirst search is that discovery and finish times have parenthesis structure. If the DFSVISIT
procedure were to print a left parenthesis "$(u$" when it discovers vertex $u$ and to print a right parenthesis "$u)$" when it finishes $u$, then the printed expression would be well formed in the sense that the parentheses are properly nested.
The following theorem provides another way to characterize the parenthesis structure.
In any depthfirst search of a (directed or undirected) graph $G = (V, E)$, for any two vertices $u$ and $v$, exactly one of the following three conditions holds:
 the intervals $[u.d, u.f]$ and $[v.d, v.f]$ are entirely disjoint, and neither $u$ nor $v$ is a descendant of the other in the depthfirst forest,
 the interval $[u.d, u.f]$ is contained entirely within the interval $[v.d, v.f]$, and $u$ is a descendant of $v$ in a depthfirst tree, or
 the interval $[v.d, v.f]$ is contained entirely within the interval $[u.d, u.f]$, and $v$ is a descendant of $u$ in a depthfirst tree.
Edge classification (CLRS)
You can obtain important information about a graph by classifying its edges during a depthfirst search. For example, Section 20.4 will show that a directed graph is acyclic if and only if a depthfirst search yields no "back" edges (Lemma 20.11).
The depthfirst forest $G_\pi$ produced by a depthfirst search on graph $G$ can contain four types of edges
 Tree edges are edges in the depthfirst forest $G_\pi$. Edge $(u, v)$ is a tree edge if $v$ was first discovered by exploring edge $(u, v)$.
 Back edges are those edges $(u, v)$ connecting a vertex $u$ to an ancestor $v$ in a depthfirst tree. We consider selfloops, which may occur in directed graphs, to be back edges.
 Forward edges are those nontree edges $(u, v)$ connecting a vertex $u$ to a proper descendant $v$ in a depthfirst tree.
 Cross edges are all other edges. They can go between vertices in the same depthfirst tree, as long as one vertex is not an ancestor of the other, or they can go between vertices in different depthfirst trees.
The DFS algorithm has enough information to classify some edges as it encounters them. The key idea is that when an edge $(u, v)$ is first explored, the color of vertex $v$ says something about the edge:
 WHITE indicates a tree edge,
 GRAY indicates a back edge, and
 BLACK indicates a forward or cross edge.
The first case is immediate from the specification of the algorithm. For the second case, observe that the gray vertices always form a linear chain of descendants corresponding to the stack of active DFSVISIT
invocations. The number of gray vertices is 1 more than the depth in the depthfirst forest of the vertex most recently discovered. Depthfirst search always explores from the deepest gray vertex, so that an edge that reaches another gray vertex has reached an ancestor. The third case handles the remaining possibility.
The entire depth first search from the worked example above can be summarized nicely using parentheses:
An opening parentheses stands for when the depth first search call is made on a vertex, and the closed parentheses stands for when that call exits:
The parentheses are properly nested because the inner recursive calls must complete before the code that called them. A child will always be nested in its parent, and a vertex is only the child of at most one vertex, the one that discovered it.
If we just count the parentheses from the beginning, they will match the discovery and finish times:
Recall the fully explored graph for reference and comparison:
In CLRS they do two more things. First, they color vertices in the following manner:
 White: Undiscovered
 Gray: Discovered (but unifinished)
 Black: Finished
Second, they classify edges:
 Tree edges (parent to a child): go to an undiscovered vertex
 Back edges (to an ancestor): to a discovered but unfinished vertex (creates a cycle). During DFS, every back edge completes a cycle. Removing back edges from a graph would remove all cycles.
 Forward edges (to a nonchild descendant): to a finished vertex discovered after the current vertex. A forward edge is basically an edge that goes to an indirect descendant, not a direct child. For the graph we've been considering, the edge from
D
toG
is a forward edge. How can we tell? BecauseG
is finished but its discovery time is after that of the current node,D
. The vertexG
was discovered and explored during the lifetime ofG
— it's a descendant ofD
but not a direct descendant. NodeB
is the direct descendant ofD
; nodeG
is the direct descendant ofB
; and nodeG
is an indirect descendant ofD
.  Cross edges (everything else): to a vertex finished before the current vertex's discovery. It's essentially any edge that's not captured in the edge classifications above. It can go from one branch of a tree to another or even from one tree to another. And there isn't any ancestor or descendant relation between the vertices it links. You can tell because it leads to a vertex that finished before the current vertex was discovered. Its parentheses don't overlap.
If we color our example graph so that tree edges are red, back edges are black, forward edges are blue, and cross edges are green, then we end up with the following:
For undirected graphs, we end up seeing each edge twice, once from each vertex. If we classify the edge the first time we see it, then there won't be any forward or cross edges, only tree and back edges.
Unlike breadth first search, if a graph is more connected, then its depth first search trees tend to be taller and more vinelike. If vertices have lots of outgoing edges, then you can keep finding new vertices to explore, and the tree depth can get large. This brings us to a possible implementation hiccup for DFS — for each recursive call, we push variables onto our program's call stack. Different languages have different limits to how deep the call stack can be. A graph with 20,000 vertices might try to make 20,000 nested recursive DFS calls, which can give you a program stack overflow error. To avoid this, we can use our own stack instead of the implicit call stack with recursion.
We can do this in a few ways, but here is one possible manner:
ResetGraph(G)
for v in V
v.pi = nil
v.discovered = 1
v.finished = 1
time = 1
ExploreVertex(u, S)
if (u.discovered < 0)
u.discovered = time++
S.push(u)
for each v such that u > v
S.push(u > v)
else if (u.finished < 0)
u.finished = time++
# else: ignore, top level search of finished vertex
ExploreEdge(u > v, S)
if (v.discovered < 0)
(u > v).label = "treeEdge"
v.pi = u
ExploreVertex(v, S)
else if (v.finished < 0)
(u > v).label = "backEdge
else if (v.discovered > u.discovered)
(u > v).label = "forwardEdge"
else
(u > v).label = "crossEdge"
DFS(G)
ResetGraph(G)
S = new Stack()
for u in V
S.push(u)
while (not S.isEmpty())
x = S.pop()
if (x.isVertex())
ExploreVertex(x, S)
else
ExploreEdge(x, S)
Let's go through how the pseudocode above is meant to function:
 Line
32
: Here, we make a stack where we can push edges and vertices.  Lines
33
36
: Push all vertices on to the stack and pop while the stack isn't empty.  Lines
913
: If we pop an undiscovered vertex, then discover it! Push it again to finish it later, and push its outgoing edges. When we pop it again later, after it's discovered (line14
), then finish it (line15
). And if we pop an already finished vertex, then just ignore it (line16
). That's the equivalent of looping through all vertices but only running depth first search on the undiscovered ones.  Lines
19
28
: When we pop an edge from the stack, if it leads to an undiscovered vertex, then it's a tree edge and label it as so (line20
) and explore that vertex (line22
); otherwise, just label the edge and we're done with it (lines23
28
).
In the two lines where we push either all vertices (line 34
) or all edges from a vertex (line 13
), if we push them in the opposite order that you would normally loop through them in the recursive version of the algorithm, then the version above will give us the same results, same times, edge classifications, etc. It will all be the same as the recursive version. The code above doesn't look quite as clean, but it's a nice parallel to see that while breadth first search explicitly uses a first in first out queue of vertices, depth first search can explicitly use a stack of vertices and edges instead of just implicitly using the program call stack.
If we implement all of the pseudocode above for the iterative version, then we will end up with something like the following (the blocks of code have been highlighted where ordering has intentionally been reversed to ensure the same results as the recursive version):
def dfs_iter(graph):
n = len(graph)
edge_classifications = dict()
discovered = [1] * n
finished = [1] * n
pred = [1] * n
time = 0
def explore_vertex(node):
nonlocal time
if discovered[node] < 0:
time += 1
discovered[node] = time
stack.append(node)
for i in range(len(graph[node])  1, 1, 1):
nbr = graph[node][i]
stack.append((node, nbr))
elif finished[node] < 0:
time += 1
finished[node] = time
def explore_edge(edge):
node, nbr = edge
if discovered[nbr] < 0:
edge_classifications[edge] = 'treeEdge'
pred[nbr] = node
explore_vertex(nbr)
elif finished[nbr] < 0:
edge_classifications[edge] = 'backEdge'
elif discovered[nbr] > discovered[node]:
edge_classifications[edge] = 'forwardEdge'
else:
edge_classifications[edge] = 'crossEdge'
stack = []
for node in range(n  1, 1, 1):
stack.append(node)
while stack:
x = stack.pop()
if not isinstance(x, tuple):
explore_vertex(x)
else:
explore_edge(x)
return discovered, finished, pred, { (lookup[edge[0]], lookup[edge[1]]): edge_classifications[edge] for edge in edge_classifications }
The outcome, formatted manually for the sake of clarity, matches exactly what was produced previously using the recursive approach (the edge classification also matches what we would expect):
"""
( A B C D E F G H I
[ 1, 3, 8, 2, 13, 7, 4, 14, 17], # discovered
[12, 6, 9, 11, 16, 10, 5, 15, 18], # finished
[1, 3, 5, 0, 1, 3, 1, 4, 1], # predecessors
{ # edge classifications
('A', 'D'): 'treeEdge',
('D', 'B'): 'treeEdge',
('B', 'A'): 'backEdge',
('B', 'G'): 'treeEdge',
('D', 'F'): 'treeEdge',
('F', 'C'): 'treeEdge',
('C', 'B'): 'crossEdge',
('C', 'D'): 'backEdge',
('C', 'G'): 'crossEdge',
('D', 'G'): 'forwardEdge',
('E', 'H'): 'treeEdge',
('H', 'D'): 'crossEdge',
('H', 'E'): 'backEdge',
('H', 'F'): 'crossEdge',
('I', 'E'): 'crossEdge'
}
)
"""
Why is the order reversal in the iterative version important in order to ensure the same results as in the recursive version? Because stacks are LIFO data structures — if we push elements onto the stack in the same order as we would process them recursively, then they will be popped off in reverse order. Consider the worked example we've been dealing with throughout this entire note. If we pushed the vertices A
through I
onto the stack in that order, then the first vertex popped off would be F
, not A
. But that's not the desired result! Hence, we push vertices I
, H
, ... , B
, A
onto the stack in that order so they are popped in the order A
, B
, ... , H
, I
, much as the order they are processed in the recursive version. Similarly, the order in which neighboring vertices are processed needs to be reversed as well; that is, we need to push neighbors onto the stack in reverse order — this ensures that when they are popped off the stack, they are processed in the original order.
Consider a generalized example to fully clarify things:
 Adjacency list: Let's say
u
has neighbors[v1, v2, v3]
.  Recursive DFS: Processes
v1
, thenv2
, thenv3
.  Iterative DFS (without reversal):
 Push
v1
,v2
,v3
onto the stack.  Pop and process
v3
,v2
,v1
(reverse order).
 Push
 Iterative DFS (with reversal):

Push
v3
,v2
,v1
onto the stack. 
Pop and process
v1
,v2
,v3
(original order).

def fn(graph):
def dfs(node):
ans = 0
# do some logic
for neighbor in graph[node]:
if neighbor not in seen:
seen.add(neighbor)
ans += dfs(neighbor)
return ans
seen = {START_NODE}
return dfs(START_NODE)
Examples
LC 547. Number of Provinces (✓)
There are n
cities. Some of them are connected, while some are not. If city a
is connected directly with city b
, and city b
is connected directly with city c
, then city a
is connected indirectly with city c
.
A province is a group of directly or indirectly connected cities and no other cities outside of the group.
You are given an n x n
matrix isConnected
where isConnected[i][j] = 1
if the i^{th}
city and the j^{th}
city are directly connected, and isConnected[i][j] = 0
otherwise.
Return the total number of provinces.
class Solution:
def findCircleNum(self, isConnected: List[List[int]]) > int:
def build_adj_list(adj_mat):
graph = defaultdict(list)
n = len(adj_mat)
for node in range(n):
for neighbor in range(node + 1, n):
if isConnected[node][neighbor] == 1:
graph[node].append(neighbor)
graph[neighbor].append(node)
return graph
def dfs(node):
for neighbor in graph[node]:
if neighbor not in seen:
seen.add(neighbor)
dfs(neighbor)
graph = build_adj_list(isConnected)
seen = set()
provinces = 0
for city in range(len(isConnected)):
if city not in seen:
provinces += 1
seen.add(city)
dfs(city)
return provinces
Cities are nodes, connected cities are provinces (i.e., connected components). The idea here is to explore all provinces by starting with each city and seeing how many cities we can explore from that city — every time we have to start a search again from a new city, we increment the number of overall provinces encountered thus far.
LC 200. Number of Islands (✓)
Given an m x n
2D binary grid grid
which represents a map of '1'
s (land) and '0'
s (water), return the number of islands.
An island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. You may assume all four edges of the grid are all surrounded by water.
class Solution:
def numIslands(self, grid: List[List[str]]) > int:
def valid(row, col):
return 0 <= row < m and 0 <= col < n and grid[row][col] == '1'
def dfs(row, col):
for dr, dc in dirs:
next_row, next_col = row + dr, col + dc
neighbor = (next_row, next_col)
if neighbor not in seen and valid(*neighbor):
seen.add(neighbor)
dfs(*neighbor)
m = len(grid)
n = len(grid[0])
dirs = [(1,0),(1,0),(0,1),(0,1)]
seen = set()
islands = 0
for i in range(m):
for j in range(n):
node = (i, j)
if node not in seen and grid[i][j] == '1':
islands += 1
seen.add(node)
dfs(*node)
return islands
Each "island" is a connected component — our job is to count the total number of connected components. The traversal is a fairly standard DFS traversal on a gridlike graph.
LC 1466. Reorder Routes to Make All Paths Lead to the City Zero (✓)
There are n
cities numbered from 0
to n1
and n1
roads such that there is only one way to travel between two different cities (this network form a tree). Last year, The ministry of transport decided to orient the roads in one direction because they are too narrow.
Roads are represented by connections
where connections[i] = [a, b]
represents a road from city a
to b
.
This year, there will be a big event in the capital (city 0
), and many people want to travel to this city.
Your task consists of reorienting some roads such that each city can visit the city 0
. Return the minimum number of edges changed.
It's guaranteed that each city can reach the city 0
after reorder.
class Solution:
def minReorder(self, n: int, connections: List[List[int]]) > int:
roads =