# Posts tagged *graph theory*

Doing the problem Minimum spanning tree for each edge gave me a refresher on minimum spanning trees and forced me to learn a new technique, *binary lifting*.

Here's the problem.

Connected undirected weighted graph without self-loops and multiple edges is given. Graph contains $N$ vertices and $M$ edges.

For each edge $(u, v)$ find the minimal possible weight of the spanning tree that contains the edge $(u, v)$.

The weight of the spanning tree is the sum of weights of all edges included in spanning tree.

It's a pretty simple problem statement, and at a high level not even that hard to solve. I figured out rather quickly the general approach. First, let $w: V \times V \rightarrow \mathbb{R}$ be the weight of an edge.

- Find any minimum spanning tree, $T = (V,E)$.
- Given that minimum spanning tree, root the tree.
- Now, given any edge $(u,v)$, if $(u,v) \in E$, return the weight of the spanning tree. If not, add that edge. We must necessarily have a cycle, for otherwise, the tree would not be a tree.
- Find the cycle by finding the least common ancestor of $u$ and $v$.
- Remove the edge with highest weight along the cycle that is not $(u,v)$. The weight of the minimum spanning tree with $(u,v)$ is obtained by adding $w(u,v)$ and subtracting the weight of edge that we removed.

Unfortunately, implementing this solution is rather tricky. I already knew how to find the minimum spanning tree. Finding the least common ancestor efficiently was not so easy, however.

## Minimum Spanning Tree

Consider an undirected weighted graph $G = (V,E,w)$. A *minimum spanning tree* is a subgraph $MST = (V, E^\prime, w)$, where we choose $E^\prime \subset E$ such that $\sum_{(u,v) \in E^\prime,~u < v} w(u,v)$ is minimized.

There are two standard algorithms to solve this problem, Prim's algorithm and Kruskal's algorithm. Apparently, Prim's is faster on dense graphs, whereas Kruskal's is faster on sparse graphs. Also, to get the advantages of Prim's algorithm, you'll need to have some type of fancy priority queue implemented. I always use Prim's because that's what the USACO taught me.

Prim's algorithm is actually pretty easy to memorize, too, because it's so similar to Dijkstra's algorithm. In fact, I use the exact same implementation of the binary heap. Let's go through the steps.

**Initialization:**Set the parent of each vertex to be a sentinal, say $-1$ if the vertices are labeled from $0$ to $N - 1$. Initialize $E^\prime = \emptyset$. These are the edges of the minimum spanning tree consisting of just vertex 0. Set vertex $0$ as visited and push all its neigbors onto the priority queue with the value being $w(0,v)$ for $(0,v) \in E$. Set each parent of such $v$ to be $0$.**Maintenance:**At the start, we have subgraph $G_k$, which is the minimum spanning tree of some vertices $v_0,v_1,...,v_{k-1}$. Now, we're going to add vertices one-by-one if the priority queue is not empty.- Pop a vertex from $v_k$ from the priority queue. This is the vertex that is closest to the current minimum spanning tree. Mark $v_k$ as visited.
- Let $P(v_k)$ be the parent of $v_k$. Add the edge $\left(P(v_k), v_k\right)$ to our minimum spanning tree.
For each neighbor $u$ of $v_k$, we have several cases.

- If $u$ has never been added to the priority queue, push $u$ with value $w(v_k, u)$. Set $u$'s parent to be $v_k$.
- If $u$ is in the priority queue already, and $w(v_k, u)$ is smaller than the current distance of $u$ from the minimum spanning tree, decrease the value of key $u$ in the priority queue to $w(v_k, u)$. Set $u$'s parent to be $v_k$.
- If $u$ was already visited of $w(v_k,u)$ is bigger the current distance of $u$ from the minimum spanning tree, do nothing.

We need prove that this actually creates a minimum spanning tree $G_{k + 1}$. Let $G_{k + 1}^\prime$ be a minimum spanning tree of vertices $v_0, v_1, v_2,\ldots,v_k$. Let $W(G)$ denote sum of the edge weights of graph $G.$

Suppose for a contradiction that $G_{k + 1}$ is not a minimum spanning tree. $G_{k + 1}$ contains some edge $(u, v_k)$, and $G^\prime_{k+1}$ contains an edge $(u^\prime, v_k)$. Now, we have that $W(G_{k + 1}) = W(G_k) + w(u,v_k)$ and $W(G^\prime_{k + 1}) = W(G_k^\prime) + w(u^\prime,v_k).$ Clearly, $u, u^\prime \in \{v_0,v_1,\ldots,v_{k - 1}\}$. Since $G_k$ is a minimum spanning tree of $v_0, v_1,\ldots,v_{k-1}$ and $w(u,v_k) \leq w(v,v_k)$ for any $v \in \{v_0,v_1,\ldots,v_{k - 1}\}$ by our induction hypothesis and construction, we cannot have that $W(G_{k+1}) > W(G_{k+1}^\prime)$. Thus, $G_{k+1}$ is the minimum spanning tree of vertices $v_0,v_1,\ldots,v_k$, and this greedy approach works.

**Termination:**There's nothing to be done here. Just return $G_{N}$.

Here's an example of this algorithm in action. Vertices and edges in the minimum spanning tree are colored red. Possible candidates for the next vertex are colored in blue. The most recently added vertex is bolded.

Here's the code. It takes and adjacency list and returns the subset of the adjacency list that is a minimum spanning tree, which is not necessarily unique.

```
vector<unordered_map<int, int>> findMinimumSpanningTree(const vector<unordered_map<int, int>> &adjacencyList) {
int N = adjacencyList.size();
vector<unordered_map<int, int>> minimumSpanningTree(N);
vector<int> visited(N, false);
vector<int> parent(N, -1);
phillypham::priority_queue pq(N); // keep track of closest vertex
pq.push(0, 0);
while (!pq.empty()) {
// find closest vertex to current tree
int currentVertex = pq.top();
int minDistance = pq.at(currentVertex);
pq.pop();
visited[currentVertex] = true;
// add edge to tree if it has a parent
if (parent[currentVertex] != -1) {
minimumSpanningTree[parent[currentVertex]][currentVertex] = minDistance;
minimumSpanningTree[currentVertex][parent[currentVertex]] = minDistance;
}
// relax neighbors step
for (pair<int, int> nextVertex : adjacencyList[currentVertex]) { // loop through edges
if (!visited[nextVertex.first]) { // ignore vertices that were already visited
if (!pq.count(nextVertex.first)) { // add vertex to priority queue for first time
parent[nextVertex.first] = currentVertex;
pq.push(nextVertex.first, nextVertex.second); // distance is edge weight
} else if (pq.at(nextVertex.first) > nextVertex.second) {
parent[nextVertex.first] = currentVertex;
pq.decrease_key(nextVertex.first, nextVertex.second);
}
}
}
}
return minimumSpanningTree;
}
```

## Least Common Ancestor

The trickier part is to find the cycle and the max edge along the cycle whenever we add an edge. Suppose we're trying to add edge $(u,v)$ not in $E^\prime$. We can imagine the cycle starting and stopping at the least common ancestor of $u$ and $v$. The naive way to find this is to use two pointers, one starting at $u$ and the other at $v$. Whichever pointer points to a vertex of greater depth, replace that pointer with one to its parent until both pointers point to the same vertex.

This is too slow. To fix this, we use a form of path compression that some call binary lifting. Imagine that we have vertices $0,1,\ldots,N-1$. We denote the $2^j$th ancestor of vertex $i$ by $P_{i,j}$. Storing all $P_{i,j}$ takes $O(N\log N)$ storage since the height of the tree is at most $N$. Then, if we were trying to find the $k$th ancestor, we could use the binary representation and rewrite $k = 2^{k_1} + 2^{k_2} + \cdots + 2^{k_m}$, where $k_1 < k_2 < \cdots < k_m$. In this way, from $v$, we could first find the $2^{k_m}$th ancestor $a_1$. Then, we'd find the $2^{k_{m-1}}$th ancestor of $a_1$. Call it $a_2$. Continuing in this manner, $a_m$ would be the $k$th ancestor of $v$. $m \sim O(\log N)$, and each lookup is takes constant time, so computing the $k$th ancestor is a $O(\log N)$ operation.

We can also compute all $P_{i,j}$ in $O(N\log N)$ time. We initially know $P_{i,0}$ since that's just the parent of vertex $i$. Then for each $j = 1,2,\ldots,l$, where $l = \lfloor \log_2(H) \rfloor$, where $H$ is the height of the tree, we have that \begin{equation} P_{i,j+1} = P_{P_{i,j},j} \end{equation} if the depth of $i$ is at least $2^{j + 1}$ since $P_{i,j}$ is the $2^j$th ancestor of vertex $i$. The $2^j$th ancestor of vertex $P_{i,j}$ will then be the $2^j + 2^j = 2^{j+1}$th ancestor of vertex $i$.

Thus, if we want to find the least common ancestor of $u$ and $v$, we use the following recursive algorithm in which we have three cases.

**Base case:**If $u = v$, the least common ancestor is $u$. If $P_{u,0} = P_{v,0}$, the least common ancestor is $P_{u,0}$.**Case 1**is that the depths of $u$ and $v$ are different. Assume without loss of generality that the depth of $u$ is greater than the depth of $v$. Replace $u$ with $P_{u,j}$, where $j$ is chosen to be as large as possible such that the depth of $P_{u,j}$ is at least the depth of $v$.**Case 2**is that the depths of $u$ and $v$ are the same. Replace $u$ with $P_{u,j}$ and $v$ with $P_{v,j}$, where $j$ is the largest integer such that $P_{u,j}$ and $P_{v,j}$ are distinct.

I'm sure that you're thinking that this is all very interesting, but how about that edge of max weight in the cycle? This paradigm can be used to find this, too.

Suppose we have some function that acts on the edges $g : E^\prime \rightarrow \mathbb{R}$. We have some combiner function $C$ with with we can extend $g$ to $\tilde{g} : V \times V \rightarrow \mathbb{R}$ in the following manner. Consider any two vertices $u$ and $v$. If $(u,v) \in E^\prime$, then $\tilde{g}(u,v) = g(u,v)$. If there is not an edge between $u$ and $v$, there is an intermediate vertex $w$. Then, we define \begin{equation} \tilde{g}(u,v) = C\left(\tilde{g}(u, w), \tilde{g}(w, v)\right). \end{equation}

When we add an edge $(u,v) \not\in E^\prime$, we want add $w(u,v)$ and remove an edge of weight $\tilde{g}(u,v)$. Such a function can be computed in parallel with find the least common ancestor. Let $w$ be the least common ancestor. As we walk up the tree from $u$ to $w$, we'll compute $\tilde{g}(u,w)$. Similarly, we'll compute $\tilde{g}(w,v)$ as we walk up the tree from $v$ to $w$. This will give us $\tilde{g}(u,v)$ after applying $C$.

Let's walk through the steps with code now. Recall our minimum spanning tree is an adjacency list of type `vector<unordered_map<int, int>>`

.

First, we'll root the tree and compute the parent and depth of each vertex.

`vector<pair<int, int>> rootTree(int root, const vector<unordered_map<int, int>> &adjacencyList) { int N = adjacencyList.size(); vector<pair<int, int>> parentDepth(N); stack<int, vector<int>> s; s.push(root); parentDepth[root] = make_pair(-1, 0); while (!s.empty()) { int currentVertex = s.top(); s.pop(); for (pair<int, int> nextVertex : adjacencyList[currentVertex]) { if (nextVertex.first != parentDepth[currentVertex].first) { parentDepth[nextVertex.first] = make_pair(currentVertex, parentDepth[currentVertex].second + 1); s.push(nextVertex.first); } } } return parentDepth; }`

Next, we compute $P_{i,j}$ and $\tilde{g}(i,P_{i,j})$. $P_{i,j}$ corresponds to

`ancestor[i][j]`

, and $\tilde{g}(i,P_{i,j})$ corresponds to`maxEdge[i][j]`

.`void memoizeAncestorAndMaxEdge(vector<vector<int>> &ancestor, vector<vector<int>> &maxEdge, vector<unordered_map<int, int>> &minimumSpanningTree, const vector<pair<int, int>> &parentDepth) { int N = minimumSpanningTree.size(); int maxLogDepth = 0; for (int i = 0; i < N; ++i) { int logDepth = parentDepth[i].second == 0 ? 0 : floor(log2(parentDepth[i].second) + 1); if (maxLogDepth < logDepth) maxLogDepth = logDepth; ancestor.push_back(vector<int>(logDepth)); maxEdge.push_back(vector<int>(logDepth)); if (logDepth > 0) { ancestor.back().front() = parentDepth[i].first; maxEdge.back().front() = minimumSpanningTree[parentDepth[i].first][i]; } } for (int j = 1; j < maxLogDepth; ++j) { for (int i = 0; i < N; ++i) { int logDepth = parentDepth[i].second == 0 ? 0 : floor(log2(parentDepth[i].second) + 1); // go up 2^(j-1) + 2^(j-1) = 2^j levels if (j < logDepth) { ancestor[i][j] = ancestor[ancestor[i][j - 1]][j - 1]; maxEdge[i][j] = max(maxEdge[i][j-1], maxEdge[ancestor[i][j - 1]][j - 1]); } } } }`

Now, we can recursively compute $\tilde{g}(u,v)$ whenever we want to add $(u,v) \not\in E^\prime$. Note that one of the base cases is handle in the

`else`

part of the`if`

statement.`int findMaxEdge(int u, int v, const vector<pair<int, int>> &parentDepth, const vector<vector<int>> &ancestor, const vector<vector<int>> &maxEdge) { if (u == v) return 0; if (parentDepth[u].second < parentDepth[v].second) swap(u, v); // now depth(u) >= depth(v) if (parentDepth[u].second != parentDepth[v].second) { int depthDelta = parentDepth[u].second - parentDepth[v].second; int logDelta = floor(log2(depthDelta)); return max(maxEdge[u][logDelta], findMaxEdge(ancestor[u][logDelta], v, parentDepth, ancestor, maxEdge)); } else { // depth(u) == depth(v) int L = 0; int U = parentDepth[u].second == 0 ? 0 : floor(log2(parentDepth[u].second) + 1); int mid = L + (U - L)/2; while (L < U) { // find smallest index such that ancestor[u][j] == ancestor[v][j] if (ancestor[u][mid] == ancestor[v][mid]) { U = mid; } else { // ancestor[u][mid] != ancestor[v][mid] L = mid + 1; } mid = L + (U - L)/2; } if (mid == 0) return max(maxEdge[u][mid], maxEdge[v][mid]); --mid; // recursively run on the shallowest distinct ancestors int maxEdgeWeight = max(maxEdge[u][mid], maxEdge[v][mid]); return max(maxEdgeWeight, findMaxEdge(ancestor[u][mid], ancestor[v][mid], parentDepth, ancestor, maxEdge)); } }`

## Code

Here's the main loop, the code that glues all these functions together.

```
long long computeTreeWeightHelper(int parent, int vertex,
const vector<unordered_map<int, int>> &adjacencyList) {
long long weight = 0LL;
for (pair<int, int> nextVertex : adjacencyList[vertex]) {
if (nextVertex.first != parent) {
weight += nextVertex.second + computeTreeWeightHelper(vertex, nextVertex.first, adjacencyList);
}
}
return weight;
}
long long computeTreeWeight(const vector<unordered_map<int, int>> &adjacencyList) {
return computeTreeWeightHelper(-1, 0, adjacencyList);
}
int main(int argc, char *argv[]) {
ios::sync_with_stdio(false); cin.tie(NULL);
int N, M; cin >> N >> M; // vertices and edges
vector<tuple<int, int, int>> edgeList; edgeList.reserve(M);
vector<unordered_map<int, int>> adjacencyList(N);
for (int m = 0; m < M; ++m) {
int u, v, w; cin >> u >> v >> w;
--u; --v;
if (adjacencyList[u].count(v)) {
adjacencyList[u][v] = min(w, adjacencyList[u][v]);
} else {
adjacencyList[u][v] = w;
}
if (adjacencyList[v].count(u)) {
adjacencyList[v][u] = min(w, adjacencyList[v][u]);
} else {
adjacencyList[v][u] = w;
}
edgeList.emplace_back(u, v, w);
}
vector<unordered_map<int, int>> minimumSpanningTree = findMinimumSpanningTree(adjacencyList);
// do lots of pre-processing
long long mstWeight = computeTreeWeight(minimumSpanningTree);
vector<pair<int, int>> parentDepth = rootTree(0, minimumSpanningTree);
// for special pair of vertices find least common ancestor and heaviest edge between vertices
vector<vector<int>> ancestor; ancestor.reserve(N); // ancestor[u][j] is 2^j ancestor of u
vector<vector<int>> maxEdge; maxEdge.reserve(N); // maxEdge[u][j] is heaviest edge between u and ancestor[u][j]
memoizeAncestorAndMaxEdge(ancestor, maxEdge, minimumSpanningTree, parentDepth);
// now iterate through given edges and include each and remove heaviest edge in cycle
for (tuple<int, int, int> edge : edgeList) {
if (minimumSpanningTree[get<0>(edge)].count(get<1>(edge))) {
if (minimumSpanningTree[get<0>(edge)][get<1>(edge)] == get<2>(edge)) {
cout << mstWeight << '\n';
} else {
cout << mstWeight + get<2>(edge) - minimumSpanningTree[get<0>(edge)][get<1>(edge)] << '\n';
}
} else {
// now for each edge take diffs with the minimum spanning tree
int maxEdgeWeight = findMaxEdge(get<0>(edge), get<1>(edge),
parentDepth, ancestor, maxEdge);
cout << mstWeight + get<2>(edge) - maxEdgeWeight << '\n';
}
}
cout << flush;
return 0;
}
```

For what it's worth, here's my complete submission, 18541620.

A common problem in computer science is to find the maximum flow between two vertices in a *flow network* $G = (V,E,c)$, where $V$ is a set of vertices, $E$ is a set of edges, and $c : V \times V \rightarrow \mathbb{R}$ is the capacity function. Each edge $(u,v) \in E$ has capacity $c(u,v).$ We'll have $c(u,v) = 0$ if $(u,v) \not\in E$.

To concretely visualize this problem, think of a network of pipes. We want to find how much water can flow from point a source $s \in V$ to a sink $t \in V$. Other analogous situations are how much cargo could we ship on a railroad network or how much data can flow between two servers. More surprisingly, the algorithm solves problems like finding the minimum cut and bipartite matching.

Formally, we we'll define a *flow* as a function $f : V \times V \rightarrow \mathbb{R}$ that satisfies the following properties:

**Capacity constraint:**For all $u, v \in V$, we must have that $0 \leq f(u,v) \leq c(u,v).$ Note that this implies that $f(u,v) = 0$ if $(u,v) \not\in E$.**Flow conservation:**For all $u \in V - \{s,t\}$, we have that \begin{equation} \sum_{v \in V}f(v,u) = \sum_{v \in V}f(u, v), \label{eqn:flow_conservation_1} \end{equation} so the flow entering the vertex is equal to the flow leaving the vertex if the vertex is not a source or sink.

We can define total flow as the flow leaving the sink, which is \begin{equation} \left|f\right| = \sum_{v \in V}f(s, v) - \sum_{v \in V}f(v, s). \label{eqn:total_flow_1} \end{equation} The maximum flow is a flow, $f^*$ such that $\left|f^*\right| \geq \left|f\right|$, where $f$ is any other flow.

## The Algorithm

I'm actually not sure what this algorithm is called. Some sources seem to call it the Ford Fulkerson algorithm. Others call it the Edmonds Karp algorithm. In any case, the version that I describe is different than the versions that Wikipedia describes. It can be found in Sedgewick's Algorithms or the USACO training pages.

Before introducing the algorithm, I introduce several concepts. First is the concept of a *residual network*, $G_f = (V,E,c,f)$, so it is a flow network associated with some specific flow $f$. Along with the residual network we have two related concepts, which is *residual capacity*,
\begin{equation}
c_f(u,v) = c(u,v) - f(u,v),
\label{eqn:residual_capacity}
\end{equation}
and an *augmenting path*, which is path from $s$ to $t$ such that each edge along the path has nonzero residual capacity. The capacity of that path is the minimum residual capacity along the path.

A helpful way to reason about these paths is to add reverse edges and maintain a loop invariant. That is, for all $(u,v) \in E$, if $(v,u) \not\in E$, then we'll create it with capacity $0$. At every step, we'll ensure that $f(u,v) + f(v,u) = 0$. This breaks the rules a bit and gives us negative flow. If you find it discomforting, you can instead add $c(u,v) + c(v,u)$ to both sides and think of maintaining the invariant that $f(u,v) + f(v,u) = c(u,v) + c(v,u).$

Both ways of thinking about it are helpful. Some proofs require that flow be nonnegative. However, maintaining $f(u,v) + f(v,u) = 0$ is easier to code. It also handles the case that both $(u,v) \in E$ and $(v,u) \in E$ more elegantly. Otherwise, we need to intialize each edge with flow of the capacity of the other edge increase the capacity by the other's capacity. Some bookkeeping at termination would also have to be done. Both flows would be equivalent. From Equation \ref{eqn:residual_capacity}, even if flow is negative, the residual capacity will remain the same since we're just adding or subtracting $c(u,v)$ from both terms. Thus, in my code, I will assume that $f(u,v) + f(v,u) = 0,$ but in proofs, I will assume that $f(u,v) \geq 0$ for all $u,v \in V.$

The concept of these reverse edges confused me a lot at first, but during the course of the algorithm, we often want to remove flow along an edge and reassign that flow to another edge. These reverse edges and the loop invariant are a form of bookkeeping that allows us to increase flow in a monotonic way. It will become clearer after I detail the steps of the algorithm and walk through an example.

### Steps

**Initialization:**This step is pretty simple. Define $f$ to simple be $f(u,v) = 0$ for all $u,v \in V.$ Clearly, this meets the definition of a flow. Also, if $(u,v) \in E$, but $(v,u) \not\in E,$ then add $(v,u)$ to $E.$ We will keep $c(v,u) = 0$, and set $f(v,u) = 0$. Thus, $f(u,v) + f(v,u) = 0.$In practice, it is better to create those reverse edges on demand only if we actually push flow along $(u,v) \in E$. We may also want to allocate memory since we'll be calling Dijkstra's algorithm, repeatedly, in the next step. If there are $N$ edges, we'd allocate three arrays of size $N$: (1) keeps track of the path, (2) keeps track of vertices that have been visited, and (3) a priority queue that keeps track of which path has max capacity.

**Maintenance:**This is definitely the most complicated step. We want to find augmenting paths to increase flow, while maintaining our loop invariant that $f(u,v) + f(v,u) = 0$ at all times.Recall Dijkstra's algorithm to find the shortest path. We can also use it find the augmenting path of most capacity on the residual network. To do this, we replace the min priority queue with a max priority queue, so at every step we're taking the unvisited vertex that has the max flow to it. In the relaxation step, we update flows instead of updating distances. We terminate when the sink $t$ is on top of the priority queue. There can't be a better flow to $t$ because flow is monotonically decreasing along a path, so some vertex would have to have more flow than $t.$

After find the augmenting path, which is a subgraph that happens to be a tree. Update flows along all the edges. Suppose we have found path $P$ with capacity $C_P$. Initialize a new flow $f^\prime = f.$ The new flows for all edges along the path is $$f^\prime(u,v) = f(u,v) + C_P.$$ By construction of our path, we will have that $f^\prime(u,v) \leq c(u,v)$ for all $u,v \in V$, so it is a proper flow except that some flows might be negative. As we discussed earlier, that is okay. Now to maintain our invariant, we update $$f^\prime(v,u) = f(v,u) - C_P.$$ Since $f(u,v) + f(v,u) = 0 \Rightarrow f^\prime(u,v) + f^\prime(v,u) = 0.$ This gives us a mechanism to take back flow and reassign it as we shall see later. Since $f(v,u) \leq c(v,u)$, $f^\prime(v,u) \leq c(v,u)$ since $C_P > 0.$

In this way, the total value of flow is always increasing, for $\left|f^\prime\right| = \left|f\right| + C_P.$ Note that we calculate total flow without the edges that we added. Now, set $f = f^\prime.$

**Termination:**We terminate when no more augmenting paths can be found. Remove the edges that were added. In case that only $(u,v) \in E$, and $(v,u) \not\in E,$ we must have that $f(u,v) \geq 0$ since $f(u,v) = -f(v,u) \geq 0$ since $c(v,u) = 0$. Thus, removing the reverse edge make this a proper flow. If $(v,u) \in E$ just set $f(v,u) = 0.$ Return $f$. I usually return $f$ as a list of edges.

```
struct FlowEdge {
int from, to, capacity, flow;
FlowEdge(int from, int to, int capacity) : from(from), to(to), capacity(capacity), flow(0) {}
};
// accepts a list of of weighted edges, where adjacencyList[u].at(v) is capacity of edge, returns a list of of edges (u,v,capacity,flow)
vector<unordered_map<int, FlowEdge>> maxFlow(int source, int sink, const vector<unordered_map<int, int>> &adjacencyList) {
int N = adjacencyList.size();
vector<unordered_map<int, FlowEdge>> flowEdges(N); // construct flow edges
for (int i = 0; i < N; ++i) {
for (pair<int, int> edge : adjacencyList[i]) {
flowEdges[i].emplace(edge.first, FlowEdge(i, edge.first, edge.second));
}
}
int totalFlow = 0;
// allocate memory for algorithm
phillypham::priority_queue flow(N);
vector<bool> visited(N, false);
vector<int> parent(N, -1);
do {
flow.clear(); flow.push(source, INT_MAX);
fill(visited.begin(), visited.end(), false);
fill(parent.begin(), parent.end(), -1);
// start algo by declaring that paths exist at first
while (!flow.empty() && flow.top() != sink) { // continue while paths exist and we haven't reached sink
// djikstra-like algorithm, instead of finding closest vertex, find vertex with max flow
int maxVertex = flow.top();
int maxVertexFlow = flow.at(maxVertex);
flow.pop();
// relax step: update flows of neighboring vertices
visited[maxVertex] = true;
for (pair<int, FlowEdge> entry : flowEdges[maxVertex]) {
int potentialFlow = min(maxVertexFlow, entry.second.capacity - entry.second.flow);
// if there's nonzero flow and we haven't been there before
if (potentialFlow > 0 && !visited[entry.first] && !flow.count(entry.first)) {
flow.push(entry.first, potentialFlow);
parent[entry.first] = maxVertex;
} else if (flow.count(entry.first) && flow.at(entry.first) < potentialFlow) {
flow.increase_key(entry.first, potentialFlow);
parent[entry.first] = maxVertex;
}
}
}
if (!flow.empty()) { // path was found, augment graph to create residual graph
totalFlow += flow.at(sink);
// use this path, so update flows
for (int currentVertex = sink; currentVertex != source; currentVertex = parent[currentVertex]) {
int parentVertex = parent[currentVertex];
flowEdges[parentVertex].at(currentVertex).flow += flow.at(sink); // send flow along this path
if (!flowEdges[currentVertex].count(parentVertex)) { // construct reverse edge
/* Give it capacity 0 to mark it as a reverse edge, add forwardEdge.capacity to get actual.
* So, capacity is same as parent, and on creation reverseEdge.flow == reverseEdge.capacity.
* In this way, we start with the invariant that reverseEdge.flow + fowardEdge.flow == forwardEdge.capacity
*/
flowEdges[currentVertex].emplace(parentVertex, FlowEdge(currentVertex, parentVertex, 0));
}
// maintain invariant that reverseEdge.flow + fowardEdge.flow == forwardEdge.capacity
flowEdges[currentVertex].at(parentVertex).flow -= flow.at(sink); // flow defaults to 0
}
}
} while (!flow.empty()); // if there is path, flow should have sink in it
// remove reverse edges and return edges with how much capacity was used in flow member
for (int i = 0; i < N; ++i) {
unordered_map<int, FlowEdge>::iterator flowEdgeIt = flowEdges[i].begin();
while (flowEdgeIt != flowEdges[i].end()) {
if ((flowEdgeIt -> second).capacity == 0) {
flowEdgeIt = flowEdges[i].erase(flowEdgeIt);
} else {
++flowEdgeIt;
}
}
}
return flowEdges;
}
```

An implementation of my binary heap can be found below.

## Example

To see more clearly how this algorithm works consider this flow network.

Let our source be $s = 0$ and our sink be $t = 7$. Here's our first augmenting path colored in blue. The reverse edges that we added are colored red.

And here's the next augmenting path.

Now, look carefully at the violet edge in the next augmenting path. Here we push a flow of 5 along a reverse edge. Notice how the flow into vertex 5 is still 10 (5 from 1 and 5 from 2), which maintains a flow of 10 out to vertex 7, our sink, so total flow does not decrease. Originally, we had that $f(1,5) = 10$, which we changed to $f(1,5) = 5$. To maintain flow conservation, we now set $f(1,4) = 5$.

For our final augmenting path we push a flow of 3 onto the violet path again, so while we originally had that $f(1,5) = 10$ at one point, now $f(1,5) = 2$.

Thus, we end up with a total flow of 28.

## Proof of Correctness

Intuitively, it make sense that when there's no further augmenting path, we have achieved maximum flow. The proof of this is not so easy. To prove this, we need to introduce more terminology. All this work is not in vain, however. Through the proof, we actually solve the minimum cut problem at the same time. Informally, the minimum cut can be thought of as the cheapest way to separate the source and sink. That is, we try to select the set of edges with minimum total weight such that their removal causes the source and sink to be disconnected.

Formally, we define a *cut* of a graph $G = (V,E)$ as a partition of $V = S \cup T$ such that $S \cap T = \emptyset$, the source is in $S$, and the sink is in $T$. The edges $\{(u,v) \in E : u \in S,v \in T\}$ are the *cut-set*. It's easy to see that removal of these edges would kill all flow from the source to the sink.

Denote the *flow* between two vertices by $f(u,v)$. Clearly if $(u,v) \not\in E$, $f(u,v) = 0$. Define *net flow* across a cut $(S,T)$ to be
\begin{equation}
f(S,T) = \sum_{u \in S}\sum_{v \in T}f(u,v) - \sum_{v\in T}\sum_{u \in S}f(v,u).
\label{eqn:net_flow}
\end{equation}
Essentially, this is the flow out of $S$ minus the flow into $S$.

Let $s$ be our source. The total value of flow in Equation \ref{eqn:total_flow_1} can also be written \begin{equation} \left|f\right| = f\left(\{s\}, V - \{s\}\right) = \sum_{v \in V}f(s,v) - \sum_{v \in V}f(v, s), \label{eqn:total_flow} \end{equation} which can be thought of as the net flow of just the source. With these definitions, we prove our first interesting result.

### For any cut $(S,T)$ and flow $f$, we have that $f(S,T) = \left|f\right|$.

First note that by Equation \ref{eqn:flow_conservation_1}, for any vertex that is not the source or sink, say $u$, we have that \begin{equation} 0 = f\left(\{u\}, V\right) = \sum_{v \in V}f(u, v) - \sum_{v \in V}f(v, u) \label{eqn:flow_conservation} \end{equation} by definition of flow since the flow going out of $u$ must be the same as the flow going into $u$.

Now by Equations \ref{eqn:total_flow} and \ref{eqn:flow_conservation}, we have that \begin{align} \left|f\right| &= \sum_{v \in V}f(s,v) - \sum_{v \in V}f(v, s) + \sum_{u \in S - \{s\}}0\nonumber\\ &= \sum_{v \in V}f(s,v) - \sum_{v \in V}f(v, s) + \sum_{u \in S - \{s\}}\left(\sum_{v \in V}f(u, v) - \sum_{v \in V}f(v, u)\right)\nonumber\\ &= \sum_{v \in V}f(s,v) + \sum_{u \in S - \{s\}}\sum_{v \in V}f(u, v) - \sum_{v \in V}f(v, s) - \sum_{v \in V}\sum_{u \in S - \{s\}}f(v, u)\nonumber\\ &= \sum_{u \in S}\sum_{v \in V}f(u, v) - \sum_{v \in V}\sum_{u \in S}f(v, u)\nonumber\\ &= \sum_{u \in S}\left(\sum_{v \in T}f(u, v) + \sum_{v \in S}f(u, v)\right) - \sum_{v \in T}\sum_{u \in S}f(v, u) - \sum_{v \in S}\sum_{u \in S}f(v, u)\nonumber\\ &= \sum_{u \in S}\sum_{v \in T}f(u, v) - \sum_{v \in T}\sum_{u \in S}f(v, u) = f(S,T), \label{eqn:cut_flow} \end{align} where the second-to-last line follows from the fact that $V = S \cup T$. The last line follows from the defintion of net flow in Equation \ref{eqn:net_flow}. $\square$

Now, let $E(S,T)$ be the cut-set of the cut $(S,T)$. We define the *capacity* of the $(S,T)$ by
\begin{equation}
c(S,T) = \sum_{(u,v) \in E(S,T)} c(u,v).
\label{eqn:capacity}
\end{equation}

If $(u,v) \not\in E,$ then we'll define $c(u,v) = 0.$ Recall that $f(u,v) \leq c(u,v)$ for all $u,v \in V.$ An application of these basic facts and the previous result give us our next result.

### The total value of any flow $f$ is less than the capacity of any cut.

Let $(S,T)$ be any cut. Note that $f(u,v) \geq 0$ for any $u,v \in V.$ Then, we have \begin{align*} \left|f\right| &= f(S,T) = \sum_{u \in S}\sum_{v \in T}f(u,v) - \sum_{v\in T}\sum_{u \in S}f(v,u) \\ &\leq \sum_{u \in S}\sum_{v \in T}f(u,v) \\ &\leq \sum_{u \in S}\sum_{v \in T}c(u,v) = c(S,T).~\square \end{align*}

Finally, we have our main result.

### Max-flow min-cut theorem

If $f$ is a flow in a flow network $G = (V,E,c)$ with source $s$ and sink $t$ then the following conditions are equivalent.

- $f$ is a maximum flow in $G$.
- The residual network $G_f$ contains no augmenting paths.
- $\left|f\right| = c(S,T)$ for some cut $(S,T)$ of $G$.

- $1 \Rightarrow 2$: This is obvious. If there was an augmenting path we could increase the flow by pushing flow through that path.
$2 \Rightarrow 3$: This is the only hard part of the proof. Fortunately, the proof is constructive and tells us how compute the minimum cut.

Define $S = \{v \in V : \text{there exists a path of positive capacity in $G_f$}\}.$ Trivially, $s \in S$, and $t \not\in S$, so if $T = V - S$, we have a cut $(S,T)$.

Now, consider any $u \in S$ and $v \in T$. We need to show that $f(u,v) = c(u,v)$. If $(u,v) \not\in E$, this is trivial. On the other hand, suppose that $(u,v) \in E$, but $f(u,v) \lneq c(u,v)$ for a contradiction. Then, $c_f(u,v) > 0$, which implies that $v \in S$, which is a contradiction since $S \cap T = \emptyset$.

Note that in the termination step of the algorithm, only one the edges $(u,v)$ or $(v,u)$ has positive flow if one of them has nonzero flow. We set the one with negative flow to 0. Since we have show that $f(u,v) = c(u,v),$ it's $f(v,u) = 0$. Thus, by Equation \ref{eqn:cut_flow}, we have that \begin{align} \left|f\right| = f(S,T) &= \sum_{u \in S}\sum_{v \in T}f(u,v) - \sum_{v\in T}\sum_{u \in S}f(v,u) \nonumber\\ &= \sum_{u \in S}\sum_{v \in T}c(u,v) - \sum_{v\in T}\sum_{u \in S}0 = c(S,T). \end{align}

$3 \Rightarrow 1$: Recall that the capacity of any cut $c(S,T) \geq \left|f\right|$ for any flow $f.$

By hypothesis, we can choose some cut $(S,T)$ such that $\left|f\right| = c(S,T)$. c(S,T) is an upper bound for $\left|f\right|$, so there cannot exist a flow $f^\prime$ such that $\left|f\right| = c(S,T) \lneq \left|f^\prime\right|$. Thus, $f$ is a maximum flow in $G$.

Thus, we have shown that no augumenting paths in the residual network implies that we have a maximum flow. $\square$

This proof is constructive, so we can compute the cut $(S,T)$ by doing a breadth-first search starting from the sink $t$ on our residual network. It's clear that we end up with $S = \{0, 2, 3, 6\}$. Note that the residual network is not unique. If we consider the same flow network and choose our augmenting paths differently, we end up with another residual network with no augmenting paths.

Vertices in $S$ are colored in blue, so we actually end up with the same minimum cut. In general, the minimum cut is not unique, either. In this case to compute the minimum cut, we had to follow reverse edges, which have positive residual capacity when they have negative flow. Our cut-set is $\left\{(0,1), (2,5), (6,7)\right\}$.

## Binary Heap

Here is the binary heap used in the algorithm above.

```
namespace phillypham {
class priority_queue {
private:
int keysSize;
vector<int> keys;
vector<long long> values;
unordered_map<int, int> keyToIdx;
int parent(int idx) {
return (idx + 1)/2 - 1;
}
int left(int idx) {
return 2*(idx+1) - 1;
}
int right(int idx) {
return 2*(idx+1);
}
void heap_swap(int i, int j) {
if (i != j) {
keyToIdx[keys[j]] = i;
keyToIdx[keys[i]] = j;
swap(values[j], values[i]);
swap(keys[j], keys[i]);
}
}
void min_heapify(int idx) { // smaller value bubbles down
int lIdx = left(idx);
int rIdx = right(idx);
int largestIdx = idx;
if (lIdx < keysSize && values[lIdx] > values[largestIdx]) {
largestIdx = lIdx;
}
if (rIdx < keysSize && values[rIdx] > values[largestIdx]) {
largestIdx = rIdx;
}
if (largestIdx != idx) {
heap_swap(largestIdx, idx);
min_heapify(largestIdx);
}
}
void max_heapify(int idx) { // larger value bubbles up
while (idx > 0 && values[parent(idx)] < values[idx]) {
heap_swap(parent(idx), idx);
idx = parent(idx);
}
}
public:
priority_queue(int N) {
keysSize = 0;
keys.clear(); keys.reserve(N);
values.clear(); values.reserve(N);
keyToIdx.clear(); keyToIdx.reserve(N);
}
void push(int key, long long value) {
// if (keyToIdx.count(key)) throw logic_error("key " + ::to_string(key) + " already exists");
int idx = keysSize; ++keysSize;
if (keysSize > keys.size()) {
keys.push_back(key);
values.push_back(value);
} else {
keys[idx] = key;
values[idx] = value;
}
keyToIdx[key] = idx;
max_heapify(idx); // make this value bubble up
}
void increase_key(int key, long long value) {
// if (!keyToIdx.count(key)) throw logic_error("key " + ::to_string(key) + " does not exist");
// if (values[keyToIdx[key]] > value) throw logic_error("value " + ::to_string(value) + " is not an increase");
values[keyToIdx[key]] = value;
max_heapify(keyToIdx[key]);
}
void decrease_key(int key, long long value) {
// if (!keyToIdx.count(key)) throw logic_error("key " + ::to_string(key) + " does not exist");
// if (values[keyToIdx[key]] < value) throw logic_error("value " + ::to_string(value) + " is not a decrease");
values[keyToIdx[key]] = value;
min_heapify(keyToIdx[key]);
}
void pop() {
if (keysSize > 0) {
heap_swap(0, --keysSize);
keyToIdx.erase(keys[keysSize]);
if (keysSize > 0) max_heapify(0);
} else {
throw logic_error("priority queue is empty");
}
}
int top() {
if (keysSize > 0) {
return keys.front();
} else {
throw logic_error("priority queue is empty");
}
}
int size() {
return keysSize;
}
bool empty() {
return keysSize == 0;
}
void clear() {
keysSize = 0;
keyToIdx.clear();
}
int at(int key) {
return values[keyToIdx.at(key)];
}
int count(int key) {
return keyToIdx.count(key);
}
string to_string() {
ostringstream out;
copy(keys.begin(), keys.begin() + keysSize, ostream_iterator<int>(out, " "));
out << '\n';
copy(values.begin(), values.begin() + keysSize, ostream_iterator<int>(out, " "));
return out.str();
}
};
}
```

Think of two sets that can only be matched together. For example, for those that believe in traditional marriage you can think of $n$ females and $n$ males. Suppose that we know much happiness each marriage would bring, and assume marriage only brings postive happiness. How should we match couples to maximize happiness?

In another scenario, suppose we have $n$ items and $n$ bidders, and we know how much each bidder is willing to pay for each item. Also, assume each bidder can only bid on at most $1$ item. To whom should we sell each item to maximize our profit?

We can model this scenario with a complete bipartite graph $G = (X,Y,E)$, where $X$ and $Y$ are two disjoint sets of vertices such that every vertex in $X$ is connected to every in $Y$, but within $X$ and within $Y$ no vertices are connected. That is, $(x,y) \in E \Leftrightarrow x \in X \wedge y \in Y$. Each edge has a weight $w_{xy}.$ In our two examples, $w_{xy}$ is the happiness from the marriage of female $x$ and male $y$ or how much item $x$ is worth to bidder $y$.

We want to match or assign each element of $X$ to a distinct element of $Y$. Thus, we have the so-called Assignment Problem. Here, I'll detail how the Hungarian method solves this.

First, we need to introduce some definitions for the algorithm to make sense.

## Graph Theory

### Matching

The technical definition of a matching is a set $M$ of non-adjacent edges, that is, no two edges in the set share a common vertex. However, I always prefered to think of matching as pairs of vertices connected by an edge such that each only vertex appears in at most 1 pair. Here's an example, where I have colored the matching in red:

In this case, $M = \{(x_1,y_1),(x_3,y_4),(x_4,y_2)\}.$

### Perfect Matching

A matching is *maximal* if adding any other edge makes it not a matching. Another way to think about this is in terms of sets. The maximal matching cannot be a proper subset of any other matching.

Now, matching is *perfect* if it contains all vertices. For instance,

is a perfect matching, where $M = \{(x_1,y_3),(x_2,y_1),(x_3,y_4),(x_4,y_2)\}.$

### Alternating Path

Now, one way of representing matchings is to break them up into paths. An *alternating path* is composed of edges, where every other edge is part of a matching. For example, this is an alternating path, where I have colored edges in the path red and blue, and red edges are part of the matching.

A path is just an ordered set of edges, where two adjacent edges share a common vertex. Our path is $x_1 \rightarrow y_2 \rightarrow x_4 \rightarrow y_1 \rightarrow x_2,$ and the corresponding matching is $M = \left\{(x_4,y_2), (x_2,y_1)\right\}.$

### Augmenting Path

An *augmenting path* is a particular kind of alternating path, in which the first and last vertex in the path is unmatched. Thus, by flipping the edges in the matching, we will get a matching that contains an additional vertex. Formally, consider the augmenting path $P$. Clearly, it must contain an odd number of edges. If $M$ is the corresponding matching, then it contains the even numbered edges in the path $P$. We can make a new larger matching $P-M$ that contains the odd numbered edges. Here's an example of this process.

### Labeling

Let us suppose the graph looks like this, where I only draw the edges from one vertex in $X$ at a time for clarity.

We assign each vertex a nonnegative integer label. Formally, this *labeling* is a function, $L: X \cup Y \rightarrow \mathbb{Z}.$ Let $w(x,y)$ be the weight of the edge from vertex $x$ and $y.$ An an edge $(x,y)$ is considered *feasible* is $L(x) + L(y) \geq w(x,y).$ A labeling is feasible if $L(x) + L(y) \geq w(x,y)$ for all $x$ and $y$.

Moreover, we define an *equality subgraph*, $G_L = (X,Y,E_L)$, associated with a labeling as follows: $(x,y) \in E_L \Leftrightarrow L(x) + L(y) = w(x,y).$

For example, given a specific labeling, here's the equality subgraph, where I've colored the feasible edges in black and the edges in the equality subgraph in green.

To me, it's not at all obvious why a labeling would be helpful, but this theorem connects everything.

If a perfect matching exists in an equality subgraph of a feasible labeling $L$, then it is a maximum weighted matching.

Here's a proof. Let $M$ be the perfect matching in the equality subgraph of labeling $L$. Let $M^\prime$ be an alternative perfect matching. Then, we have that the value of the matching $M^\prime$ is \begin{align*} \sum_{(x,y) \in M^\prime} w(x,y) &\leq \sum_{(x,y) \in M^\prime} L(x) + L(y)~~~\text{since $L$ is feasible}\\ &= \sum_{(x,y) \in M} L(x) + L(y) \\ &= \sum_{(x,y) \in M} w(x,y), \end{align*} which is the value of the matching $M.$

Thus, our problem reduces to find a perfect matching in the equality subgraph of a feasible labeling.

## Hungarian Method

Now, we know enough terminology to describe the algorithm. The main idea is to iteratively find augmenting paths in the equality subgraph until all vertices in $X$ are matched. If such a path cannot be created, we need to modify the labeling so that the labeling contains additional edges but remains feasible.

### Algorithm

- Initialize with a feasible labeling.
- Initialize two sets to keep track of vertices in alternating path. $S \subseteq X$ and $T \subseteq Y.$ Initialize with $S = T = \emptyset.$
- Pick an unmatched vertex $x \in X.$ Put it in a set called $S.$
There are two options here:

- If there is a vertex $y \in Y - T \cap N(S),$ where $N(S)$ denotes the neighbors of the vertices in $S$ in the equality subgraph, then put $y$ in $T.$ This means that we add $y$ to our path. For the $x \in S$ such that $L(x) + L(y) = w(x,y)$, set $x$ as the previous vertex in the current alternating path.
- Otherwise, fix the labeling. Let $\Delta_y = \inf\{L(x) + L(y) - w(x, y) : x \in S\}.$ Let $\Delta = \inf\{\Delta_y : y \in Y - T\}.$ For each $x \in S,$ $L^\prime(x) = L(x) - \Delta.$ For each $y \in T,$ $L^\prime(y) = L(y) + \Delta.$ In this way, all edges remain feasible. Consider various $e = (x,y)$:

- If $x \not\in S$ and $y \not\in T$, $e$ is still feasible since we didn't touch the labels of $x$ and $y$. If this edge is part of the equality subgraph, and these vertices are matched, they still are.
- If $x \in S$ and $y \in T$, we decreased $L(x)$ by $\Delta$ and increased $L(y)$ by $\Delta$, so $L(x) + L(y) = w(x,y)$ still.
- If $x \in S$ and $y \not\in T$, $e$ is still feasible since we have decreased $L(x)$ by the minimum of all such $\Delta_y.$
If $x \not\in S$ and $y \in T$, we only increased $L(y)$, so the edge is more feasible in a sense.

Now, if $\Delta_y = \Delta$ for $y \in Y - T,$ then $y$ becomes and element of $N(S),$ and we can return to step 4.

- Now, we have just put $y \in T$. We have two cases here:
- $y$ is umatched. Add $y$ to end end of the alternating path. Our alternating path starts with some umatched vertex in $x \in S$ and ends with unmatched $y.$ We have an augmenting path, so create a new bigger matching by inverting our path. Count the number of matched vertices. If it is $n$, we are done. Otherwise, go back to step 2 and repeat.
- $y$ is already matched, say, to $x^\prime$. We add two vertices to our path, so our path looks like $$x \longrightarrow \cdots \longrightarrow y \longrightarrow x^\prime,$$ where $x \in S$ is an umatched vertex. Go to step 4 to find more vertices to add to our path.

### Implementation

The algorithm is simple enough, but there are few complications in implementing it, particularly, finding an efficient way to calculate $\Delta$. If you read carefully, you'll note that if $|Y| \geq |X|,$ we will still find the optimal match since we will use the edges of higher value first. Let $|X| = n$ and $|Y| = m.$ Let us define various data structures:

`vector<bool> S(n, false)`

: keeps track of the $X$ vertices in our current alternating path`vector<int> xLabel(n, -1)`

: labeling of vertices in $X$`vector<int> yLabel(m, 0)`

: labeling of vertices in $Y$`vector<int> slack(m, INT_MAX)`

:`slack[y]`

$= \Delta_y$`vector<int> prevX(n, -1)`

:`prevX[x]`

is the vertex in $Y$ that comes before $x$ in our alternating path`vector<int> prevY(m, -1)`

:`prevY[y]`

is the vertex in $X$ that comes before $y$ in our alternating path`vector<int> xToY(n, -1)`

: $x$ is matched to`xToY[x]`

in $Y$`vector<int> yToX(m, -1)`

: $y$ is matched to`yToX[y]`

in $X$`stack<int, vector<int>> sStack`

keeps track of the vertices that need to be added to $S$. A queue would work just as well.`queue<int> tQueue`

keeps track of the vertices that are eligible to be added to $T$ (neighbors of $S$ in equality subgraph that aren't in $T$ already). A stack would work just as well.

Our input will be `const vector<vector<int>> &weights`

, which a $n \times m$ matrix of edge weights. We will return `xToY`

when all vertices in $X$ are matched.

Let's go over the implementation step-by-step:

We initialize all the data structures. To create an initial feasible labeling, we set $L(y) = 0$ for all $y \in Y$ and set $L(x) = \max_{y \in Y}w(x,y).$ This is an $O(NM)$ operation. We set the number of matches to $0$ and also push an umatched vertex into

`sStack`

. Here's the code:`vector<int> findMaximumAssignment(const vector<vector<int>> &weights) { int N = weights.size(); if (N == 0) return vector<int>(); int M = weights.back().size(); if (N > M) throw logic_error("|X| > |Y|, no match is possible"); vector<bool> S(N, false); // set to keep track of vertices in X on the left in alternating path vector<int> xLabel(N, -1); vector<int> yLabel(M, 0); vector<int> slack(M, INT_MAX); // keep track of how far Y is from being matched with a vertex in S for (int i = 0; i < N; ++i) { // initialize label with max edge to Y for (int j = 0; j < M; ++j) xLabel[i] = max(xLabel[i], weights[i][j]); } // array for memorizing alternating path vector<int> prevX(N, -1); // prevX[i] is vertex on the right (in Y) that comes before i in X vector<int> prevY(M, -1); // prevY[j] is vertex on the left (in X) that comes before j in Y if slack[j] == 0; otherwise, closest vertex in S // maps to keep track of assignment vector<int> xToY(N, -1); // xToY[i] is vertex on the right (in Y) matched to i in X vector<int> yToX(M, -1); // yToX[j] is vertex on the left (in X) matched to j in Y stack<int, vector<int>> sStack; // vertices to add to S queue<int> tQueue; // neighbors of S to add to T int matches = 0; sStack.push(0); // initialize with unmatched vertex while (matches < N) { ... } return xToY; }`

Right now $S$ and $T$ are empty. The first order of business is to add something to $S$. When we add to $S$, we initialize

`slack`

, find the closest vertex in $S$ to each $y \in Y$, and add neighbors in the equality subgraph to`tQueue`

. By closest vertex, I mean the vertex that would require changing the labeling the least. Here's the code.`vector<int> findMaximumAssignment(const vector<vector<int>> &weights) { ... while (matches < N) { while (!sStack.empty()) { // add unmatched vertices to S int x = sStack.top(); sStack.pop(); S[x] = true; for (int j = 0; j < M; ++j) { if (xLabel[x] + yLabel[j] - weights[x][j] < slack[j]) { // check for neighboring vertices and initialize slack slack[j] = xLabel[x] + yLabel[j] - weights[x][j]; // slack >= 0, all feasible initially, and we decrease by min prevY[j] = x; // tree looks like ... --> x -?-> j depending if slack[j] == 0 or not if (slack[j] == 0) tQueue.push(j); // edge is in equality subgraph, it is a neighbor of S } } } ... } return xToY; }`

Now, we have two options depending if

`tQueue`

is empty or not. If`tQueue`

is empty, we fix the labeling. We'll restart the while loop with an empty`sStack`

and at least 1 vertex in`tQueue`

after this.`vector<int> findMaximumAssignment(const vector<vector<int>> &weights) { ... while (matches < N) { ... if (tQueue.empty()) { // no neighboring vertices, fix labeling // loop invariant is that |S| > |T|, since we add to S whenever we add pop from tQueue int delta = INT_MAX; for (int j = 0; j < M; ++j) { if (slack[j] > 0) delta = min(delta, slack[j]); // only try to add edges that are feasible and not in T } for (int i = 0; i < N; ++i) { if (S[i]) xLabel[i] -= delta; // decrease label of vertices in S } for (int j = 0; j < M; ++j) { if (slack[j] == 0) { // it's in T yLabel[j] += delta; } else if (slack[j] > 0 && prevY[j] != -1) { // check that it's feasible and connected to S slack[j] -= delta; // decrease the distance from S since labels in S were decreased if (slack[j] == 0) tQueue.push(j); } } } else { ... } } return xToY; }`

Now, we have ensured

`tQueue`

won't be empty. Again, we have two options here depending if the vertex at the head of the queue is matched or not. Let us first deal with chase where it is already matched, so we extend our alternating path with $\cdots \rightarrow y \rightarrow x^\prime.$ Then, we restart our while loop with $x^\prime$ in`sStack`

since $x^\prime$ is part of the path now, too.`vector<int> findMaximumAssignment(const vector<vector<int>> &weights) { ... while (matches < N) { ... if (tQueue.empty()) { // no neighboring vertices, fix labeling ... } else { // either augment path or vertex is already matched so add to S int y = tQueue.front(); tQueue.pop(); int x = yToX[y]; if (x == -1) { ... } else { // vertex was already matched, new path is [something umatched in S] --> ... --> prevY[y] --> y --> x prevX[x] = y; // update alternating path with edge between x and y, recall prevY[y] is already set sStack.push(x); // add this already matched vertex to S } } } return xToY; }`

On the other hand, if the head of the queue $y$ is unmatched we have found an augmenting path to invert to create a new matching. After establishing this new matching, discard our alternating path, clear $S$ and $T$, update

`slcak`

, and count the total matches. If the everything in $X$ is matched, we're done. Otherwise, put something in`sStack`

, so we can begin building our new alternating path.`vector<int> findMaximumAssignment(const vector<vector<int>> &weights) { ... while (matches < N) { ... if (tQueue.empty()) { // no neighboring vertices, fix labeling ... } else { // either augment path or vertex is already matched so add to S int y = tQueue.front(); tQueue.pop(); int x = yToX[y]; if (x == -1) { int currentY = y; while (currentY > -1) { // new path is [something unmatched in S] --> ... --> y int currentX = prevY[currentY]; // go to left side xToY[currentX] = currentY; yToX[currentY] = currentX; currentY = prevX[currentX]; // go back to right side } for (int i = 0; i < N; ++i) prevX[i] = -1, S[i] = false; // reset path and remove everything from tree for (int j = 0; j < M; ++j) prevY[j] = -1, slack[j] = INT_MAX; // reset path and slack while (!tQueue.empty()) tQueue.pop(); // empty queue // check for a perfect match matches = 0; for (int i = 0; i < N; ++i) { if (xToY[i] != -1) { // if matched ++matches; } else if (sStack.empty()) { sStack.push(i); // put an unmatched left side node back in S to start } } } else { // vertex was already matched, new path is [something umatched in S] --> ... --> prevY[y] --> y --> x ... } } } return xToY; }`

All in all, the algorithm is $O(n^2m)$ since every time we build an augmenting path we match a vertex in $X$, of which there are $n$. We enter the main loop as often as $n$ times since our path can be upto length $2n$. There are several steps in the loop where we iterate over $Y$, which has size $m$, such as computing the slack or fixing labels. Here's the whole function together:

```
/* hungarian method for maximum weighted matching of a bipartite graph
* Consider a weighted bipartite graph G = (X,Y). X is vertices on left side, Y is vertices on the right side
* We must have |X| <= |Y|. Match each vertex on the left to the a distinct vertex on the right with maximum total weight of edges
*/
vector<int> findMaximumAssignment(const vector<vector<int>> &weights) {
int N = weights.size();
if (N == 0) return vector<int>();
int M = weights.back().size();
if (N > M) throw logic_error("|X| > |Y|, no match is possible");
vector<bool> S(N, false); // set to keep track of vertices in X on the left in alternating path
vector<int> xLabel(N, -1);
vector<int> yLabel(M, 0);
vector<int> slack(M, INT_MAX); // keep track of how far Y is from being matched with a vertex in S
for (int i = 0; i < N; ++i) { // initialize label with max edge to Y
for (int j = 0; j < M; ++j) xLabel[i] = max(xLabel[i], weights[i][j]);
}
// array for memorizing alternating path
vector<int> prevX(N, -1); // prevX[i] is vertex on the right (in Y) that comes before i in X
vector<int> prevY(M, -1); // prevY[j] is vertex on the left (in X) that comes before j in Y if slack[j] == 0; otherwise, closest vertex in S
// maps to keep track of assignment
vector<int> xToY(N, -1); // xToY[i] is vertex on the right (in Y) matched to i in X
vector<int> yToX(M, -1); // yToX[j] is vertex on the left (in X) matched to j in Y
stack<int, vector<int>> sStack; // vertices to add to S
queue<int> tQueue; // neighbors of S to add to T
int matches = 0;
sStack.push(0); // initialize with unmatched vertex
while (matches < N) {
while (!sStack.empty()) { // add unmatched vertices to S
int x = sStack.top(); sStack.pop();
S[x] = true;
for (int j = 0; j < M; ++j) {
if (xLabel[x] + yLabel[j] - weights[x][j] < slack[j]) { // check for neighboring vertices and initialize slack
slack[j] = xLabel[x] + yLabel[j] - weights[x][j]; // slack >= 0, all feasible initially, and we decrease by min
prevY[j] = x; // tree looks like ... --> x -?-> j depending if slack[j] == 0 or not
if (slack[j] == 0) tQueue.push(j); // edge is in equality subgraph, it is a neighbor of S
}
}
}
if (tQueue.empty()) { // no neighboring vertices, fix labeling
// loop invariant is that |S| > |T|, since we add to S whenever we add pop from tQueue
int delta = INT_MAX;
for (int j = 0; j < M; ++j) {
if (slack[j] > 0) delta = min(delta, slack[j]); // only try to add edges that are feasible and not in T
}
for (int i = 0; i < N; ++i) {
if (S[i]) xLabel[i] -= delta; // decrease label of vertices in S
}
for (int j = 0; j < M; ++j) {
if (slack[j] == 0) { // it's in T
yLabel[j] += delta;
} else if (slack[j] > 0 && prevY[j] != -1) { // check that it's feasible and connected to S
slack[j] -= delta; // decrease the distance from S since labels in S were decreased
if (slack[j] == 0) tQueue.push(j);
}
}
} else { // either augment path or vertex is already matched so add to S
int y = tQueue.front(); tQueue.pop();
int x = yToX[y];
if (x == -1) {
int currentY = y;
while (currentY > -1) { // new path is [something unmatched in S] --> ... --> y
int currentX = prevY[currentY]; // go to left side
xToY[currentX] = currentY; yToX[currentY] = currentX;
currentY = prevX[currentX]; // go back to right side
}
for (int i = 0; i < N; ++i) prevX[i] = -1, S[i] = false; // reset path and remove everything from tree
for (int j = 0; j < M; ++j) prevY[j] = -1, slack[j] = INT_MAX; // reset path and slack
while (!tQueue.empty()) tQueue.pop(); // empty queue
// check for a perfect match
matches = 0;
for (int i = 0; i < N; ++i) {
if (xToY[i] != -1) { // if matched
++matches;
} else if (sStack.empty()) {
sStack.push(i); // put an unmatched left side node back in S to start
}
}
} else { // vertex was already matched, new path is [something umatched in S] --> ... --> prevY[y] --> y --> x
prevX[x] = y; // update alternating path with edge between x and y, recall prevY[y] is already set
sStack.push(x); // add this already matched vertex to S
}
}
}
return xToY;
}
```

### Sample Run

Using our graph, we will intialize with all $L(x) = 10$ for all $x \in X$ and $L(y) = 0$ for all $y \in Y$. In the first run, we greedily match $x_1$ to $y_1$. I'll omit feasible edges since all edges are always feasbile. I'll denote edges in the equality subgraph in green, matches in orange, edges in the alternating path and matching in red, and edges in the alternating path not in matching in blue.

On the second run, we add umatched $x_2$ to $S$. Only $y_1$ is a neighbor of $S$, so we add $y_1$ to $T$. $y_1$ is already matched to $x_1$, so we add $x_1$ to $S$, too.

Now, $S$ has no neighbors in the equality subgraph not in $T$, so we need to fix the labeling. We find that $(x_1,y_4)$ is the edge that is closest to being in the equality subgraph. We fix our labeling, add $y_4$ to $T$, then invert our augmenting path to get a new matching.

The other vertices will be greedily matched, so we end up with the total weight being 39.

## Case Study

My motivation for learning this algorithm was Costly Labels in Round 2 of the 2016 Facebook Hacker Cup. Given a tree (a graph with such that there is exactly one path that doesn't cross itself between any two vertices), we want to color the vertices to minimize the cost. Coloring vertex $i$ color $k$ costs us $C_{i,k}$, which is given. Also, if any of the vertices neighbors are of the same color, then we have to pay a penalty factor $P$.

The number of colors and vertices is rather small, so we can solve this using some clever brute force with dynamic programming. To my credit, I actually was able to discover the general dynamic programming regime, myself.

We take advantage of the tree structure. Suppose for each vertex, we compute $K(i, k_p, k_i)$, where $i$ is the current vertex, $k_p$ is the color of the parent, and $k_i$ is the color of $i$. $K(i, k_p, k_i)$ will be the minimum cost of this vertex plus all its children. Our solution will be $\min_k K(0, 0, k)$ if we root our tree at vertex $0$.

Now, $K(i, k_p, k_i) = \min(\text{cost with penalty},\text{cost without penalty})$. The cost with penalty is to greedily color all the child vertices than add $P$. The cost without penalty is where the Hungarian method comes in. We need to assign each child a color different than the parent in way that costs least. Thus, we view $X$ as children of the current vertex and $Y$ as colors. We can modify the Hungarian method to do this by setting edge weights to be $M - w(x,y)$, where $M$ is a large number. We give edges going to the parent color weight $0$, so they won't be used.

All in all, we have a solution that is just barely fast enough as it takes around 30 seconds to run on my computer.

```
#include <algorithm>
#include <ctime>
#include <exception>
#include <iostream>
#include <queue>
#include <set>
#include <stack>
#include <vector>
using namespace std;
/* hungarian method for maximum weighted matching of a bipartite graph
* Consider a weighted bipartite graph G = (X,Y). X is vertices on left side, Y is vertices on the right side
* We must have |X| <= |Y|. Match each vertex on the left to the a distinct vertex on the right with maximum total weight of edges
*/
vector<int> findMaximumAssignment(const vector<vector<int>> &weights) {
...
return xToY;
}
struct Tree {
int root;
vector<int> parent;
vector<set<int>> children;
explicit Tree(int root, int N) : root(root), parent(N, -1), children(N) { }
};
// root the tree, undefined behavior if graph is not a tree
Tree rootTree(int root, const vector<set<int>> &adjacencyList) {
int N = adjacencyList.size();
Tree tree(root, N);
tree.parent[root] = -1;
stack<int> s; s.push(root);
while (!s.empty()) {
int currentVertex = s.top(); s.pop();
for (int nextVertex : adjacencyList[currentVertex]) {
if (nextVertex != tree.parent[currentVertex]) { // don't recurse into parent
if (tree.parent[nextVertex] != -1) throw logic_error("a cycle was found, this graph is not a tree");
tree.children[currentVertex].insert(nextVertex);
tree.parent[nextVertex] = currentVertex;
s.push(nextVertex);
}
}
}
return tree;
}
int computeMinimumCostHelper(const Tree &tree, const vector<vector<int>> &C, int P,
int vertex, int parentColor, int vertexColor,
vector<vector<vector<int>>> &memo) {
int N = C.size(); // number of vertices
int K = C.back().size(); // number of colors
if (memo[vertex][parentColor][vertexColor] != -1) return memo[vertex][parentColor][vertexColor];
int parent = tree.parent[vertex];
int minCost = C[vertex][vertexColor] + P; // first calculate cost if we're willing to take penalty
vector<vector<int>> childCostByColor; childCostByColor.reserve(tree.children[vertex].size()); // prepare for computation without penalty
for (int child : tree.children[vertex]) {
int minChildCost = INT_MAX;
childCostByColor.emplace_back(); childCostByColor.back().reserve(K);
for (int childColor = 0; childColor < K; ++childColor) {
int childCost = computeMinimumCostHelper(tree, C, P, child, vertexColor, childColor, memo);
// weight 0 effectively ensures that the parent color will not be used for the children, invert edges to find max assignment
childCostByColor.back().push_back(parent != -1 && parentColor == childColor ? 0 : INT_MAX - childCost);
minChildCost = min(minChildCost, childCost); // if we're taking penalty just take min cost
}
minCost += minChildCost;
}
// only count parent if it exists, check that we don't have too many childen
if (childCostByColor.size() < K || (parent == -1 && childCostByColor.size() == K)) {
int noPenaltyCost = C[vertex][vertexColor];
vector<int> optimalAssignment = findMaximumAssignment(childCostByColor); // assign children to distinct colors
for (int i = 0; i < optimalAssignment.size(); ++i) noPenaltyCost += INT_MAX - childCostByColor[i][optimalAssignment[i]];
minCost = min(minCost, noPenaltyCost);
}
if (parent == -1) {
for (int k = 0; k < K; ++k) memo[vertex][k][vertexColor] = minCost; // doesn't matter what parent color is if no parent
} else {
memo[vertex][parentColor][vertexColor] = minCost;
}
return minCost;
}
int computeMinimumCost(const Tree &tree, const vector<vector<int>> &C, int P) {
int N = C.size(); // number of vertices
int K = C.back().size(); // number of colors
// memo[vertex index][parent color][vertex color] = cost of coloring current vertex and children (excludes parent cost)
vector<vector<vector<int>>> memo(N, vector<vector<int>>(K, vector<int>(K, -1)));
int minimumCost = INT_MAX;
for (int k = 0; k < K; ++k) { // vary color of root since root has no parent
minimumCost = min(minimumCost,
computeMinimumCostHelper(tree, C, P, tree.root, 0, k, memo));
}
return minimumCost;
}
int main(int argc, char *argv[]) {
clock_t startTime = clock();
ios::sync_with_stdio(false); cin.tie(NULL);
int T;
cin >> T;
for (int t = 1; t <= T; ++t) {
int N, K, P; cin >> N >> K >> P; // total vertices, max label, and penalty fee
// penalty occurs when a node has at least one pair of neighbors with same nodes
vector<vector<int>> C; C.reserve(N); // C[i][j] cost of coloring vertex i, j
for (int i = 0; i < N; ++i) {
C.emplace_back(); C.back().reserve(K);
for (int j = 0; j < K; ++j) {
int c; cin >> c; C.back().push_back(c);
}
}
vector<set<int>> adjacencyList(N);
for (int i = 0; i < N - 1; ++i) {
int a, b; cin >> a >> b;
--a; --b; // convert to 0-based indexing
adjacencyList[a].insert(b);
adjacencyList[b].insert(a);
}
Tree tree = rootTree(0, adjacencyList);
cout << "Case #" << t << ": "
<< computeMinimumCost(tree, C, P)
<< '\n';
}
// finish
cout << flush;
double duration = (clock() - startTime) / (double) CLOCKS_PER_SEC;
cerr << "Time taken (seconds): " << duration << endl;
return 0;
}
```

Recently, I saw an interesting problem about graphs. For any integer $p \geq 2$, we call a graph a $p$-graph if it has the property that *every* group of $p$ vertices has *exactly* $1$ common neighbor, that is, each one of these $p$ vertices has an edge to some other common vertex $x$. We consider finite undirected graphs with no loops.

The $2$-graphs are actually called Friendship graphs, and they must be triangles with $1$ point in common.

So, let us first look at $3$-graphs. Consider a $3$-graph $G$. Suppose we have vertices $a$, $b$, and $c$. By assumption, they have a common neighbor $x$, so it looks something like this.

Now $a$, $b$, and $x$ must have a common neighbor. Suppose that it's not $c$. Then, we have something like this.

Now, the subgraph containing $a$, $d$, and $x$ is complete. If they have a common neighbor, then $a$, $d$, and $x$ plus that common neighbor is complete, so we have a complete graph with $4$ vertices. Call their common neighbor $v$. If $v \neq b$, we have this situation. But now, $a$, $b$, and $v$ have two common neighbors, $x$ and $d$, so in fact, $v = b$, so the complete graph is formed from $a$, $b$, $d$, and $x$.By symmetry, we'll have that $a$ and $c$ are part of a complete graph, too, which gives us this.

But now, $b$, $c$, and $d$ have $2$ common neighbors, $a$ and $x$, which is a contradiction. Thus, $d = c$ in the first place, so we actually have this.

Since this graph is isomorphic to the $a$, $b$, $d$, $v$, and $x$ subgraph earlier, we have that $a$, $c$, and $x$ have common neighbor $b$, so the only possible $3$-graph is the complete graph with $4$ vertices.

Now, the the rest of cases will follow by induction. For our induction hypothesis, assume that for $p = 3,\ldots, n - 1$, the only possible $p$-graph is the complete graph with $p + 1$ vertices. We have just proved the base case $p=3$.

Now, consider a $n$-graph for some $n > 3$. Call this graph $G$. Consider $n$ vertices, and let us call them $v_1, v_2, \ldots, v_n$. They have common neighbor $x$. Consider the subgraph consisting of only friends of $x$. Note that this graph excludes $x$. We'll call it $G_x$. Now, $G_x$ is a $n-1$-graph. To see this, note that any set of $n-1$ vertices of $G_x$ plus $x$ itself will have exactly $1$ common neighbor. Call it $y$. Since $y$ is neighbor of $x$, we must have that $y \in G_x$. Thus, $G_x$ is a $n-1$-graph, and by our induction hypothesis, $G_x$ is a complete graph with $n$ vertices. Since every vertex in $G_x$ is connected to $x$, then $G_x \cup \{x\}$ is a complete graph with $n+1$ verties.

Now, we show that $G = G_x \cup \{x\}$. Suppose otherwise for a contradiction, so there is a $y \in G$ such that $y \not\in G_x \cup \{x\}$. $y$ and $x$ have a common neighbor, so $y$ is connected with some $v_i$. But $v_i \in G_x \cup \{x\}$, which is a complete $n+1$ graph, so $v_i$ is the common neighbor of $x$ and $v_1,\ldots,v_{i-1},v_{i+1},\ldots, v_n$. We can consider $G_{v_i}$, the subgraph consisting of only friends of $v_i$. For the same reason that $G_x$ is a complete graph with $n$ vertices, $G_{v_i}$ is also a complete graph with $n$-vertices. But we have that $x,y,v_1,\ldots,v_{i-1},v_{i+1},\ldots,v_n \in G_{v_i}$, which is $n+1$ vertices, so we have a contradiction. Thus, we have that $G = G_x \cup \{x\}$. So, $G$ is a complete graph with $n+1$ vertices. This proves our induction step.

All in all, we have that for $p \geq 3$, the only possible $p$ graph is the complete graph with $p+1$ vertices.

I was recently doing a problem that involved depth-first search to find components of a graph. For me, the most natural way to write this is recursion. It's pretty simple:

```
void floodFill(int vertex, int currentComponent, const vector<set<int>> &edgeList,
vector<int> &component, vector<vector<int>> &componentSets) {
component[vertex] = currentComponent;
componentSets.back().push_back(vertex);
for (int i : edgeList[vertex]) {
if (component[i] == -1) {
floodFill(i, currentComponent, edgeList, component, componentSets);
}
}
}
vector<vector<int>> findComponents(const vector<set<int>> &edgeList) {
int N = edgeList.size();
vector<int> component(N, -1);
vector<vector<int>> componentSets;
int currentComponent = 0;
for (int i = 0; i < N; ++i) {
if (component[i] == -1) {
componentSets.emplace_back();
floodFill(i, currentComponent++, edgeList, component, componentSets);
}
}
return componentSets;
}
```

`component`

and `componentSets`

are shared state variables, so we pass them by reference. It seems simple, enough, right? Unfortunately, the stack size is very limited on OS X. Trying to increase the stack size, I get

```
$ ulimit -s 65536
-bash: ulimit: stack size: cannot modify limit: Operation not permitted
```

The most that I can increase the stack size to with `ulimit`

is a mere 8192 KB, which only works on a graph with less than 15,000 nodes.

Now the easiest solution is to increase the stack size with `gcc`

like this

```
$ g++ -Wl,-stack_size,0x4000000 -stdlib=libc++ -std=c++0x D.cpp
```

We specify the stack size in bytes, so 0x4000000 is $4 \times 16^6 = 67108864$, which is 64 MB.

I want to cover a better solution that works everywhere, though. We can use an explicit stack for recursion:

```
vector<vector<int>> findComponents(const vector<set<int>> &edgeList) {
int N = edgeList.size();
vector<int> component(N, -1);
vector<vector<int>> componentSets;
int currentComponent = 0;
for (int i = 0; i < N; ++i) {
if (component[i] == -1) {
componentSets.emplace_back();
stack<int> s;
s.push(i);
component[i] = currentComponent;
componentSets.back().push_back(i);
while (!s.empty()) {
int vertex = s.top(); s.pop();
for (int j : edgeList[vertex]) {
if (component[j] == -1) {
component[j] = currentComponent;
componentSets.back().push_back(j);
s.push(j);
}
}
}
++currentComponent;
}
}
return componentSets;
}
```

This solution is actually more memory efficient since the implicit stack in the recursion solution contains the function state, which includes 5 arguments and other local variables. According to the online judge, the explicit stack-based solution is only using 2/3 of the memory as the recursive solution.

However, the astute reader may note that we aren't quite doing the same thing. For example, in the explicit stack verion, we assign the component before we push the vertex on the stack, whereas in the recursive version, we assign the component after we retrieve the vertex from the stack. This is necessary to avoid adding a vertex to a stack twice. Consider the undirected graph with the edges $\{(0,1), (0,2), (1,2)\}$.

- First, we push $0$ onto the stack.
- We pop $0$ and check its edges, and push $1$ and $2$ onto the stack.
- Stacks are last in, first out (LIFO), so we pop $2$ off the stack. Now, we check the edges of $2$. That includes an edge to $1$. If we added $1$ to the stack without assigning it to a component, yet, we would push $1$ onto the stack again.

We're also visiting visiting the nodes in a different order. In the recursive solution, we would visit an unassigned vertex as soon as we see it, so we would have $0 \rightarrow 1 \rightarrow 2$, but in this explict stack version, we get $0 \rightarrow 2 \rightarrow 1$.

An explicit stack version that better simulates what the recursive solution does is

```
vector<vector<int>> findComponents(const vector<set<int>> &edgeList) {
int N = edgeList.size();
vector<int> component(N, -1);
vector<vector<int>> componentSets;
int currentComponent = 0;
int iterations = 0;
for (int i = 0; i < N; ++i) {
if (component[i] == -1) {
componentSets.emplace_back();
stack<tuple<int, set<int>::iterator>> s;
s.emplace(i, edgeList[i].begin());
while (!s.empty()) {
// important to assign by reference
tuple<int, set<int>::iterator> &state = s.top();
int vertex = get<0>(state);
set<int>::iterator &edgeIt = get<1>(state);
if (component[vertex] == -1) { // first time visiting
component[vertex] = currentComponent;
componentSets.back().push_back(vertex);
}
// if there is nothing else, pop
bool shouldPop = true;
while (edgeIt != edgeList[vertex].end()) {
int j = *edgeIt; ++edgeIt;
if (component[j] == -1) {
s.emplace(j, edgeList[j].begin());
shouldPop = false; // something new is on stack, do not pop it
break; // immediately put it on the stack, break, and visit next vertex
}
}
if (shouldPop) s.pop();
}
++currentComponent;
}
}
return componentSets;
}
```

Now, we assign the component after getting the vertex off the stack, and when we see an unassigned vertex, we immediately put it on top of the stack instead of adding all unassigned vertices first. This uses slightly more memory because now the stack has to keep track of which edges were visited, so the state includes `set<int>::iterator`

. Despite this explicit solution being more similar to recursion, I'd be hesitant to recommend this code because of how convoluted it is. You need to be very careful with references. Also, in more complicated cases, your state tuple will contain more than two variables, requiring you to keep track of more things and making it more likely that you will make a mistake.

Recently, I ran into a problem that forced me to recall a lot of old knowledge and use it in new ways, President and Roads. The problem is this:

We have $V$ vertices and $E$ edges. $2 \leq V \leq 10^5$, and $1 \leq E \leq 10^5$. Index these vertices $0, 1, \ldots, V - 1$. Edges are directed and weighted, and thus, can be written as $e_i = (a_i,b_i,l_i)$, where an edge goes from vertex $a_i$ to vertex $b_i$ with length $l_i$, where $1 \leq l_i \leq 10^6$. We're given a source $s$ and a target $t$. For each edge $e_i$, we ask:

If we take the shortest path from $s$ to $t$, will we always travel along $e_i$? If not, can we decrease $l_i$ to a positive integer so that we will always travel along $e_i$? If so, how much do we need to decrease $l_i$?

## Shortest Path

Obviously, we need to find the shortest path. My first thought was to use Floyd-Warshall, since I quickly realized that just the distances from the $s$ wouldn't be enough. Computing the shortest path for all pairs is overkill, however. Let $d(a,b)$ be the distance between two vertices. An edge $e_i = (a_i,b_i,l_i)$ will belong to a shortest path if $$d(s,a_i) + l_i + d(b_i,t)$$ equals to the length of the shortest path. Thus, we only need the distances from the $s$ and to $t$. We can find the distance from $s$ and to $t$ by running Dijkstra's algorithm twice. To find the distances to $t$, we'll need to reverse the edges and pretend $t$ is our source. There's still the problem that in the most simple implementation of Djiktra's, you need to find the vertex of minimum distance each time by linear search, which leads to an $O(V^2)$ algorithm. With a little bit of work, we can bring it down to $O\left((E + V)\log V\right)$ with a priority queue. Unfortunately, the built-in C++ and Java implementations do not support the decrease key operation. That leaves us with two options:

- The easiest is to reinsert a vertex into the priority queue, and keep track of already visited vertices. If we see the vertex a second time, throw it away.
- Implement our own priority queue.

Of course, I chose to implement my own priority queue based on a binary heap with an underlying hash map to support the decrease key operation. If I was really ambitious, I'd implement a Fibonnaci heap, to achieve $O\left(E + V\log V\right)$ running time. See my C++ implementation below.

I use it in Djikstra's algorithm here:

```
vector<pair<long long, unordered_set<int>>> computeShortestPath(int s, const vector<unordered_map<int, pair<int, int>>> &edgeList) {
int N = edgeList.size();
vector<pair<long long, unordered_set<int>>> distance(N, make_pair(LLONG_MAX/3, unordered_set<int>()));
phillypham::priority_queue pq(N);
distance[s].first = 0;
pq.push(s, 0);
while (!pq.empty()) {
int currentVertex = pq.top(); pq.pop();
long long currentDistance = distance[currentVertex].first;
// relax step
for (pair<int, pair<int, int>> entry : edgeList[currentVertex]) {
int nextVertex = entry.first; int length = entry.second.first;
long long newDistance = currentDistance + length;
if (distance[nextVertex].first > newDistance) {
distance[nextVertex].first = newDistance;
distance[nextVertex].second.clear();
distance[nextVertex].second.insert(currentVertex);
if (pq.count(nextVertex)) {
pq.decrease_key(nextVertex, newDistance);
} else {
pq.push(nextVertex, newDistance);
}
} else if (distance[nextVertex].first == newDistance) {
distance[nextVertex].second.insert(currentVertex);
}
}
}
return distance;
}
```

`computeShortestPath`

returns a vector of pairs, where each pair consists of the distance from $s$ and the possible previous vertices in a shortest path. Thus, I've basically recreated the graph only including the edges that are involved in some shortest path.

## Counting Paths

The next task is to figure out which edges are necessary. There could be multiple shortest paths, so if the shortest path has length $L$, given edge $e_i = (a_i, b_i, l_i)$, the condition $$d(s,a_i) + l_i + d(b_i,t) = L$$ is necessary but not sufficient. First, we need to make sure $e_i$ is distinct as there are no restrictions on duplicate edges. Secondly, every path must pass through $a_i$ and $b_i$. And finally, we need to make sure that $\left|P(b_i)\right| = 1$, where $P(b_i)$ is the set of vertices that come directly before vertex $b_i$ on a shortest path because you could imagine something like this scenario:

The shortest path is of length 4, and we have two options $s \rightarrow a \rightarrow c \rightarrow b \rightarrow t$ and $s \rightarrow a \rightarrow b \rightarrow t$. Every shortest path goes through $a$ and $b$, but the actual edge between $a$ and $b$ is not necessary to achieve the shortest path. Since we already computed the parents of $b_i$ when running `computeShortestPath`

, it's simple to check that $b_i$ only has one parent.

Now, the only thing left to do is count the number of shortest paths that go through each vertex. For any vertex, $v$, let $w$ be the number of ways to reach $v$ from $s$ and $x$ be the number ways to reach $t$ from $v$. Then, the number of paths that pass through $v$ is $wx$. Now one could use something like depth-first search to compute $w_j$ for every vertex $j$. We'd start at $s$, take every path, and increment $w_j$ every time that we encouter $j$. This is too slow, though, as we'll visit every node multiple times. If there are $10^9$ paths to $t$, we'll take all $10^9$.

We can solve this problem by using a priority queue again using a similar idea to Dijkstra's. The idea is that for a paricular vertex $v$, $$w_v = \sum_{u \in P(v)} w_u,$$ where you may recall that $P(v)$ consists of all the parents of $v$. So, we just need to calculate all the number of paths from $s$ to the parents first. Since every edge has positive length, all the parents will have shorter distance from $s$, thus, we can use a priority queue to get the vertices in increasing order of distance from $s$. Every time we visit a node, we'll update all its children, and we'll never have to visit that node again, so we can compute the number of paths in $O\left((E + V)\log V\right)$ time. See the code:

```
const int MOD_A = 1000000207;
const int MOD_B = 1000000007;
const int MOD_C = 999999733;
tuple<int, int, int> addPathTuples(tuple<int, int, int> a, tuple<int, int, int> b) {
return make_tuple((get<0>(a) + get<0>(b)) % MOD_A, (get<1>(a) + get<1>(b)) % MOD_B, (get<2>(a) + get<2>(b)) % MOD_C);
}
tuple<int, int, int> multiplyPathTuples(tuple<int, int, int> a, tuple<int, int, int> b) {
long long a0 = get<0>(a), a1 = get<1>(a), a2 = get<2>(a);
return make_tuple((a0*get<0>(b)) % MOD_A, (a1*get<1>(b)) % MOD_B, (a2*get<2>(b)) % MOD_C);
}
vector<tuple<int, int, int>> countPaths(int s,
const vector<pair<long long, unordered_set<int>>> &distances,
const vector<pair<long long, unordered_set<int>>> &children) {
// assume only edges that make shortest paths are included
int N = children.size();
vector<tuple<int, int, int>> pathCounts(N, make_tuple(0, 0, 0)); // store as tuple, basically modular hash function
pathCounts[s] = make_tuple(1, 1, 1);
phillypham::priority_queue pq(N);
pq.push(s, 0);
while (!pq.empty()) {
int currentVertex = pq.top(); pq.pop();
for (int nextVertex : children[currentVertex].second) {
pathCounts[nextVertex] = addPathTuples(pathCounts[currentVertex], pathCounts[nextVertex]);
if (!pq.count(nextVertex)) {
pq.push(nextVertex, distances[nextVertex].first);
}
}
}
return pathCounts;
}
```

$x$, the number of paths to $t$ from $v$ can be computed in much the same way by reversing edges and starting at $t$.

Now, you might notice that instead of returning the number of paths, I return a vector of `tuple<int, int, int>`

. This is because the number of paths is huge, exceeding what can fit in `unsigned long long`

, which is $2^{64} - 1 = 18446744073709551615$. We don't actually care about the number of paths, though, just that the number of paths is equal to the total number of paths. Thus, we can hash the number of paths. I chose to hash the paths by modding it by three large prime numbers close to $10^9$. See the complete solution below.

## Hashing and Chinese Remainder Theorem

At one level, doing this seems somewhat wrong. If there is a collision, I'll get the wrong answer, so I decided to do some analysis on the probability of a collision. It occured to me that hashing is actually just an equivalence relation, where two keys will belong to the same bucket if they are equivalent, so buckets are equivalence classes.

Now, I choose 3 large primes $p_1 = 1000000207$, $p_2 = 1000000007$, and $p_3 = 999999733$, each around $10^9$. We'll run into an error if we get two numbers $x$ and $y$ such that \begin{align*} x &\equiv y \bmod p_1 \\ x &\equiv y \bmod p_2 \\ x &\equiv y \bmod p_3 \end{align*} since $x$ and $y$ will have the same hash. Fix $y$. Let $N = p_1p_2p_3$, $y \equiv a_1 \bmod p_1$, $y \equiv a_2 \bmod p_2$, and $y \equiv a_3 \bmod p_3$. Then, by the Chinese Remainder Theorem, for $x$ to have the same hash, we must have $$ x \equiv a_1\left(\frac{N}{p_1}\right)^{-1}_{p_1}\frac{N}{p_1} + a_2\left(\frac{N}{p_2}\right)^{-1}_{p_2}\frac{N}{p_2} + a_3\left(\frac{N}{p_3}\right)^{-1}_{p_3}\frac{N}{p_3} \bmod N,$$ where $$\left(\frac{N}{p_i}\right)^{-1}_{p_i}$$ is the modular inverse with respect to $p_i$, which we can find by the extended Euclidean algorithm as mentioned here.

We'll have that \begin{align*} \left(\frac{N}{p_1}\right)^{-1}_{p_1} &\equiv 812837721 \bmod p_1 \\ \left(\frac{N}{p_2}\right)^{-1}_{p_2} &\equiv 57354015 \bmod p_2 \\ \left(\frac{N}{p_3}\right)^{-1}_{p_3} &\equiv 129808398 \bmod p_3, \end{align*} so if $a_1 = 10$, $a_2 = 20$, and $a_3 = 30$, then the smallest such $x$ that would be equivalent to $y$ is $x = 169708790167673569857984349$. In general, \begin{align*} x &= 169708790167673569857984349 + Nk \\ &= 169708790167673569857984349 + 999999946999944310999613117k, \end{align*} for some $k \in \mathbb{Z}$, which leaves us with only an approximately $10^{-27}$ chance of being wrong. Thus, for all intents and purposes, our algorithm will never be wrong.

## Binary Heap Priority Queue

```
namespace phillypham {
class priority_queue {
private:
int keysSize;
vector<int> keys;
vector<long long> values;
unordered_map<int, int> keyToIdx;
int parent(int idx) {
return (idx + 1)/2 - 1;
}
int left(int idx) {
return 2*(idx+1) - 1;
}
int right(int idx) {
return 2*(idx+1);
}
void heap_swap(int i, int j) {
if (i != j) {
keyToIdx[keys[j]] = i;
keyToIdx[keys[i]] = j;
swap(values[j], values[i]);
swap(keys[j], keys[i]);
}
}
void max_heapify(int idx) {
int lIdx = left(idx);
int rIdx = right(idx);
int smallestIdx = idx;
if (lIdx < keysSize && values[lIdx] < values[smallestIdx]) {
smallestIdx = lIdx;
}
if (rIdx < keysSize && values[rIdx] < values[smallestIdx]) {
smallestIdx = rIdx;
}
if (smallestIdx != idx) {
heap_swap(smallestIdx, idx);
max_heapify(smallestIdx);
}
}
void min_heapify(int idx) {
while (idx > 0 && values[parent(idx)] > values[idx]) {
heap_swap(parent(idx), idx);
idx = parent(idx);
}
}
public:
priority_queue(int N) {
keysSize = 0;
keys.clear(); keys.reserve(N);
values.clear(); values.reserve(N);
keyToIdx.clear(); keyToIdx.reserve(N);
}
void push(int key, long long value) {
// if (keyToIdx.count(key)) throw logic_error("key " + ::to_string(key) + " already exists");
int idx = keysSize; ++keysSize;
if (keysSize > keys.size()) {
keys.push_back(key);
values.push_back(value);
} else {
keys[idx] = key;
values[idx] = value;
}
keyToIdx[key] = idx;
min_heapify(idx);
}
void increase_key(int key, long long value) {
// if (!keyToIdx.count(key)) throw logic_error("key " + ::to_string(key) + " does not exist");
// if (values[keyToIdx[key]] > value) throw logic_error("value " + ::to_string(value) + " is not an increase");
values[keyToIdx[key]] = value;
max_heapify(keyToIdx[key]);
}
void decrease_key(int key, long long value) {
// if (!keyToIdx.count(key)) throw logic_error("key " + ::to_string(key) + " does not exist");
// if (values[keyToIdx[key]] < value) throw logic_error("value " + ::to_string(value) + " is not a decrease");
values[keyToIdx[key]] = value;
min_heapify(keyToIdx[key]);
}
void pop() {
if (keysSize > 0) {
heap_swap(0, --keysSize);
keyToIdx.erase(keys[keysSize]);
if (keysSize > 0) max_heapify(0);
} else {
throw logic_error("priority queue is empty");
}
}
int top() {
if (keysSize > 0) {
return keys.front();
} else {
throw logic_error("priority queue is empty");
}
}
int size() {
return keysSize;
}
bool empty() {
return keysSize == 0;
}
int at(int key) {
return values[keyToIdx.at(key)];
}
int count(int key) {
return keyToIdx.count(key);
}
string to_string() {
ostringstream out;
copy(keys.begin(), keys.begin() + keysSize, ostream_iterator<int>(out, " "));
out << '\n';
copy(values.begin(), values.begin() + keysSize, ostream_iterator<int>(out, " "));
return out.str();
}
};
}
```

## Solution

```
int main(int argc, char *argv[]) {
ios::sync_with_stdio(false); cin.tie(NULL);
int N, M, s, t; // number of nodes, edges, source, and target
cin >> N >> M >> s >> t;
--s; --t; // 0 indexing
vector<tuple<int, int, int>> edges;
vector<unordered_map<int, pair<int, int>>> edgeList(N);
vector<unordered_map<int, pair<int, int>>> reverseEdgeList(N);
for (int m = 0; m < M; ++m) { // read in edges
int a, b, l;
cin >> a >> b >> l;
--a; --b;
edges.emplace_back(a, b, l);
if (!edgeList[a].count(b) || edgeList[a][b].first > l) {
edgeList[a][b] = make_pair(l, 1);
reverseEdgeList[b][a] = make_pair(l, 1);
} else if (edgeList[a][b].first == l) {
++edgeList[a][b].second;
++reverseEdgeList[b][a].second;
}
}
vector<pair<long long, unordered_set<int>>> distanceFromSource = computeShortestPath(s, edgeList);
vector<pair<long long, unordered_set<int>>> distanceFromTarget = computeShortestPath(t, reverseEdgeList);
vector<tuple<int, int, int>> pathCounts = countPaths(s, distanceFromSource, distanceFromTarget);
vector<tuple<int, int, int>> backPathCounts = countPaths(t, distanceFromTarget, distanceFromSource);
for (int i = 0; i < N; ++i) pathCounts[i] = multiplyPathTuples(pathCounts[i], backPathCounts[i]);
long long shortestDistance = distanceFromSource[t].first;
tuple<int, int, int> shortestPaths = pathCounts[s];
for (tuple<int, int, int> edge : edges) {
long long pathDistance = distanceFromSource[get<0>(edge)].first + get<2>(edge) + distanceFromTarget[get<1>(edge)].first;
if (pathDistance == shortestDistance && // path is shortest
pathCounts[get<0>(edge)] == shortestPaths && // every path goes through from node
pathCounts[get<1>(edge)] == shortestPaths && // every path goes through to node
distanceFromSource[get<1>(edge)].second.size() == 1 && // only paths come from the from node
edgeList[get<0>(edge)][get<1>(edge)].second == 1) {
cout << "YES" << '\n';
} else if (pathDistance - get<2>(edge) + 1 < shortestDistance) {
cout << "CAN" << ' ' << (pathDistance - shortestDistance) + 1 << '\n';
} else {
// which will happen if the city is unconnected since the edge won't be big enough to overcome the big distance
cout << "NO" << '\n';
}
}
cout << flush;
return 0;
}
```