Kosaraju's Strongly Connected Components Algorithm

Kosaraju's Algorithm (Noisy audio but has clear explanation)
🔗
Walkthrough of Kosaraju's Algorithm
🔗
Click here to suggest a better video or view suggestions..

Kosaraju’s Algorithm is a graph traversal algorithm used to find all strongly connected components (SCCs) present in a directed graph. The purpose of this algorithm is to efficiently identify the subsets/groups of vertices in a directed graph where each vertex in a subset is reachable from every other vertex within the same subset.

What are Strongly Connected Components?

Strongly connected components (SCCs) is a concept in graph theory used to describe subgraphs of a directed graph where every vertex in a subgraph is reachable from every other vertex within the same subgraph. In simpler terms, an SCC is a portion of the graph where you can travel between any two nodes in both directions without leaving the component. These components are important because they help to identify sections of a graph that are tightly interconnected, which can be useful for understanding the structure and behavior of networks, such as social networks or web pages.

Below is an illustration of a graph with each of its strongly connected components labeled:

A
A
B
B
C
C
SCC-1
SCC-1
E
E
H
H
G
G
F
F
SCC-4
SCC-4
D
D
SCC-2
SCC-2
J
J
I
I
SCC-3
SCC-3
Text is not SVG - cannot display

Refer to this article to learn more about strongly connected components.

Detailed Steps of Kosaraju’s SCC Algorithm

Below are all the steps involved in Kosaraju’s algorithm:

Step 1: Perform first DFS and fill Post-Order Stack

  1. Initialise a stack which stores nodes in post-order. Meaning, a node is added to the stack only after all the nodes connected to it are processed and added to the stack.
  2. Perform a Depth-First Search (DFS) on the original graph.
  3. Start from any unvisited vertex.
  4. For each vertex:
    • Mark it as visited.
    • Recursively visit its unvisited neighbors.
    • After processing all neighbors, push the vertex onto the stack.
  5. The stack will contain vertices in post-order (reverse finish time).

Step 2: Graph Transposition (Creation of graph with reversed edges)

  1. Create a new graph with the same vertices as the original.
  2. For each edge (u, v) in the original graph, add edge (v, u) to the new graph.

Step 3: Perform Second DFS on the Transposed Graph

  1. Initialize all vertices as unvisited.
  2. While the stack from Step 1 is not empty:
    • Pop a vertex from the stack.
    • If the vertex is unvisited, start a new DFS from this vertex.
    • Mark all vertices visited during this DFS as part of a new Strongly Connected Component.

Implementation of Kosaraju’s Algorithm


# Function to perform DFS and fill the post-order stack
def dfs1(graph, vertex, visited, stack):
  # Mark the current vertex as visited
  visited[vertex] = True

  # Recursively visit all unvisited neighbors
  for neighbor in graph[vertex]:
    if not visited[neighbor]:
      dfs1(graph, neighbor, visited, stack)

  # Push the current vertex onto the stack
  stack.append(vertex)

# Function to perform DFS on the transposed graph
def dfs2(graph, vertex, visited, scc):
  # Mark the current vertex as visited
  visited[vertex] = True

  # Add the current vertex to the current SCC
  scc.append(vertex)

  # Recursively visit all unvisited neighbors
  for neighbor in graph[vertex]:
    if not visited[neighbor]:
      dfs2(graph, neighbor, visited, scc)

# Function to find the strongly connected components
def find_strongly_connected_components(graph):
  # Step 1: Perform first DFS and fill the post-order stack
  visited = {vertex: False for vertex in graph}
  stack = []
  for vertex in graph:
    if not visited[vertex]:
      dfs1(graph, vertex, visited, stack)

  # Step 2: Create the transposed graph
  transposed_graph = {vertex: [] for vertex in graph}
  for vertex in graph:
    for neighbor in graph[vertex]:
      transposed_graph[neighbor].append(vertex)

  # Step 3: Perform second DFS on the transposed graph
  visited = {vertex: False for vertex in graph}
  strongly_connected_components = []
  while stack:
    vertex = stack.pop()
    if not visited[vertex]:
      # We came across a node in unvisited SCC
      current_scc = []
      dfs2(transposed_graph, vertex, visited, current_scc)
      strongly_connected_components.append(current_scc)

  return strongly_connected_components

graph = {
  "A": ["D", "B"],
  "B": ["C"],
  "C": ["A"],
  "D": ["I"],
  "E": ["D", "F"],
  "F": ["G"],
  "G": ["H"],
  "H": ["E"],
  "I": ["J"],
  "J": ["I"]
}

sccs = find_strongly_connected_components(graph)
for index, scc in enumerate(sccs):
  print(f"SCC-{index+1}: {scc}")

Time and Space Complexity

Time Complexity:

The time complexity of Kosaraju’s algorithm is O(V + E), where V is the number of vertices and E is the number of edges in the graph.

  • Step-1 of the algorithm involves DFS traversal which takes O(V + E) time.
  • Step-2 of the algorithm involved transposing the graph (reverse the direction of all edges) which takes O(V + E) time.
  • Step-3 and Step-4 involves another DFS on the transposed graph which again takes O(V + E) time.

The total time complexity of Kosaraju’s algorithm is the sum of the time complexities of these three steps, which is O(V + E).

Space Complexity:

Kosaraju’s algorithm requires O(V) space to store all the nodes in the graph in post-order. Additionally, the algorithm requires O(V+E) space for storing the transposed graph. So in all, the algorithm requires O(V+E) space where V is the number of vertices and E is the number of edges in the graph.

Intuition Behind the Algorithm

To understand how Kosaraju’s algorithm works, it’s important to understand the order in which Depth First Search (DFS) visits the nodes in a graph. You can refer to this article to understand more about Depth First Search if you are new to this concept.

Let’s consider the example graph below to understand how strongly connected components can be grouped using DFS.

A
A
B
B
C
C
SCC-1
SCC-1
E
E
H
H
G
G
F
F
SCC-4
SCC-4
D
D
SCC-2
SCC-2
J
J
I
I
SCC-3
SCC-3
Text is not SVG - cannot display

If you start DFS from any node in SSC-1, you will be able to reach all the nodes present in SCC-1, SCC-2 and SCC-3. You can’t reach SCC-4 because there is no path from any node present in SCC-1 to a node present in SCC-4.

But, if you start DFS from any node in SCC-3, only the nodes present in that component will be visited in the DFS iteration as there is no edge flowing out of SCC-3. So all the nodes visited in this DFS iteration can be considered as belonging to one strongly connected component. Below are the remaining components after this DFS iteration:

A
A
B
B
C
C
SCC-1
SCC-1
E
E
H
H
G
G
F
F
SCC-4
SCC-4
D
D
SCC-2
SCC-2
J
J
I
I
SCC-3
SCC-3
Text is not SVG - cannot display

Strongly Connected Component identified in this DFS iteration: SCC-3 with nodes {I, J}

Now, if you start DFS from any node in SCC-1 or SCC-4, you will also reach nodes in SCC-2. So these nodes are not a good place to start the next DFS. But, if you start DFS from any node in SCC-2, you will only visit nodes in that strongly connected component. Even though there is path to SCC-3, we would consider those nodes as they are already identified as a separated strongly connected component. Below are this remaining components after this DFS iteration:

A
A
B
B
C
C
SCC-1
SCC-1
E
E
H
H
G
G
F
F
SCC-4
SCC-4
D
D
SCC-2
SCC-2
J
J
I
I
SCC-3
SCC-3
Text is not SVG - cannot display

Strongly Connected Component identified in this DFS iteration: SCC-2 with nodes {D}

Now we can perform DFS from any node in SCC-1 and add all nodes visited to another strongly connected component. Below are this remaining components after this DFS iteration:

A
A
B
B
C
C
SCC-1
SCC-1
E
E
H
H
G
G
F
F
SCC-4
SCC-4
D
D
SCC-2
SCC-2
J
J
I
I
SCC-3
SCC-3
Text is not SVG - cannot display

Strongly Connected Component identified in this DFS iteration: SCC-1 with nodes {A,B,C}

Finally we can perform DFS from any node present in SCC-4 and add all nodes visited to another strongly connected component. After this DFS iteration, all the nodes in the graph are categorised into strongly connected components.

A
A
B
B
C
C
SCC-1
SCC-1
E
E
H
H
G
G
F
F
SCC-4
SCC-4
D
D
SCC-2
SCC-2
J
J
I
I
SCC-3
SCC-3
Text is not SVG - cannot display

Strongly Connected Component identified in this DFS iteration: SCC-4 with nodes {E,F,G,H}

The pattern is that if you perform DFS from any node in a strongly connected component which does not have any outward edge to any other active component in the graph, only the nodes in that component will be covered in the DFS iteration. If we keep doing this, we get all nodes in a strongly connected component with each DFS iteration.

Kosaraju’s algorithm identifies all strongly connected components using the same principle but in a slightly different way. It first performs DFS from a random node in the graph and stores all nodes in a stack in post-order, meaning a node is added to the stack only after all nodes reachable from it have been visited. Because of this, the nodes from which there are paths to other components will be at the top of the stack. Due to reversal of all edges in the graph in the next step, the nodes present at the top of stack will be the ones from which there won’t be any outward paths to other unvisited components in the reversed graph.

Because of this, in each DFS iteration performed in Kosaraju’s algorithm on the transposed graph, all nodes belonging to a specific strongly connected component will be visited/identified.