# Topological Sorting

All about Topological Sort by WilliamFiset
ðŸ”—
Topological Sort DFS Algorithm Wallkthrough
ðŸ”—
Walkthrough of Kahn's Algorithm
ðŸ”—

Topological sorting is a concept in computer science, which is used in scenarios where a set of tasks must be performed in a specific order such that all the required conditions are satisfied. This concept is widely applied in task scheduling, dependency management, and graph theory.

## Example of Topological Sorting

Let’s consider a sample example using academic subjects and their prerequisites/dependencies to demonstrate topological sorting.

Assume that below are the list of subjects a student has to complete:

1. Introduction to Programming (IP)
2. Data Structures (DS)
3. Algorithms (ALG)
4. Database Systems (DB)
5. Web Development (WD)
6. Machine Learning (ML)

Below are the dependencies/requirements before enrolling for each subject:

• “Data Structures” requires completion of “Introduction to Programming”
• “Algorithms” requires completion of “Data Structures”
• “Database Systems” requires “Introduction to Programming”
• “Web Development” requires “Data Structures” and “Database Systems”
• “Machine Learning” requires “Algorithms” and “Web Development”

We can represent the dependencies as a directed acyclic graph (DAG):

``````IP ---> DS ---> ALG ---> ML
|       |              ^
|       |             /
v       v            /
DB ----> WD ----------
``````

Now, let’s find a valid topological ordering for these subjects:

1. Start with Introduction to Programming (`IP`) as it has no prerequisites.
2. We can then take either Data Structures (`DS`) or Database Systems (`DB`), as both only require `IP`.
3. Let’s choose `DS` next.
4. Now we can take `ALG` (which requires `DS`) or `DB`.
5. Let’s choose `DB`.
6. At this point, we can take either `ALG` or `WD`.
7. We’ll choose `ALG`.
8. Now we can only take `WD` as `ML` needs `WD`.
9. Let’s finish with `WD`, then `ML`.

So, a valid topological ordering would be:

``````IP -> DS -> DB -> ALG -> WD -> ML
``````

This ordering ensures that each subject is taken only after its prerequisites have been completed.

It’s important to note that this is not the only valid topological ordering. For example, another valid ordering could be:

``````IP -> DB -> DS -> WD -> ALG -> ML
``````

In this alternative ordering, we took `DB` before `DS`, which is also valid since both only depend on `IP`. We also took `WD` before `ALG`, which is fine because `WD` doesn’t depend on `ALG`.

The key point is that in any valid topological ordering:

1. `IP` will always come first
2. `ML` will always come last
3. `DS` will always come before `ALG`
4. `DB` will always come before `WD` and `ML`
5. `ALG` and `WD` will always come before `ML`

Depth-First Search (DFS) recursively explores a graph, fully visiting all connected nodes before backtracking. This property is useful for topological sorting, which orders nodes in a directed graph such that for every edge A -> B, A comes before B.

In the DFS-based topological sort, we maintain a list and add each node to this list after exploring all its neighbors. This ensures that a node is added only after all nodes depending on it are added. However, this process actually creates a reverse topological order.

For example, in a graph with A -> B -> D, DFS exploration would add nodes to the list in the order [D, B, A]. To get the correct topological order, we must reverse this list, resulting in [A, B, D]. This final reversed list ensures that each node appears before any nodes that depend on it, satisfying the definition of a topological sort.

Below is the implementation of topological sort using DFS:

``````
def dfs(graph, node, visited, sorted_nodes):
"""
Perform a Depth-First Search (DFS) starting from a given node.

Args:
graph: A dictionary representing the adjacency list of the graph.
node: The starting node for the DFS.
visited: A set to keep track of visited nodes.
sorted_nodes: A list to store the topologically sorted nodes.
"""
# Mark the current node as visited

# Iterate through the neighbors of the current node
for neighbor in graph[node]:
# If the neighbor has not been visited, recursively call DFS
if neighbor not in visited:
dfs(graph, neighbor, visited, sorted_nodes)

# Add the current node to the sorted list
sorted_nodes.append(node)

def topological_sort(graph):
"""
Perform a topological sort on a directed graph.

Args:
graph (dict): A dictionary representing the adjacency list of the graph.

Returns:
list: A list of nodes in topologically sorted order.
"""
# Initialize an empty list to store the sorted nodes
sorted_nodes = []

# Initialize a set to keep track of visited nodes
visited = set()

# Iterate through all the nodes in the graph
for node in graph:
# If the node has not been visited, perform DFS
if node not in visited:
dfs(graph, node, visited, sorted_nodes)

# Reverse the list to get the correct topological order
sorted_nodes.reverse()

return sorted_nodes

# Let's test the function with courses dependencies discussed before
courses_graph = {
"IP" : ["DS", "DB"],
"DS" : ["ALG", "WD"],
"DB" : ["WD"],
"ALG" : ["ML"],
"WD" : ["ML"],
"ML" : []
}

topological_order = topological_sort(courses_graph)
print("Below is one valid order in which courses can be taken:")
print(topological_order)
``````

## Topological Sorting Using Khan’s Algorithm

Kahn’s algorithm for topological sorting works by repeatedly finding and removing nodes that have no dependencies. We start by identifying all nodes with no prerequisites (no incoming edges to the node). These are our “ready to process” nodes. We add them to our result list and remove them from the graph. As we remove each node, we also remove its outgoing dependencies. If this causes any other nodes to lose all their dependencies, we add those to our “ready to process” list. We repeat this process until all nodes are completed.

Below is the implementation of topological sort using Kahn’s algorithm:

``````
def topological_sort(graph):
"""
Perform a topological sort on a directed graph using Kahn's algorithm.

Args:
graph (dict): A dictionary representing the adjacency list of the graph.

Returns:
list: A list of nodes in topologically sorted order
"""
# Create a list to store nodes with 0 in-degree
zero_indegree_nodes = []

# Create a dictionary to store the in-degree of each node
in_degree = {}

# Initialize in-degree of all nodes to 0
for node in graph:
in_degree[node] = 0

# Calculate the in-degree of each node
for node in graph:
for neighbor in graph[node]:
in_degree[neighbor] += 1

# Add nodes with 0 in-degree to the zero_indegree_nodes list
for node in in_degree:
if in_degree[node] == 0:
zero_indegree_nodes.append(node)

# Perform topological sort
topological_order = []
while zero_indegree_nodes:
# Remove a node from the end of the list
current_node = zero_indegree_nodes.pop()
topological_order.append(current_node)

# Decrement the in-degree of its neighbors
for neighbor in graph[current_node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
zero_indegree_nodes.append(neighbor)

# Check if there is a cycle in the graph
if len(topological_order) != len(graph):
print("Error: Cycle detected in the graph")
return []

# Let's test the function with courses dependencies discussed before
courses_graph = {
"IP": ["DS", "DB"],
"DS": ["ALG", "WD"],
"DB": ["WD"],
"ALG": ["ML"],
"WD": ["ML"],
"ML": []
}

topological_order = topological_sort(courses_graph)
print("Below is one valid order in which courses can be taken:")
print(topological_order)
``````

## Applications of Topological Sorting

Topological sorting has numerous real-world applications:

• Task Scheduling and Dependency Management: Ensuring tasks are executed in the correct order based on dependencies.
• Compiler Design and Build Systems: Determining the order of compilation for source code files.
• Ordering of Courses in an Academic Curriculum: Ensuring prerequisites are completed before advanced courses.
• Dependency Analysis in Software Engineering: Managing package dependencies in software projects.
• Sequencing Problems in Various Domains: Solving problems where a sequence must respect certain constraints.