Directed Acyclic Graphs (DAGs)

A Directed Acyclic Graph (also known as DAG) consists of nodes (also called vertices) connected by edges, where each edge has a direction associated with it. As the name indicates, a Directed Acyclic Graph has two key properties: “directed” property and “acyclic” property.

The “directed” property means that the relationships between nodes have a specific orientation. For example, if node A is connected to node B, the edge points from A to B, but not necessarily from B to A. This directional nature allows DAGs to represent dependencies, hierarchies, or sequences of events.

The “acyclic” property is equally important. It means that there are no cycles or loops within the graph. In other words, if you start at any node and follow any sequence of directed edges, you can never return to the same node. The absence of cycles ensures that the graph has a clear structure without circular dependencies.

Below is an example Directed Acyclic Graph:

Notice that all the edges in the graph have a specific direction indicated by the arrow at the end of each edge. Also, following any sequence of edges in the above graph will not create a cycle. In contrast, below is an example graph which has directed edges but there is a cycle via nodes A->B->C, because of which the graph can’t be called as a directed acyclic graph.

If you are new to graphs, refer to this article to learn more about them.

Practical Applications of DAGs

Directed Acyclic Graphs (DAGs) have numerous practical applications and are widely used in many areas because they clearly show how different things are related and depend on each other.

One intuitive example is in project management, where tasks and their dependencies can be represented as a DAG. Each node in the DAG represents a task, and directed edges show which tasks must be completed before others can begin. For example, the below diagram represents some of the main steps involved in building a house using a Directed Acyclic Graph:

Foundation
Foundation
Pillars
Pillars
Walls
Walls
Base Flooring
Base Flooring
Windows
Windows
Plumbing & Wiring
Plumbing & Wiring
Interiors
Interiors
Text is not SVG - cannot display

Another compelling application of DAGs can be found in academic course planning. In this context, each node represents a course, and the edges indicate prerequisites. For instance, in a computer science curriculum, a course on “Algorithms” can be taken up only after a basic understanding of “Data Structures”. In a directed graph, this relation is represented using a directed edge from “Data Structures” node to “Algorithms” node. This structure helps students and advisors plan a logical sequence of courses, ensuring that students have the necessary foundational knowledge before advancing to more complex subjects. Below is an example graph which shows all the courses along with prerequisites/dependency information as a Directed Acyclic Graph:

Introduction to Programming
Introduction to Prog...
Data Structures
Data Structures
Algorithms
Algorithms
DataBases
DataBases
Backend Development
Backend Development
HTML, CSS, JS
HTML, CSS, JS
React, Vue etc
React, Vue etc
Frontend Developer
Frontend Developer
Full Stack Developer
Full Stack Developer
Text is not SVG - cannot display

The acyclic nature of DAGs prevents situations where two courses are prerequisites for each other, which would create an impossible study plan.

Below are some other applications where Directed Acyclic Graphs are helpful:

  • Project Management: DAGs represent tasks and dependencies in project schedules, helping visualize critical paths and optimize timelines for efficient project completion.
  • Data Processing Pipelines: DAGs model workflows in data engineering and analytics, ensuring data is processed in the correct order without circular dependencies for efficient and accurate data processing.
  • Version Control Systems: Git uses DAGs to represent commit history, facilitating branching, merging, and collaborative development through parent-child commit relationships.
  • Dependency Resolution: Package managers use DAGs to resolve software dependencies, determining the correct order for installing or updating software to prevent conflicts and satisfy all dependencies.
  • Bayesian Networks: DAGs model probabilistic relationships between variables, used in machine learning for prediction and inference tasks by representing conditional dependencies.
  • Task Scheduling in Computing: Operating systems use DAGs to schedule tasks and manage resources, optimizing CPU usage and parallel processing for efficient resource allocation.

Properties of Directed Acyclic Graphs

Below are the fundamental properties of Directed Acyclic Graphs:

  • Graph structure: A directed acyclic graph (DAG) consists of vertices connected by directed edges, where each edge has a direction from one vertex to another.
  • Directed: In a DAG, all edges have a direction, meaning they point from one vertex to another, indicating a one-way relationship.
  • Acyclic: A DAG has no cycles, meaning there is no path that starts and ends at the same vertex following the direction of the edges.
  • No self-loops: A DAG does not have any edges that start and end at the same vertex.

Below are some of the Derived/Inferred Properties of Directed Acyclic Graphs:

  • Topological ordering: DAGs can always be arranged in a topological order, where for every directed edge from vertex A to vertex B, A comes before B in the ordering. We will learn more about this in the next section.
  • At least one source and one sink: In a DAG, there is at least one vertex with no incoming edges (source) and at least one vertex with no outgoing edges (sink).
  • Transitive closure: The transitive closure of a DAG is another DAG where an edge exists between vertices A and B if there is a path from A to B either directly or indirectly in the original graph.
  • Tree-like structure: While not all DAGs are not actually trees, they can often be visualized as something similar to trees, where the hierarchical (parent-child) relationships are preserved without cycles.

Topological Ordering of Directed Acyclic Graph

Topological ordering of a directed acyclic graph (DAG) is like creating a list of tasks where each task must come before the tasks that depend on it. In other words, it’s a way to arrange the nodes (or vertices) of the graph in a sequence such that for every directed edge from node A to node B, node A appears before node B in the ordering. This kind of ordering is only possible if the graph has no cycles, meaning there are no loops or paths that start and end at the same node, hence this is applicable to only Directed Acyclic Graphs.

Let’s consider an example to help understand topological ordering intuitively. Imagine you have a set of tasks to complete, and some tasks must be done before others. For instance, consider the following tasks with dependencies:

  • Task 1: No prerequisites/dependencies
  • Task 2: No prerequisites/dependencies
  • Task 3: Depends on Task 1
  • Task 4: Depends on Task 1 and Task 2
  • Task 5: Depends on Task 3

You can represent these tasks and their dependencies as a directed acyclic graph (DAG) shown below:

Task 1
Task 1
Task 2
Task 2
Task 3
Task 3
Task 4
Task 4
Task 5
Task 5
Text is not SVG - cannot display

In this graph:

  • Task 1 has directed edges to Task 3 and Task 4.
  • Task 2 has a directed edge to Task 4.
  • Task 3 has a directed edge to Task 5.

Below are some of the valid topological ordering for this graph:

  • 1 -> 2 -> 3 -> 4 -> 5
  • 2 -> 1 -> 3 -> 4 -> 5
  • 2 -> 1 -> 4 -> 3 -> 5
  • 1 -> 2 -> 4 -> 3 -> 5

The reasoning behind the valid topological ordering is as follows

  • Task 1 and Task 2 can be done first because they have no prerequisites. They can also be done in any order.
  • Task 3 can be done only after Task 1 is completed.
  • Task 4 can be done only after both Task 1 and Task 2 are completed.
  • Task 5 can be done only after Task 3 is completed.

As long as the above ordering requirements are met, any order is a valid topological order. For instance, the ordering 3 -> 1 -> 2 -> 4 -> 5 is not a valid topological order as Task 3 cannot be performed without performing Task 1.

Click here to learn more about topological sorting and different algorithms to perform topological sort on a Directed Acyclic Graph.

Representations of DAGs

Directed Acyclic Graphs can be represented similar to normal graphs using Edge list, Adjacency List or Adjacency Matrix representations. Let’s go through each representation using the tasks example discussed before:

Task 1
Task 1
Task 2
Task 2
Task 3
Task 3
Task 4
Task 4
Task 5
Task 5
Text is not SVG - cannot display

Edge List

An edge list is a list of all directed edges in the graph. Each edge is represented as a pair (u, v), where u is the starting vertex and v is the ending vertex.

Below is how the edge list representation looks like:

[(1, 3), (1, 4), (2, 4), (3, 5)]

Click here to learn more about Edge List representation of a graph.

Adjacency List

An adjacency list represents the graph as a list of lists. Each vertex has a list of all vertices it has edges to.

Below is how the Adjacency list representation looks like:

1: [3, 4]
2: [4]
3: [5]
4: []
5: []

Click here to learn more about Adjacency List representation of a graph.

Adjacency Matrix

An adjacency matrix is a 2D array where the cell at row i and column j is 1 if there is a directed edge from vertex i to vertex j, and 0 otherwise.

Below is how the Adjacency matrix representation looks like, assuming the tasks are numbered from 1 to 5:

   1  2  3  4  5
1 [0, 0, 1, 1, 0]
2 [0, 0, 0, 1, 0]
3 [0, 0, 0, 0, 1]
4 [0, 0, 0, 0, 0]
5 [0, 0, 0, 0, 0]

Click here to learn more about Adjacency Matrix representation of a graph.

Common Operations on DAGs

Below are some of the common operations performed on Directed Acyclic Graphs based on the use case:

  • Topological Sorting: Topological sorting arranges the vertices of a DAG in a linear order such that for every directed edge u->v, vertex u comes before vertex v. This is useful for scheduling tasks, ordering events, and resolving symbol dependencies in compilers.
  • Cycle Detection: Cycle detection in a DAG is used to verify whether the graph is acyclic (a valid DAG). This is important because the presence of a cycle would invalidate the DAG properties and make operations like topological sorting impossible. Cycle detection can be performed using Depth-First Search (DFS) to check for back edges.
  • Transitive Reduction: Transitive reduction of a DAG involves removing as many edges as possible while preserving the reachability of vertices. This results in the smallest equivalent graph that maintains the same reachability. It is useful in minimizing the complexity of the graph without altering its fundamental properties.
  • Finding Shortest Paths: In a DAG, finding the shortest path from a source to all other vertices can be efficiently done using topological sorting followed by edge relaxation. This helps in determining the quickest way to complete tasks.
  • Finding Longest Paths: Finding the longest path in a DAG is crucial for applications like the Critical Path Method (CPM) in project scheduling, which identifies the maximum time taken to complete a sequence of dependent tasks.