Professional Documents
Culture Documents
What is a graph
Formally: A finite graph G(V, E) is a pair (V, E), where V is a finite set and E is a binary relation on V.
Recall: A relation R between two sets X and Y is a subset of X x Y. For each selection of two distinct Vs, that pair of Vs is either in set E or not in set E.
The elements of the set V are called vertices (or nodes) and those of set E are called edges. Undirected graph: The edges are unordered pairs of b V (i.e. the binary relation is symmetric).
Ex: undirected G(V,E); V = {a,b,c}, E = {{a,b}, {b,c}}
c a
Directed graph (digraph):The edges are ordered pairs of V (i.e. the binary relation is not necessarily symmetric).
Ex: digraph G(V,E); V = {a,b,c}, E = {(a,b), (b,c)}
Intro. to Graph Theory
b
c
2
Why graphs?
Many problems can be stated in terms of a graph The properties of graphs are well-studied
Many algorithms exists to solve problems posed as graphs Many problems are already known to be intractable
By reducing an instance of a problem to a standard graph problem, we may be able to use well-known graph algorithms to provide an optimal solution Graphs are excellent structures for storing, searching, and retrieving large amounts of data
Graph theoretic techniques play an important role in increasing the storage/search efficiency of computational techniques.
Graphs in bioinformatics
Sequences
DNA, proteins, etc.
Chemical compounds
Metabolic pathways
Graphs in bioinformatics
Phylogenetic trees
Basic definitions
Undirected graph loop G=(V,E) isolated vertex loop Directed graph
multiple edges incidence: an edge (directed or undirected) is incident to a vertex that is one of its end points. degree of a vertex: number of edges incident to it
Nodes of a digraph can also be said to have an indegree and an outdegree
adjacent
Travel in graphs
x y b a path: no vertex can be repeated example path: a-b-c-d-e trail: no edge can be repeated example trail: a-b-c-d-e-b-d walk: no restriction example walk: a-b-d-a-b-c
d
c
closed: if starting vertex is also ending vertex length: number of edges in the path, trail, or walk circuit: a closed trail (ex: a-b-c-d-b-e-d-a) cycle: closed path (ex: a-b-c-d-a)
Intro. to Graph Theory
Types of graphs
simple graph: an undirected graph with no loops or multiple edges between the same two vertices multi-graph: any graph that is not simple connected graph: all vertex pairs are joined by a path disconnected graph: at least one vertex pairs is not joined by a path complete graph: all vertex pairs are adjacent
Kn: the completely connected graph with n vertices
Simple graph
b
e
b
a e K5
d
c Disconnected graph with two components d c
8
Types of graphs
acyclic graph (forest): a graph with no cycles tree: a connected, acyclic graph rooted tree: a tree with a root or distinguished vertex
leaves: the terminal nodes of a rooted tree
directed acyclic graph (DAG): a digraph with no cycles weighted graph: any graph with weights associated with the edges (edgeweighted) and/or the vertices (vertex-weighted)
b d 2 e 5
10 8 -3 6
Digraph definitions
for digraphs only Directed graph a Every edge has a head (starting point) and a b tail (ending point) Walks, trails, and paths can only use edges in the appropriate direction In a DAG, every path connects an c predecessor/ancestor (the vertex at the head of the path) to its successor/descendents d (nodes at the tail of any path). x parent: direct ancestor (one hop) y w child: direct descendent (one hop) A descendent vertex is reachable from any of v u its ancestors vertices z
Intro. to Graph Theory
10
Computer representation
undirected graphs: usually represented as digraphs with two directed edges per actual undirected edge. adjacency matrix: a |V| x |V| array where each cell i,j contains the weight of the edge between vi and vj (or 0 for no edge) adjacency list: a |V| array where each cell i contains a list of all vertices adjacent to vi incidence matrix: a |V| by |E| array where each cell i,j contains a weight (or a defined constant HEAD for unweighted graphs) if the vertex i is the head of edge j or a constant TAIL if vertex I is the tail of edge j
6
1
c
8 2
b
10 4
a a b c d
c 8
d 4
6 10 2
a b c d
2 8 t
4 t
5 4
t 6
t 2 10
adjacency matrix
adjacency list
incidence matrix
11
Computer representation
Linked list of nodes: Node is a defined data object with labels which include a list of pointers to its children and/or parents
12
Subgraphs
G(V,E) is a subgraph of G(V,E) if V V and E E. induced subgraph: a subgraph that contains all possible edges in E that have end points of the vertices of the selected V a a e
G(V,E)
Intro. to Graph Theory
Complement of a graph
The complement of a graph G (V,E) is a graph with the same vertex set, but with vertices adjacent only if they were not adjacent in G(V,E) a b e G d c c G d b a e
14
What is the path of total minimum weight from the source to any other vertex? Greedy strategy works for simple problems (no cycles, no negative weights) Longest path is a similar problem (complement weights) We will see this again soon for fragment assembly!
6
c
8 2
b
10
15
Dijkstras Algorithm
1. 2. D(x) = distance from s to x (initially all ) Select the closest vertex to s, according to the current estimate (call it c) Recompute the estimate for every other vertex, x, as the MINIMUM of:
1. 2. The current distance, or The distance from s to c , plus the distance from c to x D(c) + W(c, x)
16
B
18 18 18 18 A
10 3 2
D
11
20
15
17
d c
18
1 4
Intro. to Graph Theory
2 3
Maximal cliques: {1,2,3},{1,3,4} Vertex cover: {1,3} Clique cover: { {1,2,3}{1,3,4} } Clique partition: { {1,2,3}{4} }
19
1 4
2 3
20
d
g
1 5
e 2
3 4
4 d
21
c 2
K4,4
Intro. to Graph Theory
22
1 4
2 3
23
A B C D E F
1 x
2 x x x
3 x x x x
b Colors?
c
x x x
d
e f
24
e d8
4 4 2
a
4 1
f 2
2
h
2
b
6
e
2
a
4
Intro. to Graph Theory
d8
1 4
f 2
2
25
area b a
area d
a b c y z
a b c y z
a b c y z
a b c y z
a b c y z
Intro. to Graph Theory
Graph traversal
There are many strategies for solving graph problems for many problems, the efficiency and accuracy of the solution boil down to how you search the graph. We will consider a travel problem for example: Given the graph below, find a path from vertex a to vertex d. Shorter paths (in terms of edge weight sums) are desirable.
a
2
c
6 7
b
4 5
Intro. to Graph Theory
f
28
A greedy approach
greedy traversal: Starting with the root node, take the edge with smallest weight. Mark the edge so that you never attempt to use it again. If you get to the end, great! If you get to a dead end, back up one decision and try the next best edge. Advantages: Fast! Drawbacks: Answer is usually non-optimal For some problems, greedy approaches are optimal, for others the answer may usually be close to the best answers, for yet other problems, the greedy strategy is a poor choice.
3
a
2
c
6 7
b
4 5
Intro. to Graph Theory
f
29
Place all adjacent unused edges in a queue (FIFO) Take an edge from the queue, mark it as used, and follow it to the new current node
a
2
c
6 7
b
4
Traversal order: a, b, c, d, e, f
d
Intro. to Graph Theory
e
30
DFS (G, v)
V.state = visited Process vertex v Foreach edge (v,w) { if w.state = unseen { DFS (G, w) process edge (v,w) } }
a
2
c
6
b
4
}
7
f Traversal order: a, b, d, e, f, c
31
d
Intro. to Graph Theory
a
2
c
6 7
b
4 5
Path Current Best ACF 7 11 ACFE 15 11 < prune AB 3 11 ABD 8 8 ABE 7 8 ABEF 14 8
32
5
3 8
2
1
Intro. to Graph Theory
6
7
9
10
33