You are on page 1of 13

What is a Network?

Network = graph
Network/Graph
Informally a graph is a set of nodes
Theory joined by a set of lines or arrows.

1 1 2 3
2 3

4 5 6 4 5 6

Graph-based representations What is network theory?


Network theory provides a set of
Representing a problem as a graph can techniques for analysing graphs
provide a different point of view Complex systems network theory provides
Representing a problem as a graph can techniques for analysing structure in a
make a problem much simpler system of interacting agents, represented
More accurately, it can provide the as a network
appropriate tools for solving the problem Applying network theory to a system
means using a graph-theoretic
representation

What makes a problem graph-like? Friendship Network


There are two components to a graph
Nodes and edges
In graph-like problems, these components
have natural correspondences to problem
elements
Entities are nodes and interactions between
entities are edges
Most complex systems are graph-like
Business ties in US biotech-
Scientific collaboration network
industry

Protein-Protein Interaction
Genetic interaction network Networks

Transportation Networks Internet


Graph Theory - History
Ecological Networks
Leonhard Euler's paper
on Seven Bridges of
Knigsberg ,
published in 1736.

Graph Theory - History Graph Theory - History


Cycles in Polyhedra
Trees in Electric Circuits

Thomas P. Kirkman William R. Hamilton

Gustav Kirchhoff
Hamiltonian cycles in Platonic graphs

Graph Theory - History Graph Theory - History


Enumeration of Chemical Isomers Four Colors of Maps

Arthur Cayley James J. Sylvester George Polya

Francis Guthrie Auguste DeMorgan


Definition: Graph Definitions

G is an ordered triple G:=(V, E, f) Vertex


Basic Element
V is a set of nodes, points, or vertices.
Drawn as a node or a dot.
E is a set, whose elements are known as Vertex set of G is usually denoted by V(G), or V
edges or lines. Edge
f is a function A set of two elements
maps each element of E Drawn as a line connecting two vertices, called
to an unordered pair of vertices in V. end vertices, or endpoints.
The edge set of G is usually denoted by E(G), or
E.

Example Simple Graphs

Simple graphs are graphs without multiple


edges or self-loops.

V:={1,2,3,4,5,6}
E:={{1,2},{1,5},{2,3},{2,5},{3,4},{4,5},{4,6}}

Directed Graph (digraph) Weighted graphs


Edges have directions is a graph for which each edge has an
An edge is an ordered pair of nodes associated weight, usually given by a weight
function w: E R.
loop
multiple arc 1.2 2
1 2 3 1 2 3
.2
.5 1.5 5 3
.3 1
arc node 4 5 6 4 5 6

.5
Structures and structural
metrics Graph structures
Identify interesting sections of a graph
Graph structures are used to isolate
Interesting because they form a significant
interesting or important sections of a

domain-specific structure, or because they


graph significantly contribute to graph properties
Structural metrics provide a measurement A subset of the nodes and edges in a
of a structural property of a graph graph that possess certain characteristics,
Global metrics refer to a whole graph or relate to each other in particular ways
Local metrics refer to a single node in a graph

Connectivity Component
a graph is connected if Every disconnected graph can be split
you can get from any node to any other by up into a number of connected
following a sequence of edges OR
components.
any two nodes are connected by a path.

A directed graph is strongly connected if


there is a directed path from any node to any
other node.

Degree Degree (Directed Graphs)


In-degree: Number of edges entering
Number of edges incident on a node Out-degree: Number of edges leaving

Degree = indeg + outdeg


outdeg(1)=2
indeg(1)=0

outdeg(2)=2
indeg(2)=2

The degree of 5 is 3 outdeg(3)=1


indeg(3)=4
Degree: Simple Facts Walks
If G is a graph with m edges, then
deg(v) = 2m = 2 |E | A walk of length k in a graph is a succession of k
(not necessarily different) edges of the form

If G is a digraph then uv,vw,wx,,yz.


indeg(v)= outdeg(v) = |E |
This walk is denote by uvwxxz, and is referred to
as a walk between u and z.
Number of Odd degree Nodes is even
A walk is closed is u=z.

Path Cycle
A path is a walk in which all the edges and all
the nodes are different. A cycle is a closed path in which all the
edges are different.

Walks and Paths


1,2,5,2,3,4 1,2,5,2,3,2,1 1,2,3,4,6
walk of length 5 CW of length 6 path of length 4 1,2,5,1 2,3,4,5,2
3-cycle 4-cycle

Special Types of Graphs Trees


Empty Graph / Edgeless graph Connected Acyclic Graph
No edge
Two nodes have exactly
one path between them

Null graph
No nodes
Obviously no edge
Special Trees Regular
Connected Graph
Paths
All nodes have the same
degree

Stars

Bipartite graph
Special Regular Graphs: Cycles
V can be partitioned
into 2 sets V1 and V2
such that (u,v)E
implies
either u V1 and v V 2
C3 C4 C5 OR v V 1 and uV2.

Complete Graph Complete Bipartite Graph

Every pair of vertices are adjacent Bipartite Variation of Complete Graph


Has n(n-1)/2 edges Every node of one set is connected to
every other node on the other set

Stars
Planar Graphs Subgraph
Can be drawn on a plane such that no two edges Vertex and edge sets are subsets of
intersect
those of G
K4 is the largest complete graph that is planar
a supergraph of a graph G is a graph that
contains G as a subgraph.

Special Subgraphs:
Subgraphs: Cliques Spanning subgraph
A clique is a maximum complete Subgraph H has the same vertex set as
connected subgraph. G.
Possibly not all the edges
A B C
H spans G.

D E F

G H I

Isomorphism
Spanning tree
Let G be a connected graph. Then a Bijection, i.e., a one-to-one mapping:
spanning tree in G is a subgraph of G f : V(G) -> V(H)
that includes every node and is also a u and v from G are adjacent if and only
tree. if f(u) and f(v) are adjacent in H.
If an isomorphism can be constructed
between two graphs, then we say those
graphs are isomorphic.
Isomorphism Problem Representation (Matrix)

Determining whether two


graphs are isomorphic
Incidence Matrix
VxE
Although these graphs look
very different, they are [vertex, edges] contains the edge's data
isomorphic; one isomorphism Adjacency Matrix
between them is VxV
f(a)=1 f(b)=6 f(c)=8 f(d)=3
Boolean values (adjacent or not)
f(g)=5 f(h)=2 f(i)=4 f(j)=7
Or Edge Weights

Matrices Representation (List)

1
1,2 1,5 2,3 2,5 3,4 4,5 4,6
1 1 0 0 0 0 0
Edge List
2 1 0 1 1 0 0 0
pairs (ordered if directed) of vertices
3 0 0 1 0 1 0 0
4 0 0 0 0 1 1 1 Optionally weight and other data
5 0 1 0 1 0 1 0
6 0 0 0 0 0 0 1 Adjacency List (node list)
1 2 3 4 5 6
1 0 1 0 0 1 0
2 1 0 1 0 1 0
3 0 1 0 1 0 0
4 0 0 1 0 1 1
5 1 1 0 1 0 0
6 0 0 0 1 0 0

Implementation of a Graph. Edge and Node Lists


Adjacency-list representation Edge List Node List
an array of |V | lists, one for each vertex in 12 122
12 235
V.
23 33
For each u V , ADJ [ u ] points to all its 25 435
adjacent vertices. 33 534
43
45
53
54
Edge Lists for Weighted
Graphs Topological Distance

A shortest path is the minimum path


Edge List
connecting two nodes.
1 2 1.2
2 4 0.2
4 5 0.3 The number of edges in the shortest path
4 1 0.5
connecting p and q is the topological
5 4 0.5
6 3 1.5
distance between these two nodes, d p,q

N = 12
Random Graphs
Distance Matrix
Erds and Renyi (1959)
p = 0.0 ; k = 0

N nodes
|V | matrix D = ( dij ) such that
|V | x |V
dij is the topological distance between i and j . A pair of nodes has
probability p of being
connected.
p = 0.09 ; k = 1
1 2 3 4 5 6
1 0 1 2 2 1 3 Average degree, k pN
2 1 0 1 2 1 3
3 2 1 0 1 2 2 What interesting things can
be said for different values
4 2 2 1 0 1 1 of p or k ? p = 1.0 ; k N 2
(that are true as N )
5 1 1 2 1 0 2
6 3 3 2 1 2 0

Random Graphs Random Graphs


Erds and Renyi (1959) Erds and Renyi (1959)
p = 0.0 ; k = 0

p = 0.09 ; k = 1

p = 0.045 ; k = 0.5 p = 0.0 ; k = 0 p = 0.045 ; k = 0.5 p = 0.09 ; k = 1 p = 1.0 ; k N 2

Lets look at Size of largest component


1 5 11 12
Size of the largest connected cluster Diameter of largest component
p = 1.0 ; k N 2 0 4 7 1
Diameter (maximum path length between nodes) of the largest cluster
Average path length between nodes
Average path length between nodes (if a path exists) 0.0 2.0 4.2 1.0
Random Graphs Random Graphs
David
Kentaro
Mumford Peter Toyama
Erds and Renyi (1959) Erds and Renyi (1959) Fan Belhumeur
Chung
Diameter of largest component (not to scale)
Percentage of nodes in largest component

If k < 1: What does this mean?


small, isolated clusters
small diameters
short path lengths 1.0 If connections between people can be modeled as a
random graph, then
At k = 1:
a giant component appears Because the average person easily knows more than one
diameter peaks person (k >> 1),
path lengths are high

0
We live in a small world where within a few links, we are
For k > 1: 1.0 k connected to anyone in the world.
almost all nodes connected
diameter shrinks Erds and Renyi showed that average
path lengths shorten phase transition path length between connected nodes is

Random Graphs The Alpha Model


David
Kentaro
Mumford Peter Toyama
Erds and Renyi (1959) Fan Belhumeur Watts (1999)
Chung

What does this mean? The people you know arent


BIG IF!!! randomly chosen.

If connections between people can be modeled as a


random graph, then
People tend to get to know those
Because the average person easily knows more than one who are two links away
person (k >> 1), (Rapoport *, 1957).

We live in a small world where within a few links, we are


connected to anyone in the world. The Personal Map
The real world exhibits a lot of by MSR Redmonds Social Computing Group

Erds and Renyi computed average clustering.


path length between connected nodes to be:

* Same Anatol Rapoport, known for TIT FOR TAT!

The Alpha Model The Alpha Model


Watts (1999) Watts (1999)

model: Add edges to nodes, as model: Add edges to nodes, as


in random graphs, but makes in random graphs, but makes
Normalized path length

links more likely when two links more likely when two
Clustering coefficient /

nodes have a common friend. nodes have a common friend.

For a range of values: For a range of values:

The world is small (average The world is small (average


Probability of linkage as a function path length is short), and path length is short), and
Clustering coefficient (C) and
of number of mutual friends
( is 0 in upper left, average path length (L)
Groups tend to form (high plotted against Groups tend to form (high
1 in diagonal,
and in bottom right curves.) clustering coefficient). clustering coefficient).


The Beta Model The Beta Model Jonathan
Donner
Kentaro
Toyama
Watts and Strogatz (1998) Watts and Strogatz (1998) Nobuyuki
Hanaki

First five random links reduce the


average path length of the
network by half, regardless of N!

Normalized path length


Clustering coefficient /
Both and models reproduce
short-path results of random
graphs, but also allow for
=0 = 0.125 =1 clustering.

People know People know People know


their neighbors. their neighbors, others at Small-world phenomena occur at Clustering coefficient (C) and average
and a few distant people. random. threshold between order and path length (L) plotted against
chaos.
Clustered, but Clustered and Not clustered,
not a small world small world but small world

Power Laws Power Laws


Albert and Barabasi (1999) Albert and Barabasi (1999)

Whats the degree (number of Whats the degree (number of


edges) distribution over a graph, edges) distribution over a graph,
for real-world graphs? for real-world graphs?

Random-graph model results in Random-graph model results in


Poisson distribution. Poisson distribution.

Degree distribution of a random graph, Typical shape of a power-law distribution.


N = 10,000 p = 0.0015 k = 15. But, many real-world networks But, many real-world networks
(Curve is a Poisson curve, for comparison.)
exhibit a power-law distribution. exhibit a power-law distribution.

Power Laws Power Laws Anandan


Kentaro
Toyama
Albert and Barabasi (1999) Albert and Barabasi (1999) Jennifer
Chayes

Power-law distributions are straight The rich get richer!


lines in log-log space.

Power-law distribution of node


distribution arises if
How should random graphs be Number of nodes grow;
generated to create a power-law Edges are added in proportion to
distribution of node degrees? the number of edges a node
already has.
Power laws in real networks:
Hint: (a) WWW hyperlinks
(b) co-starring in movies
Paretos* Law: Wealth (c) co-authorship of physicists Map of the Internet poster Additional variable fitness coefficient
distribution follows a power law. (d) co-authorship of neuroscientists allows for some nodes to grow
faster than others.

* Same Velfredo Pareto, who defined Pareto optimality in game theory.


Searchable Networks
Kleinberg (2000)

Searchable Networks
Kleinberg (2000)

Watts, Dodds, Newman (2002) show


that for d = 2 or 3, real networks
are quite searchable.

Killworth and Bernard (1978) found


that people tended to search their
networks by d = 2: geography and
profession.
Just because a short path exists,
doesnt mean you can easily
find it.

You dont know all of the people


whom your friends know.

Under what conditions is a network


searchable?

Ramin
Zabih

The Watts-Dodds-Newman model


Kentaro
Toyama

closely fitting a real-world experiment



a)

b)

c)



Searchable Networks
Kleinberg (2000)

Variation of Wattss model:


Lattice is d-dimensional (d=2).
One random link per node.
Parameter controls probability of random link
greater for closer nodes.

For d=2, dip in time-to-search at =2


For low , random graph; no geographic
correlation in links
For high , not a small world; no short paths to
be found.

Searchability dips at =2, in simulation

References

ldous & Wilson, Graphs and Applications. An


Introductory Approach, Springer, 2000.

Wasserman & Faust, Social Network Analysis,


Cambridge University Press, 2008.

You might also like