3
Border graph of 48 contiguous United States
4
Protein-protein interaction network
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0004803
6
10 million Facebook friends
8
The evolution of FCC lobbying coalitions
The Evolution of FCC Lobbying Coalitions by Pierre de Vries in JoSS Visualization Symposium 2010 9
key variable of interest was an alters obesity at t + 1 to the foregoing models. We also analyzed
time t + 1. A significant coefficient for this vari- the role of geographic distance between egos
Framingham heart study
able would suggest either that an alters weight
affected an egos weight or that an ego and an
and alters by adding such a variable.
We calculated 95% confidence intervals by sim-
alter experienced contemporaneous events affect- ulating the first difference in the alters contem-
Figure 1. Largest Connected Subcomponent of the Social Network in the Framingham Heart Study in the Year 2000.
Each circle (node) represents one person in the data set. There are 2200 persons in this subcomponent of the social
network. Circles with red borders denote women, and circles with blue borders denote men. The size of each circle
is proportional to the persons body-mass index. The interior color of the circles indicates the persons obesity status:
yellow denotes an obese person (body-mass index, 30) and green denotes a nonobese person. The colors of the
ties between the nodes indicate the relationship between them: purple denotes a friendship or marital tie and orange
denotes a familial tie.
The Spread of Obesity in a Large Social Network over 32 Years by Christakis and Fowler in New England Journal of Medicine, 2007 10
The Internet as mapped by the Opte Project
http://en.wikipedia.org/wiki/Internet
11
Graph applications
12
Graph terminology
vertex
edge
cycle of
length 5
path of
length 4
vertex of
degree 3
connected
components
Anatomy of a graph 13
Some graph-processing problems
problem description
Euler cycle Is there a cycle that uses each edge exactly once ?
Hamilton cycle Is there a cycle that uses each vertex exactly once ?
planarity Can the graph be drawn in the plane with no crossing edges ?
Vertex representation.
This lecture: use integers between 0 and V 1.
Applications: convert between names and integers with symbol table.
A 0
B C G 1 2 6
symbol table
D E 3 4
F 5
self-loop parallel
edges
Anomalies.
Anomalies
17
Graph API
0
0 1
0 2
1 2 6 0 5
0 6
3 4 3 4
3 5
5 4 5
4 6
7 8
9 10
9 10
7 8 9 11
9 12
11 12
11 12
two entries
0 for each edge
0 1 2 3 4 5 6 7 8 9 10 11 12
1 2 6 0 0 1 1 0 0 1 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0 0 0 0
2 1 0 0 0 0 0 0 0 0 0 0 0 0
3 4 3 0 0 0 0 1 1 0 0 0 0 0 0 0
4 0 0 0 1 0 1 1 0 0 0 0 0 0
5 5 1 0 0 1 1 0 0 0 0 0 0 0 0
6 1 0 0 0 1 0 0 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 1 0 0 0 0
9 10 8 0 0 0 0 0 0 0 1 0 0 0 0 0
9 0 0 0 0 0 0 0 0 0 0 1 1 1
7 8 10 0 0 0 0 0 0 0 0 0 1 0 0 0
11 12 11 0 0 0 0 0 0 0 0 0 1 0 0 1
12 0 0 0 0 0 0 0 0 0 1 0 1 0
6 2 1 5
0
Bag objects
0
0
adj[]
0 5 4
1 2 6 1
2 5 6 3
3
3 4 4 3 4 0
5
6 0 4
5 7
8 8
representations
9 of the same edge
9 10 10 7
11
7 8 12 11 10 12
11 12 9
9 12
24
Graph representations
list of edges E 1 E E
adjacency matrix V2 1* 1 V
25
Adjacency-list graph representation: Java implementation
public Graph(int V)
{
this.V = V;
create empty graph
adj = (Bag<Integer>[]) new Bag[V]; with V vertices
for (int v = 0; v < V; v++)
adj[v] = new Bag<Integer>();
}
{ return adj[v]; }
}
26
4.1 U NDIRECTED G RAPHS
introduction
graph API
depth-first search
Algorithms
breadth-first search
Maze graph.
Vertex = intersection.
Edge = passage.
intersection passage
Algorithm.
Unroll a ball of string behind you.
Mark each visited intersection and each visited passage.
Retrace steps when no unvisited options.
Tremaux exploration
29
Trmaux maze exploration
Algorithm.
Unroll a ball of string behind you.
Mark each visited intersection and each visited passage.
Retrace steps when no unvisited options.
First use? Theseus entered Labyrinth to kill the monstrous Minotaur;
Ariadne instructed Theseus to use a ball of string to find his way back out.
30
Maze exploration: easy
31
Maze exploration: medium
32
Maze exploration: challenge for the bored
33
Depth-first search
Mark v as visited.
Recursively visit all unmarked
vertices w adjacent to v.
Typical applications.
Find all vertices connected to a given source vertex.
Find a path between two vertices.
To visit a vertex v :
Mark vertex v as visited.
Recursively visit all unmarked vertices adjacent to v.
tinyG.txt
0 7 8 V
13
E
13
0 5
4 3
0 1
1 2 6 9 10 9 12
6 4
5 4
0 2
11 12
3 4 11 12
9 10
0 6
7 8
9 11
5
5 3
graph G
35
Depth-first search demo
To visit a vertex v :
Mark vertex v as visited.
Recursively visit all unmarked vertices adjacent to v.
0 7 8 v marked[] edgeTo[]
0 T
1 T 0
2 T 0
1 2 6 9 10 3 T 5
4 T 6
5 T 4
6 T 0
3 4 11 12 7 F
8 F
9 F
5 10 F
11 F
12 F
36
Design pattern for graph processing
37
Depth-first search: data structures
To visit a vertex v :
Mark vertex v as visited.
Recursively visit all unmarked vertices adjacent to v.
Data structures.
Boolean array marked[] to mark visited vertices.
Integer array edgeTo[] to keep track of paths.
(edgeTo[w] == v) means that edge v-w taken to visit w for first time
Function-call stack for recursion.
Depth-first search: Java implementation
40
Depth-first search: properties
edgeTo[]
public Iterable<Integer> pathTo(int v)
0
{
1 2
if (!hasPathTo(v)) return null; 2 0
Stack<Integer> path = new Stack<Integer>(); 3 2
for (int x = v; x != s; x = edgeTo[x]) 4 3
path.push(x); 5 3
path.push(s); x path
return path; 5 5
} 3 3 5
2 2 3 5
0 0 2 3 5
http://xkcd.com/761/
43
4.1 U NDIRECTED G RAPHS
introduction
graph API
depth-first search
Algorithms
breadth-first search
tinyCG.txt standard
0 2
V
6 E
8
0 5
1 2 4
2 3
1 2 drawing
0 1
3 3 4
3 5
5 4 0 2
adjacency lists
2
graph G
0
adj[]
45
0
Breadth-first search demo
0 2
v edgeTo[] distTo[]
0 0
1 0 1
1 2 0 1
3 2 2
4 2 2
3 5 0 1
5 4
done
46
Breadth-first search
Breadth-first
maze exploration
47
Breadth-first search: Java implementation
48
Breadth-first search properties
1
4
s 0 2
1 0 2
3
3
5 4
5
50
Breadth-first search application: Kevin Bacon numbers
51
Kevin Bacon graph
Include one vertex for each performer and one for each movie.
Connect a movie to all performers that appear in that movie.
Compute shortest path from s = Kevin Bacon.
Patrick Dial M Grace
Caligola Kelly
Allen for Murder
Murder on the
Orient Express
Cold Donald
Sutherland Kathleen Joe Versus
Mountain
Quinlan the Volcano
52
V and E
Breadth-first search application: Erds numbers
public class CC
63 connected components
57
Connected components
Connected components
To visit a vertex v :
Mark vertex v as visited.
Recursively visit all unmarked vertices adjacent to v.
0 7 8 v marked[] id[]
0 F
1 F
2 F
1 2 6 9 10 3 F
4 F
5 F
6 F
3 4 11 12 7 F
8 F
9 F
5 10 F
11 F
12 F
graph G
59
Connected components demo
To visit a vertex v :
Mark vertex v as visited.
Recursively visit all unmarked vertices adjacent to v.
0 7 8 v marked[] id[]
0 T 0
1 T 0
2 T 0
1 2 6 9 10 3 T 0
4 T 0
5 T 0
6 T 0
3 4 11 12 7 T 1
8 T 1
9 T 2
5 10 T 2
11 T 2
12 T 2
done
60
Finding connected components with DFS
public class CC
{
private boolean[] marked;
private int[] id; id[v] = id of component containing v
private int count; number of components
public CC(Graph G)
{
marked = new boolean[G.V()];
id = new int[G.V()];
for (int v = 0; v < G.V(); v++)
{
if (!marked[v])
{ run DFS from one vertex in
dfs(G, v); each component
count++;
}
}
}
62
Connected components application: study spread of STDs
Peter Bearman, James Moody, and Katherine Stovel. Chains of affection: The structure of adolescent
romantic and sexual networks. American Journal of Sociology, 110(1): 4499, 2004.
63
Connected components application: particle detection
0 0-1
0-2
0-5
1 2 6
0-6
1-3
How difficult? 3 4
Hire an expert.
Intractable. simple DFS-based solution
0
No one knows.
(see textbook) 0-1
0-2
Impossible.
0-5
1 2 6
0-6
1-3
3 4 2-3
2-4
5 4-5
4-6
{ 0, 3, 4 }
66
Bipartiteness application: is dating graph bipartite?
67
Graph-processing challenge 2
0 0-1
0-2
0-5
1 2 6
0-6
1-3
How difficult? 3 4
No one knows.
(see textbook)
Impossible.
68
Bridges of Knigsberg
Euler cycle. Is there a (general) cycle that uses each edge exactly once?
Answer. A connected graph is Eulerian iff all vertices have even degree.
69
Graph-processing challenge 3
Problem. Find a (general) cycle that uses every edge exactly once.
0 0-1
0-2
0-5
1 2 6
0-6
1-2
How difficult? 3 4
Intractable.
0-1-2-3-4-2-0-6-4-5-0
Euler cycle
No one knows.
(classic graph-processing problem)
Impossible.
70
Graph-processing challenge 4
0 0-1
0-2
0-5
1 2 6
0-6
1-2
How difficult? 3 4
Intractable. 0-5-3-4-6-2-1-0
No one knows.
Hamilton cycle
(classical NP-complete problem)
Impossible.
71
Graph-processing challenge 5
0 0-1
0-2
0-5
1 2 6
0-6
3-4
How difficult? 3 4
Hire an expert.
Intractable. 3 0-4
No one knows. 2
0-5
Impossible.
0-6
4 1-4
graph isomorphism is 1
6 1-5
longstanding open problem
5 2-4
3-4
0 5-6
72
Graph-processing challenge 6
1 0-1
0-2
2
0-5
0 0-6
6
3-4
How difficult? 3
Hire an expert.
Intractable. 0
No one knows.
linear-time DFS-based planarity algorithm
discovered by Tarjan in 1970s
3 4
73
Graph traversal summary
BFS and DFS enables efficient solution of many (but not all) graph problems.
cycle E+V
planarity E+V
74