Amogh Dhamdhere
Committee:
Dr. Constantine Dovrolis (advisor)
Dr. Mostafa Ammar
Dr. Nick Feamster
Dr. Ellen Zegura
Dr. Walter Willinger (AT&T Labs-Research)
The Internet Ecosystem
6/7/2019 3
Previous Work
Static graph properties Dynamics of the evolving
No focus on how the graph graph
evolves
Birth/death
“Descriptive” modeling Rewiring
Match graph properties
e.g. degree distribution “Bottom-up”
Model the actions of
Homogeneity
individual networks
Nodes and links all the
same Heterogeneity
Networks with different
Game theoretic,
incentives
computational Semantics of interdomain
Restrictive assumptions links
6/7/2019 4
Our Approach
6/7/2019 6
Motivation
6/7/2019 7
Approach
Focus on Autonomous Systems (ASes)
As opposed to networks without AS numbers
Start from BGP routes from RouteViews and RIPE
monitors during 1997-2007
Focus on primary links
6/7/2019 8
Internet growth
6/7/2019 10
Rewiring more important than
growth
Most new links due to internal rewiring and not birth (75%)
Most dead links are due to internal rewiring and not death
(almost 90%)
6/7/2019 11
Classification of ASes based on
business function
Four AS types:
Enterprise customers (EC)
Small Transit Providers
(STP)
Large Transit Providers
(LTP)
Content, Access and Hosting
Providers (CAHP)
Based on customer and peer
degrees
Classification based on
decision-trees
80-85% accurate
6/7/2019 12
Evolution of AS types
6/7/2019 15
Conclusions
Where is the Internet heading?
Initial exponential growth up to mid-2001, followed
by linear growth phase
Average path length practically constant
Rewiring more important than growth
Need to classify ASes according to business type
ECs contribute most of the overall growth
Increasing multihoming degree for STPs, LTPs and
CAHPs
Densification at the core
CAHPs are most active in terms of rewiring, while ECs
are least active
6/7/2019 16
Outline
6/7/2019 17
Modeling the Internet Ecosystem
From measurements: Significant rewiring activity
Especially by transit providers
Networks rewire connectivity to optimize a certain
objective function
Distributed
Localized spatially and temporally
Rewiring by changing the set of providers and peers
What are the global, long-term effects of these
distributed optimizations?
Topology and traffic flow
Economics
Performance (path lengths)
6/7/2019 18
The Feedback Loop
Provider
selection
Peer
selection
P1 P1P2
open to
C No and P3
peering
C
peering
peer with
CPs
P2 P3
S S
C S C
S
6/7/2019 20
Our Approach
6/7/2019 22
What would happen if..?
The traffic matrix consists of mostly P2P traffic?
P2P traffic benefits STPs, can make LTPs
unprofitable
6/7/2019 24
Provider and Peer Selection are
Related
A Restrictive
peering
X
? C
A B
Peering by
necessity
C B
A U Level3-Cogent
peering dispute
U
6/7/2019 C 25
Economics, Routing and Traffic
Matrix
Realistic transit, peering and operational costs
Transit prices based on data from Norton
Economies of scale
Traffic matrix
Heavy-tailed content popularity and consumption by sinks
Predominantly client-server: Traffic from CPs to ECs
Predominantly peer-to-peer: Traffic between ECs
6/7/2019 26
Algorithm for network actions
6/7/2019 28
Properties of the steady-state
Is steady-state unique?
No, can depend on playing sequence
Different steady-states have qualitatively similar properties
6/7/2019 29
Canonical Model
6/7/2019 30
Model Validation
6/7/2019 32
Results – Canonical Model
EC ECEC EC EC EC
STPs to peer
Peering by “traffic
ratios” makes sense
6/7/2019 34
Conclusions
Considers effects of
Economics
Geography
Heterogeneity in network types
6/7/2019 36
The Edge of the Internet
6/7/2019 39
Monetary and Path Length Cost
For set of ISPs C, what is the monetary and path
length cost of routing egress flows?
Find the minimum cost mapping G* of flows to ISPs
(Bin Packing)
Flows = items
ISPs = bins
NP hard !
Use First Fit Decreasing (FFD) heuristic
Mapping G* very close to optimal
Monetary and path length costs of C are calculated
using the mapping G*
6/7/2019 40
Path Diversity
selection C gives K paths to
each destination d
K-shared link to d: link shared
by all K paths to d
If a K-shared link fails,
destination d is unreachable
Minimize the number of K-
shared links
Path diversity metric: The number
of k-shared links to destination d
averaged over all destinations
Gives the best resiliency to
single-link failures
6/7/2019 41
Summary
Algorithms for ISP selection
Choosing best set of upstream ISPs
Objectives are minimum monetary cost, short AS paths and
high path diversity
6/7/2019 43
The debate
Recent evolution trend: Large amounts of video and
peer-to-peer traffic
Content providers (CP) generate the content
Provide content and services “over the top” of the basic
connectivity provided by ISPs
Profitable (think Google)
Access Providers (AP) deliver content to users
Recent trend: Not profitable
Commoditization of basic Internet access
Want a share of the pie
Tension between AP and CPs: “Network neutrality”
6/7/2019 44
A Technical View
Previous work
Mostly non-technical
Highly emotional debates in the press
Legislation/policy aspects: Do we need network neutrality
legislation?
But what about the underlying problem: Non-
profitability of Access Providers?
Our approach: A quantitative look at AP profitability
Investigate reasons for non-profitability
Evaluate strategies for remaining profitable
6/7/2019 45
Modeling AP Profitability
6/7/2019 46
AP Profitability
6/7/2019 47
Major Findings
6/7/2019 50
Future Directions
6/7/2019 51
Other things I’ve been up to
Router buffer sizing
“Buffer Sizing For Congested Internet Links” [Infocom ‘05]
“Open Issues in Router Buffer Sizing” [CCR ‘06]
Network troubleshooting
“NetDiagnoser: Troubleshooting network disruptions using
end-to-end probes and routing data” [CoNext ‘07]
Network monitoring
“Route monitoring from passive data plane measurements”
[In progress]
Measurement
“Poisson vs. Periodic Path Probing” [IMC ‘05]
6/7/2019 52
Thank You !
6/7/2019 53
Issue-1: remove backup/transient links
Each snapshot of the Internet topology captures 3
months
40 snapshots – 10 years
Perform “majority filtering” to remove backup and
transient links from topology
For each snapshot, collect several “topology samples”
interspersed over a period of 3 weeks
Consider an AS-path only if it appears in the majority of the
topology samples
Otherwise, the AS-path includes links that were active for
less than 11 days (probably backup or transient links)
Samples
Snapshot
6/7/2019 54
6/7/2019
Issue-2: variable set of BGP monitors
Some observed link births may be links revealed due to increased
monitor set
Similarly for observed link deaths
We calculated error bounds for link births and deaths
Relative error < 10% for CP links
See paper for details
6/7/2019 55
Issue-3: visibility of ASes, Customer-Provider (CP) and
Peering (PP) links
Number of ASes and CP links is robust to number of monitors
But we cannot reliably estimate the number of PP links
6/7/2019 56
Global Internet trends
6/7/2019 57
Transit (CP) vs Peering (PP) relations
6/7/2019 62
Customer activity by region
6/7/2019 67
Attractiveness (repulsiveness) of transit providers
6/7/2019 71
Evolution of Internet Peering
6/7/2019 72
Which AS pairs like to peer?
6/7/2019 77
Problem Definition
Two phases
Phase I – ISP Selection:
Select K upstream ISPs
K depends on monetary and performance constraints
“Static” operation
Change only when major changes in the traffic destinations or
ISP pricing
Phase II – Egress Path Selection
Allocate egress traffic to selected ISPs
Avoid long term congestion and minimize cost
“Semi-static” operation, performed every few hours or days
6/7/2019 78
Evaluation – Path Diversity
AS-level paths and traffic
rates are input to simulator
9 ISPs, 250 destinations
Given K, find the selection
C* with the minimum path
diversity cost
For each selection C, find
u(C) = total traffic lost due
to the failure of each link in
topology
Calculate Δu(C) = u(C) – u(C*) Single link failures:
for each selection C C* is the optimal
selection
6/7/2019 79
Egress Path Selection
After Phase-I, S has K upstream ISPs
Problem: How to map outgoing traffic to the ISPs
M flows: KM mappings of flows to ISPs
Some mappings may cause congestion to flows !
Flows can be congested at access links or further upstream
Objective: Find the loss-free mapping with the
minimum cost
Challenges:
Upstream topology and capacities are unknown
Iterative routing approaches required
Propose an iterative routing based on simulated annealing
6/7/2019 80
Evaluation – Path Diversity
AS-level paths and traffic
rates are input to simulator
9 ISPs, 250 destinations
Given K, find the selection
C* with the minimum path
diversity cost
For each selection C, find
u(C) = total traffic lost due
to the failure of each link in
topology
Calculate Δu(C) = u(C) – u(C*)
Single
2,3 linklink failures:
failures:
for each selection C C*
C* is
is the
closeoptimal
to the
selection
optimal selection
6/7/2019 81
Provider and Peer Selection
A B C
6/7/2019 84
Baseline model
AP and CP connect to the TP as customers
N users of AP, charged a flat rate R ($/month)
Transit pricing: 95th percentile of traffic volume,
concave transit pricing functions
95th / mean = 2:1 for normal traffic, 4:1 for video1
More video means higher transit payment by AP
AP users: Heavy tailed distribution of content
downloaded per month
High variability in AP costs
1Norton’06: Internet Video: The Next Wave of Massive Disruption to the U.S. Peering Ecosystem
6/7/2019 85
AP Strategies
Charging strategies
AP charges “heavy hitters”
according to volume
downloaded
AP caps heavy hitters
AP charges CP (non-network
neutral)
Charging strategies are
disruptive
AP cannot control customer
departure probability
6/7/2019 86
AP Strategies
6/7/2019 87
AP Strategies
Connection Strategies
AP caches content from CPs
AP peers with CPs
Non-disruptive
Caching can reduce transit costs of AP
But depends on the amount of content cacheable
Selective peering with CPs can improve profitability
Peering cost depends on CP
Cost/benefit analysis for each CP
CP with large network: low cost of peering
6/7/2019 88
AP Strategies
Connection Strategies
AP caches content from CPs
AP peers with CPs
Non-disruptive
Cost-benefit analysis for
peering
Peering cost depends on CP
(easy/medium/hard)
r = saving/cost (both
estimated)
Peer if r > R
AP controls the factor R
6/7/2019 89
AP Strategies
Charging heavy hitters
download amount D,
threshold T, flat rate R
c(D) = D*R/T
AP’s profit is sensitive to
customer departure prob
Non-neutral charging
Customer departure prob
“How discriminatory is my
AP?”
AP’s profit is sensitive to
customer departure prob
6/7/2019 90
Why Study Internet Evolution?
6/7/2019 91