Infinity
Moshe Yanai
IBM Fellow
2010 IEEE Reynold B. Johnson Award
1
More information will be generated 2003
G i g a b y t e s
2002 33B
40,000 BCE 2001 21B
cave paintings 2000 15B
bone tools 3500
writing
0 C.E.
paper 105
1450
printing
1870
electricity, telephone
transistor 1947
computing 1950
Late 1960s
Internet 1993
The Web
1999
2003
Source: UC Berkeley, School of Information Management and Systems. Copyright © 2010 - Moshe Yanai 2
Data Storage Growth
• 2006: 161 Exabytes
2010 Audi
A5 Cabriolet
22
American Airlines
DC4, 1938
Boeing 787
New Dreamliner
23
Titanic - 1912
Newest Carnival
Cruise
24
On the other hand…
Controllers
• Controllers
• PERFORMANCE
Interfaces
• Interconnects
RELIABILITY = $$$
SCALABILITY
Cache
With this current architecture,
scalability is achieved by using
more powerful (and more
expensive) components
JBOD JBOD
Tier 3
Laundry Power? Lamp Power? TV Power?
Switching Switching
Data Module Data Module Data Module Data Module Data Module Data Module Data Module
47
Storage Architecture Revolutions
1970 1990 2010
Mainframe Cluster architecture Scalable grid architecture
Monolithic Tightly coupled Node independent
Gates design Custom HW design Commodity H/W building blocks
Very expensive Expensive components Off-the-shelf low cost components
Switching
Node Node
JBOD JBOD
48
THANK
YOU
myanai@us.ibm.com
49
•Backup Slides
Data Module 3
Data Module 3
[ hardware upgrade ]
Copyright © 2010 - Moshe Yanai 53
XIV Distribution Algorithm
on System Changes
• Data distribution only changes when the system changes
– Equilibrium is kept when new hardware is added
– Equilibrium is kept when old hardware is removed
The fact that distribution is full and
– Equilibrium is kept after amakes
automatic hardware failure
sure all spindles join
the effort of data re-distribution after
[ hardware failure ]
configuration change.