Computer Architecture
Firefox, MS Excel
Windows 7 Visual C++ x86 Machine Primitives Von Neumann Machine Logic Gates & Memory Transistors & Devices Quantum Physics
Technology
Technology
Technology advances at astounding rate
19th century: attempts to build mechanical computers Early 20th century: mechanical counting systems (cash registers, etc.) Mid 20th century: vacuum tubes as switches Since: transistors, integrated circuits
Semiconductor History
Date Event 1947 1st transistor 1958 1st IC 1971 1st microprocessor 1974 Intel 4004 1978 1989 1995 2006 Intel 8086 Intel 80486 Intel Pentium Pro Intel Montecito Comments Bell Labs Jack Kilby (MSEE 50) @TI Winner of 2000 Nobel prize Intel (calculator market) 2300 transistors 29K transistors 1M transistors 5.5M transistors 1.7B transistors
201x IBM
50B transistors
Computer Architecture
Instruction Set Architecture (IBM 360)
the attributes of a [computing] system as seen by the programmer. I.e. the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls, the logic design, and the physical implementation. -- Amdahl, Blaaw, & Brooks, 1964
752 In Context
Prior courses
352 gates up to multiplexors and adders 354 high-level language down to machine language interface or instruction set architecture (ISA) 552 implement logic that provides ISA interface CS 537 provides OS background (co-req. OK) Modern processors that exploit ILP Modern memory systems that exploit MLP ECE 757 covers parallel and multiprocessing ECE 755 covers VLSI design
Computer Architecture
Exercise in engineering tradeoff analysis
Find the fastest/cheapest/power-efficient/etc. solution Optimization problem with 100s of variables
Technology Push
What do these two intervals have in common?
1776-1999 (224 years) 2000-2001 (2 years)
Answer: Equal progress in processor speed! The power of exponential growth! Driven by Moores Law
Devices per chip doubles every 18-24 months
Computer architects turn additional resources into Speed Power savings Functionality
Performance Growth
Unmatched by any other industry ! [John Crawford, Intel] Doubling every 18 months (1982-1996): 800x
Cars travel at 44,000 mph and get 16,000 mpg Air travel: LA to NY in 22 seconds (MACH 800) Wheat yield: 80,000 bushels per acre
Technology Push
Technology advances at varying rates
E.g. DRAM capacity increases at 60%/year But DRAM speed only improves 10%/year Creates gap with processor frequency!
Inflection points
Crossover causes rapid change E.g. enough devices for multicore processor (2001)
Application Pull
Corollary to Moores Law: Cost halves every two years
In a decade you can buy a computer for less than its sales tax today. Jim Gray
Application Pull
What about the future? Must dream up applications that are not costeffective today
Virtual reality, telepresence Web agents, social networking Wireless, location-aware Proactive (beyond interactive) w/ sensors Recognition/Mining/Synthesis (RMS) ???
Trends
Moores Law for device integration Chip power consumption Single-thread performance trend
[source: Intel]
Dynamic Power
P k C V A dyn i i if
2
i units
Combinational logic evaluates new inputs Flip-flop, latch captures new value (clock edge) C: capacitance of circuit
wire length, number and size of transistors
Multicore Mania
First, servers
IBM Power4, 2001
Then desktops
AMD Athlon X2, 2005
Then laptops
Intel Core Duo, 2006
Your cellphone
Baseband/DSP/application/graphics
Sep 18, 2007 Mikko Lipasti-University of Wisconsin
Why Multicore
Core
Core Core
Core
Core
Core
Core
Single Core Core area Core power Chip power Core performance Chip performance
Sep 18, 2007
A W W+O P P
Amdahls Law
n
# CPUs
f
1
1-f
Time
f fraction that can run in parallel 1-f fraction that must run serially
Speedup
Sep 18, 2007
1 f (1 f ) n
1 lim n f 1 f 1 f n
# CPUs
1-f
Amdahls Law
Ignores (power) cost of n cores
Time
32
16 8 4 2 1
1 2 4 8 16 32 64 128
# of cores/chip
Fixed power budget forces slow cores Serial code quickly dominates
Sep 18, 2007 Mikko Lipasti-University of Wisconsin
Iron Law
Time Processor Performance = --------------Program
=
Instructions Program
(code size)
Cycles X Instruction
(CPI)
Time Cycle
(cycle time)
Iron Law
Instructions/Program
Instructions executed, not static code size Determined by algorithm, compiler, ISA
Cycles/Instruction
Determined by ISA and CPU organization Overlap among instructions reduces this term
Time/cycle
Determined by technology, organization, clever circuit design
Our Goal
Minimize time, which is the product, NOT isolated terms Common error to miss terms while devising optimizations
E.g. ISA change to decrease instruction count BUT leads to CPU organization which makes clock slower
Textbooks
Required course textbook:
John Paul Shen and Mikko H. Lipasti, Modern Processor Design: Fundamentals of Superscalar Processors, First edition, McGraw-Hill.
Recommended textbook:
Mark Hill, Norm Jouppi, and Guri Sohi. Readings in Computer Architecture. Morgan Kauffman, 1999
Expected Background
ECE/CS 552 or equivalent
Design simple uniprocessor Simple instruction sets Organization Datapath design Hardwired/microprogrammed control Simple pipelining Basic caches
Course Context
Assume canonical RISC ISA
Register-register ALU ops Load from memory (cache) Store to memory Branches, jumps, calls, returns
Lecture
Attendance required Some lectures will be delivered on line Overscheduled in first half; will cancel many lectures in 2nd half
Homework
Homework assigned but not graded Learning tool to help prepare for midterm
Exams
Midterm 1: Fri 3/9 in class Midterm 2: Fri 5/18 12:25PM (final exam time slot) Keep up with reading list!
Midterm 2
Project
30%
30%
Communication channels
E-mail to instructor, class e-mail list Web page Office hours
35
Tentative Schedule
Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13 Week 14 Week 15 Finals Week Introduction, Review of Pipelining Superscalar Organization, Instruction Flow Instruction Flow contd, Memory review Register Data Flow Memory Data Flow Advanced Register Data Flow Case Studies, Midterm 1 in-class on 3/9 Case Studies, Advanced Memory Hierarchy Lecture canceled (most likely) Multiple threads, Niagara case study Instruction Set Design, Advanced topics Lecture canceled, project work Lecture canceled, project work Lecture canceled, project work Project talks, Course Evaluation, Final reports Midterm 2 Friday 5/18 12:25pm
Wrapping Up
Wed lecture on technology challenges
Sets the stage for the whole course
Be prepared for discussion/pop quiz No lecture Fri 2/3 (out of town) Final thought: Talking about music is like dancing about architecture. (Thelonius Monk)
37