Performance Issues
Hyperthreading
Timeline
Now, Parallel Computing is available on Single Processor
Symmetric
Cluster Multi Processor
Processing
(SMP)
Hyperthreading
Parallel Computing - Goals
Parallel computing is when a program
uses concurrency to either:
Parts of Processor:
Front-end:
fetching/decoding/reordering
Execution core:
Concurrency
actual execution Illusion
Hyperthreading
Single-threaded SMP
What is SMP?
“Symmetric Multi-
Processors”
Tolerably, mislabeled as
“Shared-Memory
Processors”
Processors all connected
to a (large) memory
UMA: Uniform Memory
Access, makes is easy to
program
Symmetric: all memory is
Two threads execute at once, so threads equally close to all
processors
spend less time waiting
Cache Coherence via
Twice as much speed and twice as much “snoopy caches”
waste
Hyperthreading
Super-threading
[Time-Slice Multithreading]
Principle: the processor can execute
more than one thread at a time
Hyperthreading
Simultaneous Multi Threading (SMT)
Principle: the processor can execute more
than one thread at a time, even within a
single clock cycle!!
Hyperthreading
Evolution of Hyper-Threading
Two ways of faster computing
Increase Clock Speed
Better utilization of resources
Hyperthreading
Hyper Threading
Hyperthreading
Hardware Requirements
Because the additional threads all run on the same CPU elements
(FPU, ALU) the only additions that are needed are the initial
scheduling process.
Hyperthreading
Intel Xeon – Case Study
Capable of executing at most two threads in parallel on two logical
processors.
Hyperthreading
Intel Xeon – Resources Division
• Register renaming logic
Instruction Pointer
Replicated
•
• ITLB
• Return stack predictor
• Various other architectural registers
• Load/Store buffers
• Various queues, like the scheduling queues, uop queue, etc.
• Micro-architectural registers
• Execution Units
Hyperthreading
Replicated Resources
Some resources have to be replicated like
Instruction Pointer
1 Instruction Pointer for each Logical Processor.
Xeon: 2 Instruction Pointer
Hyperthreading
Partitioned Resources
Queues are partitioned resources
Hyperthreading
Shared Resources
Heart of Hyperthreading:
SMT unaware
Hyperthreading
Hyper-Threading Architecture
Overview
Hyperthreading
Hyperthreading
Confusing Notions
Is Hyper-threaded Processor same as Dual
Core?
Answer: NO
Hyperthreading
HT – System Requirements
HT enabled Processor
Pentium 4 3.06 GHz, Xeon
HT enabled Chipsets
Intel 945G Express
HT enabled System BIOS
HT enabled Operating System
Windows 2000, XP, Linux 2.4.12
Hyperthreading
HT – Requirements from User
Enable HT in BIOS
To utilize HT
Use multi-threaded applications
OR
Run multiple applications at same time
Hyperthreading
Performance Issues - 1
2 Logical Processors != Double Power
Hyperthreading
Performance Issues - 2
Death-Traps
Main cause: Shared Resource
Xeon Philosophy <->Cooperative Multitasking OS
Cases:
Floating Point Unit (FPU):
One floating-point intensive thread takes up the FPU; Another similar thread
contending for same FPU gets stalled
Cache
No cache-coherency problem as in SMP
But, cache conflict between two logical processors
Worst-Case: Two threads accessing different parts of memory and sharing no data =>
Lot of thrashing
Benchmarks Results: Non-SMT may perform better
With the wrong mix of code, hyper-threading decreases performance
Hyperthreading
HT Hardware Hands-On
Need to Enable Hyperthreading through BIOS
Simple Test:
Do together with and without HT
Hyperthreading
4 Processors View in Task Manager
Hyperthreading
Key Point
Hyperthreading
References
"Hyper-Threading Technology." Intel.
Deborah T. Marr, Frank Binns, David L. Hill, Glenn Hinton, David A.
Koufaty, J. Alan Miller, Michael Upton. "Hyper-Threading
Technology Architecture and Microarchitecture." Intel
Susan Eggers, Hank Levy, Steve Gribble. Simultaneous
Multithreading Project. University of Washington
Susan Eggers, Joel Emer, Henry Levy, Jack Lo, Rebecca Stamm,
and Dean Tullsen. "Simultaneous Multithreading: A Platform for
Next-generation Processors." IEEE Micro, September/October
1997, pages 12-18.
Jack Lo, Susan Eggers, Joel Emer, Henry Levy, Rebecca Stamm,
and Dean Tullsen. "Converting Thread-Level Parallelism Into
Instruction-Level Parallelism via Simultaneous Multithreading."
ACM Transactions on Computer Systems, August 1997, pages
322-354.
Hyperthreading
Thank You
E-mail: zainvi.sf@gmail.com
Hyperthreading