Anda di halaman 1dari 31

Lecture 17

I/O Interfaces and I/O Busses

I/O Interface

CS510 Computer Architectures

Lecture 17 - 1

Storage System Issues


Historical Context of Storage I/O Storage I/O Performance Measures Secondary and Tertiary Storage Devices Processor Interface Issues I/O Buses Redundant Arrays of Inexpensive Disks (RAID)

I/O Interface

CS510 Computer Architectures

Lecture 17 - 2

Disk Time Example


Disk Parameters:
Transfer size is 8K bytes Advertised average seek is 12 ms Disk spins at 7200 RPM Transfer rate is 4 MB/sec Controller overhead is 2 ms

Assume that disk is idle so no queuing delay What is Average Disk Access Time for a Sector?
Ave seek + ave rot delay + transfer time + controller overhead 12 ms + 0.5/(7200 RPM/60) + 8 KB/4 MB/s + 2 ms 12 + 4.15 + 2 + 2 = 20 ms

Advertised seek time assumes no locality: typically 1/4 to 1/3 advertised seek time
Average Disk Access Time for a Sector: 20 ms => 12 ms
I/O Interface CS510 Computer Architectures Lecture 17 - 3

Processor Interface Issues

Interconnections
Busses Interrupts Memory mapped I/O Polling Interrupts DMA I/O Controllers I/O Processors

Processor interface

I/O Control Structures


Capacity, Access Time, Bandwidth

I/O Interface

CS510 Computer Architectures

Lecture 17 - 4

I/O Interface
Independent I/O Bus CPU Memory memory bus Interface Peripheral Interface Seperate I/O instructions (in,out) Peripheral Lines distinguish between I/O and memory transfers

Common Memory & I/O bus Memory

CPU
VME bus Multibus-II Nubus 40 Mbytes/sec optimistically 10 MIPS processor completely saturates the bus!

Interface Peripheral

Interface Peripheral

I/O Interface

CS510 Computer Architectures

Lecture 17 - 5

Memory Mapped I/O


CPU
Single Memory & I/O Bus No Separate I/O Instructions

ROM

Memory

Interface Peripheral

Interface Peripheral

RAM

CPU $

I/O

L2 $ Memory Bus

I/O bus

Memory
I/O Interface

Bus Adapter
CS510 Computer Architectures Lecture 17 - 6

Programmed I/O (Polling)


CPU Is data ready? no yes read data busy on wait loop

not an efficient way to use the CPU, unless the device is very fast!
but checks for I/O completion can be dispersed among computationally intensive code

Memory

IOC device

store data

done? yes
I/O Interface

no
Lecture 17 - 7

CS510 Computer Architectures

Interrupt Driven Data Transfer


CPU (1) I/O interrupt Memory add sub and or nop user program

IOC
device

(2) save PC (3) interrupt service addr read store ... rti memory interrupt service routine

User program halts only during actual transfer

(4)

1000 transfers/second: 1000 interrupts @ 2 msec per interrupt=> 2 msec 1000 interrupt service @ 98 msec each=> 98 msec 100 msec = 0.1 CPU seconds
I/O Interface CS510 Computer Architectures Lecture 17 - 8

Direct Memory Access


Time to do 1000 xfers in 1 msec: CPU sends a starting address, 1 DMA set-up sequence: @ 50 msec direction(R/W), and word count 1 interrupt: @ 2 msec to DMAC. Then issues "start". 1 interrupt service sequence: @ 48 msec 100msec .0001 second of CPU time CPU 0 ROM Memory DMAC I/O IOC Memory Mapped I/O device

RAM

DMAC provides; Peripheral controller Handshake signals Memory Addresses Handshake signals
I/O Interface CS510 Computer Architectures

Peripherals DMAC
Lecture 17 - 9

Input/Output Processors
CPU IOP D1 D2 . . . I/O bus Dn target device where cmnds are OP Device Address looks in memory for commands main memory bus Mem

issues instruction to IOP

CPU

(1) IOP
(3)

(4) (2)

interrupts when done

memory what to do

Command OP Addr Cnt Other


special requests where to put data how much
Lecture 17 - 10

Device to/from memory transfers are controlled by the IOP directly. IOP steals memory cycles from CPU
I/O Interface

CS510 Computer Architectures

Relationship to Processor Architecture


I/O instructions and busses have largely disappeared Interrupt vectors have been replaced by jump tables
PC <- M [ IVA + interrupt number ]

Interrupts:
Stack replaced by shadow registers Handler saves registers and re-enables higher priority interrupts

I/O Interface

CS510 Computer Architectures

Lecture 17 - 11

Relationship to Processor Architecture


Caches required for processor performance cause problems for I/O
Flushing is expensive, I/O pollutes cache Solution is borrowed from shared memory multiprocessors "snooping"

Virtual memory frustrates DMA Stateful processors are hard to switch context

I/O Interface

CS510 Computer Architectures

Lecture 17 - 12

Interconnect Trends
Interconnect = glue that interfaces computer system components High speed hardware interfaces + logical protocols Networks, channels, backplanes
Network Distance Bandwidth Latency Reliability >1000 m 10 - 100 Mb/s high (>ms) low Extensive CRC Channel 10 - 100 m 40 - 1000 Mb/s medium medium Byte Parity

Backplane
1m

320 - 1000+ Mb/s


low (<s) high Byte Parity Memory-mapped Wide pathways Centralized arbit.
Lecture 17 - 13

Message-based Narrow pathways Distributed arbitration


I/O Interface CS510 Computer Architectures

Backplane Architectures
Metric VME FutureBus MultiBus II SCSI-I Bus Width (signals) 128 96 96 25 Address data multiplexed? No Yes Yes na Data Width 16 - 32 32 32 8 Xfer Size Single/Multiple Single/Multiple Single/Multiple Single/Multiple # of Bus Masters Multiple Multiple Multiple Multiple Split Transactions No Optional Optional Optional Clocking Async Sync Async Either Bandwidth, Single Word (0 ns mem) 25 20 37 5, 1.5 Bandwidth, Single Word (150ns mem) 12.9 10 15.5 5, 1.5 Bandwidth Multiple Word (0 ns mem) 27.9 40 95.2 5, 1.5 ord (150 ns mem) Bandwidth Multiple W 13.6 13.3 20.8 5, 1.5 Max # of devices 21 21 20 7 Max Bus Length .5 m .5 m .5 m 25 m Standard IEEE 1014 IEEE 896 ANSI/IEEE 1296 ANSI X3.131

Distinctions begin to blur:


SCSI channel is like a bus FutureBus is like a channel (disconnect/reconnect) HIPPI forms links in high speed switching fabrics
I/O Interface CS510 Computer Architectures Lecture 17 - 14

Bus-Based Interconnect
Bus: a shared communication link between subsystems
Low cost: a single set of wires is shared multiple ways Versatility: Easy to add new devices & peripherals may even be ported between computers using common bus

Disadvantage
A communication bottleneck, possibly limiting the maximum I/O throughput

Bus speed is limited by physical factors


the bus length the number of devices (and, hence, bus loading). these physical limits prevent arbitrary bus speedup.

I/O Interface

CS510 Computer Architectures

Lecture 17 - 15

Bus-Based Interconnect
Two generic types of busses:
I/O busses (sometimes called a channel)
lengthy many types of devices connected wide range in the data bandwidth follow a bus standard

CPU-Memory buses (sometimes called a backplane)


short high speed, matched to the memory system to maximize memory-CPU bandwidth

To lower costs, low cost (older) systems combine together

Bus transaction
Sending address & receiving or sending data

I/O Interface

CS510 Computer Architectures

Lecture 17 - 16

Bus Protocols
Master Slave
Control Lines Address Lines Data Lines

Multibus: 20 address, 16 data, 5 control, 50ns Pause


Bus Master: has ability to control the bus, initiates a transaction

Bus Slave: module activated by the transaction


Bus Communication Protocol: specification of sequence of
events and timing requirements in transferring information. Asynchronous Bus Transfers: control lines (req., ack.) serve to orchestrate sequencing Synchronous Bus Transfers: sequence relative to the common clock
CS510 Computer Architectures Lecture 17 - 17

I/O Interface

Synchronous Bus Protocols


Read transaction Clock

Address Data Read Wait


begin read Read complete

Pipelined/Split transaction Bus Protocol


Address Data addr 1 addr 2 data 0 addr 3 data 1

Wait
I/O Interface

wait 1
CS510 Computer Architectures

OK 1
Lecture 17 - 18

Asynchronous Handshake
Write Transaction
Address Data Read Req. Ack. t0 t1 t2 t3 t4 t5 4 Cycle Handshake Master Asserts Address Master Asserts Data Next Address

t0: Master has obtained control and asserts address, direction, data; Waits a specified amount of time for slaves to decode target t1: Master asserts request line t2: Slave asserts ack, indicating data received t3: Master releases req t4: Slave releases ack
I/O Interface CS510 Computer Architectures Lecture 17 - 19

Asynchronous Handshake
Read Transaction
Address Data Read Req Ack t0 t1 t2 t3 t4 4 Cycle Handshake t5 Master Asserts Address Next Address

t0: Master has obtained control and asserts address, direction; Waits a specified amount of time for slaves to decode t1: Master asserts request line t2: Slave asserts ack, indicating ready to transmit data; Slave asserts data t3: Master releases req, data received t4: Slave releases ack

Time Multiplexed Bus: address and data share lines


I/O Interface CS510 Computer Architectures Lecture 17 - 20

Bus Arbitration
Parallel (Centralized) Arbitration Bus Arbiter
BR BG BR BG BR BG

BR: Bus Request BG: Bus Grant Bus sequence: BR BG

Serial Arbitration (daisy chaining)


BG BGi BGo BGi BGo BGi BGo

Busy
data

A.U.
BR

M
BR BR

M
BR

Polling A.U.
BR A I/O Interface CS510 Computer Architectures Lecture 17 - 21

M
BR A

M
BR A

M
BR A

Bus Options
Option
Bus width Data width Transfer size Bus masters Split

High performance
Separate address & data lines Wider is faster (e.g., 32 bits) Multiple words has less bus overhead Multiple (requires arbitration) Yes - separate Request and Reply packets gets higher bandwidth (needs multiple masters) Synchronous

Low cost
Multiplex address & data lines Narrower is cheaper (e.g., 8 bits) Single-word transfer is simpler Single master (no arbitration) No - continuous transaction? connection is cheaper and has lower latency

Clocking
I/O Interface

Asynchronous
Lecture 17 - 22

CS510 Computer Architectures

1990 Bus Survey


VME
Signals Addr/Data mux Data width Masters Clocking MB/s (0ns, word) 150ns word 0ns block 150ns block Max devices Max meters Standard 128 no 16 - 32 multi Async 25 12.9 27.9 13.6 21 0.5

FutureBus Multibus II
96 yes 32 multi Async 37 15.5 95.2 20.8 20 0.5 96 yes 32 multi Sync 20 10 40 13.3 21 0.5

IPI
16 n/a 16 single Async 25 = = = 8 50

SCSI
8 n/a 8 multi either 1.5 (asyn) 5 (sync) = = = 7 25

IEEE 1014 IEEE 896.1 ANSI/IEEE ANSI X3.129 ANSI X3.131 1296

I/O Interface

CS510 Computer Architectures

Lecture 17 - 23

VME
3 96-pin connectors 128 defined as standard, rest customer defined
32 address 32 data 64 command & power/ground lines

I/O Interface

CS510 Computer Architectures

Lecture 17 - 24

SCSI: Small Computer System Interface


Up to 8 devices to communicate on a bus or string at sustained speeds of 4-5 MBytes/sec SCSI-2 up to 20 MB/sec Devices can be slave (target) or master(initiator) SCSI protocol: a series of phases, during which specific actions are taken by the controller and the SCSI disks
Bus Free: No device is currently accessing the bus Arbitration: When the SCSI bus goes free, multiple devices may request (arbitrate for) the bus; fixed priority by address Selection: informs the target that it will participate (Reselection if disconnected) Command: the initiator reads the SCSI command bytes from host memory and sends them to the target Data Transfer: data in or out, initiator: target Message Phase: message in or out, initiator: target (identify, save/restore data pointer, disconnect, command complete) Status Phase: target, just before command complete
I/O Interface CS510 Computer Architectures Lecture 17 - 25

SCSI Bus: Channel Architecture


peer-to-peer protocols initiator/target linear byte streams disconnect/reconnect
Disconnect to seek/ ll buf fer Message In (Disconnect) - - Bus Free - Arbitration Reselection Message In (Identify) Command Setup Arbitration Selection Message Out (Identify) Command

If no disconnect is needed

Data T ransfer Data In Completion Disconnect to ll buf fer Message In (Save Data Ptr) Message In (Disconnect) - - Bus Free - Arbitration Reselection Message In (Identify) Message In (Restore Data Ptr) Command Completion Status Message In (Command Complete)

I/O Interface

CS510 Computer Architectures

Lecture 17 - 26

1993 I/O Bus Survey


Bus
Originator

SBus
Sun

TurboChannel
DEC

MicroChannel
IBM Intel

PCI
33
Physical 8,16,24,32,64 Multi Central 33 111 (222) 25

Clock Rate (MHz)


Addressing Data Sizes (bits) Master Arbitration 32 bit read (MB/s) Peak (MB/s) Max Power (W)

16-25
Virtual 8,16,32 Multi Central 33 89 16

12.5-25
Physical 8,16,24,32 Single Central 25 84 26

async
Physical 8,16,24,32,64 Multi Central 20 75 13

I/O Interface

CS510 Computer Architectures

Lecture 17 - 27

1993 MP Server Memory Bus Survey


Bus
Originator Clock Rate (MHz) Split transaction? Address lines Data lines Data Sizes (bits) Clocks/transfer Peak (MB/s) Master Arbitration Addressing Slots Busses/system Length

Summit
HP 60 Yes 48 128 512 4 960 Multi Central Physical 16 1 13 inches

Challenge
SGI 48 Yes 40 256 1024 5 1200 Multi Central Physical 9 1 12? inches

XDBus
Sun 66 Yes? ?? 144 (parity) 512 4? 1056 Multi Central Physical 10 2 17 inches

I/O Interface

CS510 Computer Architectures

Lecture 17 - 28

Communications Networks
Performance limiter is memory system, OS overhead, not protocols
Network Controller Node Processor Control Reg. I/F
DMA Request Block

Processor Memory
List of request blocks ... Data to be transmitted List of free blocks ... List of receive blocks ... Data received

Net I/F Media

Memory
Receive Block DMA

Peripheral Backplane Bus


I/O Interface

Send/receive queues in processor memories Network controller copies back and forth via DMA No host intervention needed Interrupt host when message sent or received
CS510 Computer Architectures Lecture 17 - 29

I/O Controller Architecture


Peripheral Bus (VME, FutureBus, etc.) Peripheral Bus Interface/DMA Buffer Memory Processor Cache Proc ROM Host Memory

Host Processor

I/O Interface

I/O Channel Interface I/O Controller

Request/Response block interface Backdoor access to host memory


CS510 Computer Architectures Lecture 17 - 30

I/O Data Flow


Impediment to high performance: multiple copies, complex hierarchy

Memory-to-Memory Copy
DMA over Peripheral Bus Xfer over Disk Channel Xfer over Serial Interface

Application Address Space OS Buf fers (>10 MByte)

Host Processor

HBA Buffers (1 M - 4 MBytes) I/O Controller Track Buffers (32K - 256KBytes) Embedded Controller I/O Device Head/Disk Assembly

I/O Interface

CS510 Computer Architectures

Lecture 17 - 31

Anda mungkin juga menyukai