Portions of this work are from the book, Digital Design: An Embedded Systems Approach Using VHDL, by Peter J. Ashenden, published by Morgan Kaufmann Publishers, Copyright 2007 Elsevier Inc. All rights reserved.
VHDL
Embedded Computers p
A computer as part of a digital system
Performs processing to implement or control the systems function y
Components
Processor core Instruction and data memory Input, output Input output, and input/output controllers
For interacting with the physical world
Accelerators
High-performance circuit for specialized functions
Interconnecting buses o g bu
Digital Design Chapter 7 Processor Basics 2
VHDL
Memory Organization y g
Von Neumann architecture
Single memory for instructions and data
Harvard architecture
Separate instruction and data memories Most common in embedded systems
C U CPU Instruction memory Data memory Accelerator cce e ato
Input controller
Output controller
I/O controller
VHDL
Bus Organization g
Single bus for low-cost low-performance g p systems Multiple buses for higher performance p g p
Data memory Accelerator
Instruction memory
CPU
Input controller
Output controller
I/O controller
VHDL
Bus Organization g
Traditional Bus Topology
VHDL
Bus Organization g
Typical Switch Fabric Topology
VHDL
Bus Organization g
Alteras System Interconnect Fabric Example
VHDL
Bus Organization g
Alteras Memory-Mapped and Streaming System Interconnect Fabrics
SRIO: Serial RapidIO is a highperformance, point-topoint, packet-switched interconnect technology defined by the RapidIO Trade Association. Full duplex point to point Full-duplex point-to-point links are established with single or multiple highspeed serial lanes (1x and 4x are currently defined), and industry-standard industry standard 8B/10B-encoded data transmission at signaling rates of 1.25, 2.50, or 3.125 Gbaud for peak bandwidth of up to 20 Gbps.
VHDL
Microprocessors p
Single chip Single-chip processor in a package External connections to memory and I/O b buses Most commonly seen in general purpose computers
E.g., E g Intel Pentium family, PowerPC, family PowerPC
VHDL
Microcontrollers
Single chip combining g p g
Processor A small amount of instruction/data memory y I/O controllers
Microcontroller families
Same processor, varying memory and I/O
8-bit microcontrollers
Operate on 8-bit data Low cost, low performance
NXPs 50-MHz ARM CortexM0-based LPC1100 microcontroller family represents the latest 32-bit challenge to 8- and 16-bit processors. The parts are available now with prices starting at 65 to 95 cents (10,000). CoreMark Benchmark measures 40 to 50% better 0 0% b code density for the LPC1100 than that of 8- and 16-bit 10 microcontrollers.
VHDL
Processor Cores
Processor as a component in an FPGA or ASIC In FPGA, can be a fixed-function block FPGA
E.g., PowerPC cores in some Xilinx FPGAs
VHDL
12
VHDL
Instruction Sets
A processor executes a program
A sequence of instructions, each performing a small step of a computation p p
13
VHDL
Instruction Execution
Instructions are encoded in binary
Stored in the instruction memory
VHDL
Big endian
8-bit data
m m+1
16-bit data
m m+1
16-bit data
LSB=lowest address
Big endian
n least sig. byte l i b 32-bit data most sig byte sig. n n+1 n+2 n+3 least sig. byte l i b most sig. byte
PowerPC
32-bit data
MSB=lowest address
15
VHDL
Instruction set illustrates features typical of 8bit cores and processors in general Programs written i assembly language P itt in bl l
Each processor instruction written explicitly Translated to binary representation by an assembler
16
VHDL
Gumnut Storage g
General-Purpose Registers
How many registers should you encode for in the instruction? Two? Three? How many registers should there be?
r0 r1 r2 r3 r4 r5 r6 r7
PC
254 255
4094 4095
17
VHDL
Arithmetic Instructions
Operate on register data and put result in a register
add, addc, sub, add addc sub subc Can have immediate value operand
VHDL
Arithmetic Instructions
Examples
add add dd sub add add r3, r4, r1 r5, r1, 2 5 1 r4, r4, 1 r4, r4, r3 r4 r4 r4, r4, 1 ; double x ; then add 1
19
VHDL
Logical Instructions g
Operate on register data and put result in a register
and, or, xor, mask ( d not) d k (and t) Operate bitwise on 8-bit operands Can have immediate value operand
Condition codes
Z: 1 if result is zero, 0 if result is non-zero C: always 0
20
VHDL
Logical Instructions g
Examples p
and or xor and sub r3, r4, r5 r1, r1, 0x80 r5, r5, 0xFF ; set r1(7) ; invert r5
21
VHDL
Shift Instructions
Logical shift/rotate register data and put result in a register
shl, shr, rol, ror hl h l Count specified as a literal operand
Condition codes
Z: 1 if result is zero, 0 if result is non-zero zero C: the value of the last bit shifted/rotated past th end of th b t t the d f the byte
22
VHDL
Shift Instructions
Examples p
shl ror shl shl shl add r4, r1, 3 r2, r2, 4 r4, r4, 3 r1, r4, 1 ; multiply by 2 , , p y y r4, r4, 3 ; multiply by 8 r4, r4, r1
23
VHDL
Memory Instructions y
Transfer data between registers and data g memory
Compute address by adding an offset to a base register value
Store from register to memory Use r0 if base address is 0 Condition codes not affected
Digital Design Chapter 7 Processor Basics 24
VHDL
Memory Instructions y
Increment a 16-bit integer in memory 16 bit
Little-endian: address location ldm r1, (r2) add r1, r1, 1 stm r1, (r2) ldm r1, (r2)+1 addc r1 r1, 0 r1, r1 stm r1, (r2)+1 of lsb in r2, msb in next ; increment lsb
25
VHDL
Input/Output Instructions p / p
I/O controllers have registers that govern their operation
Each has an address, like data memory address Gumnut has separate data and I/O address spaces
Output to O t t t I/O register i t Condition codes not affected Further examples in Chapter 8 p p
Digital Design Chapter 7 Processor Basics 26
VHDL
Branch Instructions
Programs can evaluate conditions and take alternate courses of action
Condition codes (Z, C) represent outcomes of (Z arithmetic/logical/shift instructions
27
VHDL
Branch Example p
Elapsed seconds in location 100
Increment, ldm r1, add r1, sub r0, bnz +1 add r1, stm r1 r1, wrapping to 0 after 59 100 r1, 1 r1, 60 ; Z set if r1 = 60 ; Skip to store if r0, 0 ; Z is 0 100
28
VHDL
Jump Instruction p
Unconditionally skips forward or backward to y p specified address
Changes the PC to the address
VHDL
Subroutines
A sequence of instructions that perform some operation
Can call them from different parts of a program using a jsb instruction Subroutine returns with a ret instruction
jsb m m subroutine instructions ret jsb m b
30
VHDL
Subroutine Example p
Subroutine to increment second count
Address of count in r2 ldm r1, (r2) add r1 r1, 1 r1, r1 sub r0, r1, 60 bnz +1 add r1 r0, 0 r1, r0 stm r1, (r2) ret add jsb add jsb r2, r0, 100 20 r2, r0, 102 20 Dgtl ein Chapter 7 Processor Basics ii D s a g
31
VHDL
32
VHDL
Miscellaneous Instructions
Instructions supporting interrupts
See Chapter 8 reti R t i Return f from i t interrupt t enai Enable interrupts disi Disable interrupts wait Wait for an interrupt stby Stand by in low power mode until an interrupt occurs
33
VHDL
VHDL
Example Program p g
; Program to determine greater of value_1 and value_2 text t t org 0x000 ; start here on reset jmp main ; Data memory layout t e o y yout data value_1: byte value_2: byte result: bss ; Main program text g org main: ldm ldm sub bc stm jmp value_2_greater: stm finish: jmp 0x010 r1, value_1 r2, value_2 r0, r1, r2 value_2_greater _ _g r1, result finish r2, result finish
10 20 1
; load values ; compare values ; value_1 is greater ; value_2 is greater ; idle loop 35
VHDL
Gumnut encoding
18 bits per instruction p Divided into fields representing different aspects of the instruction
Opcodes and function codes The VAX has a computer architecture with easily Register numbers the most complex instruction set. Addresses The instruction set has a highly variable format
Digital Design Chapter 7 Processor Basics
where the minimal instruction length is 1 byte and the longest instruction is 37 bytes (296 bits). d h l b ( b )
36
VHDL
1 1 1 0
1 3
rd d
3
rs
3
rs2 2
8
fn f immed
0
3
fn
1
rd
3
rs
3 3
1 1 0
2 2
rd
3
rs
3
count
8
fn offset
1 0
fn
6
rd
2
rs
2
1 1 1 1 1 0
5 1
fn
12
disp addr
3 8
1 1 1 1 0 fn
7
1 1 1 1 1 1 0
fn
37
VHDL
Encoding Examples g p
Encoding for addc r3, r5, 24 r3 r5
Arithmetic immediate, fn = 001
1 3 3 3 8
fn
rd
rs
immed
0 0 0 1 0 1 1 1 0 1 0 0 0 1 1 0 0 0
05D18
Branch
1 1 1 1 1 0
fn
disp
bnc -4
38
VHDL
VHDL
Processor/memory interfacing / y g
Gluing the signals together
40
VHDL
gumnut
clk_i rst_i inst_cyc_o inst_stb_o inst stb o inst_ack_i clk data_we_o data we o data_cyc_o data_stb_o data stb o data_ack_i Q D clk we adr dat_i dat_o inst_adr_o inst_dat_i clk_i en
data SRAM
adr dat_o
41
VHDL
42
VHDL
VHDL
P0 ALE PSEN WR RD
44
VHDL
32-bit Memory y
Four bytes per memory word
Little-endian: lsb at least address Big endian: Big-endian: msb at least address
0 4 8 1 5 9 2 6 10 3 7 11
Partial-word read
Read all bytes, processor selects those needed
Partial-word write
Use byte-enable signals
Digital Design Chapter 7 Processor Basics 45
VHDL
+V
Ready Clk
46
VHDL
Cache Memory y
For high-performance processors high performance
Memory access time is several clock cycles Performance b l P f bottleneck k
Cache memory y
Small fast memory attached to a processor Stores most frequently accessed items items, plus adjacent items Locality: those items are most likely to be L lit th it t lik l t b accessed again soon
Digital Design Chapter 7 Processor Basics 47
VHDL
Cache Memory y
Memory contents divided into fixedfixed sized blocks (lines)
Cache copies whole lines from memory
VHDL
Burst transfers
Send starting address, then read successive locations
Pipelining
Overlapping stages of memory access E.g., address transfer, memory operation, data transfer
49
VHDL
Summary y
Embedded computer
Processor, memory, I/O controllers, buses
Microprocessors, microcontrollers, Microprocessors microcontrollers and processor cores Soft-core processors f ASIC/FPGA S f for S C/ G Processor instruction sets
Binary encoding for instructions