Anda di halaman 1dari 36

VLSI Architecture :: MEL G642

Dr. A. Amalin Prince


BITS Pilani K.K. Birla Goa Campus
Department of Electrical and Electronics Engineering

MEL G642
Memory addressing

MEL G642
Memory addressing

 Datapath, the addressing path and memory subsystem


are separated and work in parallel
 The memory should be accessed just before the datapath
operate on accessed data
 The memory access to store the result should be
executed just after the result is available
 Latency or delay of address computing shall be
minimized.
 The separation of datapath and addressing path should
not introduce an excessive HW cost.

MEL G642
Memory addressing modes

Addressing Algorithm Specification


Implied addressing Implicitly specified in the OP code
Memory direct A <= immediate data of the instruction
Segment plus offset A <= SEG + OFFSET
Register indirect A <= Selected GR
Register post increment A <= ADP and INC (ADP)
/*ADP is an address pointer*/
Register pre decrement DEC (ADP) and A <= ADP
Index addressing A <= SEG + Index GR

MEL G642
Inputs
Memory addressing circuit in
Register File/Control
path (Instruction
Decoder) Address
calculation

Addressing feedback
logic circuit
Initial
general

Keeper

MEL G642
address
Control path/Instruction
Decoder

Combinational
output
Address pointer
Registered
output
MEL G642
Memory addressing circuit in
general

MEL G642

MEL G642
General addressing circuits

MEL G642
General addressing circuits

MEL G642
General addressing circuits

Address<= DU_Segment + column_length*(row_position-1) + column_position-1

MEL G642
Modulo addressing of FIFO buffer

 Implicitly using post ++ or -- addressing


 Implicitly checking / assigning of FIFO bottom and top
registers
 Implicitly checking that the address pointer is bottom
< address pointer < top

MEL G642
Hardware Accelerated Memory Addressing

 Specification of implied addressing for convolution:


1. Coefficient memory supplying coefficients without modulo addressing.
2. Data memory supplying data within a FIFO. The FIFO has:
1. BTMR to avoid underflow of the FIFO addressing.
2. TOPR to avoid overflow of the FIFO addressing.
3. DAR, data address pointer, pointing to data within the FIFO.
3. Postincrement (decrement) addressing when:
1. reaching the top of the FIFO, jump to the bottom of the FIFO.
2. reaching the bottom of the FIFO, jump to the top of the FIFO.
4. Two memories supply data and coefficient simultaneously.
5. All address operations should be executed in parallel with the data
processing.
6. Data and coefficients must be available before computing.

MEL G642
Modulo++ addressing circuit

MEL G642
Modulo-- addressing circuit

MEL G642
Bit-Reversed Addressing
Given a buffer size of 8, an index register set to 4, and an initial
pointer set to 0x100, the sequence of accesses to the buffer is shown in
Table

MEL G642
Bit-Reversed Addressing
Few restrictions are required to use this method

1) The size of the buffer must be a power of two (2n).


2) The buffer must be aligned so that the starting address of the buffer
has n lsb's equal to zero.
3) All pointer updates must use the bit-reversed update mode
(*BR0+/-). This requires that the desired increment/decrement
value be stored in the index register (AR0) in bit-reversed form.

How to integrate bit-reversed addressing with modulo


addressing/circular addressing? Finally integrate with the AGU circuits

MEL G642
RF fundamentals

MEL G642
Datapath in a DSP processor

The data path (DP)

RF ALU MAC
Control path (CP)

Processor memory and register busses

DM1 DM2
PM

AGU1 AGU2

Addressing path (AGU)

MEL G642
General register file

 A general register file (RF) consists of a group of


registers used as the first level of computing storage
buffers.
 A physical data memory (with a single read/write port)
can access one data at a time, read and write cannot be
executed simultaneously.
 A register file supports simultaneous reads and writes in
one clock cycle, unlimited number of operands can be
supplied by a RF at the same time.

MEL G642
General register file

 The RF gets data from data memories by running load


instructions while preparing for an execution of a
subroutine.
 While running a subroutine, the register file is used as
computing buffers.
 After running the subroutine, results in the RF will be
stored into data memories by running store instructions.

MEL G642
Register file

MEL G642
Write circuit Read circuit
reg_select
RF: register file

MEL G642
Store circuit

MEL G642
Physical design: Gate count problem

 The gate count of RF will be more than a simple MAC


 Consider a register file with 32 registers of 16-bit
– contains 32×16 = 512 flip-flops.
o The gate count of 512 flip-flops is about 5.12k gates.
o The gate count for all 32×16b keepers is 32×16× 6 = 3.1k gates.
– The gate count for operand selection including OPA and OPB is
2×16× (32+16+8+4+2+1)=2.1k gates.
– The gate count of the data selection control is at least 16×
(8+4+2+1) = 0.24k gates.
– the total gate count of the register file without driving buffers is
about 10.6k gates.
– Including extra driving buffers, the gate count of this register file
will be around 15k. [approx gate count refer the slides at the end of this presentation]
MEL G642
Physical design: fan-in fan-out problem

 Fan out of the clk driver must be 512×2=1024


 The fan-out of the operand selection control pin from CP
to RF is very high.
 Consider a register file with 32 registers of 16-bit
– Fan-out of 512 is very heavy,
o this is a typical hidden critical path between control path and
datapath.
o It might not be recognized until reaching the logic synthesis of the
RTL codes because the critical path cannot be identified from the
datapath schematic.
– It is a typical mistake of an inexperienced design team.

MEL G642
Physical design: fan-in fan-out problem

From 32 registers in a register file


Fan-out of the control signal
For the first stage: 16*16*2 = 512

Fan-out of the control signal


For the second stage: 16*8*2 = 256

Fan-out of the control signal


For the third stage: 16*4*2 = 128

Fan-out of the control signal


For the fourth stage: 16*2*2 = 64

Fan-out of the control signal


For the fivth stage: 16*1*2 = 32

Selected operand

MEL G642
How to manage critical path

 Pipeline around register file includes the pipeline of


result store (toward RF) and the pipeline of operand fetch
(from RF).
– RF>>memory?
o An operand might be data to be stored to a data memory.
o A data memory might be allocated far away from the register file.
– RF>>execution unit?
– RF <<execution unit?
o Not always for ALU
o Required for MUL/MAC or other accelerated instruction units
o Now you know why Accumulator required?
– if the result of multiplication is stored in an accumulator register
instead of in a RF, the pipeline of “result store” can be merged into
the pipeline of “execution” for most ASIP DSPs
MEL G642
Special registers in general register file

 Registers in a RF can be either general registers or


special registers.
– Special registers could be allocated inside a general register file.
o In case a register is a general register carrying special register
functions, extra logic circuits will be added around the register.
– in an ASIP when a special register is frequently involved in
ALU computing.
o For example, if a register in the register file is both a general
register and an address pointer, addressing logic will be added
around this register. (figure in next slide)
– A register file consisting of general and special registers is called
a multiple-function RF.

MEL G642
Special registers in general register file

A special register in a register file

Specific
function
A general register in a register file

Register input Register input

Register output Register output

MEL G642
Special registers in general register

Address
calculation
logic for a
special register

MEL G642
file

MEL G642
Special function registers are in RF?

 Not all special registers can be integrated inside the


general register file.
– For example, special registers of peripheral devices usually are
allocated in peripheral modules.
– Loop counter and stack registers usually are allocated in the
control path.

MEL G642
Special function registers are in RF?

 If a special register is not integrated in the general


register file, it cannot be used as an operand supplying
data directly to ALU.
– In this case, the data process for a special register is:
1) Move the data in a special register to a general register.
2) Process the data and store it in the general register.
3) Move the result stored in the general register to the special
register.
– Three clock cycles are needed for one data manipulation in this
case.
– Decision is based on the data in a special register is needed very
frequently or not.

MEL G642
RF multiple data write

 double-precision data from the MAC accumulator


register
 Swap instruction

MEL G642
The End :: Thank you for your attention

Questions?

MEL G642
 Flip flop made up of 10 gates using Transmission gate
logic

These slides are only for providing additional information


with certain assumptions
MEL G642
 Keepers are basically 2:1 muxes and are realized by
transmission gate logic.

 No of gates = 6 (two transmission gates (4 gates), one


inverter (2 gates) )

MEL G642
 Operand selection is implemented using complimentary
pass transistor logic.
PMOS used to eliminate the
need for inverter to generate S’

Implementation of 8: 1 mux using 2:1 muxex

MEL G642
 Similarly 32:1 and 16:1 can be built.
 Each 2:1 mux require 2 transistor.
 For 32:1 mux using 2:1 no of levels required is 5.
 Therefore gate count is for 32:1 mux is (32+16+8+4+2)
 The final stage will have a level restorer transistor which accounts for an extra gate.

Output buffer so
not counted

Final gate count is (32+16+8+2+1)


Similarly for 8:1 mux gate count can be calculated.

Driving buffers tends to depend on the load capacitance to be driven and number of stages used for it.

MEL G642