Anda di halaman 1dari 36

VLSI Architecture :: MEL G642

Dr. A. Amalin Prince

BITS Pilani K.K. Birla Goa Campus
Department of Electrical and Electronics Engineering

MEL G642
Memory addressing

MEL G642
Memory addressing

 Datapath, the addressing path and memory subsystem

are separated and work in parallel
 The memory should be accessed just before the datapath
operate on accessed data
 The memory access to store the result should be
executed just after the result is available
 Latency or delay of address computing shall be
 The separation of datapath and addressing path should
not introduce an excessive HW cost.

MEL G642
Memory addressing modes

Addressing Algorithm Specification

Implied addressing Implicitly specified in the OP code
Memory direct A <= immediate data of the instruction
Segment plus offset A <= SEG + OFFSET
Register indirect A <= Selected GR
Register post increment A <= ADP and INC (ADP)
/*ADP is an address pointer*/
Register pre decrement DEC (ADP) and A <= ADP
Index addressing A <= SEG + Index GR

MEL G642
Memory addressing circuit in
Register File/Control
path (Instruction
Decoder) Address

Addressing feedback
logic circuit


MEL G642
Control path/Instruction

Address pointer
MEL G642
Memory addressing circuit in

MEL G642

MEL G642
General addressing circuits

MEL G642
General addressing circuits

MEL G642
General addressing circuits

Address<= DU_Segment + column_length*(row_position-1) + column_position-1

MEL G642
Modulo addressing of FIFO buffer

 Implicitly using post ++ or -- addressing

 Implicitly checking / assigning of FIFO bottom and top
 Implicitly checking that the address pointer is bottom
< address pointer < top

MEL G642
Hardware Accelerated Memory Addressing

 Specification of implied addressing for convolution:

1. Coefficient memory supplying coefficients without modulo addressing.
2. Data memory supplying data within a FIFO. The FIFO has:
1. BTMR to avoid underflow of the FIFO addressing.
2. TOPR to avoid overflow of the FIFO addressing.
3. DAR, data address pointer, pointing to data within the FIFO.
3. Postincrement (decrement) addressing when:
1. reaching the top of the FIFO, jump to the bottom of the FIFO.
2. reaching the bottom of the FIFO, jump to the top of the FIFO.
4. Two memories supply data and coefficient simultaneously.
5. All address operations should be executed in parallel with the data
6. Data and coefficients must be available before computing.

MEL G642
Modulo++ addressing circuit

MEL G642
Modulo-- addressing circuit

MEL G642
Bit-Reversed Addressing
Given a buffer size of 8, an index register set to 4, and an initial
pointer set to 0x100, the sequence of accesses to the buffer is shown in

MEL G642
Bit-Reversed Addressing
Few restrictions are required to use this method

1) The size of the buffer must be a power of two (2n).

2) The buffer must be aligned so that the starting address of the buffer
has n lsb's equal to zero.
3) All pointer updates must use the bit-reversed update mode
(*BR0+/-). This requires that the desired increment/decrement
value be stored in the index register (AR0) in bit-reversed form.

How to integrate bit-reversed addressing with modulo

addressing/circular addressing? Finally integrate with the AGU circuits

MEL G642
RF fundamentals

MEL G642
Datapath in a DSP processor

The data path (DP)

Control path (CP)

Processor memory and register busses



Addressing path (AGU)

MEL G642
General register file

 A general register file (RF) consists of a group of

registers used as the first level of computing storage
 A physical data memory (with a single read/write port)
can access one data at a time, read and write cannot be
executed simultaneously.
 A register file supports simultaneous reads and writes in
one clock cycle, unlimited number of operands can be
supplied by a RF at the same time.

MEL G642
General register file

 The RF gets data from data memories by running load

instructions while preparing for an execution of a
 While running a subroutine, the register file is used as
computing buffers.
 After running the subroutine, results in the RF will be
stored into data memories by running store instructions.

MEL G642
Register file

MEL G642
Write circuit Read circuit
RF: register file

MEL G642
Store circuit

MEL G642
Physical design: Gate count problem

 The gate count of RF will be more than a simple MAC

 Consider a register file with 32 registers of 16-bit
– contains 32×16 = 512 flip-flops.
o The gate count of 512 flip-flops is about 5.12k gates.
o The gate count for all 32×16b keepers is 32×16× 6 = 3.1k gates.
– The gate count for operand selection including OPA and OPB is
2×16× (32+16+8+4+2+1)=2.1k gates.
– The gate count of the data selection control is at least 16×
(8+4+2+1) = 0.24k gates.
– the total gate count of the register file without driving buffers is
about 10.6k gates.
– Including extra driving buffers, the gate count of this register file
will be around 15k. [approx gate count refer the slides at the end of this presentation]
MEL G642
Physical design: fan-in fan-out problem

 Fan out of the clk driver must be 512×2=1024

 The fan-out of the operand selection control pin from CP
to RF is very high.
 Consider a register file with 32 registers of 16-bit
– Fan-out of 512 is very heavy,
o this is a typical hidden critical path between control path and
o It might not be recognized until reaching the logic synthesis of the
RTL codes because the critical path cannot be identified from the
datapath schematic.
– It is a typical mistake of an inexperienced design team.

MEL G642
Physical design: fan-in fan-out problem

From 32 registers in a register file

Fan-out of the control signal
For the first stage: 16*16*2 = 512

Fan-out of the control signal

For the second stage: 16*8*2 = 256

Fan-out of the control signal

For the third stage: 16*4*2 = 128

Fan-out of the control signal

For the fourth stage: 16*2*2 = 64

Fan-out of the control signal

For the fivth stage: 16*1*2 = 32

Selected operand

MEL G642
How to manage critical path

 Pipeline around register file includes the pipeline of

result store (toward RF) and the pipeline of operand fetch
(from RF).
– RF>>memory?
o An operand might be data to be stored to a data memory.
o A data memory might be allocated far away from the register file.
– RF>>execution unit?
– RF <<execution unit?
o Not always for ALU
o Required for MUL/MAC or other accelerated instruction units
o Now you know why Accumulator required?
– if the result of multiplication is stored in an accumulator register
instead of in a RF, the pipeline of “result store” can be merged into
the pipeline of “execution” for most ASIP DSPs
MEL G642
Special registers in general register file

 Registers in a RF can be either general registers or

special registers.
– Special registers could be allocated inside a general register file.
o In case a register is a general register carrying special register
functions, extra logic circuits will be added around the register.
– in an ASIP when a special register is frequently involved in
ALU computing.
o For example, if a register in the register file is both a general
register and an address pointer, addressing logic will be added
around this register. (figure in next slide)
– A register file consisting of general and special registers is called
a multiple-function RF.

MEL G642
Special registers in general register file

A special register in a register file

A general register in a register file

Register input Register input

Register output Register output

MEL G642
Special registers in general register

logic for a
special register

MEL G642

MEL G642
Special function registers are in RF?

 Not all special registers can be integrated inside the

general register file.
– For example, special registers of peripheral devices usually are
allocated in peripheral modules.
– Loop counter and stack registers usually are allocated in the
control path.

MEL G642
Special function registers are in RF?

 If a special register is not integrated in the general

register file, it cannot be used as an operand supplying
data directly to ALU.
– In this case, the data process for a special register is:
1) Move the data in a special register to a general register.
2) Process the data and store it in the general register.
3) Move the result stored in the general register to the special
– Three clock cycles are needed for one data manipulation in this
– Decision is based on the data in a special register is needed very
frequently or not.

MEL G642
RF multiple data write

 double-precision data from the MAC accumulator

 Swap instruction

MEL G642
The End :: Thank you for your attention


MEL G642
 Flip flop made up of 10 gates using Transmission gate

These slides are only for providing additional information

with certain assumptions
MEL G642
 Keepers are basically 2:1 muxes and are realized by
transmission gate logic.

 No of gates = 6 (two transmission gates (4 gates), one

inverter (2 gates) )

MEL G642
 Operand selection is implemented using complimentary
pass transistor logic.
PMOS used to eliminate the
need for inverter to generate S’

Implementation of 8: 1 mux using 2:1 muxex

MEL G642
 Similarly 32:1 and 16:1 can be built.
 Each 2:1 mux require 2 transistor.
 For 32:1 mux using 2:1 no of levels required is 5.
 Therefore gate count is for 32:1 mux is (32+16+8+4+2)
 The final stage will have a level restorer transistor which accounts for an extra gate.

Output buffer so
not counted

Final gate count is (32+16+8+2+1)

Similarly for 8:1 mux gate count can be calculated.

Driving buffers tends to depend on the load capacitance to be driven and number of stages used for it.

MEL G642