Anda di halaman 1dari 22

Lecture 07

The Thumb Instruction Set

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

The Thumb Instruction Set


The Thumb instruction set addressed the issue of code density. It may be viewed as a compressed form of a subset of the ARM instruction set. Thumb is not a complete architecture. It only supports common application functions, allowing recourse to the full ARM instruction set where necessary. ARM processors which support the Thumb IS can also execute the standard 32-bit ARM IS. Not all ARM processors are capable of executing Thumb instructions.

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

The Thumb bit in the CPSR


The interpretation of the instruction stream at any particular time is determined by bit 5 of the CPSR, the T bit (see pp. 40). If T is set the processor interprets the instruction stream as 16-bit Thumb instructions, otherwise it interprets it as standard ARM instructions.
31 28 27 8 7 6 5 4 0

NZCV

unused

IF T

mode

Not all ARM processors are capable of executing Thumb instructions; those that are have a T in their name, such as the ARM7TDMI described in Section 9.1 on page 248.

Thumb 16-bit compressed instruction set; On-chip Debug support, enabling the processor to halt in response to a debug
request. An enhanced Multiplier, with higher performance than its predecessors and yielding a full 64-bit result. Embedded ICE hardware to give on-chip breakpoint and watchpoint support.
ychang@CS.NCHU, 2012 ychang@CS.NCHU,

Thumb entry
The normal way they switch to execute Thumb instructions is by executing a Branch and Exchange instruction (see BX, pp. 115 and example, pp. 116).
This instruction sets the T bit if the bottom bit of the specified register was set, and switches
the PC to the address given in the remainder of the register.

Example_1: MOV BX Example_2: MOV BX R0, #0x43 R0 R0, #0x40 R0

Thumb exit An explicit switch back to an ARM instruction stream can be caused by
executing a Thumb BX instruction (see example, pp. 117). An implicit return takes place whenever an exception is taken, since exception entry is always handled in ARM code.

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

Thumb Systems
A typical embedded system will include a small amount of fast 32-bit memory on the same chip as the ARM core and will execute speed-critical routines (such as digital signal processing algorithms) in ARM code from this memory. The bulk of the code will not be speed-critical and may execute from a 16-bit off-chip ROM.

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

The Thumb Programmers Model


The Thumb IS gives full access to the eight Lo general purpose registers r0 to r7, and makes extensive use of r13 to r15 for special purposes:

r13 is used as a stack pointer. r14 is used as the link register. r15 is the program counter (PC).
The remaining registers (r8 to r12 and CPSR) have only restricted access:

A few instructions allow the Hi register (r8 to r15) to be specified. The CPSR condition code flags are set by arithmetic and logical operations and control
conditional branching.
r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 SP (r13) LR (r14) PC (r15)
ychang@CS.NCHU, 2012 ychang@CS.NCHU,

shaded registers have restricted access Lo registers

Hi registers CPSR

Thumb-ARM similarities
All Thumb instructions are 16 bits long. They map onto ARM instructions so they inherit many properties of the ARM instruction set:
The load-store architecture. Support for 8-bit byte, 16-bit half-word and 32-bit word data types. A 32-bit unsegmented memory.

Thumb-ARM differences
In order to achieve a 16-bit instruction length a number of characteristic features of the ARM IS have been abandoned:
Most Thumb instructions are executed unconditionally. (All ARM instructions are executed
conditionally.) Many Thumb data processing instructions use a 2-address format. (the destination register is the same as one of the source registers). (ARM data processing instructions, with the exception of the 64-bit multiplies, use a 3-address format.) Thumb instruction formats are less regular than ARM instruction formats, as a result of the dense encoding.

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

Thumb Branch Instructions


The ARM instructions have a large (24-bit) offset field which clearly will not fit in a 16bit instruction format. Therefore the Thumb instruction set includes various ways of subsetting the functionality. Typical uses of branch instructions include:

Short conditional branches to control (for example) loop exit; Medium-range unconditional branches to goto sections of code; Long-range subroutine calls.
ARM handles all these with the same instruction, typically wasting many bits of the 24-bit offset in the first two cases. Thumb has to be more efficient, using different formats for each of these cases.
15 12 11 8 7 0

1101
15

cond
11 10

8-bit offset
0

(1) B<cond> <label>

Thumb target

11100
15 12 11 10

11-bit offset
0

(2) B <label>

Thumb target

1111 H
15 11 10

11-bit offset
1 0

(3) BL <label>

Thumb target

11101
15

10-bit offset
8 7 6 5 3 2

0
0

(3a) BLX <label>

ARM target

01000111
ychang@CS.NCHU, 2012 ychang@CS.NCHU,

L H Rm

000

(4) B{L}X Rm

ARM or Thumb target 8

Thumb Software Interrupt Instruction


15 8 7 0

11011111

8-bit immediate

The instruction causes the following actions:

The address of the next Thumb instruction is saved in r14_svc. The CPSR is saved in SPSR_svc. The processor disables IRQ, clears the Thumb bit and enters supervisor mode by modifying
the relevant bits in the CPSR. The PC is forced to address 0x08.

The ARM instruction SWI handler is then entered. The normal return instruction restores the Thumb execution state. Assembler format: SWI <8-bit immediate>

The equivalent ARM instruction has an identical assembler syntax; the 8-bit immediate is zero-extended to fill the 24-bit field in the ARM instruction.

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

Thumb Data Processing Instruction (1)


Thumb data processing instructions comprise a highly optimized set of fairly complex formats covering the operations most commonly required by a compiler.
15 10 9 8 6 5 3 2 0

000110
15 10

A
9 8

Rm
6 5

Rn
3 2

Rd
0

(1) ADD|SUB Rd,Rn,Rm

000111
15 13 12 1 1 10

A #imm3
8 7

Rn

Rd
0

(2) ADD|SUB Rd,Rn,#imm3

001
15

Op

Rd/Rn
6

#imm8
5 3 2 0

(3) <Op> R d /Rn ,#imm8

13 12 1 1 10

000
15

Op
10 9

#sh
6 5

Rn
3 2

Rd
0

(4) LSL|LSR|ASR Rd,Rn,#shift

010000
15 10 9

Op
8 7 6

Rm/Rs
5 3

Rd/Rn
2 0

(5) <Op> Rd/Rn,Rm/Rs

010001
15 12 1 1 10

Op
8

D M
7

Rm

Rd/Rn
0

(6) ADD|CMP|MOV Rd/Rn,Rm

1010
15

Rd
8 7 6

#imm8
0

(7) ADD Rd,SP|PC,#imm8

10110000

#imm7

(8) ADD|SUB SP,SP,#imm7

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

10

Thumb Data Processing Instruction (2)


Instructions that use the Lo, general-purpose register (r0-r7)
ARM instruction MOVS MVNS CMP CMP CMN TST ADDS ADDS ADDS ADCS SUBS SUBS SUBS SBCS RSBS Rd, #<#imm8> Rd, Rm Rn, #<#imm8> Rn, Rm Rn, Rm Rn, Rm Rd, Rn, #<#imm3> Rd, Rd, #<#imm8> Rd, Rn, Rm Rd, Rn, Rm Rd, Rn, #<#imm3> Rd, Rd, #<#imm8> Rd, Rn, Rm Rd, Rn, Rm Rd, Rn, #0 Thumb instruction ;MOV ;MVN ;CMP ;CMP ;CMN ;TST ;ADD ;ADD ;ADD ;ADC ;SUB ;SUB ;SUB ;SBC ;NEG Rd, #<#imm8> Rd, Rm Rn, #<#imm8> Rn, Rm Rn, Rm Rn, Rm Rd, Rn, #<#imm3> Rd, Rd, #<#imm8> Rd, Rn, Rm Rd, Rm Rd, Rn, #<#imm3> Rd, Rd, #<#imm8> Rd, Rn, Rm Rd, Rm Rd, Rn

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

11

Thumb Data Processing Instruction (3)


Instructions that use the Lo, general-purpose register (r0-r7)
ARM instruction MOVS MOVS MOVS MOVS MOVS MOVS MOVS ANDS EORS ORRS BICS MULS Rd, Rm, LSL #<#sh> Rd, Rd, LSL Rs Rd, Rm, LSR #<#sh> Rd, Rd, LSR Rs Rd, Rm, ASR #<#sh> Rd, Rd, ASR Rs Rd, Rd, ROR Rs Rd, Rd, Rm Rd, Rd, Rm Rd, Rd, Rm Rd, Rd, Rm Rd, Rd, Rm Thumb instruction ;LSL ;LSL ;LSR ;LSR ;ASR ;ASR ;ROR ;AND ;EOR ;ORR ;BIC ;MUL Rd, Rm, #<#sh> Rd, Rs Rd, Rm, #<#sh> Rd, Rs Rd, Rm, #<#sh> Rd, Rs Rd, Rs Rd, Rm Rd, Rm Rd, Rm Rd, Rm Rd, Rm

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

12

Thumb Data Processing Instruction (4)


Instructions that operate with or on the Hi registers (r8-r15), in some cases in combination with a Lo register;
ARM instruction ADD CMP MOV ADD ADD ADD SUB Rd, Rd, Rm Rn, Rm Rd, Rm Rd, PC, #<#imm8> Rd, SP, #<#imm8> SP, SP, #<#imm7> SP, SP, #<#imm7> Thumb instruction ;ADD ;CMP ;MOV ;ADD ;ADD ;ADD ;SUB Rd, Rm (1/2 Hi regs) Rn, Rm (1/2 Hi regs) Rd, Rm (1/2 Hi regs) Rd, PC, #<#imm8> Rd, SP, #<#imm8> SP, SP, #<#imm7> SP, SP, #<#imm7>

All the data processing instructions that operate with and on the Lo registers update the condition code bits (the S bit is set in the equivalent ARM instruction). The instructions that operate with and on the Hi registers do not change the condition code bits, with the exception of CMP which only changes the condition codes. The instructions that are indicated above as requiring 1 or 2 Hi regs must have one or both register operands specified in the Hi register area. #imm3, #imm7 and #imm8 denote 3-, 7- and 8-bit immediate fields respectively. #sh denotes a 5-bit shift amount field.
ychang@CS.NCHU, 2012 ychang@CS.NCHU,

13

Thumb Single Register Data Transfer Instructions (1)


15 13 12 11 10 6 5 3 2 0

011
15

B L
12 11 10

#off5
6 5

Rn
3 2

Rd
0

(1) LDR|STR{B} Rd,[Rn,#off5]

1000
15

L
12 11

#off5
9 8 6 5

Rn
3 2

Rd
0

(2) LDRH|STRH Rd,[Rn,#off5]

0101
15

Op
11 10 8

Rm
7

Rn

Rd
0

(3) LDR|STR{S}{H|B} Rd,[Rn,Rm]

01001
15 12 11 10

Rd
8 7

#off8
0

(4) LDR Rd,[PC,#off8]

1001

Rd

#off8

(5) LDR|STR Rd,[SP,#off8]

These instructions are a carefully derived subset of the ARM single register transfer instructions, and have exactly the same semantics as the ARM equivalent. In all cases the offset is scaled to size of the data type, so the range of the 5-bit offset is 32 bytes in a load or store byte instruction, 64 bytes in a load or store half-word instruction and 128 bytes in a load or store word instruction.

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

14

Thumb Single Register Data Transfer Instructions (2)

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

15

Thumb Single Register Data Transfer Instructions (3)

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

16

Thumb Multiple Register Data Transfer Instructions (1)


15 12 11 10 8 7 0

1100
15

Rn
10 9 8 7

reg list
0

(1) LDMIA|STMIA Rn!,


{<reg list>}

101111

L R

reg list

(2) POP|PUSH {<reg list>{,R}}

The block copy forms of the instruction use the LDMIA and STMIA addressing modes. The base register may be any of the Lo register, and the register list may include any subset of these registers but should not include the base register. The stack forms use SP (r13) as the base register. In addition to the eight registers which may be specified in the register list, the link register (LR, or r14) may be included in the PUSH instruction and PC (r15) may be included in the POP form.

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

17

Thumb Multiple Register Data Transfer Instructions (2)

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

18

Thumb Implementation
The Thumb instruction set can be incorporated into a 3-stage pipeline ARM processor macrocell with relatively minor changes. The 5-stage pipeline implementations are trickier. The biggest addition is the Thumb instruction decompressor in the instruction pipeline; this logic translates a Thumb instruction into its equivalent ARM instruction.
B operand bus data in immediate elds ARM instruction decoder

select ARM or Thumb stream

mux

Thumb decompressor
select high or low half-word

mux

instruction pipeline

data in from memory


ychang@CS.NCHU, 2012 ychang@CS.NCHU,

19

Thumb Instruction Mapping


The Thumb decompressor performs a static translation from the 16-bit Thumb instruction into the equivalent 32-bit ARM instruction. This involves:

Performing look-up to translate the major and minor opcodes. Zero-extending the 3-bit register specifiers to 4-bit specifiers. Mapping other fields across as required.
Example
Thumb: ADD ARM: ADDS Rd, #imm8 Rd, Rd, #imm8

The simplicity of the decompression logic is crucial to the efficiency of the Thumb instruction set. There would be little merit in the Thumb architecture if it resulted in complex, slow and power-hungry decompression logic.

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

20

Example
Thumb: ADD ARM: ADDS Rd, #imm8 Rd, Rd, #imm8

Since the only conditional Thumb instructions are branches, the condition always is used in
translating all other Thumb instructions. Whether or not a Thumb data processing instruction should modify the condition codes in the CPSR is implicit in the Thumb opcode; this must be made explicit in the ARM instruction. The Thumb 2-address format can always be mapped into the ARM 3-address format by replicating a register specifier
15 13 12 11 10 8 7 0

0 0 1 10
alwa ys condition

Rd

#imm8

major opcode, format 3: MOV/ CMP/ADD/SUB with immediate

minor opcode denoting ADD & set CC

destination and source register

zero shift

immediate value

31

28 27 26 25 24

21 20 19

16 15

12 11

1110 00 1 0100 1
ychang@CS.NCHU, 2012 ychang@CS.NCHU,

0 Rd

0 Rd

0000

#imm8
21

Thumb applications
Thumb Properties
The Thumb code requires 70% of the space of the ARM code. The Thumb code uses 40% more instructions than the ARM code. With 32-bit memory, the ARM code is 40% faster than the Thumb code. With 16-bit memory, the Thumb code is 45% faster than the ARM code. Thumb code uses 30% less external memory power than ARM code. So where performance is all-important, a system should use 32-bit memory and run ARM code. Where cost and power consumption are more important, a 16-bit memory system and Thumb code may be a better choice.

Thumb Systems A high-end 32-bit ARM system may use Thumb code for certain non-critical
routines to save power or memory requirements. A low-end 16-bit system may have a small amount of on-chip 32-bit RAM for critical routines running ARM code, but use off-chip Thumb code for all noncritical routines.

ychang@CS.NCHU, 2012 ychang@CS.NCHU,

22

Anda mungkin juga menyukai