14ARM

14. ARM Architecture and Instruction Sets.
14.1. Introduction
The MOS Technology 6502 was used in many personal computers121 including the BBC
Computer from ACORN. In the early 1980s ACORN were looking to upgrade their
product and designed a RISC device known as the ACORN RISC Machine or simply
ARM (version V1). Apple Computers, ACORN and VLSI combined to further develop
the ARM and it is now known as the Advanced RISC Machine. The ARM business was
converted to sell processor cores the first product being the ARM6.
The ARM CPU is contained in a range of devices from a range of manufacturers. This has
lead to a diverse family line as illustrated in figure 151. Currently the CPU core is known
as Cortex. There are 3 different Cortex cores available: the A range which is enhanced for
large applications, the R range enhanced for real time operations and the M range for
microcontrollers. Within the M series there is the mid range M3, the larger M4 with DSP
(Digital Signal Processing) capability and the smaller M0 which is designed to compete
with 8 and 16 bit microcontrollers122.
ARM1
ARM7
ARM9
ARM11
A5
Cortex A8
A9
A15
Applications
R4
Cortex R5
R7
Realtime
M4
Cortex M3
M0
Microcontroller
Figure 151. Evolution of the ARM Processors

In the A or applications series A5 is the entry level device while the A9 is for
multiprocessor operations. The R4 incudes high speed interrupt handling capacity for real
time applications while the R5 and R7 are enhancements of the R4.
Included in the evolution of the CPU core was the evolution of the instruction set starting
with the ARM V1 that had high performance but low code density, through the Thumb
which sacrificed performance for code density to Thumb 2 that achieved both high
performance and high code density.
14.2. The Cortex Core

With the ARM core different manufacturers added their own implementation of features
such as exception and interrupt handling. See figure 152. The Cortex enhanced/expanded
the ARM to include system features such as uniform interrupt handling, system tick timer,
debug system and memory map. See figure 153.
121
Personal computer here is used in the general sense and applies to the era prior to a personal computer
implying an IBM compatible PC.
122
Not shown in the figure is the M1 variant which is designed to be implemented into FPGA. (Field
programmable gate arrays)
Microcontroller Systems: Notes for Google sites. John Kneen
Page 135 of 510
NVIC Interface
ETM Interface
Hardware Divider
32 Bit ALU
Single cycle 32 bit multiplier
Control Logic
Thumb/Thumb2 Decode
Instruction Interface
Data Interface
Figure 152. ARM CPU Core.

The Cortex M3 has a 4GByte memory space that is split into well defined areas for code,
SRAM, peripherals ands system peripherals (Figure 154). In contrast to its precessor the
ARM7 the Cortex M3 has a Havard bus architecture123 that allows it to perform
operations in parallel. Further the Cortex M3 allows unaligned data access.
Configurable
NVIC
DAP
ARM Core
ETM External
Trace Macro
Figure 152.
Serial Wire
Viewer
Memory Protection Unit
Data Watch
Points
Flash Patch
Bus Matrix
Code Interface
SRAM & Peripheral Interface
Figure 153. Cortex CPU Core
123
Separate address and data bus.
Page 136 of 510
0xE0000000->0xFFFFFFFF Internal/Private Peripherals

0xA0000000->0xDFFFFFFF External Devices
0x60000000->0xBFFFFFFF External Memory
0x40000000->0x5FFFFFFF Peripherals
0x20000000->0x3FFFFFFF SRAM
0x00000000->0x1FFFFFFF Code/FLASH
Figure 154. Cortex M3 Memory Map
14. 3. The ARM Registers.

One of the characteristics of a RISC machine is that there are a large number of identical
registers. It is possible to perform all operations on any register. In terms of the 6502 of
figure 30 the registers A, X and Y could all be used as either accumulators or index
registers124. Figure 155 illustrates the registers in the ARM microcontroller.
A second characteristic of RISC machines is that instructions are one byte (8 bit
machines), one word (16 bit machines) or one double word (32 bit machines) in length. In
terms of the 6502 the LDA #xx instruction would need to be modified to fit into one
byte. The equivalent ARM instruction is the mov. With the ARM processor there is no
mov instruction that allows a general125 32 bit number to be directly moved into one of
the registers. Likewise there is no mov instruction that directly specifies a 32 bit source
or destination address.
124
This list would also include the stack pointer, program counters and the status registers.
The word general is used here as it is possible to move small numbers and patterns. For example there is
code for mov #0x33333333 where in this case each nibble of the operand is the same. However, mov
#0x12345678 is not a valid instruction.
125
Page 137 of 510
R0
General purpose low registers.

R7
R8
General purpose high registers.

R12
R13 (MSP)
R14
R13 (PSP)
Main & Process Stack pts
LR
PC
R15
XPSR
Interrupt Mask
Registers
Special
Registers
Control
Figure 155. ARM Registers
14.4. Assembly Language Instructions.

The MOV instruction.
The syntax for the move instruction is:
MOV Rd,operand
where Rd is the destination register and operand can be another
register or an immediate number.
Since the instructions are only 32 bits long the mov instruction cannot be used to perform
a data load from memory or to load an arbitrary 32-bit immediate constant into a register
in a single instruction.
The MOV instruction can load
Any 16 bit immediate number. Eg 0x1234.
Any 8 bit constant shifted left by any number. Eg 0x56000000.
Any 8-bit bit pattern duplicated in all four bytes. Eg 0x33333333.
Any 8-bit bit pattern duplicated in bytes 0 and 2, with bytes 1 and 3 set to 0. Eg
0x00560056.
Any 8-bit bit pattern duplicated in bytes 1 and 3, with bytes 0 and 2 set to 0. Eg
0x78007800.
Other forms of the MOV instruction are:
MOVT
Load the 16 bit immediate operand to the top 16 bits of the
destination register.
MVN
Load the complement of the operand into the destination register.
MVN is only available with the 8 bit operands or the patterns noted above.
Page 138 of 510
MOV32
A pseudo instruction or directive. The assembler will generate two
instructions MOV and MOVT to perform the operation. Eg. MOV32
R0,#0x12345678 becomes MOV R0,0x5678 and MOVT R0,0x1234126
MOVS
Adding an S will change the status bits as appropriate.
Eg MOVS R0,R1 will set the Z flag if the contents of R1 are zero.
The Load and Store Instructions.
The load LDR and store STR instruction have two forms
LDR Rt,[Re+offset] / STR Rt,[Re+offset]
In this form the register Re plus the offset defines the effective memory address and Rt
the target register. Often the program counter is used as Re. Consider the following code
0x080001234 LDR R0,[pc+0x120]
0x080001238 Next instruction
Other instructions
End of function/procedure
0x080001358 DCD 0x12345678
0x08000135C more data
Figure 156. Sample code fragment using the LDR instruction.
1.
The LDR instruction will be decoded as LDR R0,[pc+0x120]
2.
The offset 0x120 will be added to the current program counter to give an effective
address of 0x08001238+0x0120 = 0x08001358.
3.
The contents of memory location 0x08001358 are loaded into register R0. R0
becomes 0x12345678.
LDR Rd,= Address / STR Rd,=Address
The second form of the load and store instructions is actually a pseudo instruction to the
assembler. Since the 32 bit operands are not permitted the assembler generates two
program lines:
The address or operand is placed in memory using the define constant double
(DCD) constant pseudo.
At run time this Address is loaded into the register using the indirect addressing
[pc127+offset] where the assembler has calculated the offset.
In summary the LDR Rd,=Address will generate the code shown bold in figure 156.
Figure 157 expands the example for the situation where the contents of a memory address
need to be loaded into a register. The address is first loaded into one register as per figure
156 and then this register is used as the pointer or index to the target memory.
R0 =
;Require contents location 0x20000800 to be loaded into register R1

;Pseudo Code LDR R0 = 0x20000800 ;address of location with data
LDR R0,[PC+offset]
Assembler replaces
LDR R1,[R0]
Pseudo code with
Etc
Offset
these 2 lines.
DCD 0x20000800
R1 =
Program Counter
when offset
calculated.
20000800
DCD 0x12345678
Figure 157. Loading a register with the contents of memory.

Comparison with a CISC machine.
As an example assume it is required to load a value into a IO_register. In C language the
code would be of the form:
126
Note the order is important. The MOV instruction will clear the upper 2 bytes whereas the MOVT
instruction will not modify the lower 2 bytes.
127
PC equates to R15.
Page 139 of 510
IO_register = 0x01234567;
For a CISC this would be implemented with an assembly language instruction of the
form:
MOV IO_register,#0x01234567
This instruction would occupy 3 double words in memory and would take 4 clock cycles
to execute. See figure 158.
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Fetch the Instruction

Decode the instruction and read the first operand
Fetch the second operand
Execute the instruction
Figure 158. CISC Instruction execution.

Following figure 157 a compiler for the ARM will translate this same operation into the
assembly language code of figure 159.
;Code area of memory
;offset1 = offset from current PC to where address of IO_Register is placed
;offset2 = offset from current PC to where value is placed
LD
R0,[PC+offset1]
LD
R1,[PC+offset2]
ST
R1,[R0 + 0]
;ROM Data
DCD
DCD
Address of IO_Register
Value to be stored (0x01234567)
Figure 159. RISC assembly language code to initialise IO_register.

While the work of the compiler for the CISC is relatively straight forward with the ARM
it is more complex. Both the address of the IO_register and the value to be stored are
placed into ROM at compile time. At run time these are loaded into registers using
indexed addressing where the program counter plus an offset is the index. Part of the
compilers job would be to calculate the value of the offset. The operation is completed
using an indexed store instruction where R0 contains the target address.
The operation as illustrated will require 5 double words in memory128. It will take 3 cycles
to execute. Note there will be some performance gains if a second IO_register is to be
initialised. There will be no need to reload an address. The store instruction could be
modified to
ST
R1,[R0 + offset]
128
This explanation assumes the original ARM instructions. With the Thumb2 instruction set there will be a
reduction in the size of the instructions.
Page 140 of 510
where offset is the difference between the new address and the value in R0. This will add
some complexity to the compiler.
Arithmetic and Logic Operations
Mathematical and logic operations using the ARM are of the following form:
ADD Rd,Rs,operand2
where Rd is the destination register
Rs is one of the source registers or operand1
operand2 is the second operand and could be a register or an
immediate number as summarised with the mov instruction.
Note appending an S to the instruction will cause the flags to be modified as appropriate.
Link and stack operations.
All programs use functions or subroutines. Since the same block of code may be used
many times subroutines reduce the memory requirements. However there is a
performance penalty in pushing and popping return addresses on and off the stack129.
The ARM addresses the performance challenge by using an intermediate register known
as the link register.R14. On a jump to subroutine or function call a register to register
transfer occurs, the program counter R15 being transferred to the LR. The operation is
reversed for the return.
Stack = LR = ret add1
PC = address fn1.
LR = ret address2
void fn1()
{
}
BX LR
void fn
{
PUSH {LR}
fn1(); BL.W fn1
PC = address fn()
LR = ret address1
}
POP {PC}
void main( )
{
fn(); BL.W fn
}
LR PC
Stack PC
Figure 160. Saving the return address in the ARM link register.
Consider the example of figure 160. The main program will invoke the function fn( ). The
assembly instruction to do this is BL.W fn. The BL instruction will
(i) Place the return address into the link register. This is effectively a MOV
R14,R15 operation, a register to register operation which the ARM processor has
been optimised to perform.
(ii) Branch to the function. This is effectively a MOV R15,[R15,offset] another
register instruction the ARM has been optimised to perform.
The first function fn() invokes a second function fn1( ). If the previous procedure is
repeated the return address would be lost as R14 the link register is over written. Hence at
the start of the first function the LR is pushed onto the stack.
If no further functions are called, as in fn1(), the PUSH to save the LR is not required.
To exit from a function, that is to execute a return from subroutine, the instruction is BX
LR. This is effectively a MOV R15,R14 instruction.
129
For the software functions have the advantages that they may be written once and reused, they are easier
to test and make the code more readable.
Page 141 of 510
Since the first function saved the LR on the stack it should be returned before the exit.
The full code will be POP {LR} to return the LR and BX LR to return from the function.
This operation may be optimised by bypassing the LR and placing the return address
directly into the program counter. That is POP {PC}
If-then-else operations.
Microcontrollers make decisions. As a test results in a true condition the microcontroller
may execute one path in the program or if it is false it may execute a second. Traditionally
there can be significant overheads in the assembler code to execute the if-then-else
operations. See figure 161.
;if (R0 = R1)
_if_then_else
cmp r0,r1
bne _else
; then
{ then operations }
;handle true condition
bra _exit
else
{else operations}
_else
;handle else condition
_exit
Figure 161. Structure of if-then-else statement.

One of the possibilities with the ARM instructions is conditional execution. For example
MOVEQ R0,R1 is only executed when the Z flag is set. This may be used for an if-then
statement when there is only one conditional instruction.
With the conditional instruction as above the assembler will insert an if-then (IT)
instruction. That is MOVEQ R0,R1 becomes:
IT
EQ
MOVEQ R0,R1.
The IT instruction can be enhanced to include up to four conditional instructions. For
example with 2 true and 2 else instructions sample code will be:
ITTEE130 GT
ADDGT R0,R1
ADDGT R1,R2
ADDLE R1,R0
ADDLE R2,R1
14.5. Assembly Language Example.

1.
The code template or null code.
Figure 162 gives working code that matches the ARM assembler syntax. Starting in
column 1 are the labels used. Across at least one space are the instructions followed by
the operands. In this case there is only one instruction, the B or branch always131. The
remaining instructions are all pseudo instructions or directives to the assembler.
130
Combinations between IT and IT <x><y><z> where x, y and z are either T or E are possible. There must
be at least one true conditional instruction. After the first true/then instruction the then and else may be
mixed. For example ITETE is acceptable. The conditions on the instructions must match the IT instruction.
131
The assembler will accept the syntax B . where . represents the current program counter. The B Loop
has been used here just for clarity.
Page 142 of 510
AREA
EXPORT
RESET, DATA, READONLY

__Vectors
DCD
DCD
0
Reset_Handler
AREA
EXPORT
|.text|, CODE, READONLY

Reset_Handler
__Vectors
Reset_Handler
;
Loop
PROC
B
ENDP
;<<< Add CODE here

Loop
;<<<<Add DATA here
END
Figure 162. Code template block.

The AREA pseudo defines where the next block of code will be located. The first block
will be in the area RESET and it is data that is read only. This area contains all the vectors
and the assembler will assemble all the data starting at location 0x00000000 the
reset/vector area in the ARM. As given only two vectors are defined. Since this program
does not use a stack the stack pointer is specified as 0x00000000132. At location 0x04 the
ARM expects to find the start address of the program. This is given the label
Reset_Handler. The assembler will determine the value to be placed at address 0x04.
The second AREA is the code. The assembler will generate this code starting at location
0x08000000 the start address of Flash memory in the ARM.
If any label is required by other programs the pseudo EXPORT is used133.
2. Program to add an array of numbers.
This example will write a fragment of code in assembly to add 16 bytes.
The first step is to draw up a graphical representation of what the program is to do. Only
when this is satisfactory should coding commence. See figure 163.
132
While in this example the stack pointer is not required a value is necessary to move the assembler
forward to the next location which is required.
133
The ARM simulator and debugger on start up will run to the label Reset_Handler so this label is
required. Users cannot substitute their own.
Page 143 of 510
Initialise all variables to be used.

Next
Add next number to running sum.
Increment counters, pointers

More
Done?
Yes -exit
Figure 163. Flow chart of proposed program.

Initialising the variables.
The first step is to decide what variables are required. In this example it is anticipated that
4 will be required. One will contain the number of numbers in the array to be added (16),
a second to count how many numbers have been added, the third the partial result and the
fourth will to point to the memory location containing the number to be added. These
variables will be arbitrarily assigned to registers R0, R1, R2 and R3134.
Since R0, R1 and R2 are initialised to small numbers mov rd,#num may be used.
However R3 will contain the address of the array or table so will be a 32 bit number. The
ldr r3,=Address instruction is used. See section 14.4. The code will be:
mov r0,#16
;number of entries in tables
mov r1,#0
;counter
mov r2,#0
;clear result
ldr
r3,=Tbe
;address of table
This code may be entered and its operation verified by single stepping in either the
debugger or simulator and observing the register contents.
Adding the next number.
The next task is to add the number pointed to by R3 to the partial sum contained in R2.
The ADD instruction will add the contents of registers so the number must be transferred
to a register for example R4. The full code will be:
Next
ldrb r4,[r3]
;load value in table
add
r2,r2,r4
;add to partial result
Incrementing pointer and counter.
In this example the code will be:
add
r3,#1
;point to next
add
r1,#1
;how many done
Testing if the operation is complete.
134
All designs are an iterative process. At this point the number of variables and their assignment is not cast
in concrete. Changes can be made as necessary.
Page 144 of 510
To test if all numbers have been added the counter R1 will be compared with the number
of elements R0 using the compare instruction. If these are equal the Z flag is set. If they
are not equal Z is cleared. When the result is not equal the addition must be repeated. The
required code is:
cmp r1,r0
;all done ?
bne
Next
;no - do next
The array of numbers.
If the data is fixed it will be placed in ROM. Typically it will be coded between the ENDP
and END directives in the code template of figure 162. Possible data might be:
AREA Table, DATA, READONLY
Tbe
DCB 0x12,0x34,0x56,0x78,0x21,0x43,0x65,0x87
DCB 0x93,0x39,0x05,0x50,0x75,0x57,0x21,0x81
The table has been given the label Tbe. Each element in the table is a byte defined by
the directive define constant byte DCB. Since each element is a byte it is loaded into R4
using the LDRB instruction and the pointer is incremented by one after each pass.
If the data were 32 bits (DCD directive) the LDR rather than the LDRB instruction would
be used and the pointer (R3) incremented by 4 after each pass.
14.6. STM32F107-ARM Architecture.

The ARM group developed the CPU and debug core and sell the design to chip
developers to use in their products. Over the years the cpu has evolved through different
variants. The Keil MCBSTM32C uses the STMicroelectronics STM32F107VC device.
This device uses the Cortex M3 cpu that is in turn based on the ARM-7 core.
As illustrated in figure 164 the Cortex M3 core and the debug system are developed by
ARM and licensed to different manufacturers who include in their silicon their own
memory, peripherals and input output features.
Cortex M3
Core
Debug
System
Internal Bus
Peripherals
Memory
Clock &
Reset
Input Output.
Figure 164. Block diagram of the ARM microcontroller.
14.7. Thumb and Thumb2 Instruction Sets.

The original ARM microcontrollers used the 32 bit ARM instruction encoding. To
improve code density later variants used the Thumb instruction set. This instruction set
uses 16 bit encoding of a subset of the ARM instructions. While permitting denser code
Page 145 of 510
this instruction set often required multiple instructions to duplicate the full ARM
instruction set.
The Thumb2 instruction set was introduced in 2003 and enhances the Thumb Instruction
Set with additional 32 bit instructions. Overall the Thumb2 gives a code density similar to
the Thumb but with a performance similar to the original ARM. Figure 165 illustrates the
relationship between the ARM, Thumb and Thumb2 instruction sets. As illustrated
Thumb2 enhances the ARM instruction set with some additional instructions. The
STMicroelectronics STM32F107VC used in the Keil MCBSTM32C EVB uses the
Thumb2 instruction set.
ARM Instruction Set
Thumb2
Instruction
Set
Thumb Instruction Set
Figure 165. ARM and Thumb2 Instruction Sets
14.8. ARM/Thumb2 Instruction Execution.

Single stepping or simulating the program of section 14.5 gives the results of figure 166.
Address
Operand Instruction
Explanation/Result
0x08000020
0x08000024
0x08000028
0x0800002C
F04F0010
F04F0100
F04F0200
4B09
MOV
MOV
MOV
LDR
r0,#0x10
r1,#0x00
r2,#0x00
r3,[pc,#36];
0x0800002E
0x08000030
0x08000032
0x08000036
0x0800003A
781C
4422
F1030301
F1010101
4281
LDR
ADD
ADD
ADD
CMP
r4,[r3]
r2,r2,r4
r3,#0x01
r1,#0x01
r1,r0
0x0800003C
D1F7
BNE 0x0800002E
R0 initialised to 0x10. (16 decimal)

R1=0
R2=0
Replaces LDR r3,=Tbe where pc+36 =
0x08000054. At this address is 0x08000058
address of Tbe
R4=0x12 the data at address given by R3
R2=R2+R4 = 12 1st pass. 12+34 = 0x46 2nd pass
R3 = R3+1 the next address in array
R1 = R1+1 how far into array counted
Compare registers.
PSR = 0x81000000 not equal
PSR = 0x61000000 equal.
Z flag is bit 30. See Yiu Table 4.25
Branch to location 800002E if Z flag cleared.
Otherwise continue.
Figure 166. Program trace.

The first three instructions are full 32 bit instructions. The first instruction commences at
address 0x08000020 and the program counter progresses by 4 bytes after each instruction.
The next 3 instructions are 16 bit instructions so the program counter progresses by 2 after
each instruction.
Each area of code must start with the last nybble of the address at xxxxx0, 4, 8 or C. As
implied in the example one area of code might end at address xxxxx2, 6, A or E in which
case the assembler will need to add an extra 2 bytes135.
The ARM Cortex M3 processor has a three stage pipeline illustrated in figure 167. This
figure represents register to registers instructions. Other instructions may take multiple
cycles to execute which will stall the pipeline. When executing branch instructions, the
pipeline will be flushed. The processor will have to fetch instructions from the branch
135
The assembler will give a warning message. Normally no action is required by the programmer.
Page 146 of 510
destination to fill up the pipeline. To reduce the overhead of branch instruction the
processor includes a prediction unit that will calculate the branch address so the
destination address is available if required.
Instruction
N
N+1
N+2
(Branch)
Next
Branch
Taken
1
Fetch
2
Decode
Fetch
3
Execution
Decode
Fetch
Execution
Decode
Execution
Fetch
Decode
Execution
Figure 167. The ARM-Cortex pipeline.

With 16-bit instructions the processor will fetch up to two 16 bit instructions in one go so
the processor may not fetch instructions in every cycle since the next one is already inside
the processor.
The instruction pre-fetch unit of the processor core contains an instruction buffer that
prevents the pipeline being stalled when the instruction sequence contains 32-bit Thumb-2
instructions that are not word aligned136.
14.9. Review Questions

1.
2.
3.
4.
5.
6.
7.
8.
9.
What is a RISC processor?

With the ARM instructions of the form mov r0,#0x12345678 are [valid / invalid].
The instruction LDR R5,=Mem is a valid [ARM instruction / assembler directive].
The instruction MOV R0,#0 [will/will not] set the Z flag.
The instruction ADDS R0,R1,R2 [will/will not] set the Z flag.
Thumb instructions are .. bits.
Thumb instructions are a [subset/super set] of the ARM instructions.
Thumb2 instructions are an extension of the Thumb instruction set. [True/false]
Thumb2 instructions contain [all/most] of the ARM instruction set.
14.10. Exercises.
1.
2.
3.
In the example of section 14.5 one register (R0) was used to contain the size of the
array while a second (R1) was used to count the elements of the array that had
been used. When R1=R0 the operations were complete. An alternative approach is
to set the counter to the size of the array and on each operation decrement the
counter. Modify the code to use this technique/approach137.
The following data is stored in memory:
Tbe
DCB 0x12,0x34,0x56,0x78,0x21,0x43,0x65,0x87
DCB 0x93,0x39,0x05,0x50,0x75,0x57,0x21,0x81
Using the flowchart of figure 168 as a guide write the code to sort the data and
place into a second table at address Tbe2.
Register-register operations in the ARM microcontroller take one clock cycle.
With a 50MHz clock each instruction takes 20ns. Some instructions take
additional cycles. When the branch instruction is taken the pipeline must be
flushed. Assume that figure 167 describes the timing profile of critical code.
Estimate the average instruction time.
136
The last 2 ADD instructions of figure 166 are examples of 32 bit instructions that are not word aligned.
In undertaking this exercise review the consequences of adding S to the instruction. Ie ADDS instead
of ADD, SUBS instead of SUB etc.
137
Page 147 of 510
Counter1 =16
Counter2 =15
Pointer to first entry.
Compare with next entry.

Larger
Swap entries.
Increment pointer. Decrement Counter2.
Not Done
Decrement Counter1
Not Done
Figure 168. Flow chart of proposed program to sort numbers.
14.11. Appendix: ARM Instruction Set.

Mnemonic Description
ADC
ADD
AND
ASR
B
BIC
BKPT1
BL
BLX1
BX
CMN
CMP
EOR
LDMIA
LDR
LDRB
LDRH
LDRSB
LDRSH
LSL
LSR
MOV
Add with carry

Add constant or register to register
Logical AND
Arithmetic shift right
Branch conditional or unconditional
Bit Clear
Enter debug state Breakpoint
Branch with link
Branch with link and exchange
Branch and exchange
Compare negated register
Compare constant or register
Exclusive OR
Load multiple registers from memory
Load 32-bit word from memory
Load byte from memory
Load 16-bit half-word from memory
Load signed byte from memory
Load signed 16-bit half-word from memory
Logical Shift Left
Logical Shift Right
Move constant or register to register
Page 148 of 510
MUL
MVN
NEG
ORR
POP
PUSH
ROR
SBC
STMIA
STR
STRB
STRH
SUB
SWI
TST
Multiply register
Move inverted register to register
Negate
Logical OR
Load multiple registers from stack
Save multiple registers to stack
Rotate right
Subtract with Carry
Store multiple registers to memory
Store 32-bit word to memory
Store byte to memory
Store 16-bit half-word to memory
Subtract constant or register
Call software interrupt function
Test bits
Figure 169. ARM Thumb Instruction Set

Mnemonic Operation
ADC
ADD
AND
ASR
B
Rd := Rd + Rm + C
Rd := Rn + { imm, Rm }
Rd := Rd & Rm
Rd := (signed) Rm >> { imm5, Rs }
R15 := label
BIC
BKPT1
BL
BLX1
Rd := Rd AND NOT Rm
BX
CMN
CMP
EOR
LDMIA
R14 := address of next instruction R15 := label

R14 := address of next instruction R15 := Rm AND
0xFFFFFFFE
R15 := Rm AND 0xFFFFFFFE
Set CPSR flags on: Rn + Rm
Set CPSR flags on: Rn - { imm8, Rm }
Rd := Rd EOR Rm
Load register list
LDR
LDRB
LDRH
LDRSB
LDRSH
Rd := [address][31:0]
Rd := ZeroExtend ([address][7:0])
Rd := ZeroExtend ([address][15:0])
Rd := SignExtend ([address][7:0])
Rd := SignExtend ([address][15:0])
LSL
LSR
MOV
Rd := Rm << { imm5; Rs }
Rd := (unsigned) Rm >> { imm5, Rs }
Rd := { imm, Rm }
MUL
MVN
NEG
ORR
POP
PUSH
ROR
SBC
STMIA
STR
STRB
Rd := Rm * Rs
Rd := NOT Rm
Rd := -Rm
Rd := Rd OR Rm
Pop register list
Push register list
Rd := Rd ROR Rs[7:0]
Rd := Rd Rm - NOT C
Store register list
[address][31:0] := Rd
[address][7:0] := Rd[7:0]
Description
Add with carry
Add constant or register to register
Logical AND
Arithmetic shift right
Branch conditional or
unconditional
Bit Clear
Enter debug state Breakpoint
Branch with link
Branch with link and exchange
Branch and exchange
Compare negated register
Compare constant or register
Exclusive OR
Load multiple registers from
memory
Load 32-bit word from memory
Load byte from memory
Load 16-bit halfword from memory
Load signed byte from memory
Load signed 16-bit halfword from
memory
Logical Shift Left
Logical Shift Right
Move constant or register to
register
Multiply register
Move inverted register to register
Negate
Logical OR
Load multiple registers from stack
Save multiple registers to stack
Rotate right
Subtract with Carry
Store multiple registers to memory
Store 32-bit word to memory
Store byte to memory
Page 149 of 510
STRH
SWI
SUB
TST
[address][15:0] :=Rd[15:0]
Software interrupt
Rd := Rn - { imm; Rm }
Set CPSR flags on: Rn AND Rm
Store 16-bit halfword to memory

Call software interrupt function
Subtract constant or register
Test bits
Page 150 of 510

14ARM

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

14ARM

Diunggah oleh

Hak Cipta:

Format Tersedia

14. ARM Architecture and Instruction Sets.

Figure 151. Evolution of the ARM Processors

14.2. The Cortex Core

Microcontroller Systems: Notes for Google sites. John Kneen

Page 135 of 510

Figure 152. ARM CPU Core.

Memory Protection Unit

SRAM & Peripheral Interface

Figure 153. Cortex CPU Core

Separate address and data bus.

Microcontroller Systems: Notes for Google sites. John Kneen

Page 136 of 510

0xE0000000->0xFFFFFFFF Internal/Private Peripherals

Figure 154. Cortex M3 Memory Map

14. 3. The ARM Registers.

Microcontroller Systems: Notes for Google sites. John Kneen

Page 137 of 510

General purpose low registers.

General purpose high registers.

Main & Process Stack pts

Figure 155. ARM Registers

14.4. Assembly Language Instructions.

Microcontroller Systems: Notes for Google sites. John Kneen

Page 138 of 510

;Require contents location 0x20000800 to be loaded into register R1

Figure 157. Loading a register with the contents of memory.

Microcontroller Systems: Notes for Google sites. John Kneen

Page 139 of 510

Fetch the Instruction

Figure 158. CISC Instruction execution.

Figure 159. RISC assembly language code to initialise IO_register.

Microcontroller Systems: Notes for Google sites. John Kneen

Page 140 of 510

Microcontroller Systems: Notes for Google sites. John Kneen

Page 141 of 510

Figure 161. Structure of if-then-else statement.

14.5. Assembly Language Example.

The code template or null code.

Microcontroller Systems: Notes for Google sites. John Kneen

Page 142 of 510

RESET, DATA, READONLY

|.text|, CODE, READONLY

;<<< Add CODE here

Figure 162. Code template block.

Microcontroller Systems: Notes for Google sites. John Kneen

Page 143 of 510

Initialise all variables to be used.

Increment counters, pointers

Figure 163. Flow chart of proposed program.

Microcontroller Systems: Notes for Google sites. John Kneen

Page 144 of 510

14.6. STM32F107-ARM Architecture.

Figure 164. Block diagram of the ARM microcontroller.

14.7. Thumb and Thumb2 Instruction Sets.

Page 145 of 510

Thumb Instruction Set

Figure 165. ARM and Thumb2 Instruction Sets

14.8. ARM/Thumb2 Instruction Execution.

R0 initialised to 0x10. (16 decimal)

Figure 166. Program trace.

Microcontroller Systems: Notes for Google sites. John Kneen

Page 146 of 510

Figure 167. The ARM-Cortex pipeline.

14.9. Review Questions