Anda di halaman 1dari 19

Computer Architecture

Interlude: Example ISA ARM

Nov 14, 2013 Josef Weidendorfer (replacement today for Prof. Gerndt)

Chair for Computer Architecture LRR-TUM (Informatik 10)

Technische Universitt Mnchen

ISAs (still) Used Today


CISC x86 = IA-32 (Intel) x86_64 = Intel64 = amd64 (AMD/Intel) s390 (IBM Mainframes) RISC MIPS, SPARC, PowerPC, ARM VLIW Itanium (EPIC, Intel) Old ISAs 680x0 (old CISC, Motorola) PA-RISC, Alpha (old RISC)
WS12/13

2 / 15

ARM, The Company


ARM limited, was Advanced RISC Machines, was Acorn RISC Computer, UK Designs RISC processors for embedded systems two licence models
cores ready for production with a given manufacturing process (e.g. 28nm TSMC) cores as VHDL codes fee per core + fee per built device

for easy integration with other parts into custom chips for embedded market: SoC
includes GPU, network, controllers,
WS12/13

3 / 15

Market for ARM Processors


> 1 billon cores in Q4 2009 > 60% in mobile sets
only 7% are application processors (running Linux/Android/iOS/WP8, responsible for GUI) other cores part of controllers for GSM, network, GPS, Bluetooth, WLAN,

WS12/13

4 / 15

Processors with ARM ISA


Adhere to a ISA version ARMv6, ARMv7, ARM ISA is extensible: opcode space reserved for co-processors
customers can add its own coprocessors ARM provides optional co-processors in own implementations examples
floating point co-processor VFP-3 vector co-processor NEON

Different encoding schemes


ARM 32bit Thumb 1/2: 16bit / + mixed 32 bit Jazelle (native execution for Java Bytecode)

WS12/13

5 / 15

ISA Versions, Implementations & Features

WS12/13

6 / 15

SoCs using ARM Implementations


Samsung S5L8900
used in iPod Touch 1G, original iPhone has a ARM 1176JZ-S
familiy ARM11, adhering to ARMv6 ISA

Apple A5
used in iPad2, iPhone 4S manufactured by Samsung Cortex A9, ARMv7 ISA

Others
Texas Instruments OMAP3/4, NVIDIA Tegra2/3
7 / 15

WS12/13

Example: OMAP3 / BeagleBoard

WS12/13

8 / 15

ARMv7 ISA
RISC (Reduced Instruction Set Computing)
few, simple instructions (~32) fixed format (4 bytes for 32bit encoding)

load/store architecture
explicit instructions for memory access simple + indexed addressing modes multiple load/store with modifying index

arithmetic/logic operations combined with barrel shifter conditional execution


WS12/13

9 / 15

ARM Register File


16 general registers (r0 - r15)
some duplicated e.g. for interrupt mode

Special
r13: stack r14: link register (return address for function calls) r15: program counter CPSR: status register (+ SPSR saved)
including N/Z/C/V

NEON
separate 32 64bit / 16 128bit vector registers
10 / 15
WS12/13

Data Processing Instructions


Consist of : Arithmetic: ADD Logical: AND Comparisons: CMP Data movement: ADC ORR CMN MOV SUB EOR TST MVN SBC BIC TEQ RSB RSC

Work on registers, NOT memory. Syntax:

<Operation>{<cond>}{S} Rd, Rn, Operand2


Comparisons set flags only - they do not specify Rd Data movement does not specify Rn Second operand is sent to the ALU via barrel shifter.
11 / 15
WS12/13

Conditional execution examples


C source code if (r0 == 0) { r1 = r1 + 1; } else { r2 = r2 + 1; } ARM instructions unconditional conditional CMP r0, #0 CMP r0, #0 BNE else ADDEQ r1, r1, #1 ADD r1, r1, #1 ADDNE r2, r2, #1 B end ... else ADD r2, r2, #1 end ... 5 instructions 5 words 5 or 6 cycles 3 instructions 3 words 3 cycles
12 / 15
WS12/13

ARM Branches and Subroutines


B <label>
PC relative. 32 Mbyte range.

BL <subroutine>
Stores return address in LR Returning implemented by restoring the PC from LR For non-leaf functions, LR will have to be stacked func1 func2
: : BL func1 : : STMFD sp!, {regs,lr} : BL func2 : LDMFD sp!, {regs,pc} : : : : : MOV pc, lr

13 / 15
WS12/13

Thumb 1/2: not really RISC any longer


Code size is important for power consumption Thumb encoding uses 16bit instructions Possible by reducing flexibility, less bits for immediates, no conditional execution Fast switch between encoding modes
Bit 0 in PC specifies mode Switch modes via jumps

Thumb-2: add some 32bit instructions


flexible, without need to switch modes adds bit field manipulation, conditional execution for a set of following instructions

default for GCC with ARMv7

14 / 15
WS12/13

Current ARM implementations


Cortex A8 (ARMv7)
Single core, 16-32 kB L1I/D, 0-1 MB L2 (only L2 cache controller part of core), up to 1GHz 13 stage superscalar (dual) pipeline, in-order, branch prediction, VFP-3/NEON optional Apple A4, OMAP3 (Beagleboard), Tegra2

Cortex A9
1-4 cores, 16-64 kB L2I/D, 0-8 MB L2 speculative out-of-order Apple A5, OMAP4 (Pandaboard), Tegra3 (Nexus 7)

Cortex A15
more aggressive OoO, more stages, VFP-4, OMAP5, Samsung Exynos 5x (Nexus 10) can compete with current x86 Atom processors

15 / 15
WS12/13

ARMv8: 64bit for Servers


new 64bit mode: AArch64 still supports 32bit = AArch32 completely new instruction set
instructions still 32bit operands and addresses 64bit 31 general registers with 64bit conditional execution removed (good branch prediction gives better performance ?!) mandatory extended NEON: 32 regs (128bit)

for massively parallel server loads (simple, but lots of cores?) implementations expected for 2014 by AMD, Broadcom, Samsung,

16 / 15
WS12/13

More information
ARM Architecture Reference Manual (ARM ARM) CRE podcast about ARM, Wikipedia @TUM: new introductionary lab course with recently sponsored BeagleBoards (~40) inauguration event at Nov 23, 2013 with representatives from ARM, TI, STMicro

17 / 15
WS12/13

TOP 500: November 2012


in June 2012 NEW 1 2 3 NEW 4 NEW
18 / 15
WS12/13

TOP 500: November 2012

19 / 15
WS12/13

Anda mungkin juga menyukai