Anda di halaman 1dari 27

Advanced Computer Architecture

Conditions of Parallelism
Conditions of Parallelism
The exploitation of parallelism in computing
requires understanding the basic theory associated
with it. Progress is needed in several areas:

computation models for parallel computing


interprocessor communication in parallel architectures
integration of parallel systems into general environments
Data and Resource Dependencies
Program segments cannot be executed in parallel unless
they are independent.
Independence comes in several forms:
Data dependence: data modified by one segement must not be
modified by another parallel segment.
Control dependence: if the control flow of segments cannot be
identified before run time, then the data dependence between the
segments is variable.
Resource dependence: even if several segments are independent
in other ways, they cannot be executed in parallel if there aren’t
sufficient processing resources (e.g. functional units).
dependence graph
A dependence graph is used to describe the
relations between the statements.
The nodes corresponds to the program
statements and directed edges with different labels
shows the ordered relation among the statements.
Data Dependence - 1
The ordering relationship between statements is
shown by data dependence .following are the five
types data dependences:
Flow dependence(denoted by S1 S2): A
statement S2 is flow dependent on statement S1,if
an execution path is exists from S1 to S2 and if
atleast one output of 1 is used as input to S2.
Anti dependence(denoted by S1 S2): Statement
S2 is independent on statement S1 if S2 follows S1
in program order and if output of S2 overlaps the
input of S1.
Data Dependence - 2
Output dependence (denoted by S1 S2): Two
statement are output dependent if they produce the
same output variable.

I/O dependence (denoted by ): Read and


write are I/O statements . I/O dependence occure
when the same file is referenced by both I/O
statements.
Data Dependence - 3
Unknown dependence:
The subscript of a variable is itself subscripted.
The subscript does not contain the loop index variable.
A variable appears more than once with subscripts
having different coefficients of the loop variable (that is,
different functions of the loop variable).
The subscript is nonlinear in the loop index variable.
Parallel execution of program segments which do
not have total data independence can produce
non-deterministic results.
Example………….
I/O dependence example

S1: Read (4), A(I)


S2: Rewind (4)
S3: Write (4), B(I) S1 I/O
S3
S4: Rewind (4)

EENG-630 - Chapter 2
Control Dependence
Control dependency arise when the order of
execution of statements cannot be determined
before runtime.
For example , conditional statements (IF) will not
be resolved until runtime. Differnet paths taken
after a conditional branch may introduce or
eliminate data dependence among instruction.
Dependence may also exits between operations
performed in successive iteration of looping
procedures.
Control Dependence

Dependence may also exist between operations


performed in successive iteration of looping
procedure.
The successive iteration of the following loop are
control-independent. example:
for (i=0;i<n;i++) {
a[i] = c[i];
if (a[i] < 0) a[i] = 1;
}
Control Dependence
The successive iteration of the following loop are control-
independent. example:
for (i=1;i<n;i++) {
if (a[i-1] < 0) a[i] = 1;
}
Compiler techniques are needed to get around
control dependence limitations.
Resource Dependence
Data and control dependencies are based on the
independence of the work to be done.
Resource independence is concerned with
conflicts in using shared resources, such as
registers, integer and floating point ALUs, etc.
ALU conflicts are called ALU dependence.
Memory (storage) conflicts are called storage
dependence.
The transformation of a sequentially coded
program into a parallel executable form can be

done manually by the programmer using explicit


parallism
Or done by a compiler detecting implicit parallism
automatically .
In both the approaches ,the decomposition of
program is primary objective.
Bernstein’s Conditions - 1
Bernstein’s revealed a set of conditions which
must exist if two processes can execute in parallel.
Notation
Let P1 and P2 be two processes .
Ii is the set of all input variables for a process Pi .
Oi is the set of all output variables for a process Pi .
If P1 and P2 can execute in parallel (which is written
as P1 || P2), then:
I1 ∩ O2 = ∅
I 2 ∩ O1 = ∅
O1 ∩ O2 = ∅
Bernstein’s Conditions - 2
In terms of data dependencies, Bernstein’s
conditions imply that two processes can execute in
parallel if they are flow-independent, ant
independent, and output-independent.
The parallelism relation || is commutative (Pi || Pj
implies Pj || Pi ), but not transitive (Pi || Pj and Pj || Pk
does not imply Pi || Pk ) . Therefore, || is not an
equivalence relation.
Intersection of the input sets is allowed.
Hardware And Software Parallelism
For implementation of parallelism , special
hardware and software support is needed .
Compilation support is also needed to close to gap
between hardware and software .
Hardware Parallelism
Hardware parallelism is defined by machine architecture
and hardware multiplicity.
Hardware parallelism is a function of cost and performance
trade offs.
It displays the resources utilization patterns of
simultaneously executable operations.

It can also indicate the peak performance of the


processor resources.
It can be characterized by the number of
instructions that can be issued per machine cycle.
If a processor issues k instructions per machine
cycle, it is called a k-issue processor.
Conventional processors are one-issue machines.
Examples. Intel i960CA is a three-issue processor
(arithmetic, memory access, branch). IBM RS-
6000 is a four-issue processor (arithmetic, floating-
point, memory access, branch).
A machine with n k-issue processors should be
able to handle a maximum of nk threads
simultaneously.
Software Parallelism
Software parallelism is defined by the control and
data dependence of programs, and is revealed in
the program’s flow graph.
It is a function of algorithm, programming style, and
compiler optimization.
The program flow graph display the patterns of
simultaneously executable operations
Parallelism in a program varies during the
execution period.
Mismatch between software and
hardware parallelism - 1

Cycle 1 L1 L2 L3 L4
Maximum software
parallelism (L=load,
Cycle 2 X1 X2 X/+/- = arithmetic).

Cycle 3 + -

A B
Mismatch between software and
hardware parallelism - 2
L1 Cycle 1

Same problem, but L2 Cycle 2


considering the
X1 L3 Cycle 3
parallelism on a two-issue
superscalar processor. L4 Cycle 4

X2 Cycle 5

+ Cycle 6

- Cycle 7
A

B
Mismatch between software and
hardware parallelism - 3
L1 L3 Cycle 1

L2 L4 Cycle 2

Same problem, X1 X2 Cycle 3


with two single-
issue processors
S1 S2 Cycle 4

L5 L6 Cycle 5
= inserted for
synchronization + - Cycle 6

A B
Types of Software Parallelism
Control Parallelism – two or more operations can
be performed simultaneously. This can be
detected by a compiler, or a programmer can
explicitly indicate control parallelism by using
special language constructs or dividing a program
into multiple processes.
Control Parallelism appear in the form of pipelining
or multi-functional units which is limited by pipeline
length and by the multiplicity of functional units.
Both are handle by the hardware .
Types of Software Parallelism
Data parallelism – multiple data elements have the
same operations applied to them at the same time.
This offers the highest potential for concurrency
(in SIMD and MIMD modes). Synchronization in
SIMD machines handled by hardware.
Data parallel code is easier to write and to
debug .
Solving the Mismatch Problems
Develop compilation support
Redesign hardware for more efficient exploitation
by compilers
Use large register files and sustained instruction
pipelining.
Have the compiler fill the branch and load delay
slots in code generated for RISC processors.
The Role of Compilers
Compilers used to exploit hardware features to
improve performance.
Interaction between compiler and architecture
design is a necessity in modern computer
development.
It is not necessarily the case that more software
parallelism will improve performance in
conventional scalar processors.
The hardware and compiler should be designed at
the same time.

Anda mungkin juga menyukai