TU Dortmund,
Dortmund, Germany
2011 08 08
These slides use Microsoft clip arts.
Microsoft copyright restrictions apply.
P. Marwedel, 2011
Disappearing computer,
Ubiquitous computing,
Pervasive computing,
Ambient intelligence,
Post-PC era,
Cyber-physical systems.
Basic technologies:
Embedded Systems
Communication technologies
p. marwedel,
informatik 12, 2011
- 2-
- 3-
- 4-
Multiple networks
Multiple networked
processors
p. marwedel,
informatik 12, 2011
- 5-
P. Marwedel, 2011
- 6-
p. marwedel,
informatik 12, 2011
- 7-
- 8-
Characteristics of cyber-physical/
embedded systems
Must be dependable
Must be efficient
energy, code-size,
run-time, weight,
cost efficient
Dedicated towards a certain application
Dedicated user interface
Many CPS/ES must meet real-time constraints
Connected to physical environment
Hybrid systems (analog + digital parts).
Typically, CPS/ES are reactive systems:
(In continual interaction with is environment)
p. marwedel,
informatik 12, 2011
- 9-
P. Marwedel, 2011
p. marwedel,
informatik 12, 2011
- 10 -
Real-time constraints
CPS must meet real-time constraints
A real-time system must react to stimuli from the
controlled object (or the operator) within the time
interval dictated by the environment.
For real-time systems, right answers arriving too
late are wrong.
A real-time constraint is called hard, if not
meeting that constraint could result in a
catastrophe [Kopetz, 1997].
All other time-constraints are called soft.
A guaranteed system response has to be explained
without statistical arguments
p. marwedel,
informatik 12, 2011
- 11 -
- 12 -
Application Knowledge
2: Specification &
Modeling
Design
repository
3: ES-hardware
Design
8: Test *
6: Application
mapping
4: System
software (RTOS,
middleware, )
7: Optimization
5: Evaluation &
Validation (energy, cost,
performance, )
* Could be
integrated
into loop;
not included
in the
current
course
Generic loop: tool chains differ in the number and type of iterations
Numbers denote sequence of chapters
p. marwedel,
informatik 12, 2011
- 13 -
Specification
- 14 -
Models
p. marwedel,
informatik 12, 2011
- 15 -
Specification of CPS/ES:
Requirements for models
Hierarchy
Compositional behavior
Timing behavior
State-oriented behavior
Event-handling
Concurrency
Synchronization and communication
Presence of programming elements
Executability
Support for the design of large systems
No obstacles for efficient implementation
Domain-specific support
Non-functional properties
p. marwedel,
informatik 12, 2011
No single
model
will meet
all
requirements
compromises
- 16 -
Models of computation
- Definition -
C-1
C-2
p. marwedel,
informatik 12, 2011
- 17 -
Communication
Shared memory
Comp-1
memory
Comp-2
p. marwedel,
informatik 12, 2011
- 18 -
thread b {
..
u=5
}
- 19 -
Non-blocking/
asynchronous message passing
Sender does not have to wait until message has arrived;
send ()
receive ()
- 20 -
Rendez-vous, Blocking/
synchronous message passing
send ()
receive ()
- 21 -
Organization of computations
within the components (1)
Finite state machines
Data flow
(models the flow of data in a distributed system)
Differential equations
2x
b
2
t
p. marwedel,
informatik 12, 2011
- 22 -
Organization of computations
within the components (2)
Discrete event model
queue
5
a 6
b 7
c8
10
13
15
19
time
action
- 23 -
Models of computation
considered in this course
Communication/
Shared
local computation memory
Undefined
components
Communic. finite
state machines
Data flow
Petri nets
Discrete event
(DE) model
Von-Neumann
model
Message passing
Synchronous | Asynchronous
SDL
Kahn networks, SDF
C/E nets, P/T nets,
Only experimental systems, e.g.
distributed DE in Ptolemy
C, C++, Java with libraries
CSP, ADA
|
- 24 -
Uses cases
Similar to SW specification
p. marwedel,
informatik 12, 2011
- 25 -
Models of computation
considered in this course
Communication/
Shared
local computation memory
Undefined
components
Communic. finite
state machines
Data flow
Petri nets
Discrete event
(DE) model
Von-Neumann
model
Message passing
Synchronous | Asynchronous
SDL
Kahn networks, SDF
C/E nets, P/T nets,
Only experimental systems, e.g.
distributed DE in Ptolemy
C, C++, Java with libraries
CSP, ADA
|
- 26 -
StateCharts
Extending classical automata to model ES & CPS
Adding timing with timed automata ( tutorial by Larsen)
Adding hierarchy:
Complex graphs cannot be understood by humans.
Introduction of hierarchy StateCharts [Harel, 1987]
StateChart = the only unused combination of
flow or state with diagram or chart
Used here as a (prominent) example of a
model of computation based on shared
memory communication, appropriate only for
local (non-distributed) systems
p. marwedel,
informatik 12, 2011
- 27 -
Introducing hierarchy
p. marwedel,
informatik 12, 2011
- 28 -
Definitions
Current states of FSMs are also called active
states.
States which are not composed of other states are called
basic states.
States containing other states are called super-states.
Super-states S are called OR-super-states, if exactly one
of the sub-states of S is active whenever S is active.
superstate
substates
p. marwedel,
informatik 12, 2011
- 29 -
- 30 -
Concurrency
Convenient
ways of
describing
concurrency
are required.
Example:
AND-superstates:
FSM is in all
(immediate)
sub-states of a
super-state.
p. marwedel,
informatik 12, 2011
- 31 -
Timers
- 32 -
p. marwedel,
informatik 12, 2011
- 33 -
Questions?
Q&A?
p. marwedel,
informatik 12, 2011
- 34 -
Application Knowledge
2: Specification &
Modeling
Design
repository
3: ES-hardware
Design
8: Test *
6: Application
mapping
4: System
software (RTOS,
middleware, )
7: Optimization
5: Evaluation &
Validation (energy, cost,
performance, )
* Could be
integrated
into loop;
not included
in the
current
course
Generic loop: tool chains differ in the number and type of iterations
Numbers denote sequence of chapters
p. marwedel,
informatik 12, 2011
- 35 -
- 36 -
Example
- 37 -
Example (2)
Executing the right state first would assign the old value of a
(=1) to a and b.
The result would depend on the execution order.
p. marwedel,
Use pen on tablet
informatik 12, 2011
- 38 -
- 39 -
Steps
Execution of a StateMate model consists of
a sequence of (status, step) pairs
Status
phase
phase 2
Other implementations of
StateCharts do not have
these 3 phases!
p. marwedel,
informatik 12, 2011
- 40 -
- 41 -
- 42 -
Models of computation
considered in this course
Communication/
Shared
local computation memory
Undefined
components
Communic. finite
state machines
Data flow
Petri nets
Discrete event
(DE) model
Von-Neumann
model
Message passing
Synchronous | Asynchronous
SDL
Kahn networks, SDF
C/E nets, P/T nets,
Only experimental systems, e.g.
distributed DE in Ptolemy
C, C++, Java with libraries
CSP, ADA
|
- 43 -
- 44 -
Determinate?
Let tokens be arriving at FIFO at the same time:
Order in which they are stored, is unknown:
- 45 -
Models of computation
considered in this course
Communication/
Shared
local computation memory
Undefined
components
Communic. finite
state machines
Data flow
Petri nets
Discrete event
(DE) model
Von-Neumann
model
Message passing
Synchronous | Asynchronous
SDL
Kahn networks, SDF
C/E nets, P/T nets,
Only experimental systems, e.g.
distributed DE in Ptolemy
C, C++, Java with libraries
CSP, ADA
|
- 46 -
www.ece.ubc.ca/~irenek/techpaps/vod/vod.html
p. marwedel,
informatik 12, 2011
- 47 -
- 48 -
- 49 -
Example
levi animation
- 50 -
- 51 -
Models of computation
considered in this course
Communication/
Shared
local computation memory
Undefined
components
Communic. finite
state machines
Data flow
Petri nets
Discrete event
(DE) model
Von-Neumann
model
Message passing
Synchronous | Asynchronous
SDL
Kahn networks, SDF
C/E nets, P/T nets,
Only experimental systems, e.g.
distributed DE in Ptolemy
C, C++, Java with libraries
CSP, ADA
|
- 52 -
- 53 -
produce N
channel
N
FIFO
f AN fBM
number of tokens produced
fire B {
consume M
Schedulable statically
In the general case, buffers may be needed at edges.
Decidable:
buffer memory requirements
deadlock
Adopted from: ptolemy.eecs.berkeley.edu/presentations/03/streamingEAL.ppt
p. marwedel,
informatik 12, 2011
- 54 -
Models of computation
considered in this course
Communication/
Shared
local computation memory
Undefined
components
Communic. finite
state machines
Data flow
Petri nets
Discrete event
(DE) model
Von-Neumann
model
Message passing
Synchronous | Asynchronous
SDL
Kahn networks, SDF
C/E nets, P/T nets,
Only experimental systems, e.g.
distributed DE in Ptolemy
C, C++, Java with libraries
CSP, ADA
|
- 55 -
Introduction
Introduced in 1962 by Carl Adam Petri in his PhD
thesis. Focus on modeling causal dependencies;
no global synchronization assumed (message passing only).
Key elements:
Conditions
Either met or no met.
Events
May take place if certain conditions are met.
Flow relation
Relates conditions and events.
Conditions, events and the flow relation form
a bipartite graph (graph with two kinds of nodes).
p. marwedel,
informatik 12, 2011
- 56 -
Preconditions
p. marwedel,
informatik 12, 2011
- 57 -
p. marwedel,
informatik 12, 2011
- 58 -
p. marwedel,
informatik 12, 2011
- 59 -
Models of computation
considered in this course
Communication/
Shared
local computation memory
Undefined
components
Communic. finite
state machines
Data flow
Petri nets
Discrete event
(DE) model
Von-Neumann
model
Message passing
Synchronous | Asynchronous
SDL
Kahn networks, SDF
C/E nets, P/T nets,
Only experimental systems, e.g.
distributed DE in Ptolemy
C, C++, Java with libraries
CSP, ADA
|
- 60 -
10
13
15
19
p. marwedel,
informatik 12, 2011
time
action
- 61 -
a gate1:
b
process (a,b)
begin
c <= a nor b;
end;
a gate2:
c
b
process (a,b)
begin
c <= a nor b;
end;
R
Processes will wait for changes on their input ports.
If they arrive, processes will wake up, compute their code and
deposit changes of output signals in the event queue and wait
for the next event.
If all processes wait, the next entry will be taken from the
event queue.
p. marwedel,
informatik 12, 2011
00
01
00
11
00 00 01
11 10 10
- 62 -
-simulation cycles
Simulation of an RS-Flipflop
2nd
0000
00011
1st
3rd
11000
0111
0ns
0ns 0ns+
0ns+ 0ns+2
0ns+2 0ns+3
0ns+3
RR 11
11
11
11
SS 00
00
00
00
QQ
nQ
nQ
11
00
00
00
00
11
00
11
gate1:
process (S,Q)
begin
nQ <= S nor Q;
end;
gate2:
process (R,nQ)
begin
Q <= R nor nQ;
end;
cycles reflect the fact that no real
gate comes with zero delay.
should delay-less signal
assignments be allowed at all?
p. marwedel,
informatik 12, 2011
- 63 -
Models of computation
considered in this course
Communication/
Shared
local computation memory
Undefined
components
Communic. finite
state machines
Data flow
Petri nets
Discrete event
(DE) model
Von-Neumann
model
Message passing
Synchronous | Asynchronous
SDL
Kahn networks, SDF
C/E nets, P/T nets,
Only experimental systems, e.g.
distributed DE in Ptolemy
C, C++, Java with libraries
CSP, ADA
|
- 64 -
- 65 -
Shared memory
Potential race conditions (inconsistent results possible)
Critical sections = sections at which exclusive access to
resource r (e.g. shared memory) must be guaranteed.
task a {
..
P(S) //obtain lock
.. // critical section
V(S) //release lock
}
task b {
..
P(S) //obtain lock
.. // critical section
V(S) //release lock
}
Race-free access
to shared memory
protected by S
possible
- 66 -
Communication/synchronization
Special communication libraries for ES & CPS
OSEK/VDX COM
- 67 -
- 68 -
Questions?
Q&A?
p. marwedel,
informatik 12, 2011
- 69 -
Application Knowledge
2: Specification &
Modeling
Design
repository
3: ES-hardware
Design
8: Test *
6: Application
mapping
4: System
software (RTOS,
middleware, )
7: Optimization
5: Evaluation &
Validation (energy, cost,
performance, )
p. marwedel,
informatik 12, 2011
* Could be
integrated
into loop;
not included
in the
current
course
- 70 -
Presentation by K. E. rzn
p. marwedel,
informatik 12, 2011
- 71 -
- 72 -
11
10
01
00
Vref /4
p. marwedel,
informatik 12, 2011
Vref /2
3Vref /4
Vref
h(t)
- 73 -
Processing units
Need for efficiency (power + energy):
Power is considered as the most important constraint in
embedded systems
[in: L. Eggermont (ed): Embedded Systems Roadmap 2002, STW]
p. marwedel,
informatik 12, 2011
- 74 -
p. marwedel,
informatik 12, 2011
- 75 -
P C L Vdd2 f with
: switching activity
C L : load capacitance
Vdd
with
k CL
2
Vdd Vt
Vt : threshhold voltage
f : clock frequency
- 76 -
http://www.mpsoc-forum.org/2007/slides/Hattori.pdf
Multiprocessor systems-on-a-chip
(MPSoCs)
p. marwedel,
informatik 12, 2011
- 77 -
Multiprocessor systems-on-a-chip
(MPSoCs) (2)
p. marwedel,
~50% inherent power efficiency
of silicon
informatik
12, 2011
- 78 -
Reconfigurable Logic
Custom HW may be too expensive, SW too slow.
Combine the speed of HW with the flexibility of SW
HW with programmable functions and interconnect.
Use of configurable hardware;
common form: field programmable gate arrays (FPGAs)
Applications: bit-oriented algorithms like
encryption,
fast object recognition (medical and military)
Adapting mobile phones to different standards.
Very popular devices from
XILINX (XILINX Vertex II are recent devices)
Actel, Altera and others
p. marwedel,
informatik 12, 2011
- 79 -
- 80 -
- 81 -
Memory
Memories?
Oops!
Memories!
- 82 -
CP
(1 U
.5 P
-2 er
p. for
a. m
)
an
ce
Speed
2x
every 2
years
07 p
.
1
(
M
RA
1
0
.a.)
5 years
[P. Machanik: Approaches to Addressing the Memory Wall, TR Nov. 2002, U. Brisbane]
p. marwedel,
informatik 12, 2011
- 83 -
p. marwedel,
informatik 12, 2011
- 84 -
Index
way 0
Tags
data block
$ ()
way 1
Data
p. marwedel,
informatik 12, 2011
- 85 -
Hierarchical memories
using scratch pad memories (SPM)
SPM is a small,
physically separate
memory mapped into
the address space
Hierarchy
Hierarchy
main
Address space
Example
Example
scratch pad memory
FFF..
no tag memory
select
SPM
SPM
Selection is by an
appropriate address
decoder (simple!)
ARM7TDMI
cores, wellknown for
low power
consumption
processor
p. marwedel,
informatik 12, 2011
- 86 -
9
8
7
Energy per access [nJ]
6
Scratch pad
5
3
2
1
0
256
512
1024
2048
4096
memory size
8192
16384
[R. Banakar, S. Steinke, B.-S. Lee, 2001]
p. marwedel,
informatik 12, 2011
- 87 -
Application Knowledge
2: Specification &
Modeling
Design
repository
3: ES-hardware
Design
8: Test *
6: Application
mapping
4: System
software (RTOS,
middleware, )
7: Optimization
5: Evaluation &
Validation (energy, cost,
performance, )
p. marwedel,
informatik 12, 2011
* Could be
integrated
into loop;
not included
in the
current
course
- 88 -
- 89 -
- 90 -
Definitions
Let X: m-dimensional solution space for the
design problem. Example: dimensions correspond to # of
processors, size of memories, type and width of busses etc.
Let F: n-dimensional objective space for the design problem.
Example: dimensions correspond to speed, cost, power
consumption, size, weight, reliability,
Let f(x)=(f1(x),,fn(x)) where xX be an objective function.
We assume that we are using f(x) for evaluating designs.
objective space
solution space
f(x)
x
x
p. marwedel,
informatik 12, 2011
- 91 -
Pareto points
We assume that, for each objective, a total
order < and the corresponding order are defined.
Definition:
Vector u=(u1,,un) F dominates vector v=(v1,,vn) F
i {1,..., n} : ui vi
i {1,..., n} : ui vi
Definition:
Vector u F is indifferent with respect to vector v F
neither u dominates v nor v dominates u
p. marwedel,
informatik 12, 2011
- 92 -
Pareto Point
Objective 1
(e.g. energy
consumption)
indifferent
worse
Pareto-point
better
indifferent
Objective 2
(e.g. run time)
- 93 -
distribution of times
worst-case performance
worst-case guarantee
Lower
timing
bound
BCET
WCET
Upper
timing
bound
time
timing predictability
WCETEST
marwedel
informatik 12, 2011
- 94 -
ILP model
- 95 -
Example (1)
Program
CFG
int main()
{
int i, j = 0;
WCETs of BB
(aiT 4 TriCore)
_main: 21 cycles
_L1: 27
_L3: 2
_L4: 2
_L5: 20
_L6: 13
_L2: 20
- 96 -
Example (2)
Virtual start node
Virtual end node
Virtual end node per function
Variables:
1 variable per node
1 variable per edge
Constraints: Kirchhoff equations per node
Sum of incoming
edge variables =
flux through node
Sum of outgoing
x6
edge variables =
flux through node
x7
x9
x10
_main: 21 cycles
_L1: 27
_L3: 2
_L4: 2
_L5: 20
_L6: 13
_L2: 20
x8
ILP
/* Objective function = WCET to be maximized*/
21 x2 + 27 x7 + 2 x11 + 2 x14 + 20 x16 + 13 x18 + 20 x19;
/* CFG Start Constraint */ x0 - x4 = 0;
/* CFG Exit Constraint */ x1 - x5 = 0;
/* Constraint for flow entering function main */
x2 - x4 = 0;
/* Constraint for flow leaving exit node of main */
x3 - x5 = 0;
/* Constraint for flow entering exit node of main */
x3 - x20 = 0;
/* Constraint for flow entering main = flow leaving main */
x2 - x3 = 0;
/* Constraint for flow leaving CFG node _main */
x2 - x6 = 0;
/* Constraint for flow entering CFG node _L1 */
x7 - x8 - x6 = 0;
/* Constraint for flow leaving CFG node _L1 */
x7 - x9 - x10 = 0;
/* Constraint for lower loop bound of _L1 */
x7 - 101 x9 >= 0;
/* Constraint for upper loop bound of _L1 */
x7 - 101 x9 <= 0; .
p. marwedel,
informatik 12, 2011
- 97 -
p-J p
3
3
2
2p
3p
p-J
2p
p
p-J
3p
p+J
p. marwedel,
informatik 12, 2011
- 98 -
s
p
u
s p-s
l
p
p+s
p. marwedel,
informatik 12, 2011
2p
- 99 -
12
WCET=4
(Defined
only for
an integer
number of
events)
8
BCET=3
4
1
3
p. marwedel,
informatik 12, 2011
e
- 100 -
,
l
,
l
' , u '
RTC
RTC
', '
l
RTC
Theoretical results
allow the computation
of properties of
outgoing streams
p. marwedel,
informatik 12, 2011
- 101 -
' min ,
u
' min ,
l
' 0
Where:
f g t inf f (t u ) g (u )
0 u t
f g t inf
f (t u ) g (u )
u 0
f g t supf (t u ) g (u )
0 u t
f g t supf (t u ) g(u )
u 0
p. marwedel,
informatik 12, 2011
- 102 -
Summary
ES hardware
HW in a loop
Sensors, discretization
Processors
FPGAs
Memories
Communication ( presentation by L. Almeida)
Back to the analog world
Evaluation and validation
Multiple objectives, Pareto optimality
Computation of worst case execution times (WCETs)
Real-time calculus
p. marwedel,
informatik 12, 2011
- 103 -
Questions?
Q&A?
p. marwedel,
informatik 12, 2011
- 104 -
Application Knowledge
2: Specification &
Modeling
Design
repository
3: ES-hardware
Design
8: Test *
6: Application
mapping
4: System
software (RTOS,
middleware, )
7: Optimization
5: Evaluation &
Validation (energy, cost,
performance, )
p. marwedel,
informatik 12, 2011
* Could be
integrated
into loop;
not included
in the
current
course
- 105 -
p. marwedel,
informatik 12, 2011
- 106 -
fi
fi
fi
- 107 -
Executing task
p. marwedel,
informatik 12, 2011
- 108 -
Earlier deadline
preemption
Later deadline
no preemption
p. marwedel,
informatik 12, 2011
- 109 -
Periodic scheduling
T1
T2
Each execution instance of a task is called a job.
Notion of optimality for aperiodic scheduling does not make
sense for periodic scheduling.
For periodic scheduling, the best that we can do is to design
an algorithm which will always find a schedule if one exists.
A scheduler is defined to be optimal iff it will find a
schedule if one exists.
p. marwedel,
informatik 12, 2011
- 110 -
di
ci
li
p. marwedel,
informatik 12, 2011
t
- 111 -
Independent tasks:
Rate monotonic (RM) scheduling
Most well-known technique for scheduling
independent periodic tasks [Liu, 1973].
Assumptions:
All tasks that have hard deadlines are periodic.
All tasks are independent.
di =pi, for all tasks.
ci is constant and is known for all tasks.
The time required for context switching is negligible.
For a single processor and for n tasks, the following
equation holds for the average utilization :
n
ci
n(21/ n 1)
i 1 pi
p. marwedel,
informatik 12, 2011
- 112 -
p. marwedel,
informatik 12, 2011
- 113 -
i 1
ci
n(21/ n 1)
pi
lim(n(21/ n 1) ln(2)
n
p. marwedel,
informatik 12, 2011
- 114 -
- 115 -
Failing RMS
Task 1: period 5, execution time 3
Task 2: period 8, execution time 3
=3/5+3/8=24/40+15/40=39/40 0.975
2(21/2-1) 0.828
p. marwedel,
informatik 12, 2011
- 116 -
T1
tt
T2
No problem if p2 = m p1, m :
should be
completed
T1
tt fits
T2
p. marwedel,
informatik 12, 2011
- 117 -
Application Knowledge
Design
repository
3: ES-hardware
Design
8: Test *
6: Application
mapping
4: System
software (RTOS,
middleware, )
7: Optimization
5: Evaluation &
Validation (energy, cost,
performance, )
p. marwedel,
informatik 12, 2011
* Could be
integrated
into loop;
not included
in the
current
course
- 118 -
Example:
for j ..{ }
while ...
Repeat
main
memory
call ...
Array ...
Scratch pad
memory,
capacity SSP
Processor
?
Array
Overlaying (dynamic)
allocation:
Int ...
- 119 -
IP representation
- migrating functions and variablesSymbols:
S(vark ) = size of variable k
n(vark) = number of accesses to variable k
e(vark ) = energy saved per variable access, if vark is migrated
E(vark ) = energy saved if variable vark is migrated (= e(vark) n(vark))
x(vark ) = decision variable, =1 if variable k is migrated to SPM,
=0 otherwise
K = set of variables; Similar for functions I
Integer programming formulation:
Maximize k K x(vark) E(vark ) + iI x(Fi ) E(Fi )
Subject to the constraint
k K S(vark) x(vark ) + i I S(Fi ) x(Fi) SSP
p. marwedel,
informatik 12, 2011
- 120 -
Cycles [x100]
Energy [J]
Multi_sort
(mix of sort
algorithms)
Measured processor / external memory energy +
CACTI values for SPM (combined model)
- 121 -
Small is beautiful:
addresses
background memory
p. marwedel,
informatik 12, 2011
- 122 -
C e j x j ,i ni
j
j : x j ,i Si SSPj
i
i : x j ,i 1
j
- 123 -
Considered partitions
Example of considered memory partitions for a total
capacity of 4096 bytes
# of
partitions
7
6
5
4
3
2
1
4k
0
0
0
0
0
0
1
2k
1
1
1
1
1
2
0
64
2
0
0
0
0
0
0
- 124 -
Working set
p. marwedel,
informatik 12, 2011
- 125 -
SPM size
memory-aware
compiler
executable
ARMulator
Actual
performance
aiT
Worst case
execution time
p. marwedel,
informatik 12, 2011
- 126 -
Architectures considered
ARM7TDMI with 3 different memory
architectures:
1. Main memory
LDR-cycles: (CPU,IF,DF)=(3,2,2)
STR-cycles: (2,2,2)
* = (1,2,0)
2. Main memory + unified cache
LDR-cycles: (CPU,IF,DF)=(3,12,6)
STR-cycles: (2,12,3)
* = (1,12,0)
3. Main memory + scratch pad
LDR-cycles: (CPU,IF,DF)=(3,0,2)
STR-cycles: (2,0,0)
* = (1,0,0)
p. marwedel,
informatik 12, 2011
- 127 -
References:
Wehmeyer, Marwedel: Influence of Onchip Scratchpad Memories on WCET: 4th Intl Workshop on
worst-case execution time (WCET) analysis, Catania, Sicily, Italy, June 29, 2004
Second paper on SP/Cache and WCET at DATE, March 2005
p. marwedel,
informatik 12, 2011
- 128 -
p. marwedel,
informatik 12, 2011
- 129 -
A vg. R elative WC ET
EST
[%]
110%
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
marwedel
informatik 12, 2011
- 130 -
Memory
Memory
- 131 -
100
128
200
Scratchpad Size (Bytes)
p. marwedel,
informatik 12, 2011
256
avg.
- 132 -
Saving/Restoring
P3
at context switch
P2
P1
Process P3
Process P2
P1
Scratchpad
- 133 -
Process P1
P3
P2
P1
Process P2
Process P3
Scratchpad
- 134 -
Process P1
Saving/Restoring
P3
at context switch
P2
Process P2
Process P3
Process
ProcessP2
P1
P3
P1,P2, P3
P1
Scratchpad
p. marwedel,
informatik 12, 2011
- 135 -
Research monographs
- 136 -
Textbook(s)
Several editions/translations:
Peter
Marwedel
1st edition
Peter
Marwedel
English
- Original hardcover version
- Reprint, soft cover, 2006
German, 2007
Chinese, 2006
Macedonian, 2010
Peter
Marwedel
- 137 -
Overall Summary
Introduction, Motivation and Overview
Motivation
Common characteristics
Specifications and Modeling
Models of computation
Early phases
FSM-based models, Data flow, Petri nets, discrete
event-based models, Von-Neumann models
Comparison
Exploitation of the memory hierarchy
Scratch pad memories
- Non-overlaying allocation
- Overlaying allocation
p. marwedel,
informatik 12, 2011
- 138 -
Morning
Afternoon
Monday
Marwedel
Marwedel
Tuesday
Madsen
Larsen
Wednesday
Madsen
Larsen
Thursday
Almeida
Arzen
Friday
Almeida
Arzen
p. marwedel,
informatik 12, 2011
- 139 -