4, NOVEMBER 2010
for specific applications, it can be extended by system architectural services is kept simple and deterministic in order
components providing optional services on top of the core to enable certification and the use of the GENESYS MPSoC in
services. The set of optional services is open and can be safety-critical applications.
extended. Furthermore, the introduced MPSoC addresses the chal-
• Architectural services addressing the technological chal- lenges identified by the European embedded systems initiative
lenges: Fundamental core and optional services supporting ARTEMIS [3]. The GENESYS MPSoC provides architectural
complexity management and robustness are identified in services for robustness (e.g., a trusted subsystem that performs
this paper. fault isolation, state restoration to enable the recovery from
• Mapping an automotive example application to a proto- transient faults). In order to facilitate complexity management,
type of the MPSoC: A prototype of the introduced MPSoC the GENESYS MPSoC raises the level of abstraction at the
is used to implement an automotive example application
network interface of the Network-on-a-Chip (NoC).
and provide information about performance and resource
The NoC of GENESYS is controlled by a TDMA scheme,
requirements.
The GENESYS MPSoC leads to a significant reduction of which can also be found in the AEthereal architecture [11], the
fragmentation in the area of embedded systems. By advancing Sonics SiliconBackplane Network [12] and Nostrum [14] to
from domain-specific solutions towards a cross-domain provide bandwidth and latency guarantees. However, these ar-
MPSoC, a significantly larger market for embedded system chitectures provide a shared memory abstraction to the attached
technologies (e.g., semiconductor devices, embedded com- IP cores via a transaction-based master/slave protocol as used in
ponents, software, and tools) emerges. A large market and OCP [16] or AXI [17]. These protocols define low-level signals
the resulting ability of exploiting the economies of scale is of like address signals, data signals, interrupt signals, reset signals
particular importance for the semiconductor industry where or clock signals which are typically found at the interfaces of
the engineering costs associated with the design and tooling processors, memory subsystems, or bus bridges. In contrast, the
of chips grows rapidly. According to the International Tech- GENESYS MPSoC introduces system and application compo-
nology Roadmap for Semiconductors (ITRS) the production of nents, which are self contained computational units (e.g., a pro-
masks and exposure systems is becoming a bottleneck for the cessor core with local memory) that provide their functionality
development of chips with finer geometries [4]. For example, solely over application-level message-based interfaces.
consider the design investments in the Cell Broadband Engine
(CellBE) processor, which amount to 400 million USD [6]. The III. OVERVIEW OF GENESYS MPSOC
cross-domain usability helps to amortize such nonrecurring
The GENESYS MPSoC is designed for the component-based
costs.
development of embedded systems. A component is a self-con-
The paper is structured as follows. Section II provides an
tained hardware/software unit that performs computations, is
overview of related work in the area of MPSoCs and system
aware of the progression of real-time, and interacts with its en-
architectures. An overview of the GENESYS MPSoC is the
topic of Section III. The stable set of core services is detailed vironment exclusively by the exchange of messages. A compo-
in Section IV. Section V is devoted to the optional services. nent is a self-contained IP core, which is formed by the com-
Section VI presents the prototype implementation and the ponent-hardware and the allocated software, the job. A compo-
automotive example. This paper ends with a discussion in nent is a replaceable part of a system that encapsulates design
Section VIII. and implementation and exposes four different message inter-
faces, which are detailed in Section III-E. We call the timed se-
II. RELATED WORK quence of messages that a component exchanges across an in-
terface with its environment (e.g., at the network-on-chip) the
Several system architectures exist for specific application
behavior of the component at that interface. The intended be-
domains (e.g., Automotive Open System Architecture (AU-
havior is called the application service of the component. An
TOSAR) [7], Integrated Modular Avionics (IMA) [8], and unintended behavior is called a component failure.
Network on Terminal Architecture (NoTA) [9]). Also, MP- The GENESYS MPSoC provides a framework for the imple-
SoCs, and NoC architectures (e.g., CellBE [10], AEthereal [11], mentation of components and the emergence of global applica-
Sonics [12], CoMPSoC [13], Nostrum [14], and Spidergon tion services, which come about by the interaction of the appli-
[15]) have been developed with a focus on specific applications. cation services of the components. An example of an emergence
The major novelty of the presented GENESYS MPSoC archi- in the automotive domain would be the Electronic Stability Pro-
tecture lies in its cross-domain applicability and, in particular, gram (ESP), which computes control values for braking actua-
its suitability for safety-critical applications. The domain-inde- tion based on information from a diversity of sensors. In this
pendent core of the GENESYS MPSoC with its core architec- example, the prior application services of the components on
tural services is the result of discussions among industrial part- the GENESYS MPSoC can include an acceleration measure-
ners of five application domains. Additional capabilities of the ment service provided by a sensor component, a braking actu-
platform are realized by optional services on top of the core ser- ation service provided by an actuator component and a control
vices, thereby customizing the GENESYS MPSoC to specific service for computing a braking actuation value by a controller
applications. The set of core architectural services is kept min- component.
imal and includes only those capabilities that cannot be intro- As a foundation for the implementation of components, the
duced through optional services, but are required as the founda- GENESYS MPSoC offers platform services with mature solu-
tion for the addressed application domains. The small set of core tions to generic problems (e.g., security) in order to simplify
550 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 6, NO. 4, NOVEMBER 2010
failure of a component (e.g., babbling idiot) cannot affect The optional platform services and the application services
the message exchanges of other components. are provided by components in the application-specific sub-
• Trusted Resource Manager (TRM): The communication system of the GENESYS. A failure of a component (due to a
interfaces must be configured to establish knowledge in design fault or a physical fault) is contained by the trusted sub-
the interface about the permitted temporal behavior of the system in order to prevent the propagation to a global chip-wide
component. The Trusted Resource Manager (TRM) per- failure.
forms this configuration. A component cannot modify the
parameters of its communication interface, such as the in- C. Component Types
stants when a message is sent and how long a message
transmission may last. It is impossible for a faulty com- From the point-of-view of service provision, GENESYS dis-
ponent (caused by a hardware or software fault within that tinguishes between system components and application compo-
component) to modify the communication schedule set by nents. A system component is a self-contained component that
the TRM. provides some architectural service and can be considered to
The NoC, the communication interfaces and the TRM are form a part of the GENESYS architecture. System components
trusted, because all components build on these architectural ele- can be widely reused by different applications.
ments. A failure of the trusted subsystem would have the poten- An application component is a component that implements
tial of causing a global failure of the entire MPSoC. Therefore, the specified application functionality. Application components
the trusted subsystem is kept simple (i.e., only inclusion of ab- use the services of the available system components to reduce
solutely required capabilities) in order to enable a rigid devel- the effort required to implement the application functionality.
opment and validation. We assume that the trusted subsystem is An application designer is only concerned with the development
free of design faults. of application components.
Likewise, a physical fault affecting the trusted subsystem has It is one of the significant contributions of the GENESYS
the potential to cause a global failure of the entire MPSoC. The MPSoC to eliminate the need for a complex monolithic central
probability of such a fault depends on the chip area consumed operating system to control all resources of the platform. In the
by the trusted subsystem, which is typically a small fraction of GENESYS MPSoC many of the global operating system func-
the entire chip. However, from the perspective of safety-criti- tions are provided by a set of self-contained system components.
cality, a whole chip is assumed to fail with a probability of These system components map ideally into the IP-cores of the
FIT. Today’s technology does not support the manufacturing MPSoC. If they become stable, they can be implemented in
of chips with failure rates low enough to meet the reliability hardware, thus significantly reducing energy consumption and
requirements of ultra-dependable systems (i.e., failure rate of silicon area. Additionally, each component may have a small
critical failures per hour [20]). Component failure rates local (possibly heterogeneous) operating system that manages
are usually in the order of to (e.g., [21] uses a large the local resources of the component and that is not visible at
statistical basis and reports 100 to 500 failures out of 1 Million the architectural level.
Electronic Control Units (ECUs) in 10 years) and ultra-depend- While this approach is already successfully used in the field
able applications require the system as a whole to be more reli- of distributed systems, the specific technological constraints of
able than any one of its components. This can only be achieved an MPSoCs require new architectural solutions. The proposed
by utilizing fault-tolerant strategies that enable the continued GENESYS MPSoC takes into consideration the unique hard-
operation of the system in the presence of chip failures. ware properties, faults and threats of an MPSoC. In particular,
552 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 6, NO. 4, NOVEMBER 2010
the point-of-view of the LIF, only the timing and the meaning is partitioned into a small core service and a more intricate op-
of the information exchanged across a local interface is of rele- tional service. Consider, for example a dynamic message sched-
vance, while the detailed structure, naming, and access mecha- uler, which must be part of any dynamic resource management.
nisms of the local interface are intentionally left unspecified. A Such a dynamic scheduler is not included in the core services.
modification of the local access mechanisms, e.g., the exchange However a much simpler checker that verifies the properties of
of direct I/O by a LIN fieldbus [30], will not have any effect on a schedule and ascertains that the constraints of a static safety-
the LIF specification, and consequently on the users of the LIF critical schedule have not been violated by the dynamic sched-
specification, as long as the meaning and the timing of the rele- uler is part of the core services.
vant data at the local interface are the same.
For example, consider a component that provides tempera- A. Basic Time Services
ture measurements in an automotive application. The LIF spec- In the GENESYS MPSoC, each component and the NoC can
ification of the component will cover the semantics of the local reside in their own clock domains. The time-triggered NoC per-
interface (e.g., temperature at a certain location within the car) forms clock synchronization in order to provide a common time
and the timing (e.g., delay until a change of temperature is indi- for all components despite the existence of multiple clock do-
cated). However, the implementation technology for interacting mains. The common time allows the temporal coordination of
with the sensor is unspecified at the LIF (e.g., analogue input distributed activities (e.g., synchronized messages), as well as
with digital/analogue conversion, LIN fieldbus, ). to interrelate timestamps assigned at different components.
4) The Technology Dependent Interface (TDI): The TDI pro- 1) Common Time: The GENESYS common time service
vides the means to look inside a component and to observe in- provides to each component a local clock, which is globally syn-
ternal variables of a component (e.g., for debugging). The TDI chronized within the MPSoC. If the clock is read by a compo-
is intended for the person who has a deep understanding of the
nent at a particular point in time, it is guaranteed that its value
component internals (e.g., a maintenance engineer or a valida-
can differ by at most one tick from the value that is read at any
tion engineers). The TDI is of no relevance for the user of the
other correct component at the same point in time. This property
LIF services of the component nor for the system engineer that
is also known as the reasonableness of the common time ([34,
configures a component. The precise specification of the TDI
p. 52]), which states that the precision of the local clocks at the
depends on the technology of the component implementation,
components is lower than the granularity of the common time.
and will be different if the same functionality of a component
Optionally, the common time can also be synchronized with an
is realized by software running on a CPU, by an FPGA or by
external time source (e.g., GPS). In this case, the accuracy de-
an ASIC. An example for a realization of the TDI is a boundary
notes the maximum deviation of any local clock of the MPSoC
scan [31].
to the external reference time.
The main rational for the provision of a common time is the
IV. CORE SERVICES OF THE TRUSTED SUBSYSTEM ability for the temporal coordination of activities and the es-
tablishment of a deterministic communication infrastructure. A
The four core services are mandatory and thus part of any in- system behaves deterministically if and only if, given a full set
stantiation of the GENESYS MPSoC. These four services repre- of initial conditions (the initial state) at the initial instant, and
sent capabilities that are universally important for all considered a sequence of future timed inputs, the outputs at any future in-
application domains and absolutely necessary to build higher stant are entailed [35]. This determinism is achieved by pro-
services or to maintain the desired properties of the architecture. viding the predictable time-triggered NoC. The time-triggered
The GENESYS MPSoC supports the construction of mixed NoC requires a common notion of time among all components
criticality systems, in which safety-critical components and non for assigning conflict-free sending slots to the components.
safety-critical components coexist on one chip. Due to certifica- In addition to determinism and predictability, the time-trig-
tion requirements [32], [33], safety-critical components are typ- gered operation of the NoC enabled by the common time base
ically kept simple with static preplanned schedules and no dy- results in a simpler and more energy efficiency interconnect, be-
namic reconfiguration. In contrast, dynamic system capabilities cause there is no need for dynamic arbitration. The common
are effective in non safety-critical components in order to adapt time also allows to minimize end-to-end latencies through tem-
to varying resource availability and environmental conditions. poral alignment of computational of components and commu-
In order to support mixed criticality systems, the core services nication activities at the NoC. Furthermore, a relationship can
of the GENESYS MPSoC must ensure fault containment and be be established for timestamps assigned at different components
deterministic and simple. The core services ensure that a fault and timestamps can be used to determine the temporal accuracy
of a non safety-critical system or application component cannot ([34, p. 102]) of information from another component.
affect the correct operation of the safety-critical components. The proposed common time is opposed to Globally Asyn-
Hence, only the core service need to be considered in the cer- chronous Locally Synchronous (GALS) or solutions with a
tification of the safety-critical components, while the optional single clock domain. The provision of a single clock signal for
services of the non safety-critical components can be abstracted an entire chip has become prohibitively expensive [36], e.g.,
from. because of clock skew and the integration of components with
In order to make the core services amenable to certification different implementation technologies. GALS, on the other
and keep them deterministic and simple, the implementation of hand, do not support the presented temporal coordination and
powerful dynamic system capabilities in the GENESYS MPSoC alignment.
OBERMAISSER et al.: A CROSS-DOMAIN MULTIPROCESSOR SYSTEM-ON-A-CHIP FOR EMBEDDED REAL-TIME SYSTEMS 555
The disadvantage of the common time base lies in the over- nated [38]. In the presented MPSoC, each periodic mes-
head for clock distribution or clock synchronization. sage possesses a period and phase with respect to the global
2) Time Format: We can distinguish between two different time base. The timing of sporadic messages and streaming
representations of time, the linear time representation and the is characterized by a minimum message interarrival time.
cyclic time representation. The linear time representation fol- This knowledge is essential in real-time systems in order to
lows the arrow of time, starting at a defined starting point, the perform a timing analysis of an application, to meet dead-
epoch, and counting the number of seconds up to instant now lines and to detect/contain failures (e.g., violation of min-
and beyond. imum message interarrival time).
In the cyclic model of time, the continuum of time is por- • Fault containment. A unidirectional message, which
tioned into an infinite set of cycles. A cycle is a period of phys- is used in the presented MPSoC architecture, has one
ical time between the repetitions of regular (equidistant) events. sender component and one or more receiver components.
A cycle is specified by the duration of its period and the po- The knowledge about the sender identity is essential for
sition of its start, the cycle start phase relative to the common fault-containment and diagnosis (e.g., masquerading fail-
time. Since many embedded systems exhibit a cyclic behavior ures). In the temporal domain, the knowledge concerning
(e.g., control systems, signal processing systems, and multi- the timing (e.g., period of a message) enables the contain-
media systems) the cyclic model of time is well suited to desig- ment of temporal faults.
nate progress in these systems. The cyclic model of time is also • Universality. Message passing is a universal model. Dif-
most appropriate for the description of the behavior of time-trig- ferent interaction patterns, such as a shared memory, can
gered systems. In a time-triggered system, a cycle is associated be realized on top of message passing [39].
with every time-triggered process. Whenever the cycle-start in- Message passing as the basic interaction mechanism is in con-
stant occurs, a control-signal is generated to start the time-trig- trast to the shared memory model that is prevalent in today’s
gered process. MPSoCs. Shared memories typically result in temporal unpre-
GENESYS restricts the duration of cycles such that all cycles dictability, since the access of components is not preplanned and
are in a harmonic relationship to each other. Every cycle must be concurrent access needs to be resolved dynamically. Message
a power-of-two product of a smallest cycle, with the additional passing also eliminates the communication overhead and hard-
restriction that the duration of one of the cycles must be exactly ware complexity of the coordination protocol needed for cache
the duration of the physical second. Hence, the GENESYS time coherence [40]. Memory hierarchies and coherence protocols
format is a binary format that counts full seconds in two arith- significantly contribute to temporal unpredictability [41].
metic and subdivides the second in fractions of two arithmetic. As explained in [42] message passing is also superior to
These restrictions are introduced in order to simplify the inter- shared memories in case of a high computation/communication
leaving of cycles, the generation of cyclic schedules and the syn- ratio, which is typical for embedded systems.
chronization of the activity of a system with an external time 1) Periodic Exchange of Messages: The periodic message
reference. This time format has also been standardized by the exchange service, also called the time-triggered message service
OMG [37]. sends a message at its period and phase, the start instant, from
The drawback of this restriction is that overhead can be in- one sending component to one or more receiving components,
curred by mapping cycles to power-of-two products. A smaller i.e., this service provides multicast communication. The data
cycle and more communication and computational resources that is disseminated with this service is state information.
might be used than actually required for a given application Messages are associated with ports. During establishment of
(e.g., to ensure the stability of a control loop). a port by the TRM, the configuration parameters are assigned to
a port, such as the period and phase of a time-triggered message,
B. Basic Communication Services the size of the message and the set of receivers of the message.
Before sending a message, the sender updates the state vari-
The GENESYS MPSoC provides services for the communi- able associated with the respective port. The multicast of the
cation among components: periodic message multicasting, spo- message is performed autonomously by the communication in-
radic message multicasting and primitive multicast streaming. frastructure at the previously configured period and phase. At
These three communication modes support all targeted types of the receiver side the communication infrastructure overwrites
application. Periodic message multicasting is ideal for the ex- the state variable associated with the receiving port with the data
change of state variables [34] in cyclic applications such as con- contained in the message.
trol loops. Sporadic message multicasting is effective for sig- Since the transmission of messages is triggered by the pro-
naling events (e.g., alarm event, error indication reported to a gression of the common time, this service relies on the avail-
diagnostic component). Streaming serves for the realization of ability of the common time service.
multimedia applications and the transfer of new job images to 2) Sporadic Exchange of Messages: The sporadic message
components. exchange service, also called event-triggered message service,
The reasons for the choice of message passing as the basic supports the exchange of event-information at arbitrary instants
interaction mechanism in the GENESYS MPSoC are as follows. only constrained by a minimum interarrival time of events.
• Explicit timing. In contrast to shared memory abstrac- During normal operation this message service supports an
tions, the timing of message exchanges is explicitly de- exactly once semantics, i.e., every message is delivered exactly
fined and the need for separate synchronization is elimi- once.
556 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 6, NO. 4, NOVEMBER 2010
The sending component places the message in the specified with both safety-critical and non safety-critical subsystems. Pri-
event-triggered output port. An event-triggered output port in- marily due to safety concerns, reconfiguration of safety-critical
cludes an outgoing message queue that resides in the memory systems is often reduced to selecting system-wide modes out of
space of the sender. As soon as the communication is ready to statically defined scheduling tables, which impairs the flexibility
accept the next message, it fetches the message from this queue resource management, but provides good analyzability and de-
(consumable read) and transports the message to the receivers terminism. Certification standards such as DO-178B [32] in the
that are associated with the sending port. The message transport avionic domain provide recommendations for static resource al-
is realized via a time-triggered channel. As soon as the mes- locations.
sage arrives at the receiver(s), it is placed in the input queue(s) In order to support safety-critical applications, the mandatory
associated with the receiver’s ports. The size of the message basic configuration services encompass only those capabilities,
queues—outgoing queues at sending components as well as in- which are indispensable for configuration and do not preclude
coming queues at receiving components—is determined by the certification in safety-critical applications: a basic boot service,
respective component itself. If the message queue is already full an identification service and an inter-component channel config-
and the component tries to place a new message in the queue, urator. In non safety-critical applications, configuration services
an error is reported via the TII. going beyond these basic capabilities can be realized as part of
Message reception is realized by retrieving a message from the optional services.
the incoming message queue (consuming read operation). For • The basic boot service is the primitive configuration ser-
handling empty incoming message queues, the sporadic mes- vice that assigns a job to a hardware unit of the MPSoC,
sage exchange service supports two types of read operations: If thus creating a component of the MPSoC. Every hardware
non-blocking read is chosen, the read call returns instantly and unit must have a priori established boot access ports for
signals the caller that no message has been retrieved. If blocking the boot messages (i.e., access ports that are physically as-
read is chosen, the read operation waits until a new message sociated with the hardware) as part of their TII interface.
arrives. The boot protocol links an off-chip repository with the job
3) Primitive Multicast Streaming: The primitive multicast images (e.g., the development system) with the physical
streaming service can be utilized for the transmission of a se- runtime systems. By associating the logical names of jobs
quence of variable-size data elements (e.g., frames in a video with the physical names of hardware units, this service es-
stream) with corresponding temporal properties (e.g., average tablishes a logical name-space such that the components of
data rate with bursts). Similar to the sporadic message exchange the platform can be uniquely identified and addressed on
service, this service uses queues at the sender and receiver side. the basis of their role in the given application. The basic
Queues compensate for irregular data rates that are typical for boot service can be extended to a secure boot service by
streaming applications (e.g., MPEG4). For retrieving a data el- optional security services.
ement out of the queue, a blocking and a non-blocking mode • The identification service provides a unique identifier of
of operation is supported. The primitive streaming service does the chip. This identification is needed to inform the boot
not exercise any flow-control from the receiver to the sender. service about the type of available hardware and the at-
tributes of the unit under consideration. In addition, it is
C. Basic Configuration Services required to distinguish individual manufactured physical
units from each other. The implementation of the hardware
The basic configuration services are provided to load the jobs identification within the MPSoC must be tamper resistant,
to the available hardware units of the MPSoC, thus forming i.e., it should be impossible to modify the unique chip iden-
the components of the MPSoC. In case of a static configura- tification without physically destroying the chip.
tion, the allocation of jobs to the hardware units maybe outside • The intercomponent channel configurator supports the
the scope of architectural considerations—the jobs are already establishment, modification and removal of communica-
permanently assigned to the hardware units (e.g., in case of an tion channels. The intercomponent channel configurator,
ASIC implementation of a component, where the software is which is part of the TRM of the MPSoC, configures the
part of the hardware). The basic configuration services are also chip-local inter-component communication system (i.e.,
needed if a dynamic reconfiguration of an MPSoC is performed the TT-NoC and the communication interfaces of the
by reassigning a job to a different hardware unit, because the trusted subsystem) by establishing, naming, connecting,
original hardware unit has failed. and disconnecting the ports and communication channels
Configuration capabilities are required for the initial config- of the components of the MPSoC according to a time-trig-
uration of programmable hardware and to support different en- gered communication schedule. Communication links to
vironmental conditions (e.g., on ground versus airborne in an the outside of the MPSoC are provided by system compo-
avionic system), modifications of the computer system (e.g., ad- nents providing gateway services. Such a gateway system
dition of a component) and variable resource availability (e.g., component supports two interfaces, an inner interface to
low energy resources). Reconfiguration enables better resource the TT-NoC and an outer interface to the chip environment
utilization, improved dependability, and the enabling of power/ (e.g., CAN, FlexRay). Viewed from the MPSoC, the inner
energy aware system behavior. interface is a chip-level LIF, while the outer interface is
The GENESYS MPSoC targets safety-critical applications, an (unspecified) local interface of the gateway system
non safety-critical applications, and mixed criticality systems component. The gateway system component connects
OBERMAISSER et al.: A CROSS-DOMAIN MULTIPROCESSOR SYSTEM-ON-A-CHIP FOR EMBEDDED REAL-TIME SYSTEMS 557
(e.g., power gating, voltage level), time-management, sched- A. Domain-Independent Optional Services
uling, and other execution control issues.
The basic execution control services are essential to facili- The domain-independent services build upon the core ser-
tate two key properties of the GENESYS MPSoC: robustness vices and are generic in the sense that they can be used in mul-
and energy efficiency. A component can recover from transient tiple application domains. These services are optional in the
faults by being reset via the TII. In order to improve energy ef- sense that they are not required in every instantiation of the ar-
ficiency, component can be stopped using the execution control chitecture. If needed, developers can pick them out of compo-
services. When the component services are needed again, the nent libraries with existing, validated architectural services.
component can be started and the state is restored using an ex- At present, the domain-independent optional services are
ecution request. grouped into the following categories (cf. Fig. 7): robustness
services, external memory management services, security
services, resource management services, gateway services,
mobility services, and higher communication services. Within
V. OPTIONAL SERVICES the GENESYS project1 these services were identified with
the support of experts from the automotive, industrial control,
The GENESYS MPSoC is designed for the use in multiple
avionics, consumer electronics and mobile domains. Selected
application domains in order to exploit the economies of scale
optional services were implemented and tested for the case
of the hardware. Using optional services in the waistline archi-
study in Section VI, namely, gateway, diagnosis, untrusted
tecture, the core architectural services can be refined according
resource manager, and input/output services.
to the requirements of specific domains (e.g., automotive) and
An example of a domain-independent optional service, which
applications (e.g., powertrain). This section presents some
has already been introduced in the explanation of the GEM, is
optional services (e.g., external memory management, state
the external memory management service. In many applications
restoration) as examples for this refinement. The external
built with the GENESYS MPSoC there may be the need to pro-
memory management service is essential for components
vide, in addition to the internal local memories of the compo-
requiring more memory than can be provided on-chip within
nents a (large) external memory. While the organization and at-
the components. The state restoration service allows to tolerate
tributes of the component-internal memory is a local issue of
transient faults through the external reset and restart of a
the component and of no relevance at the architecture level,
component. State restoration is important to address increasing
the external memory that is provided in the form of a system
transient failure rates induced by trends in the semiconductor
component and accessed by many application components must
industry.
be properly managed from the point-of-view of inter-compo-
An optional service encapsulates a well-defined supportive
nent data integrity. Access to the external memory is controlled
functionality into a self-contained system component that inter-
by this memory system component which acts as an intelligent
acts with the GEM of the application components by the ex-
memory controller that communicates with the application com-
change of messages. Alternatively, an optional service can be
ponents exclusively by the exchange of messages through the
implemented directly in the GEM of an application component.
LIF.
Optional services are can be useful across many application do-
Another example of a domain-independent optional service
mains and may be needed on many different occasions. They
is the state externalization service. For various reasons, such as
simplify the system development process by providing ready
state validity checking, state exchange, logging purposes, global
building blocks that can be reused on the basis of their LIF speci-
state snapshot, and power gating for energy saving, components
fication without the need to know the internals of the component
should periodically externalize their internal state at predefined
implementation.
cyclic recovery instants. State externalization is very important
The set of optional services is an open set that can be extended
for the fast recovery of components affected by transient faults.
and modified as new services are identified and conceptualized
The externalized information can be used to: 1) allow error de-
into a self-contained entity.
tection and/or 2) enable checkpointing and retry mechanisms.
The partitioning of the software on a GENESYS MPSoC
In a time-triggered system, the state externalization can be
into a set of self-contained system and application components
synchronized with the processing cycle within the component.
that interact with each other solely by the exchange of mes-
After the component has entered its ground state at its ground
sages, takes advantage of the enormous and cheap bandwidth
state instant, i.e., an instant where the internal tasks of the last
of the deterministic NoC that connects the components. It is
component cycle have been completed and all data that is rele-
thus possible to partition the software cleanly according to func-
vant for the next cycle is stored in a ground-state data structure,
tional and fault-containment criteria without causing an undue
the ground state can be sent in a ground-state message to a di-
performance penalty because of the distributed nature of the
agnostic system component.
implementation.
The diagnostic system component will use the component
As any other component, a system component forms its own
restart service to reset and restart a failed component by sending
fault-containment region. If a transient fault affects a system
a restart message to the TII interface of a failed component, with
component without internal state, the component can be reset
an internal state that is expected to be acceptable at the next fu-
immediately. If the component contains internal state, then this
ture restart instant in order to force the component into a restart.
internal state must be repaired before the component can con-
tinue to provide its services. 1www.genesys-platform.eu.
OBERMAISSER et al.: A CROSS-DOMAIN MULTIPROCESSOR SYSTEM-ON-A-CHIP FOR EMBEDDED REAL-TIME SYSTEMS 559
The component restart service is also used to restart and perform premium cars, the example encompasses application services of
state restoration in case of power gating. different domains (i.e., a control subsystem and a multimedia
subsystem). The prototype demonstrates the cross-domain
B. Domain-Specific Optional Services suitability of the GENESYS MPSoC and the ability for inter-
The domain-specific optional services are services that operability between different application domains.
are specialized for the considered application domain. The The example employs one GENESYS MPSoC, which is con-
domain-specific optional services build upon the core services nected via an Ethernet gateway to a PC running the open-source
and a well-defined subset of the domain-independent optional racing car simulator TORCS [44]. The GENESYS MPSoC con-
services. For example, in the automotive domain a gateway trols the vehicle in the simulator by processing user input from
service for interacting with a Media-Oriented Systems Trans- a USB driving wheel with gas and brake pedals (see Fig. 8).
port (MOST) [43] network can be a domain-specific optional In the prototype of the GENESYS MPSoC, an implementa-
service. MOST is bus standard used specifically for intercon- tion of the trusted subsystem provides the core services, while
necting multimedia components in automobiles. system components realize selected optional services. Three
system components offer the following domain-independent
optional services: untrusted resource manager, diagnosis,
VI. PROTOTYPE IMPLEMENTATION AND
and gateway service. In addition, each application subsystem
AUTOMOTIVE EXAMPLE
contains a system component offering domain-specific optional
This section describes a prototype implementation of the services (i.e., display in the multimedia subsystem, input/output
GENESYS MPSoC and presents an automotive example in the control subsystem). Finally, two application components
application. Like the in-vehicle electronic systems in today’s in each subsystem realize the application services.
560 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 6, NO. 4, NOVEMBER 2010
Fig. 9 gives an overview of the messages exchanged between In the prototype, the messages travel predefined seg-
the components on the TT-NoC. mented routes, i.e., using source (wormhole) routing [19]. The
TT-NoC consists of switching nodes, which embody the hops
A. Implementation of Core Services
on the segmented route. In fact, this prototype instantiates
1) Basic Time Services: Each communication interface six switching nodes in a 2 3 mesh. Each communication
maintains a counter value that is a local representation of the interface is attached to exactly one switching node at the com-
common time. All counter values are synchronized by means of munication interface, which can be organized in any non-bus
a shared clock distribution network, which spans all communi- topology, preferably a mesh. Generally, the connections be-
cation interfaces and produces the macro ticks of the common tween switching nodes are distinct and unidirectional, but two
time. The granularity of a macro tick obeys the reasonableness lanes in reverse direction are grouped to form an interconnect
condition ([34, p. 52]). between a pair of switching nodes. Consequently, the TT-NoC
2) Basic Communication Services: The time-triggered supports quasi-bidirectional communication patterns.
communication schedule, which controls the operation of the The routing information, which characterizes the path from
TT-NoC, is distributed across the communication interfaces. A a sending to the receiving communication interfaces, is a priori
dispatcher in each communication interface uses the schedule known and is organized in such a way that it avoids any tem-
to determine the instants of the common time when a com- poral or spatial collisions at switching nodes in the TT-NoC. The
munication activity has to be processed. For this purpose, a routing information for each communication channel resides in
separate control logic for each message period compares the the sending communication interface. It precedes messages in
current value of the common time with the phase of the next a header, which is injected by the sending communication in-
communication activity in that period. On compare match, the terface before it passes on the payload of the message at the
control interface triggers the processing of the communication communication interface.
activity. As a consequence, switching nodes have no knowledge about
For each sending activity listed in the schedule at a communi- the communication schedule or the routes of communication
cation interface, there must be a matching entry in the schedule channels, but they simply forward data based on the routing in-
of at least one receiving communication interface. According to formation of a message.
this mapping between senders and receivers, communication in- 3) Basic Configuration Services: The basic configuration
terfaces implicitly build single-, multi-, and broadcast channels. service modifies the communication schedule in the commu-
The periodic and sporadic messages, which conceptually are nication interfaces in order to configure the intercomponent
accessible through ports of communication channels, physically channels. For this purpose, the TRM maintains a dedicated
reside in a memory buffer beyond the trusted subsystem within communication channel to each communication interface,
the component, the port memory. A port is realized as a distinct which embodies the realization of the component’s TII. From
address range within the port memory. A port synchronization the point-of-view of the TRM, these dedicated channels are
protocol [19] defines the physical arrangement of periodic and sporadic communication channels. From the point-of-view of
sporadic messages in ports and ensures memory consistency. a given communication interface, they are directly processed
For periodic messages, the communication interface always and not visible to the application code. Whenever configuration
disseminates the messages, which are either fetched from the as- data arrives, the communication interface redirects the data into
sociated port and injected into the TT-NoC (for send operations) its internal memories where the local communication schedule
or grabbed from the TT-NoC and stored in the corresponding and the routing information reside. This process is under con-
port (for receive operations). For sporadic messages, a message trol of the TRM, while the communication interface is passive.
transfer only takes place if the message queue at the sender side The TRM does not overwrite the current configuration data of
is not empty. the communication schedule, but it delivers the new data in an
OBERMAISSER et al.: A CROSS-DOMAIN MULTIPROCESSOR SYSTEM-ON-A-CHIP FOR EMBEDDED REAL-TIME SYSTEMS 561
unused area of the internal memory. As a result, the operation In the prototype, sporadic communication channels exist
of communication interfaces need not be interrupted. Hence, from the diagnostic system component and the application
during an intercomponent channel configuration there exist two components to the GRM to communicate resource request.
different configurations in each communication interface: the Furthermore, the GRM possesses a communication channel to
active and the shadow configuration. After the TRM has dis- the TRM to transfer the proposal of the new configuration.
tributed new configuration data to all communication interfaces 3) Gateway: The prototype possesses a gateway service for
into their shadow areas, they switch to the new setup simultane- the redirection of messages between the TT-NoC and a chip-
ously at a common instant—the reconfiguration instant—also external Ethernet network. Information from multiple periodic
present in the communication schedule. In other words, we or sporadic messages is used for the construction of an Ethernet
leverage the infrastructure of dispatching communication ac- message.
tivities to create a global synchronization event, namely the
reconfiguration instant.
4) Basic Execution Control Services: Besides configuration C. Domain-Specific Optional Services
data of the communication schedule and routing information,
In addition to the domain-independent services described in
the TRM is able to modify the parameters of execution control
the previous section, the prototype contains domain-specific ser-
in each control interface. Each control interface offers a collec-
tion of special control registers, which are writable for the TRM vices for the automotive control and multimedia subsystems. An
and readable for the component. One of these special control input/output service serves for the interaction with sensor and
registers is the host mode, which represents the basic execution actuator devices (i.e., steering wheel and pedals in the proto-
control commands introduced in Section IV-D. Using the host type). This domain-specific service possesses local interfaces to
mode, a component can be started, stopped and reset. the natural environment and provides input to the application
services of the control subsystem. Due to the cyclic operation
of the control subsystem, periodic communication channels are
B. Implementation of Domain-Independent Optional Services used for the communication with the application services.
The display service is a domain-specific service in the mul-
1) Robustness Service: The prototype offers a domain-inde- timedia subsystem of the prototype. Its purpose is to depict vi-
pendent robustness service, which serves for the detection of sual content using a Liquid Crystal Display (LCD). The display
transient and permanent faults. Therefore, a diagnostic system service obtains this data from an application service through a
component collects failure indication messages that are sent by sporadic communication channel.
other components via their TII. For example, the GEM in the
application components implements an end-to-end checksum D. Programming of Application Components
mechanism [45] that checks the integrity of messages across
different components from the creation at a sender, throughout The component implementation has to ensure that the com-
transfer and use in a receiver. Also, the communication interface ponent behavior (i.e., sequence of transmitted messages as re-
at a component reports errors to the diagnostic system compo- sponse to inputs, state and the progression of time) satisfies the
nent, e.g., information about a queue overflow at a port. operational and meta-level specifications of the linking interface
The diagnostic system component exploits information about (cf. Section III-E).
the frequency and location of errors in order to compute a health For this purpose, different implementation choices of a com-
state for the components [45]. The health state is an input for ponent are possible, ranging from a direct implementation of a
maintenance decisions and for reconfiguration activities (e.g., state machine in hardware to a flexible software-based imple-
signal the resource manager to suspend a faulty component and mentation with a general purpose CPU executing an operating
activate a spare component). system, middleware and application software. The implemen-
2) Global Resource Manager: The Global Resource Man- tation choice and the used operating system and programming
ager (GRM) is a system component that leverages the basic con- languages in case of a realization based on a general purpose
figuration core service, particularly the inter-channel configura- CPU are not visible to the users of a component. Other com-
tion service, to adapt the communication schedule and param- ponents perceive only the message exchanges on the time-trig-
eters of the communication interfaces. The reason for such an gered NoC.
adaption can be changes in the environment of the MPSoC, ad- The core services, which are offered by the communication
ditional functions to be provided due to user requests, or recon- interface, serve as the foundation for the component implemen-
figuration requests by the robustness service. The GRM either tation. The core services are accessible via a memory-mapped
maintains a set of predefined schedules and configurations or it interface consisting of a port interface and a control interface
calculates them dynamically at run-time according to requests (see Fig. 10).
[46]. a) Port Interface: The port interface offers access for the
The GRM is not permitted to write configuration data to the communication interface to the physical memory (so-called
communication interfaces directly. It has to submit a proposal port memory), where all application data of messages that
for a new configuration to the TRM to be checked first. If the are sent or received is stored. In the prototype implemen-
TRM approves the new configuration, the TRM employs the in- tation, the port memory in each component is realized as a
terchannel configuration to write the configuration to the com- dual-ported memory. This dual-ported memory is accessed by
munication interfaces. the communication interface according to the time-triggered
562 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 6, NO. 4, NOVEMBER 2010
communication schedule. The host processor in a component For event ports, read and write positions for the ring buffer are
can also use the time-triggered communication schedule to maintained in the port synchronization memory.
synchronize its read/write operations with the activities of the The register file offers the current value of the global time
communication interface. Alternatively, explicit synchroniza- base, information about error conditions (e.g., queue overflow
tion can be realized using the control interface. of a event port, watchdog miss), and control mechanisms for
The port interface is realized as a master on the communi- the watchdog and the timer services.
cation interface side. Symmetrically, at the memory-side it is a
slave interface. Different implementation choices are supported E. Application Services of the Case Study
in the prototype for the slave and master interfaces such as Al- The application services realize the application logic of the
tera Avalon [47] and AMBA AHB [48]. example application. For the control subsystem, the steering
The port memory contains state ports and event ports. A state control service takes the sensor inputs from the input/output ser-
ports consists of the message data and a timestamp denoting the vice and calculates the set values for the steering actuation. The
instant of the most recent message reception (w.r.t. the global ABS controller service exploits the revolution sensors and initi-
timebase). The queue of an event port is realized as a ring buffer, ates the unlocking of a blocked wheel. The outputs of these two
which can store a given number of messages each consisting of application services are sent to the gateway service to be relayed
message data and a timestamp. to the environmental simulation on the PC.
b) Control Interface: The control interface is used for the For the multimedia subsystem, the media source is an ap-
configuration of individual ports and the synchronization of the plication service that produces a video stream for the display
access to messages in ports between the host processor and the service. Furthermore, different quality levels are supported in
communication interface. the multimedia system in order to control the requirements con-
Using the control interface, the host processor can write a cerning communication bandwidth at the TT-NoC and compu-
port configuration memory. This memory permits the enabling/ tational requirements at the components. The media control ser-
disabling of ports, the selection of port types (i.e., event or state vice manages the transition to a different quality level. It pos-
port), queue lengths and the definition of addresses in the port sesses sporadic communication channels to the GRM to send
memory for storing messages. reconfiguration requests, which lead to the switching between
The port synchronization memory serves for ensuring con- precomputed configurations in the prototype.
sistency of the read/write operations to the port memory be- In the following, the implementation of one the application
tween communication interface and host. For receiving mes- components is explained (i.e., steering application service). This
sages through state ports, counters for the Non-Blocking Write component is realized as a NiosII soft-core CPU running the
(NBW) protocol [49] are provided. For sending messages, the COS-II [50] operating system. A driver library, which is linked
port synchronization memory can be used to switch between to the application code, supports the access to the port memory
shadow buffers in the port memory. Thereby, the host is allowed and the control interface and abstracts from the low-level port
to update a message in a shadow buffer while another buffer is synchronization and the memory layout. In the prototype, both
active and used for the ongoing communication on the TTNoC. the library and the application code were written in C.
OBERMAISSER et al.: A CROSS-DOMAIN MULTIPROCESSOR SYSTEM-ON-A-CHIP FOR EMBEDDED REAL-TIME SYSTEMS 563
Fig. 11. Sequence of activities at the application component realizing the steering control service.
Fig. 11 depicts the temporal sequence of the activities of the Each application component uses an Aeroflex Gaisler LEON3
component within one cycle. The communication interface sig- soft-core CPU with according peripherals as the computing
nals the arrival of the input message via an interrupt (part of element for the realization of the application services.
the control interface) to the host processor at a predefined in- Fig. 12 summarizes the resource utilization of selected enti-
stant of the global time base. The interrupt causes the execution ties for Stratix III FPGAs [51] in terms of used logic elements,
of an interrupt service routine in the host processor. The inter- registers, and memory (RAM) cells of that target technology.
rupt service routine generates a token in a mailbox of COS-II These values are the final resource utilization after placement
serving as an interprocess communication primitive. The token and routing (fitting) with Altera Quartus II 9.0sp2.
unblocks the application task so that the dispatcher of the op-
erating system will schedule the application task. Upon execu- A. Basic Time Services
tion, the application task reads the message using the port in- The granularity of the common time in the prototype is
terface. The task invokes a send operation of the driver, which s (953.67 ns). This is the granularity of the macro tick of the
writes the computed control values as a message into the port local replicated clocks. All communication activities are syn-
memory. Finally, the communication interface sends the mes- chronized to this common time. The granularity of the local
sage on the TT-NoC according to the time-triggered communi- system operation frequency is 3 ns (333 MHz).
cation schedule.
B. Basic Communication Services
VII. PERFORMANCE AND RESOURCE REQUIREMENTS The communication interface (16 communication channels,
The prototype and the demonstration application are based 4 periods supported) achieves a maximum stable frequency
on FPGA technology [51], i.e., Stratix III series from Altera. of more than 300 MHz on the Stratix III FPGA family, the
Most parts of the trusted subsystems (communication inter- switching nodes of the TT-NoC can be driven even faster.
faces, TT-NoC) as well as parts of GEM services (end-to-end With a unit size of data (flit size in the NoC) of 32 bit, a single
checksum generation of robustness service) are realized as communication channel theoretically can transfer 9.6 Gbit/s on
VHDL code. this target technology. In case of multiple concurrent commu-
The optional services and application services have been real- nication channels this value scales linearly.
ized as software-based implementations. Each optional service The latency of communication is made up of a constant pro-
and each application service in the case study is realized with cessing time in the communication interface (from dispatching
a dedicated component (i.e., application component or system until the start of the message transfer) plus the (a priori known)
component) and a dedicated soft core CPU. The presented case propagation delay in the TT-NoC. A communication interface
study performs a strict partitioning using a one-to-one map- requires seven cycles of system operation frequency to start a
ping between optional services/application services and hard- send or receive operation. The propagation delay of a commu-
ware blocks. nication channel through the network is given by the number
Each system component is built using an Altera Nios II of hops on the route. For each hop, a switching node takes one
soft-core CPU with additional peripherals such as memory con- cycle of system operation frequency [19]. So, a message that
trollers, I/O controllers etc. The service of a system component travels the longest (multicast) route of six hops in the 2 3 mesh
(e.g., health monitoring of the robustness service) is realized has a total latency of cycles, thus 39 ns at 333 MHz
as a software-based implementation on that soft-core CPU. system operation frequency.
564 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 6, NO. 4, NOVEMBER 2010
B. Improved Robustness maximize fault isolation. This approach also results in simpler
The GENESYS MPSoC supports robustness by establishing components, which have the potential for a more efficient use of
a framework for fault and error containment, the selective restart the chip area. According to Pollack’s rule [55], the increase in
of components that have failed after a transient fault, and the performance of a sequential computer is only about the square
masking of transient and permanent errors by the replication of root of the increase in the number of devices, which implies
components. The GENESYS MPSoC offers generic core and that doubling the transistor count will lead to a performance im-
optional services for robustness such as the state externaliza- provement of about 40%. Fortunately, the inherent concurrency
tion and component restart services that enable the recovery in a embedded application (i.e., the different application and
from a transient fault with state restoration. In particular, the optional services) offers the potential to circumvent Pollack’s
trusted subsystem prevents a faulty component from affecting rule. If an application is partitioned into a set of nearly au-
the timely exchange of messages by other components. This tonomous concurrent services, then a nearly linear performance
property ensures that components represent independent fault improvement could be achieved by assigning a dedicated
containment regions. In conjunction with the determinism of the processing element to each of these concurrent services.
GENESYS MPSoC, the fault containment allows to implement For a given application and a specific instantiation of the
active redundancy with exact voting [53]. GENESYS MPSoC with a particular selection of processing
cores, the definition of the jobs is also important for an efficient
C. Improved Complexity Management use of chip area. The computational requirements of the jobs
The management of the ever-increasing cognitive complexity should be aligned with the performance of the processing cores.
of embedded systems is a major concern in all application do- If the computational requirements of a job exceed the perfor-
mains. GENESYS attacks this problem by lifting the design mance of a processing core, jobs need to be further subdivided.
process to a higher level of abstraction—to the level of self-con- Adversely, jobs with low computational requirements can lead
tained hardware/software components that communicate exclu- to a low utilization of individual processing cores. Such a lim-
sively by the exchange of messages. Components can be reused ited utilization of processing cores can also be desired to provide
on the basis of their interface specification without having to the opportunity for future extensions or control power dissipa-
know the internals of the component implementation. Based on tion in a given application (e.g., thermal constraints).
the predictable and deterministic NoC with inherent fault isola- Furthermore, the GENESYS MPSoC unleashes the potential
tion, the GENESYS MPSoC supports the classic simplification for more efficient implementation technologies. Due to cross-
strategies of abstraction, partitioning and segmentation [54]. domain reuse, implementations can target significantly larger
The case study has demonstrated that heterogeneous auto- markets. A larger market justifies the use of different implemen-
motive services (i.e., control service and multimedia services) tation technologies (e.g., special purpose hardware instead of a
with different criticality levels can be integrated. The commu- general purpose CPU). The shift to a different implementation
nication interfaces, TRM and TTNoC establish strict control on technology can result in improvements of several orders of mag-
the component behavior and restrict message transmissions to nitude w.r.t. performance, consumed die area, and energy effi-
the time intervals specified in the time-triggered communica- ciency ([26, p.7]). Therefore, a different implementation tech-
tion schedule. Thereby, the latency of a message transmitted nology can significantly outperform the benefits of optimizing
by a component is independent from the behavior of the other for a single application or domain. In contrast, optimizing for
components. a single application might reduce the size of potential applica-
In particular, the local failure of a component does not lead tions (and thus the market), thereby preventing the shift to an-
to a global failure of the entire chip. For example, the control other implementation technology (e.g., ASIC with better energy
services in the case study will remain operational despite the efficiency, higher performance) due to the higher fixed cost.
failure of a multimedia component (e.g., due to a design fault in
the implementation of the multimedia services).
REFERENCES
D. Performance and Resource Requirements [1] M. Broy, “Challenges in automotive software engineering,” in Proc.
As indicated by the results, the GENESYS MPSoC exhibits 28th Int. Conf. Software Engineering: ICSE’06, New York, 2006, pp.
33–42, ACM.
competitive performance that is suitable for applications with [2] J. Hosbond, “Mobile systems development: Challenges, implications
stringent bandwidth and latency constraints (e.g., high-res- and issues,” in Proc. IFIP Int. Working Conf. Mobile Inform. Syst.,
olution video streams, time-critical control applications). In Leeds, U.K., 2005, pp. 279–286.
[3] “ARTEMIS Final Report on Reference Designs and Architectures—
addition, the prototype demonstrates the moderate resource
Constraints and Requirements,” ARTEMIS (Advanced Research &
demands imposed by the domain-independent core of the Technology for Embedded Intelligence and Systems) Strategic Re-
GENESYS MPSoC. The required logic elements of the trusted search Agenda, 2006. [Online]. Available: http://www.artemis-sra.eu
subsystem are fewer than the logic elements of two components [4] “International roadmap of semiconductors,”—2007 Ed., Section De-
sign SIA, Tech. Rep..
in the prototype. The memory requirements of the trusted [5] C. Constantinescu, “Trends and challenges in VLSI circuit reliability,”
subsystem are configurable and depend on the configuration IEEE Micro, vol. 23, no. 4, pp. 14–19, 2003.
such as the size of the communication schedule and the source [6] M. Day and P. Hofstee, “Hardware and software architectures for
the cell broadband engine processor,” in Proc. CODES + ISSS Conf.,
routing information in the communication interfaces.
Austin, TX, 2005.
In the GENESYS MPSoC, each service is ideally imple- [7] AUTOSAR GbR, AUTOSAR—Technical Overview V2.0.1. Jun.
mented by a dedicated self-contained component in order to 2006.
566 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 6, NO. 4, NOVEMBER 2010
[8] “ARINC Specification 651: Design guide for integrated modular [33] “IEC 61508-7: Functional safety of electrical/electronic/pro-
avionics,” Aeronautical Radio, Inc.. Annapolis, MD, Nov. 1991. grammable electronic safety-related systems—Part 7: Overview
[9] K. Kronlöf, S. Kontinen, I. Oliver, and T. Eriksson, “A method for of techniques and measures,” IEC: Int. Electrotechnical Commission,
mobile terminal platform architecture development,” Advances in De- 1999.
sign and Specification Languages for Embedded Systems, pp. 285–300, [34] H. Kopetz, Real-Time Systems, Design Principles for Distributed Em-
2007. bedded Applications. Boston, MA: Kluwer, 1997.
[10] IBM, Sony, and Toshiba, “Cell broadband engine architecture,” Tech. [35] C. Hoefer, “Causality and determinism: Tension, or outright conflict,”
Rep, 2006. Revista de Filosofia, vol. 29, no. 2, pp. 99–225, 2004.
[11] K. Goossens, J. Dielissen, and A. Radulescu, “The Aethereal network [36] J. Muttersbach, T. Villiger, and W. Fichtner, “Practical design
on chip: Concepts, architectures, and implementations,” IEEE Design of globally-asynchronous locally-synchronous systems,” in Proc.
and Test of Computers, vol. 22, pp. 414–421, 2005. 6th Int. Symp. Advanced Res. Asynchronous Circuits and Systems
[12] Sonics, “Sonics network technical overview,” 2002. [Online]. Avail- (ASYNC’00), Washington, DC, 2000, pp. 52–59.
able: www.sonicsinc.com [37] “Smart transducers interface, Version 1.0,” Object Management Group
[13] A. Hansson, K. G. M. Bekooij, and J. Huisken, “Compsoc: A (OMG), Jan. 2003.
template for composable and predictable multi-processor system on [38] S. Chandra, J. Larus, and A. Rogers, “Where is time spent in mes-
chips,” ACM Trans. Des. Autom. Electron. Syst., vol. 14, no. 1, pp. sage-passing and shared-memory programs?,” in Proc. 6th Int. Conf.
1–24, 2009. Architectural Support for Programming Languages and Operating Sys-
[14] M. Millberg, E. Nilsson, R. Thid, S. Kumar, and A. Jantsch, “The tems, 1994, pp. 61–73.
nostrum backbone—A communication protocol stack for networks on [39] Y. Dou, Z. Pang, and X. Zhou, “Implementing a software virtual shared
chip,” in Proc. 17th Int. Conf. VLSI Design, 2004, pp. 693–696. memory on PVM,” in Proc. Adv. Parallel and Distrib. Comput., 1997,
[15] M. Coppola, R. Locatelli, G. Maruccia, L. Pieralisi, and A. Scandurra, pp. 190–195.
“Spidergon: A novel on-chip communication network,” in Proc. Int. [40] J. Leverich, H. Arakida, A. Solomatnikov, A. Firoozshahian, M.
Symp. System-on-Chip, Nov. 2004, p. 15. Horowitz, and C. Kozyrakis, “Comparing memory systems for chip
[16] OCP-IP Association, “Open core protocol specification 2.1,” 2005. multiprocessors,” in Proc. Int. Symp. Comput. Architecture (ISCA),
[17] ARM, “AXI protocol specification,” 2004. 2007, pp. 358–368.
[18] G. Engleder, “Time-triggered network-on-a-chip,” M.S. thesis, Faculty [41] Y. Li, V. Suhendra, Y. Liang, T. Mitra, and A. Roychoudhury,
of Computer Science, Real-Time Systems Group, Vienna Univ. Tech- “Timing analysis of concurrent programs running on shared
nology, Vienna, Austria, 2007. cache multi-cores,” in Proc. 30th IEEE Real-Time Syst. Symp.,
[19] C. Paukovits and H. Kopetz, “Concepts of switching in the time-trig- RTSS, Washington, DC, Dec. 1–4, 2009, pp. 57–67, DOI=
gered network-on-chip,” in Proc. 14th IEEE Int. Conf. Embedded Real- http://dx.doi.org/10.1109/RTSS.2009.32.
Time Comput. Syst. Appl. (RTCSA), 2008, pp. 120–129. [42] F. Poletti, A. Poggiali, D. Bertozzi, L. Benini, P. Marchal, M. Loghi,
[20] N. Suri, C. Walter, and M. Hugue, Advances in Ultra-Dependable Dis- and M. Poncino, “Energy-efficient multiprocessor systems-on-chip for
tributed Systems. Los Alamitos, CA: IEEE Computer Society Press, embedded computing: Exploring programming models and their archi-
1995, ch. 1. tectural support,” IEEE Trans. Comput., vol. 56, no. 5, pp. 606–621,
[21] B. Pauli, A. Meyna, and P. Heitmann, “Reliability of electronic compo- May 2007.
nents and control units in motor vehicle applications,” in VDI Berichte [43] T. Gaul, W. Lowe, and M. Noga, “Specification in a large industry
1415, Electronic Systems for Vehicles, 1998, Verein Deutscher Inge- consortium—The MOST approach,” in Proc. 27th Annu. Conf. IEEE
nieure, pp. 1009–1024. Ind. Electron. Soc. (IECON’01), Denver, CO, Nov. 2001, vol. 3, pp.
[22] ARINC Specification 653: Avionics Application Software Standard 1828–1833.
Interface, Part 1—Required Services, Aeronautical Radio, Inc. An- [44] E. Onieva, D. A. Pelta, J. Alonso, V. Milanes, and J. Perez,
napolis, MD, Mar. 2006. “A modular parametric architecture for the TORCS racing en-
[23] R. Obermaisser and B. Huber, “A multi-core platform for integrated gine,” in Proc. IEEE Symp. Computat. Intell. Games, 2009, pp.
modular avionics derived from a cross-domain embedded system ar- 256–262.
chitecture,” in Proc. SAE 2009 AeroTech Congr. Exhibition, Seattle, [45] H. Paulitsch, C. Paukovits, and C. E. Salloum, “Fault isolation with
WA, Nov. 2009, p. 2009-01-3262, DOI: 10.4271/2009-01-3262. intermediate checks of end-to-end checksums in the time-triggered
[24] AUTOSAR GbR, AUTOSAR—Specification of RTE Software V1.0.1, system-on-chip architecture,” in Proc. 4th Symp. Ind. Embedded Syst.
Jul. 2006. (SIES’09), 2009, pp. 90–99.
[25] B. Huber and R. Obermaisser, “An ARTEMIS cross-domain embedded [46] B. Huber, C. E. Salloum, and R. Obermaisser, “A resource manage-
system architecture and its instantiation for real-time automotive ap- ment framework for mixed-criticality embedded systems,” in Proc.
plications,” in Proc. 30th IFAC Workshop on Real-Time Programming 34th Annu. Conf. IEEE Ind. Electron. Soc. (IECON 2008), Orlando,
and 4th Int. Workshop on Real-Time Software, Mragowo, Poland, Oct. FL, 2008, pp. 2425–2431.
2009, pp. 19–26. [47] Avalon Interface Specification, Chapter Avalon Memory-Mapped In-
[26] R. Lauwereins, “Multi-core platforms are a reality . . . but where is the terfaces, , Altera Cooperation,, 2008. [Online]. Available: http://www.
software support?,” presented at the Proc. 6th Int. Forum on Embedded altera.com, available at
MPSoC and Multicore, 2006. [48] AMBA 2 AHB Specification 1.0, , ARM Limited, 1999. [Online]. Avail-
[27] H. Kopetz and N. Suri, “Compositional design of RT systems: A con- able: http://www.arm.com, available at
ceptual basis for specification of linking interfaces,” in Proc. 6th IEEE [49] H. Kopetz and J. Reisinger, “The non-blocking write protocol
Int. Symp. Object-Oriented Real-Time Distrib. Comput., May 2003, pp. NBW: A solution to a real-time synchronization problem,” in
51–60. Proc. 14th Real-Time Syst. Symp., Raleigh-Durham, NC, 1993,
[28] R. Obermaisser and P. Gutwenger, “Model-based development of MP- pp. 131–137.
SOCS with support for early validation,” in Proc. 35th Annu. Conf. [50] J. Labrosse, “MicroC/OS-II—The Real-Time Kernel” CMPBooks,
IEEE Ind. Electron. Soc. (IECON 2009), Porto, Portugal, Nov. 2009, 2002.
pp. 2867–2873. [51] Altera Corp., “Stratix III Device Handbook,” 2009.
[29] “Road vehicles—Interchange of digital information—Controller area [52] B. Ames, “Radiation threatens avionics as chip geometries shrink,”
network (CAN) for high-speed communication,” Int. Standardization Military Aerosp. Electron., 2004.
Organization, ISO 11898, 1993. [53] S. Poledna, Fault-Tolerant Real-Time Systems: The Problem of Replica
[30] LIN Specification Package Revision 2.0, LIN Consortium., Sep. 2003. Determinism. Boston, MA: Kluwer, 1996.
[31] “IEEE Std. 1149.1-1990, IEEE standard test access port and [54] H. Kopetz, “The complexity challenge in embedded system design,”
boundary-scan architecture description,” IEEE Standards Assoc,, in Proc. 11th IEEE Int. Symp. Object-Oriented Real-Time Distrib.
Tech. Rep., 1990. Comput., 2008.
[32] DO-178B: “Software considerations in airborne systems and equip- [55] P. Gelsinger, “Microprocessors for the new millenium, challenges, op-
ment certification,” Radio Technical Commission for Aeronautics, Inc. portunities, and new frontiers,” in Proc. Solid State Circuit Conf., 2001,
(RTCA). Washington, DC, Dec. 1992. pp. 22–25.
OBERMAISSER et al.: A CROSS-DOMAIN MULTIPROCESSOR SYSTEM-ON-A-CHIP FOR EMBEDDED REAL-TIME SYSTEMS 567