Anda di halaman 1dari 16

1.

Zhonghai Lu, Yuan Yao Dynamic Traffic Regulation in NoC-Based Systems IEEE
Transactions on Very Large Scale Integration (VLSI) Systems (Volume: 25, Issue: 2, Feb.
2017 ) Page(s): 556 - 569

They proposed a dynamic traffic regulation to improve the system performance for NoC-
based multi/many-processor systems-on-chip (MPSoC) and chip multi/many-core
processor (CMP) designs. It can be applied to MPSoCs for intellectual property
integration in an open-loop fashion by injecting traffic according to its run-time profiled
characteristics. It can also be applied to CMPs in a closed-loop fashion by admitting
traffic fully adaptive to the traffic and network states. Through extensive experiments and
results, we show that both the open-loop and closed-loop dynamic regulation techniques
can significantly improve the network and system performance.

2. Nezam Rohbani ; Zahra Shirmohammadi ; Maryam Zare ; Seyed Ghassem Miremadi


"LAXY: A Location-Based Aging-Resilient Xy-Yx Routing Algorithm for Network on
Chip" IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
( Volume: PP, Issue: 99 ) Date of Publication: 05 January 2017

Network on Chip (NoC) is a scalable interconnection architecture for ever increasing


communication demand between processing cores. However, in nanoscale technology size,
NoC lifetime is limited due to aging processes of Negative Bias Temperature Instability, Hot
Carrier Injection and electromigration. Usually, because of unbalanced utilization of NoC
resources, some parts of the network experience more thermal stress and duty cycle in
comparison with other parts, which may accelerate chip failure. To slow down the aging rate
of NoC, this paper proposes an oblivious routing algorithm called Location-based Aging-
resilient Xy-Yx (LAXY) to distribute packet flow over entire network. LAXY is based on the
fact that dimensionordered routing algorithms imposes the highest traffic load on the central
nodes in mesh topologies. To balance the traffic over the network, certain routers at the east
and the west of NoC, with dimension-order XY routing, statically are configured as YX.
Various configurations have been explored for LAXY and the simulations show a specific
configuration, called Fishtail, increases mean time to failure of the routers and interconnects
by about 42% and 56% respectively. Moreover, by balancing the load over the network,
LAXY improves overall packet latency by about 7% in average, with negligible area
overhead.
3. Minghua Tang ; Xiaola Lin ; Maurizio Palesi The Repetitive Turn Model for Adaptive
Routing IEEE Transactions on Computers ( Volume: 66, Issue: 1, Jan. 1 2017 )

For 2D mesh based Network-on-Chip (NoC), the prohibited turns of routing


algorithms should be repetitively distributed in order for the routing algorithms to be
implemented by logic-based circuit. In this paper, we aim to exploit the designing space
for logicbased routing algorithms, and propose new logic-based routing algorithms that
outperform the state-of-the-art counterparts. Toward this direction, we firstly construct all
routing algorithms for 5 x 5 2D mesh topology. Then we select those routing algorithms
which have repetitive prohibited turns across both the network rows and columns. In
addition, we chose those routing algorithms that have smaller routing pressures than Odd-
Even routing algorithm. Then the routing algorithms for 2D mesh topology ranging from 6
x 6 to 15 x 15 are respectively constructed according to the prohibited turns distribution of
the selected routing algorithms. Two routing algorithms that have smaller routing
pressures than Odd-Even routing algorithm are obtained for all considered networks. The
obtained logic-based routing algorithms are called as Repetitive Turn Model (RTM).
Simulation results show that RTM could achieve up to 51% performance improvement as
compared to Odd-Even routing algorithm.
4. Changlin Chen ; Yaowen Fu ; Sorin Cotofana Towards Maximum Utilization of
Remained Bandwidth in Defected NoC Links IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems ( Volume: 36, Issue: 2, Feb. 2017 )

To reach this target, we make the following contributions in this paper:

1) we propose a flit serialization (FS) method to efficiently utilize partially faulty links.
The FS approach divides the links into a number of equal width sections, and serializes
sections of adjacent flits to transmit them on all fault-free link sections to mitigate the
unbalance between the flit size and the actual link bandwidth;

2) we propose the link augmentation with one redundant section as a low cost mechanism
to mitigate the FS drawback that a links available bandwidth is reduced even if it contains
only one faulty wire

3) we deactivate HD links when their fault level exceed a certain threshold to diminish
congestion caused by HD links. The optimal threshold is derived by comparing the zero
load packet transmission latency on the HD links and that on the shortest alternative path.
Our proposal is evaluated with synthetic traffic and PARSEC benchmarks. Experimental
results indicate that the FS method can achieve lower area*power/saturation_throughput
value than all state of the art link fault tolerant strategies.

With a redundant section in each link, the NoC saturation throughput can be largely
improved than just utilizing FS, e.g., 18% when 10% of the NoC wires are broken.
Simulation results we obtained at various wire broken rate configurations indicate that we
achieve the highest saturation throughput if 4- or 8-section links with a flit transmission
latency longer than four cycles are deactivated.

5. Michael Opoku Agyeman ; Quoc-Tuan Vien ; Gary Hill ; Scott Turner ; Terrence Mak
An Efficient Channel Model for Evaluating Wireless NoC Architectures Computer
Architecture and High Performance Computing Workshops (SBAC-PADW), 2016
International Symposium on 26-28 Oct. 2016

They proposed channel model demonstrates that total path loss of the wireless channel in
WiNoCs suffers from not only dielectric propagation loss (DPL) but also molecular
absorption attenuation (MAA) which reduces the reliability of the system.
6. Michael Opoku Agyeman ; Wen Zong An Efficient 2D Router Architecture for
Extending the Performance of Inhomogeneous 3D NoC-Based Multi-Core
Architectures Computer Architecture and High Performance Computing Workshops
(SBAC-PADW), 2016 International Symposium on 26-28 Oct. 2016

In this paper, they proposed a low-latency adaptive router with a low-complexity single-
cycle bypassing mechanism to alleviate the performance degradation due to the slow 2D
routers in inhomogeneous 3D NoCs. By combining the low-complexity bypassing
technique with adaptive routing, the proposed router is able to balance the traffic in the
network to reduce the average packet latency under various traffic loads. Simulation shows
that, the proposed router can reduce the average packet delay by an average of 45% in 3D
NoCs.

7.Gwangsun Kim, Michael Mihn-Jong Lee, John Kim, Member, IEEE, Jae W. Lee, Dennis
Abts, and Michael Marty Low-Overhead Network-on-Chip Support for Location-Oblivious
Task Placement IEEE TRANSACTIONS ON COMPUTERS, VOL. 63, NO. 6, JUNE 2014.

Many-core processors will have many processing cores with a network-on-chip (NoC) that
provides access to shared resources such as main memory and on-chip caches. However,
locally-fair arbitration in multi-stageNoCcan lead to globally unfair access
to shared resources and impact system-level performance depending on where each task is
physically placed. In this work, we propose an arbitration to provide equality-of-service
(EoS) in the network and provide support for location-oblivious task placement.Wepropose
using probabilistic arbitration combined with distance-based weights to achieve EoS and
overcome the limitation of round-robin arbiter.
However, the complexity of probabilistic arbitration results in high area and long latency
which negatively impacts performance. In order to reduce the hardware complexity, we
propose an hybrid arbiter that switches between a simple arbiter at low load and a complex
arbiter at high load. The hybrid arbiter is enabled by the observation that arbitration only
impacts the overall performance and global fairness a high load.Weevaluate our arbitration
scheme with synthetic traffic patterns and GPGPUbenchmarks. Our results shows that hybrid
arbiter that combines round-robin arbiter with probabilistic distance-based arbitration reduces
performance variation as task placement is varied and also improves average IPC.
8.Junwen Luo, Graeme Coapes, Terrence Mak, Tadashi Yamazaki, Chung Tin, an Patrick
Degenaar Real-Time Simulation of Passage-of-Time Encoding in Cerebellum Using a
Scalable FPGA-Based System IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS
AND SYSTEMS, VOL. 10, NO. 3, JUNE 2016.

The cerebellum plays a critical role for sensorimotor control and learning. However,
dysmetria or delays in movements onsets consequent to damages in cerebellum cannot be
cured completely at the moment. Neuroprosthesis is an emerging technology that can
potentially substitute such motor control module in the brain. A pre-requisite for this to
become practical is the capability to simulate the cerebellum model in real-time, with low
timing distortion for proper interfacing with the biological system. In this paper, we present a
frame-based network-on-chip (NoC) hardware architecture for implementing a bio-realistic
cerebellum model with neurons, which has been used for studying timing control or passage-
of-time (POT) encoding mediated by the cerebellum. The simulation results verify that our
implementation reproduces the POT representation by the cerebellum properly. Furthermore,
our field-programmable gate array (FPGA)-based system demonstrates excellent
computational speed that it can complete 1sec real world activities within 25.6 ms. It is also
highly scalable such that it can maintain approximately the same computational speed even if
the neuron number increases by one order of magnitude. Our design is shown to outperform
three alternative approaches previously used for implementing spiking neural network model.
Finally, we show a hardware electronic setup and illustrate how the silicon cerebellum can be
adapted as a potential neuroprosthetic platform for future biological or clinical

application.

9.Raed Al-Dujaily, An Li, Robert G. Maunder, Terrence Mak, Bashir M. Al-Hashimi,


and Lajos Hanzo A Scalable Turbo Decoding Algorithm for High-Throughput Network-
on-Chip Implementation Southampton Wireless, ECS, University of Southampton,
Southampton SO17 1BJ, U.K.

Wireless communication at near-capacity transmission throughputs is facilitated by


employing sophisticated Error Correction Codes (ECCs), such as turbo codes. However, real-
time communication at high transmission throughputs is only possible if the challenge of
implementing turbo decoders having equally high processing
Throughputs, using Networks-on-Chip (NoCs), which facilitate flexible and high-throughput
parallel processing. However, turbo decoders conventionally operate on
the basis of the Logarithmic Bahl-Cocke-Jelinek-Raviv (Log-BCJR) algorithm, which has an
inherently-serial nature, owing to its data dependencies. This limits the
exploitation of the NoCs computing resources, particularly as the size of the NoC is scaled
up. Motivated by this, we propose a novel turbo decoder algorithm, which eliminates the data
dependencies of the Log-BCJR algorithm and therefore has an inherently-parallel nature. We
show that by jointly optimizing the proposed algorithm with the NoC architecture, a
significantly improved utility of the available computing resources is achieved. Owing to this,
our proposed turbo decoder achieves a factor.

10. S. Faralli,1 F. Gambini,1,2 Student Member, IEEE, P. Pintus,1,2 Member, IEEE, M.


Scaffardi,2 O. Liboiron-Ladouceur,3 Senior Member, IEEE, Y. Xiong,3 P. Castoldi,1
Senior Member, IEEE, F. Di Pasquale,1N. Andriolli,1 and I. Cerutti1 Bidirectional
Transmission in an Optical Network on Chip With Bus and Ring Topologies DOI:
10.1109/JPHOT.2016.2526607
1943-0655 2016 IEEE.

In photonic integrated networks on chip (NoCs), microrings are commonly used for adding or
dropping a single optical signal to be switched in the NoC. This paper
demonstrates the feasibility of adding or dropping two optical signals at the same wavelength
in the same microring of NoCs with bus and ring topology. More specifically, the same
microring can be used to support simultaneous bidirectional transmissions of two signals to
be coupled in the NoC topology, leading to two different configurations, called shared source-
microring and shared destination-microring. Spectral characterization shows good agreement
between simulations and measurements taken on silicon-based integrated NoC. Bit-error-rate
(BER) measurements indicate that the shared sourcemicroring configuration performs better,
achieving a penalty as low as 1.5 dB for a BER of 10_9 at 10 Gb/s in the bus NoC. A higher
penalty in the ring NoC for both configurations is due to higher crosstalk in the
interconnecting ring.

11. CHUNHUA XIAO AND WEICHEN LIU 1Department of Computer Science,


Chongqing University, Chongqing 400044, China 2Key Laboratory of Dependable Service
Computing in Cyber Physical Society of Ministry of Education, Chongqing 400044, China
Corresponding author: C. Xiao (xiaochhtky@163.com)
Through Global Sharing to Improve Network Efficiency for Radio-Frequency Interconnect
Based Network-on-Chip Received August 2, 2016, accepted August 29, 2016, date of
publication October 4, 2016, date of current version October 31, 2016.

According to the International Technology Roadmap for Semiconductors, improving charac-


teristics of metal wires will no longer satisfy performance requirements, and new interconnect
paradigms are needed. Radio frequency interconnect (RF-I) enjoys better CMOS
compatibility compared with other alternatives, and is exploited as express shortcuts overlaid
traditional network-on-chip (NoC) topologies.
However, the ef_cient utilization of on-chip communication bandwidth provided by RF
interconnects still remains an open problem. To make effective use of scarce on-chip RF-I for
different traf_c patterns, system model of NoC with shared RF-I (SRFNoC) is constructed
_rst time in this paper, along with detailed design methodology. A light-weighted arbitration
mechanism is utilized for sharing resource allocation, and a new
mapping algorithm communication weight and simulated annealing is proposed for topology
distribution. Both static and dynamic routings for SRFNoC are also discussed in detail. The
results of experiment showed that, compared with the NoC with long-range wired links and
representative network-on-chip with exclusive allocated radio frequency interconnect, the
proposed network can get better communication ef_ciency with less resource overhead.

12.DMITRI MOLTCHANOV1, ALEXANDER ANTONOV2, ARKADY KLUCHEV2, KAROLINA


BORUNOVA1, PAVEL KUSTAREV2, VITALY PETROV1, (Student Member, IEEE), YEVGENI
KOUCHERYAVY1, (Senior Member, IEEE), AND ALEXEY PLATUNOV 2 Statistical Traffic Properties
and Model Inference for Shared Cache Interface in Multi-Core CPUs
Received July 18, 2016, accepted August 9, 2016, date of publication August 25, 2016, date
of current version September 16, 2016.

The general-purpose networks-on-chip (GP-NoC) has recently attracted the attention of the
research and industry as a way to support the growing demands of computing systems. The
design and the development of the communications and networking functions for such a
large-scale versatile systems require knowledge of the traf_c exchanged between the
computing nodes. The object of the study in this paper is the
last-level shared cache interface that is likely to be a traf_c bottleneck in future GP-NoC
architectures. First, using the direct measurements, we report on the stochastic traf_c
properties at large-scales, provide _rst two moments and distribution functions.
Complementing measurements with _ne-grained cycle-accurate CPU simulations, we then
analyze the small-scale traf_c behavior.We show that even for the simplest applications such
as reading or writing of data, the nature of the traf_c is stochastic, depends on the number of
active cores, and irrespective of the application type, has an explicit batch structure.We
further reveal that the batch sizes and inter-batch intervals can be well approximated by
geometric distribution and the approximation becomes better when the number of active cores
increases. These properties identify a simple arrival model that can be used in the analytical
or simulation-based performance evaluation studies of the shared interface technologies in
prospective NoCs.

13. Emmanuel Abbe and Emre Telatar Polar Codes for the m-User
Multiple Access Channel IEEE TRANSACTIONS ON INFORMATION THEORY, VOL.
, NO. , MONTH YEAR.

In this paper, polar codes for the m-user multiple access channel (MAC) with binary inputs
are constructed. It is shown that Arkans polarization technique applied individually to each
user transforms independent uses of an m-user binary
input MAC into successive uses of extremal MACs. This transformation
has a number of desirable properties: (i) the uniform sum rate of the original MAC is
preserved, (ii) the extremal MACs have uniform rate regions that are not only polymatroids
but matroids and thus (iii) their uniform sum rate can be reached
by each user transmitting either uncoded or fixed bits; in this sense they are easy to
communicate over. A polar code can then be constructed with an encoding and decoding
complexity of O(n log n) (where n is the block length), a block error probability
of o(exp(n1=2")), and capable of achieving the uniform sum rate of any binary input
MAC with arbitrary many users. Applications of this polar code construction to channels with
a finite field input alphabet and to the AWGN channel are also
discussed.

14. Michael Opoku Agyeman, Member, IEEE, Quoc-Tuan Vien, Member, IEEE,
Ali Ahmadinia, Member, IEEE, Alexandre Yakovlev, Senior Member, IEEE,
Kin-Fai Tong, Member, IEEE, and Terrence Mak, Member, IEEE A Resilient 2-D
Waveguide Communication Fabric for Hybrid Wired-Wireless NoC Design
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL.
28, NO. 2, FEBRUARY 2017.

Hybrid wired-wireless Network-on-Chip (WiNoC) has emerged as an alternative


solution to the poor scalability and performance issues of conventional wireline NoC design
for future System-on-Chip (SoC). Existing feasible wireless solution for WiNoCs in the form
of millimeter wave (mm-Wave) relies on free space signal radiation which has high power
dissipation with high degradation rate in the signal strength per transmission distance.
Moreover, over the lossy wireless medium, combining wireless and wireline channels
drastically reduces the total reliability of the communication fabric. Surface wave has been
proposed as an alternative wireless technology for low power on-chip communication. With
the right design considerations, the reliability and performance benefits of the surface wave
channel could be extended. In this paper, we propose a surface wave communication fabric
for emerging WiNoCs that is able to match the reliability of traditional wireline NoCs. First,
we propose a realistic channel model which demonstrates that existing mm-Wave WiNoCs
suffers from not only free-space spreading loss (FSSL) but also molecular absorption
attenuation (MAA),
especially at high frequency band, which reduces the reliability of the system. Consequently,
we employ a carefully designed transducer and commercially available thin metal conductor
coated with a low cost dielectric material to generate surface wave signals with improved
transmission gain. Our experimental results demonstrate that the proposed communication
fabric can achieve a 5 dB operational bandwidth of about 60 GHz around the center
frequency (60 GHz). By improving the transmission reliability of wirelesslayer, the proposed
communication fabric can improve maximum sustainable load of NoCs by an average of 20:9
and 133:3 percent compared to existing WiNoCs and wireline NoCs, respectively.

15.Gwangsun Kim, Michael Mihn-Jong Lee, John Kim, Member, IEEE, Jae W. Lee, Dennis
Abts, and Michael Marty An Efficient Application Mapping Approach for
the Co-Optimization of Reliability, Energy, and Performance in Reconfigurable
NoC Architectures IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF
INTEGRATED CIRCUITS AND SYSTEMS, VOL. 34, NO. 8, AUGUST 2015.

In this paper, an efficient application mapping approach is proposed for the co-optimization
of reliability, communication energy, and performance (CoREP) in networkon- chip (NoC)-
based reconfigurable architectures. A cost model
for the CoREP is developed to evaluate the overall cost of a mapping. In this model,
communication energy and latency (as a measure of performance) are first considered in
energy latency product (ELP), and then ELP is co-optimized with reliability
by a weight parameter that defines the optimization priority. Both transient and intermittent
errors in NoC are modeled in CoREP. Based on CoREP, a mapping approach, referred to as
priority and ratio oriented branch and bound (PRBB),
is proposed to derive the best mapping by enumerating all the candidate mappings organized
in a search tree. Two techniques, branch node priority recognition and partial cost ratio
utilization, are adopted to improve the search efficiency. Experimental results show that the
proposed approach achieves significant improvements in reliability, energy, and performance.
Compared with the state-of-the-art methods in the same scope, the proposed approach has the
following distinctive advantages: 1) CoREP is highly flexible to address various NoC
topologies and routing algorithms while others are limited to some specific topologies and/or
routing algorithms; 2) general quantitative evaluation for reliability, energy, and performance
are made, respectively, before being integrated into unified cost model in general context
while other similar models only touch upon two of them; and3) CoREP-based PRBB attains a
competitive processing speed which is faster than other mapping approaches.

16. Gwangsun Kim, Michael Mihn-Jong Lee, John Kim, Member, IEEE, Jae W. Lee, Dennis
Abts, and Michael Marty Mapping of Irregular IP onto NoC Architecture with
Optimal Energy Consumption Received August 2, 2016, accepted August 29, 2016, date of
publication October 4, 2016, date of current version October 31, 2016.

Network on chip (NoC) architectures have been proposed to resolve complex on-chip
communication problems. An NoC-based mapping algorithm is shown in this paper. It can
map irregular intellectual properties (IPs) cores onto regular tile 2-D mesh NoC architectures.
The basic idea is to decompose a large IP into several dummy IPs or integrate several small
IPs into one dummy IP, such that each dummy IP can fit into a single tile. It can also allocate
buffer space according to the input/output degree and avoid connection congestion by
adapting communication density. Experimental data indicate that using the algorithm
proposed in this paper, the communication energy can be reduced about 7%. Key words:
network on chip (NoC); communication matrix; router weight; communication density.

17.Michael Opoku Agyeman, Member, IEEE, Quoc-Tuan Vien, Member, IEEE, Ali
Ahmadinia, Member, IEEE, Alexandre Yakovlev, Senior Member, IEEE, Kin-Fai Tong,
Member, IEEE, and Terrence Mak, Member, IEEE A Resilient 2-D Waveguide
Communication Fabric for Hybrid Wired-Wireless NoC Design IEEE TRANSACTIONS
ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 28, NO. 2, FEBRUARY 2017.

Hybrid wired-wireless Network-on-Chip (WiNoC) has emerged as an alternative solution to


the poor scalability and performance issues of conventional wireline NoC design for future
System-on-Chip (SoC). Existing feasible wireless solution for WiNoCs in the form of
millimeter wave (mm-Wave) relies on free space signal radiation which has high power
dissipation with high degradation rate in the signal strength per transmission distance.
Moreover, over the lossy wireless medium, combining wireless and wireline channels
drastically reduces the total reliability of the communication fabric. Surface wave has been
proposed as an alternative wireless technology for low power on-chip communication. With
the right design considerations, the reliability and performance benefits of the surface wave
channel could be extended. In this paper, we propose a surface wave communication fabric
for emerging WiNoCs that is able to match the reliability of traditional wireline NoCs. First,
we propose a realistic channel model which demonstrates that existing mm-Wave WiNoCs
suffers from not only free-space spreading loss (FSSL) but also molecular absorption
attenuation (MAA),
especially at high frequency band, which reduces the reliability of the system. Consequently,
we employ a carefully designed transducer and commercially available thin metal conductor
coated with a low cost dielectric material to generate surface wave signals with improved
transmission gain. Our experimental results demonstrate that the proposed communication
fabric can achieve a 5 dB operational bandwidth of about 60 GHz around the center
frequency (60 GHz). By improving the transmission reliability of wireless layer, the proposed
communication fabric can improve maximum sustainable load of NoCs by an average of 20:9
and 133:3 percent compared to existing WiNoCs and wireline NoCs, respectively.

18.Ruilian Xie, Jueping Cai and Xin Xin Simple fault-tolerant method to balance
load in network-on-chip Received August 2, 2016, accepted August 29, 2016, date of
publication October 4, 2016, date of current version October 31, 2016.

According to the International Technology Roadmap for Semiconductors, improving charac-


teristics of metal wires will no longer satisfy performance requirements, and new interconnect
paradigms are needed. Radio frequency interconnect (RF-I) enjoys better CMOS
compatibility compared with other alternatives, and is exploited as express shortcuts overlaid
traditional network-on-chip (NoC) topologies.
However, the ef_cient utilization of on-chip communication bandwidth provided by RF
interconnects still remains an open problem. To make effective use of scarce on-chip RF-I for
different traf_c patterns, system model of NoC with shared RF-I (SRFNoC) is constructed
_rst time in this paper, along with detailed design methodology. A light-weighted arbitration
mechanism is utilized for sharing resource allocation, and a new
mapping algorithm communication weight and simulated annealing is proposed for topology
distribution. Both static and dynamic routings for SRFNoC are also discussed in detail. The
results of experiment showed that, compared with the NoC with long-range wired links and
representative network-on-chip with exclusive allocated radio frequency interconnect, the
proposed network can get better communication ef_ciency with less resource overhead.
19. Rohit Kumar and Ann Gordon-Ross fkumar, anng@chrec.org NSF Center for High-
Performance Reconfigurable Computing University of Florida, Gainesville, FL 32611
MACS: A Highly Customizable Low-latency Communication Architecture
This article has been accepted for publication in a future issue of this journal, but has not been
fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPDS.2015.2390631, IEEE Transactions on Parallel and Distributed Systems.

Networks-on-chips (NoCs) are an increasingly popular communication infrastructure in


single chip VLSI design for enhancing parallelism and system scalability. Processing
elements (PEs) connect to a communication topology via NoC switches, which are
responsible for runtime establishment and management of inter-PE communication channels.
Since NoC switch design directly affects overall system performance and exploited
communication parallelism, much previous work focuses on efficient NoC switch design. In
this paper we present MACSa highly parametric NoC switch architecture that provides
reduced data transfer latency, increased designer flexibility, and scalability as compared to
previous architectures by combining and enhancing several NoC design strategies. MACS
enhances inter-PE communication using a circuit switching technique with minimal adaptive
routing and a simple and fair path resolution algorithm to maximize bandwidth utilization.
We evaluate area and performance of an FPGA implementation of MACS, and, compared to
previous work, MACS offers a 2x to 7x decrease in average channel setup latency, a 1.7x to
2x reduction in area requirements, similar average packet latency, up to a 6x increase in the
network saturation point, and up to a 1.4x increase in bandwidth utilization. Additionally, we
illustrate MACSs low average channel setup latency using 6 network traffic patterns and 8
parallel JPEG decompression core trace simulations.

20.Chen Wu, Chenchen Deng, Leibo Liu, Jie Han, Jiqiang Chen, Shouyi Yin, and Shaojun
Wei An Efficient Application Mapping Approach for the Co-Optimization of Reliability,
Energy, and Performance in Reconfigurable NoC Architectures
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS
AND SYSTEMS, VOL. 34, NO. 8, AUGUST 2015.
In this paper, an efficient application mapping approach is proposed for the co-optimization
of reliability, communication energy, and performance (CoREP) in networkon- chip (NoC)-
based reconfigurable architectures. A cost model
for the CoREP is developed to evaluate the overall cost of a mapping. In this model,
communication energy and latency (as a measure of performance) are first considered in
energy latency product (ELP), and then ELP is co-optimized with reliability
by a weight parameter that defines the optimization priority. Both transient and intermittent
errors in NoC are modeled in CoREP. Based on CoREP, a mapping approach, referred to as
priority and ratio oriented branch and bound (PRBB),
is proposed to derive the best mapping by enumerating all the candidate mappings organized
in a search tree. Two techniques, branch node priority recognition and partial cost ratio
utilization, are adopted to improve the search efficiency. Experimental results show that the
proposed approach achieves significant improvements
in reliability, energy, and performance. Compared with the state-of-the-art methods in the
same scope, the proposed approach has the following distinctive advantages: 1) CoREP is
highly flexible to address various NoC topologies and routing
algorithms while others are limited to some specific topologies and/or routing algorithms; 2)
general quantitative evaluation for reliability, energy, and performance are made,
respectively, before being integrated into unified cost model in general context while other
similar models only touch upon two of them; and
3) CoREP-based PRBB attains a competitive processing speed, which is faster than other
mapping approaches.

Anda mungkin juga menyukai