Mentorpaper 102740

THE PCB ENGINEER’S GUIDE TO
SUCCESSFUL DDR BUS DESIGN
MENTOR, A SIEMENS BUSINESS
W H I T E P A P E R
P C B D E S I G N
w w w . m e n t o r . c o m
The PCB Engineer’s Guide to Successful DDR Bus Design
ABSTRACT
Designing high-speed DRAM DDR memory busses can be stressful. While the schematic design for such busses
may amount to simply wiring one pin to another, the PCB layout can be quite complex. There are three broad
reasons for this complexity. The first is that the input setup and hold time requirements at the DRAM must be met.
The second is that the setup and hold time requirements for the address/command signals at the DRAM must be
met. And finally, the DRAM between the DQS and CLK needs to line up approximately at each DRAM.
In general, DDR buses are challenging. Changing voltages and currents can create voltages and currents in
neighboring channels that result in unintended crosstalk. Conversely, parallel busses are more susceptible to
crosstalk because most of the signals are single-ended.
Additionally, with signal integrity challenges such as Inter Symbol Interference (ISI) and crosstalk on the DDR bus, it
can be difficult to know what to validate in the first place. For the DRAMs, validation requirements are spelled out
in the JEDEC documentation. However, these documents aren’t always intuitive. On the controller side, the
validation requirements are usually more straightforward, as it’s in the best interest of the controller vendor for
designs to be completed quickly and with high quality. Nevertheless, it is still up to the designer to make sure that
the document and requirements are well understood.
INTRODUCTION
This paper tackles the critical signal integrity concerns encountered when designing, simulating, and analyzing DDR
buses. The first section describes DDR bus design challenges that can be particularly problematic, even
intimidating, to designers. Subsequent sections describe how simulation and analysis speed up the design of a
functioning DDR system to reduce PCB spins and shorten the time to release and market.
DESIGNING PCBS FOR DDR BUSSES

The DDR bus is one of the most challenging commercial high-speed busses to design. Designing the bus comes
with a set of challenges that can be particularly intimidating to designers new to the bus. Fundamentally, though,
the bus is quite simple with just a few significant concepts to keep in mind. Understanding the building blocks of
the bus will go a long way in executing the important design requirements of the bus.
The bus consists of a single controller on one end of the bus and a set of one or more DRAM chips on the other
end. Generally speaking, the greater the memory capacity requirement of the system, the greater number of
DRAMs needed. The two major operations conducted by the controller are to write data to the DRAMs or read data
from the DRAMs.
w w w. m e nto r. co m
2
HOW DDR WORKS

Read/write operations transpire using two sets of signals: the data bus and the address/command bus, illustrated in
Figure 1.
Figure 1. DDR uses two sets of signals, like most memory.
The bidirectional data bus is broken up into lanes. Each lane consists of a unique DQS (strobe) signal, and the
associated data signals. Lanes commonly consist of eight data bits. However, DRAMs with just four data bit signals
also exist, so some lanes might consist of just four data bits.
Regardless of the number of bits in a lane, each data bit is latched in during the transition of the DQS. The data
signals are all single-ended. The DQS signals for DDR3 and DDR4 are always differential. For DDR2, the DQS signal
may either be single-ended or differential.
During write transactions, the controller launches the signal approximately halfway between two DQS transitions
(Figure 2). This way, during the actual DQS transition, the data signal should be stable. Subsequently, the DRAM
latches in the data during a DQS transition. This is the first crucial element of the design: The input setup and hold
time requirements at the DRAM need to be met. Care needs to be taken during layout that the propagation delays
for the DQS and data signals of a given lane are similar to each other. The propagation delays across lanes are a bit
more flexible since the strobe is only used for the data bits within the lane under consideration.
3
Figure 2. For writes, the controller launches the signal approximately halfway between two DQS transitions.
During read transactions, the DRAM launches the data signals approximately in line with the DQS (Figure 3). It is
then the controller’s job to delay the data and/or the strobe appropriately in order to latch in the data by using the
DQS.
Figure 3. During read transactions, unlike writes, the data signals approximately line up with the DQS.
4
Each lane, in a one rank system, only accesses one DRAM. In a multi-rank system, a given lane (the data bits and the
strobe) can be connected to multiple DRAMs. During a transaction, only one of the DRAMs connected to a lane will
be accessed. To be active, this DRAM will be specified by asserting that DRAM’s Chip Select signal; the other DRAMs
sharing that lane will have their Chip Select signals deactivated. In general, the number of “ranks” in a system is
equal to the number of Chip Selects being used in the system, which will equal the number of DRAMs sharing a
given lane.
The address/command bus consists of several address bits (the exact number depending on the size of the DRAMs
that need to be accessed), bits to specify the command being issued to the DRAM, and finally the clock. The bus is
unidirectional; commands are sent only from the controller to the DRAMs.
The address and command signals are single-ended signals. The clock, however, is differential, and is measured at
the point when the + and – sides of the clock cross each other.
To specify a command, the appropriate command and address are launched by the controller approximately when
the clock signal is going low (Figure 4). The signals stay stable through the clock going high, and then transition to
the new command during the subsequent low-going of the clock. This way, the command is stable during the
high-going transition of the clock.
Figure 4. For commands, the command and address are launched by the controller at approximately the time the clock signal is
going low.
The next important consideration during design is the setup and hold time requirement for the address/command
signals needs to be met at the DRAM. Because the signals are sent from the controller in such a way as to ensure
stable address/command bits during the clock’s rising edge, care must be taken to delay the address/command
signals by an amount equal to the clock. Since the address/command bus is shared across different DRAM chips,
the delay (including all SI effects) from the controller to each DRAM must be the same for all the address bits as
well as for the clock. So, even though the delay across DRAMs might vary, the delay for the address and the clock
must line up at any given DRAM.
Finally, there is a requirement at the DRAM between the DQS and CLK. This requirement ties in the data and
address busses. The DQS and CLK need to approximately line up at each DRAM. To ensure this, DDR2 required that
the delays from the controller to every DRAM for each of the DQS signals be equal to one another, and overall be
equal to the delay from the controller to the DRAM for the clock.
5
FLY-BY ROUTING
In DDR3, a new concept called fly-by routing was introduced. Fly-by routing permitted the address and clock to
start from the controller and touch each DRAM in sequence. This, however, implies that the clock reaches each of
the DRAMs at different times. If the DQS signals are made out to similar lengths, then there is no guarantee the
DQS/CLK requirement would be met.
In this case, the controller must be able to internally delay the DQS signals enough to permit the DQS signal to
reach the CLK at approximately the same time. Note that for DDR2, and for controllers that don’t support the fly-by
topology, the clock signals and DQS signals need to match.
PREPARING TO SIMULATE DDR MEMORY BUS INTERFACES

Next we’ll look at how to speed up the process of designing a functioning DDR system. Given the critical signal
integrity challenges described earlier, it is wise to simulate the design before fabricating the board. Doing so greatly
increases the odds of a functional board on the first pass and reduces the number of spins required for the design.
That said, the time required to set up, run, and analyze the simulation also needs to be fast enough to make the
investment in simulation worthwhile.
To do that, design engineers require tools that permit quick simulation setup, run simulations in a reasonable
amount of time, and provide accurate data that are easy to understand and parse so that the critical data can be
focused on.
To shorten setup time, it’s beneficial to be familiar with the required data and have them ready by the time of the
simulation. There are really only a handful of items needed for a successful run. Most of these items are already
available to the designer, such as a board file, or are available through vendors. Think of the required items as a
checklist.
■■ Driver & receiver models

■■ IC pass/fail criteria
■■ Board stackup
The first items required are the driver and receiver models for the memory controller and DRAMs. The controller’s
IBIS model can usually be accessed by contacting the controller vendor’s AE or directly from the manufacturer’s
website. Similarly, the DRAM IBIS models can be accessed from DRAM vendors such as Micron, Samsung, or Hynix.
The models should contain information for the different driver strength options, as well as for the different ODT
termination options available to the respective chips.
Some vendors opt to provide Spice models, rather than IBIS models. While Spice simulations can be marginally
more accurate, they are often orders of magnitude slower to execute. A 2015 DesignCon presentation¹ highlights a
simulation setup that took 221 hours to run with the Spice driver but only three hours to run with an IBIS driver,
with only a marginal hit to accuracy. So, when time is a concern, use an IBIS model to speed up the process. That
said, there have been cases of poorly created IBIS models, so it might be beneficial to request correlation data from
the vendor.
The next requirement is the pass/fail criteria for the different ICs. While simulation tools can generate waveforms,
only the chip vendor can provide the measurement criteria needed to categorize the resulting waveforms as being
acceptable or not. An example is the eye opening width required at the controller during read transactions on the
DQ signals. These width values are defined by JEDEC® for the DRAM memories and are readily available from the
DRAM datasheets or from the JEDEC website (jedec.org). Controller requirements can be obtained from the
controller datasheet or by talking with the vendor’s applications engineer.
6
Keep in mind these requirements are not simulation-specific but are needed for characterization of your bus once
the board comes back. After all, if there is an eye diagram on the oscilloscope, as shown in Figure 5, and no eye
mask to go with it, then there’s no way of knowing how much margin the design has in volume production. So, in
discussion with the vendor’s AE, it might be better to focus on system SI validation rather than on simulation
validation.
Figure 5. An eye diagram plots a large number of signal transitions on top of an eye mask (center rectangle) to reveal the
operating margins. In this example, the operating margins are very good.
Common requirements to request from the controller vendor are:
––The input timing relation requirement between DQ/DQS during read transactions.
––The output DQ/DQS variations during write transactions.
––The output DQS/CLK variations during write transactions.
––The output address/CLK variation during address/command transactions.
Note: The specification documents contain not just signal integrity-related information, but a whole host of data
that might not be pertinent to validating the robustness of the DDR channel. While going through the specs, make
sure to focus on the electrical section (Figure 6).
7
Figure 6. JEDEC specs contain all the information needed for simulation.
Next, the board stackup information needs to be accurate (Figure 7). These data are obtainable from the PCB
vendor. The information here includes the layer assignments, trace widths (for given impedances) and dielectric
properties. A tentative stackup should be available from the PCB vendor even before fabricating the board and can
be used in the exploratory stage of design, before layout begins.
Figure 7. Stackups define all the layers in the board, including trace widths and dielectric properties.
8
Moving on, it’s important to identify parts of the system where third-party models are required. Connectors, for
example, are best modeled by connector vendors. Connectors might be modeled as simple RLC networks or as a
more-detailed S-parameter model. It’s best to contact the vendor for these models. Similarly, if standard DIMMs are
used in the system, the board models for the appropriate DIMM need to be used. The board models for the DIMMs
can be obtained from JEDEC too.
Finally, regions of the board that might need special analysis need to be identified. There might be structures in the
designs best modeled using 3D solvers. Most often, these are vias without good return-path stitching. The higher
the data rate, the more likely a via needs to be modeled as a 3D structure. The best tools can do all of this
automatically, without the time-consuming, error-prone chores of manual identification, extraction, and modeling.
No more hybrid solvers or slow cut-and-stitch technology!
In general, simulation needs to be used as a tool to speed up the process of designing a functioning system.
Simulation should not in itself become a time hindrance for the project at hand. To ensure this, adequately
preparing for simulation runs will go a long way toward reducing the time required for simulating. This requires a
bit of time analyzing the board and sending a few emails to the DRAM, controller, and connector vendors to get
the models and timing requirement information but, with this information in hand, you should be well on your way
to validating your design.
ANALYZING THE DDR SIMULATION RESULTS

Even if no failures are detected, DDR simulation results might highlight a weakness in the design. The first step after
executing a set of DDR simulations is to review the requirements of the bus. This comes down to understanding
the receiver requirements of the controller (during read transactions) and of the DRAMs (during write and
command/address transactions). The information for the controller is in the datasheet of the controller. JEDEC and
the DRAM vendor’s website are good resources for the requirements of the DRAM. The reason this step is critical is
that, while looking at simulations, the user can understand the subtleties of the results only if the fundamental
requirements are known.
One way to ensure the results can be well-understood is to use a simulator that provides easy-to-understand
results, thus enabling you to focus on what is important. The DDR bus is a very wide bus, with each signal
containing many bits of transition information and each transition having a tremendous number of measurements.
These data can be overwhelming. Use the tools at the simulator’s disposal to filter the information and prioritize the
order of what needs further analysis. Usually, the worst-behaving signals are a good start.
Worst-case nets can be defined as nets showing the lowest margins on at least one measurement parameter. For
example, a signal that has the lowest setup time of all the nets can be considered a worst-case net. It is usually a
good idea to analyze the worst-behaving nets regardless of whether they are failing. Even if there is no failure in
the simulation, the results might be marginal enough to highlight a weakness in the design. This might manifest
itself as a failure, often intermittent, in the real system, which can subsequently take significant effort to debug. This
is especially true when one signal has an abnormally low margin for any given measurement. For example, if DQ0
has a 5ps setup time margin, while all the other signals have at least say 40ps margin, it might be worth taking a
closer look at DQ0 (and its corresponding strobe).
Bad signal behavior can have different root causes. Often, the setup or hold time margin is very low. If one (say
setup) is low while the other (in this case, hold time) is high, then the strobe (or clock) is not well-aligned with the
DQ (or address). In such cases, the feature-set of the controller can remedy the situation if it permits calibration of
each individual bit. If it does, then the simulation can be redone (automatically, if the simulation tool supports this)
with the appropriate delays for the signal bits. If the controller does not support calibration of each bit, then some
of the signal bits might need a larger or smaller electrical delay while routing.
9
Another common cause of bad signal behavior is excessive ringing. Ringing can cause violations of multiple
parameters, including excessive maximum overshoot, excessive overshoot area, or setup/hold violations. Some
violations, such as overshoot, can damage the chips. Although this damage might not occur on the first bring-up, it
might happen over a longer duration of time. Thus, knowing about such violations early on can save significant
costs downstream.
Ringing is often caused by improper termination. This can usually be remedied by trying different ODT settings at
the receiver until it comes down to an acceptable level. In Figure 8, the signal is shooting beyond the Vdd threshold
(1.5V) for some duration of time but it is not overshooting enough to cause a peak amplitude violation. (It would
have to go past 1.9V to do so.) However, the voltage-time area, shown in the red area of Figure 8, might be enough
to violate the “maximum overshoot area” requirement in the JEDEC spec.
Figure 8. Overshoot area.
A third common issue is excessive noise on the signals. This can happen when too much crosstalk energy is
injected into a signal from a nearby switching signal. This can be fixed by separating the signals spatially from each
other.
Another common issue is the presence of stubs in the channel (Figure 9). Stubs in the channel that cause
reflections often look like “steps” in the waveform (Figure 10). If the stubs can be shortened, such as with test
points, it will improve signal integrity. Often, the step seen in the waveform is because the measurement was made
on read transactions at the pin of the controller. This can quickly be remedied by shifting the point of measurement
at the controller to the die instead of the pin. This measurement issue is often caused when the controller’s IBIS file
does not explicitly call out a measurement at the die.
10
Figure 9. Stub in system.
Figure 10. Reflection effect from stub.
After identifying the root cause(s) of poor signal quality, the next step is to fix them on the board and execute a
subsequent simulation. On these “post-fix” simulations, it is often better to first simulate only the section changed
so as to save time. Only after the local fixes have been done adequately can an entire board/system be simulated.
Finally, once the board is manufactured, the simulated waveforms can be compared with waveforms obtained by
oscilloscope measurements. With oscilloscopes, it is not practical to measure every signal on the bus so validate a
few signals measured on the oscilloscope with the measurements made by simulation to increase confidence of a
correct simulation setup.
11
CONCLUSION
With a good setup and proper analysis, simulation can help decrease the overall time required to design a high-
speed DDR subsystem. Understanding the requirements, and then simulating correctly, increases the odds the
board will be functional from the start. This results in fewer board spins and shorter time to release.
¹ Nitin Bhagwath, “DDR4 Board Design and SI Challenges,” DesignCon, January 2015.
For the latest product information, call us or visit: w w w . m e n t o r . c o m

©2017 Mentor Graphics Corporation, all rights reserved. This document contains information that is proprietary to Mentor Graphics Corporation and may
be duplicated in whole or in part by the original recipient for internal business purposes only, provided that this entire notice appears in all copies.
In accepting this document, the recipient agrees to make every reasonable effort to prevent unauthorized use of this information. All trademarks
mentioned in this document are the trademarks of their respective owners.
MGC 01-18 TECH16890-w

Mentorpaper 102740

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Mentorpaper 102740

Diunggah oleh

Hak Cipta:

Format Tersedia

THE PCB ENGINEER’S GUIDE TO

SUCCESSFUL DDR BUS DESIGN

MENTOR, A SIEMENS BUSINESS

DESIGNING PCBS FOR DDR BUSSES

HOW DDR WORKS

Figure 1. DDR uses two sets of signals, like most memory.

PREPARING TO SIMULATE DDR MEMORY BUS INTERFACES

■■ Driver & receiver models

Common requirements to request from the controller vendor are:

––The output DQ/DQS variations during write transactions.

––The output DQS/CLK variations during write transactions.

––The output address/CLK variation during address/command transactions.

ANALYZING THE DDR SIMULATION RESULTS

Figure 8. Overshoot area.

Figure 9. Stub in system.

Figure 10. Reflection effect from stub.

For the latest product information, call us or visit: w w w . m e n t o r . c o m

MGC 01-18 TECH16890-w

Anda mungkin juga menyukai