At-speed testing of memory interfaces in 45nm and below technologies has become critical for defect screening as well as for ascertaining the correct operational frequency of a device (fmax). In this paper, we describe our experience of using TetraMAXs RAM sequential ATPG for a production 45nm SoC to generate high quality memory interface tests. We discuss the advantages and challenges involved in using the native tool features, suggest DFT hooks that assist the tool in ATPG, and propose a flow for using the native sequential pattern generation engine in an optimum manner to achieve high coverage of the targeted faults with low pattern counts. Results on our production SoC show that the achieved coverage is 3.5 times more than that what is achieve-able conventionally. Further, the pattern count reduced by nearly 75%, with an order of magnitude runtime improvement. We also share results from silicon that clearly show the impact of using these tests on the fmax of a device (silicon data collection underway).
Table of Contents
1 Introduction and Motivation ................................................................................................ 3 2 Details of Production SoC .................................................................................................. 5 3 Conventional RAM Sequential ATPG................................................................................. 5 4 Challenges with Conventional RAM Sequential ATPG ...................................................... 7 5 Techniques to improve the performance of RAM Sequential ATPG .................................. 7 5.1 Memory interface fault list segregation and pruning .................................................... 8 Table 2: Effect of fault pruning ...................................................................................... 8 5.2 Exploit the BIST resources for ATPG .......................................................................... 8 5.2.1 Capture off BIST (COB) ........................................................................................ 8 5.2.2 Launch off BIST (LOB)....................................................................................... 10 5.2.3 DFT Careabouts ................................................................................................... 11 6 Results ................................................................................................................................ 12 7 Conclusions ........................................................................................................................ 13 8 References .......................................................................................................................... 13
Table of Figures
Figure 1. Differences in paths from/to a typical memory in various modes of operation .............. 3 Figure 2. Setup slack histogram of BIST versus functional paths to different memories of Chip A ......................................................................................................................................................... 4 Figure 3. Memory setup slack difference distribution view for the data in Figure 2 ..................... 4 Figure 4. A memory tested using conventional RAM sequential ATPG ....................................... 6 Figure 5. Sequence of Events in Capture off BIST (COB) based RAM sequential ATPG ............ 9 Figure 6. Constraints, switches and sequential capture procedure used in Capture off BIST (COB) based RAM sequential ATPG ........................................................................................... 10 Figure 7. Sequence of Events in Launch off BIST (LOB) based RAM sequential ATPG ........... 10 Figure 8. Constraints, switches and sequential capture procedure used in Launch off BIST (LOB) based RAM sequential ATPG ....................................................................................................... 11 Figure 9. Pipelining Bistmode control of the memory ................................................................. 12 Figure 10. Projected fmax comparisons between LOC TFT and RAM sequential ATPG patterns ....................................................................................................................................................... 13
Table of Tables
Table 1. Memory details of CHIP A ............................................................................................... 5 Table 2. Performance of Conventional RAM sequential ATPG on CHIP A ................................ 7 Table 3. CHIP A: Comparison of proposed and conventional RAM sequential ATPG............... 12
functional
bist D Q
bist
TD TM
D_mem
Q_mem
LEGEND
functional
Memory Array
ADR_mem
TADR
The different paths in the functional and test modes of operation in Figure 1 are associated with different timing characteristics. To empirically understand this problem, Figure 2 (next page) shows data from our 45nm production SoC (labeled Chip A henceforth). For different memories in the design, Figure 2 shows the worst-case setup slack for timing paths to the memory via the BIST and functional interfaces. The data clearly shows that for these memories, the BIST interface timing paths have relaxed timing compared to the functional timing paths in 37 out of 39 memories shown. Figure 3 summarizes this data as a setup slack difference distribution and shows that most of the memories have a setup slack difference higher than1.5ns (upto 2.3ns).
BIST
Functional
Setup Slack Histogram of BIST versus Functional Paths to Different Memories of Chip A
7000
Memory id
Figure 2. Setup slack histogram of BIST versus functional paths to different memories of Chip A
18
Number of memories
-500
12 10 8 6 4 2 0 0 500 1000 1500 2000 2500 3000 Setup slack difference (ps)
Figure 3. Memory setup slack difference distribution view for the data in Figure 2
The above data clearly indicates that test coverage of memory interface paths using only test paths will lead to a test escape or DPPM problem since the ignored functional paths are significantly more speed limiting. ATPG tools have begun to take cognizance of this issue and
Synopsys TetraMAXs RAM sequential ATPG is a critical feature that addresses this gap it allows the DFT engineer to generate tests that target memory interface faults using complete or partial functional paths in a scan environment, using its native sequential ATPG engine. As mentioned in the abstract, this paper describes our experience of using TetraMAXs RAM sequential ATPG for a production 45nm SoC to generate high quality memory interface tests. We propose a flow for using the native sequential pattern generation engine in an optimum manner to achieve high coverage of the targeted faults with low pattern counts. We will also suggest DFT hooks that ease the process of RAM sequential ATPG. Results on our production SoC show that the achieved coverage is 3.5 times more than that what is achieve-able conventionally. Further, the pattern count reduced by nearly 75%, with an order of magnitude runtime improvement. We also share results from silicon that clearly show the impact of using these tests on the fmax of a device
As seen later from the results presented in the paper, even for a medium-sized SoC with this many memories, the ATPG tool finds it challenging to perform RAM Sequential ATPG in an effective manner.
ATPGmode = 0
bist D Q
Memory Read
Memory Read
Memory write
Memory write
Capture D2
bist
TD
D_mem
Q_mem
functional
functional
Bistmode = 0
ADR_mem
D
ADR
ADR
bist
TADR
Memory Array
Write enable/ Read dsable
EN
WE
Q
EN bist TEN
Memory Enable
L L L L C
SHIFT Phase Capture Phase
CLK
CLK
L = Launch C = Capture
In order to use the functional paths to/from memory, ATPG would be done with Bistmode = 0, ATPGmode =0, Memory enable = 1. The tool would then follow a sequence in capture phase of the pattern as explained below to achieve coverage on the memory input and output interface pins. Memory input interface fault testing: In order to sensitize and observe the input interface faults of memory, below is a typical sequence done at speed during capture phase of ATPG. Step 1: Write D1 into address say A1. (To initialize data and address lines of the targeted memory) Step 2: Write D2 into another address A2. (To create transition on the data and address lines) Step 3: Read D2 from address A2. (Observe the transition on the input data and address lines by reading the data on the memory output) Step 4: Capture. (Capture the response on flops interfacing the memory output) Memory output interface fault testing: In order to detect the output interface faults of memory, below are the typical steps followed. Step 1: Write D1 into address say A1. (To initialize memory location A1) Step 2: Write D2 into another address A2. (To initialize memory location A2) Step 3: Read D1 from address A1. (To initialize output pins of the targeted memory) Step 4: Read D2 from address A2. (To create transition on the faults on output pins of the targeted memory)
Step 5: Capture. (Capture the response on flops interfacing the memory output)
Coverage 25.3%
#Patterns 3028
Run-time 12087 s
It is intuitive to surmise that the biggest challenge to the ATPG tool is dealing with the high sequential controllability and observability on memory interface pins when a sequence of write and read operations are done using functional paths only (although the flops are initializable using scan). Each of the memory input signals is typically associated with complex combinational decoding logic in the functional cloud shown in Figure 4. Due to this, the ATPG tool has a challenging task of computing the values required for multiple flops across several timeframes that will cause the transitions on memory address/data/output lines as mentioned in Section 3. For example, from the waveform in Figure 4, it can be seen that to gain the coverage on Q pins of the memory, complementary data has to be written to two different locations for which WE (write enable) needs to be constant for at least 2 cycles and EN (Memory enable) have to be static for at least 4 cycles. But when multiple flops of same clock domain have to be controlled to achieve these requirements on the memory pins in specific cycles, the limitations of sequential ATPG play a role and hence, pattern coverage and runtime get impacted.
CHIP A Technology #Total Top-Level Memories RAMs # Memory Clock Domains (Clock Groups) #Top-Level Critical Memories RAMs
18
16
The fault list used for ATPG is Memory_ip_int.flts as described in Section 5.1. An example illustration of the cell constraints and pin constraints required to perform COB are provided in Figure 6.
Writedata D2 @ A2 with functional interface Write data D1 @ A1 with functional interface Read data D2 @ A2 with functional interface
bist functional D Q
bist
TD
D_mem
Q_mem
functional
Bistmode
Memory Array
ADR_mem
Bist mode
0 0 0 X
L L L C
SHIFT Phase Capture Phase L = Launch C = Capture
Figure 5. Sequence of Events in Capture off BIST (COB) based RAM sequential ATPG
// Constraints set_drc clock seq_capture set_delay nocommon add_cell_constraints 0 Bistmode add_pi_constraints 0 ATPGmode. // Fault model used set_fault model transition set_atpg noclk_constraint_fault_pruning set_atpg full_seq_atpg set_atpg capture 4 -------------------------------------------------------------------//Capture Procedure for Capture on BIST in SPF file. "sequential_capture" { W "_default_WFT_"; F { ;"lcd_pclk"=P;"lcd_vsync"=0; } V { "_pi"=\j \r196 #; } V { "DIV10_125"=P; } //Launch: Write D1(A1) V { "DIV10_125"=P; } //Launch: Write D2(A2) V { "DIV10_125"=P; } //Launch: Read D2(A2) V { "DIV10_125"=P; } //Capture V { "_po"=\j \r196 #; }}}
Figure 6. Constraints, switches and sequential capture procedure used in Capture off BIST (COB) based RAM sequential ATPG
In Figure 6, we can see that a four cycle sequential capture procedure associated with a design internal clock DIV10_125 (pseudo-primary input) is defined in the spf. In order to detect the output interface faults of memory, the tool will use 3 Launch cycles and 1 capture cycle in every pattern generated. The signal Bistmode is controllable using a looped back test register on scan. Therefore, it is associated with a cell constraint of 0 so that the functional path selected on the input side. The signal ATPGmode, in this case, is a pseudo-primary input which is constrained to 0 using add_pi_constraint.
bist functional D Q
bist
TD
D_mem
Q_mem
functional
Bistmode
Memory Array
ADR_mem
Bist mode
1 1 1 1 X
L L L L C
SHIFT Phase Capture Phase
Figure 7. Sequence of Events in Launch off BIST (LOB) based RAM sequential ATPG
L = Launch C = Capture
The fault list used for ATPG is Memory_op_int.flts as described in Section 5.1. An example illustration of the cell constraints and pin constraints required to perform LOB are provided in Figure 8. In this case, we can see that a five cycle sequential capture procedure associated with a design internal clock DIV10_125 (pseudo-primary input) is defined in the spf. In order to detect
the output interface faults of memory, the tool will use 4 Launch cycles and 1 capture cycle in every pattern generated. The signal Bistmode is now associated with a cell constraint of 1 so that the test path is selected on the input slide. The signal ATPGmode is constrained to 0 using add_pi_constraint. Further, to ensure that coverage credit is provided for capture using functional flops only, the BIST flops in transitive fanout of memory Q outputs are masked during capture.
// Constraints set_drc clock seq_capture set_delay nocommon add_cell_constraints 1 Bistmode add_pi_constraints 0 ATPGmode // Fault model used set_fault model transition set_atpg noclk_constraint_fault_pruning set_atpg full_seq_atpg set_atpg capture 5 ----------------------------------------------------------------//Capture Procedure for Launch on BIST in SPF file. "sequential_capture" { W "_default_WFT_"; F { ;"lcd_pclk"=P;"lcd_vsync"=0; } V { "_pi"=\j \r196 #; } V { "DIV10_125"=P; }//Launch: Write D1(A1) V { "DIV10_125"=P; }//Launch: Write D2(A2) V { "DIV10_125"=P; }//Launch: Read D1(A1) V { "DIV10_125"=P; }//Launch: Read D2(A2) V { "DIV10_125"=P; }//Capture V { "_po"=\j \r196 #; }}}
Figure 8. Constraints, switches and sequential capture procedure used in Launch off BIST (LOB) based RAM sequential ATPG
1 0 1 X
L L L C
SHIFT Phase Capture Phase
L = Launch C = Capture A1 = some Address location A2 = another Address location
bist
TD
D_mem
Q_mem
functional
Bistmode
Memory Array
ADR_mem
ATPGmode control: During RAM sequential pattern generation, when multi load option is disabled, the output (Q pins) of the memory is not initialized to a known state at the end of shift. In LOB, for example, the flops capturing the Q pins of the memory would be in unknown state for the 4 Launch cycles of the capture phase. This in turn leads to X propagation, leading to high pattern inflation. This can be taken care in scan insertion by having ATPGmode pin of each memory to be controlled by scan flops during capture, such that ATPGMODE = 1 1/0 during shift during capture
6 Results
The techniques described in Section 5 (fault segregation and filtering technique, LOB and COB based RAM sequential ATPG as well as the associated DFT changes) were used in the 45nm production SoC CHIP A (taped out in 2011). The performance of Synopsys Tetramax ATPG (tool version : 2011.09) using the conventional and proposed techniques is listed in Table 3.
Table 3. CHIP A: Comparison of proposed and conventional RAM sequential ATPG
Coverage 25.3%
#Patterns 3028
Run-time 12087 s
Proposed 84% 736 1046 s From Table 3, we can see that the test coverage has improved by 3.5X, while the pattern generation time has reduced by 75%, while pattern generation time has improved by 10X. Using timing analysis from PT, we projected the fmax for 4 clock domains by looking at the worst-case setup slack for the timing paths exercised in conventional TFT as well as RAM sequential ATPG. The fmax comparisons are shown in Figure 10. The results show that the RAM sequential is likely to bin the parts at a lower fmax compared to conventional TFT. Silicon data collection is currently underway to validate the STA data.
600 500
Frequency (MHz)
400
LOC TFT
RAM sequential
Figure 10. Projected fmax comparisons between LOC TFT and RAM sequential ATPG patterns
7 Conclusions
In this paper, we saw how Synopsys TetraMAXs sequential ATPG engine can be leveraged to perform effective at-speed testing of the memory interface faults. We provide data on a production SoC to show the challenges in a nave approach to use TetraMAX to perform RAM sequential ATPG. We further show that through a combination of fault list segregation and pruning, usage of a two-phased ATPG approach to exploit the DFT resources in a design along with necessary DFT hooks, the performance of TetraMAX can be significantly improved. This would be a critical enabler to make RAM sequential ATPG patterns part of a production test program.
8 References
[1] Synopsys, Tetramax User Guide (tool version : 2011.09) [2] Devanathan, V.R.; Hales, A.; Kale, S.; Sonkar, D.; , "Towards effective and compression-friendly test of memory interface logic," Test Conference (ITC), 2010 IEEE International , vol., no., pp.1-10, 2-4 Nov. 2010 [3] Devanathan, V.R.; Vooka, S.; , "Techniques to improve memory interface test quality for complex SoCs," Test Conference (ITC), 2011 IEEE International , vol., no., pp.1-10, 20-22 Sept. 2011