Anda di halaman 1dari 4

A Complementary Architecture for High-Speed True

Random Number Generator


Xian Yang

Ray C.C. Cheung

Department of Electronic Engineering


City University of Hong Kong
Email: xian.yang@cityu.edu.hk

Department of Electronic Engineering


City University of Hong Kong
Email: r.cheung@cityu.edu.hk

AbstractIn this paper, we introduce a novel FPGA-based


design for true random number generator (TRNG). It is able to
harvest the timing difference caused by the nonuniformity of the
Integrated Circuits (ICs) and use it to generate the randomness.
Compared with the previous related work, this design uses a
complementary scheme that leads to a doubled data rated output.
The proposed complementary design has improved entropy and
achieved higher throughput. The prototype design has been
implemented and verified on a Xilinx Virtex-6 ML605 evaluation
board. As a result, the generated random number stream is able
to pass the statistical NIST and DIEHARD test suites showing a
reliable performance. Meanwhile, it can approach the maximum
data rate as 50 Mbps stably.

I.

Entropy
Source

Digitizer

Noise

Post-Processor

ADC

Random Bit
e.g.10110001

Fig. 1: The Schematic of a typical True Random Number


Generator (TRNG).

I NTRODUCTION

Random number plays a key role of the basis element


in various areas, such as the encryption designs [1],
mathematical simulations [2], and even in the game machines
inside casinos [3]. Its wide applications have attracted active
research, engineering work and also entertainment industries.
A. Related Work
Compared with the pseudo random number generator
(PRNG), as named, an ideal TRNG is supposed to be a bit
source as theoretically unpredicted. PRNG is usually built by
a determined algorithm, while the TRNG is always sourced
from a physical phenomena. Since the 20th century, people
have been studying using electronic device to generate random
bits. There are several classic methods developed during this
period. Typically, we can categorize them into analogue [4],
digital [6] and mixed-signal based [5]. Fig. 1 gives us a basic
view of a TRNG schematic.
In the analogue design perspective, the physical noise, for
instance the thermal/Johnson noise and the photoelectric effect
or other quantum phenomenon [4], is a reliable source to
capture the randomness and to build up a TRNG. One of the
concerns of the analogue source is it should be independent
of the external environment, which may impact the reliability
of the randomness generation. However, in the pure digital
world, the selection of randomness source has become more
limited for a TRNG. In the past years, several methods have
been presented by the researchers to achieve a digital based
TRNG circuit. In general, the most popular candidates of the
randomness source is the jitter of the digital clock resource,
including the free-running ring oscillator (RO), phase lock
loop (PLL) or delay lock loop (DLL) [5], the meta-stability
978-1-4799-6245-7/14/$31.00 2014 IEEE

248

behavior, and the noises in the circuit, etc. Among these works,
the most cited one is the design proposed by Sunar et al. [6].
They introduced a TRNG that takes advantage of the jitter from
a RO array which is constructed by an odd number of inverters.
Followed by this work, Schellekens has improved the TRNG
performance with post-processing employed in the design [7].
Schellekens provided a detailed analysis of the performance of
different correctors with RO structured TRNG. However, ROs
often occupy more area and induce less power efficient.
Furthermore, Danger has proposed his work using the
metastability based TRNG structure [8]. As an open-loop
circuit, this scheme used the delay chain to capture the
metastability behavior from the D-latch adopted on the delay
chain. This design has demonstrated a better data speed than
the close-loop design. It is reported as the fastest TRNG with
digital element as 20 Mbps. Before the open-loop architecture,
a typical closed-loop TRNG design with a delay feedback is
reported by Majzoobi [9]. Majzoobi uses the delay tuning
loop to make the DFFs go into a metastability status. The
disadvantage of this RNG is the randomness entropy is relative
low due to the difficulty to generate metastability.
Apart from the architectural design, some works focus
on the development of individual entropy component. These
designs aim at developing the randomness sources with high
entropy, including Hisashis RS-latch, [10] Dichtl and Golics
Golic and Fibonacci type of ring oscillator work [11] and
transition effect ring oscillator (TERO) by Varchola [12]. The
TRNG elements usually takes advantage of the logic conflict
in a FPGA or digital system. This kind of TRNGs can take
much less resources and are very power efficient. The shortage
is that the data rate is limited by the element itself [10].

CLK_IN

Data Path
d(i)
Q0

d(0)

d(1)
D

Coarse Delay

d(2n)
D

Q0

Q1

d(2n+1)
D

Q2n

Q2

Q2n+1

Q4

Clock Path
d(i)

d(c)
d(0)'

d(1)'

d(2n)

Q8

d(2n+1)

Q_XOR

(a)

(b)

Fig. 2: The proposed complementary scheme of TRNG source (a) The hardware architecture, (b) XORed output diagram.

Q1

B. Contributions of this work

Q3

In this paper, we have put our focus on achieving better


randomness entropy and also speeding up the data rate of
the random data generation. With this motivation, we have
designed a novel structure to harvest the nonuniformity timing
existing within a pure digital device.

Qr

Q2n+1

XOR

1011
P2S

1100

Random Bit
11011010

Q0
Q2

Qf
D

Q2n

XOR
CLK_IN

Parralle to Serial
Convertor

C
CLK_IN x2

Our contributions include:

A dual-rail complementary architecture is proposed


using the delay variation in the TRNG circuit.

The double rated data generation scheme is introduced


to improve the TRNG entropy and throughput.

We provide some parameter tuning guidelines to


improve the TRNG quality with respect to the
DIEHARD and the NIST test suite.

This paper is organized as follows. We describe the


proposed TRNG design architecture and analyze the flexibility
of the randomness result in Section II. The statistical test
results of the randomness outputs are shown in Section III.
The results are investigated and compared with other existing
works. Finally, the conclusion of this work is summarized in
Section IV.

II.

P ROPOSED D ESIGN AND I MPLEMENTATION

Fig. 3: The Proposed Sampler Design.

synthesize each inverter from the LUT inside the FPGA. Apart
from the inversion function, each of inverters contains the
timing delay essentially. With these two important features, the
inverted based delay chain becomes one of the best candidates
in our design. The delay chain design is the key to determine
the RNG quality. It is here the source of the randomness
generation. The detailed design procedure is discussed in
Section III of the FPGA implementation. Targeting the feature
of the harvest logic unit, the delay combination is typically
to set the d(i) > d(i), consistent with the assumption we have
made in the Section II(A). Meanwhile, there has been a coarse
delay inserted in the clock delay path. It is used to make less
steps to trigger the metastability of the latches. The coarse
delay unit is built by an even number of inverters in this design.
B. Sampler

In a solid state silicon circuit, the nonuniformity always


exists caused by the variation during the fabrication. It induces
several electronic phenomena in timing, including skew and
jitter. In our work, we have provided a circuit architecture to
maximize the variation harvest. As shown in Fig. 2, an inverter
based delay chain has been defined. At each delay step, there
is a D-Latch to capture the timing differential between Data
Path and Clock Path, as Node d(i) and Node d(i). Here d(i)
stands for the data path delay at step i, and d(i) for the i
of clock path.A simulated output of the architecture has been
illustrated in Fig. 2(b). Lets elaborate the details of the scheme
as followings.
A. Delay chain
Here, we focus on the inverter constructed delay line. One
of the most interested features is that it can create the 180
degree phase shift of each delay step. It forms the original
idea of the double data rated architecture. We are able to
249

Regarding the double data rated design purpose, the sample


stage has been designed in Fig. 3 shown. It illustrates a
basic idea how the double data rate receiver logic works. A
positive edge triggered flip-flop combined with a negative edge
triggered one are used as the first stage registers. At this stage,
it samples the output of the exclusive-or (XORed) output of
each step from the TRNG delay chain. The input of the Qr is
from the odd steps, which captures the output from the rising
edge of clock. While the input of Qf is connected to the even
outputs. Followed by the sampling register, a parallel to serial
converter is designed to transfer the two-bit random number
into one bit stream. The clock frequency used for the parallel
to serial conversion is CLK IN x2, two times of the first stage
register as CLK IN. After that, a doubled speed random data
stream is achieved. At the same time, since the final one
bit data stream has included two kinds of the nonuniformity
information, both rising and falling edges, the random entropy
has been optimized eventually.

Random Bit

Random Bit Stream


Post-Processing
D

CLK_IN x2

Fig. 4: 4-step LFSR based XOR Corrector.

Fig. 6: TRNG output waveform (CH1) v.s. Sampling Clock


(CH2).

D-Latch

Clock Path

III.

Data Path

E VALUATION AND TEST ANALYSIS

Combined the description in the above section. The


proposed scheme has been implemented on a Xilinx Virtex-6
ML605 evaluation board. The statistical test suite used for the
evaluation is selected as DIEHARD and NIST test suite.

Fig. 5: Place and Route of the Delay Path.

C. Post-processing
Post-processing techniques are commonly used in the
TRNG design. It is because of the bias of the random number
is always existing. The post-processors are used to balance the
bias between 1 and 0. In this work, we have used an XOR
method to eliminate the probability bias [13]. It is built by a
4 bit LFSR which is followed by XORs of the each bit as the
final output. The schematic has been shown in Fig. 4. It can
take the advantage of the LFSR which can keep the same data
rate.

D. Place and Route


From above, we can find that the timing management
is the critical factor to make a good randomness generator.
As one of the most important element for this scheme we
have introduced is the delay chain design. The LUT based
inverter is the basis to transfer the data and clock and feed
them to the D-latches. However, the timing difference could
be produced by the routings as well. To make a controllable
generator, which can be reproduced in another platform.
The place and route (P&R) has been designed in a known
manner. The general rule of this is to generate certain delta
delay between the data and clock line, which is described as
d(i) > d(i). Meanwhile, the delay difference = d(i)-d(i)
should be small enough to make the opportunity of the
D-Latches going to a metastability. In Virtex-6 device, the
metastability window is within 100ps. To achieve that, a set of
experiments for the delay unit P&R has been done. The final
P&R schematic has been illustrated as Fig. 5. As verified in
static timing analysis (STA), the d(i)=0.392ns, d(i)=0.404ns,
then =d(i)-d(i)=0.012ns. It comes out an acceptable value
for the design purpose. During the FPGA implementation, the
P&R has been constrained by using the LOC attribution in
the source file.

250

In the first round of test, we have a prototype design with


the delay chain length of 64. The coarse delay unit is combined
with 12 inverters. On the evaluation stage, the XORed method
is used for the data rate consideration. There are totally 224
slices used for the TRNG and the post-processing. The whole
evaluation platform has occupied around 1% of the total
resource. A snapshot of the random number out can be seen
in Fig. 6 at the sampling clock is rating at 50 MHz.
According to the NIST test suite requirement [14], we have
collected the length of Megabits sequences for test. Totally
100 batches of data have been tested and passed. The test
result has been listed in Table I. Only the items of Runs and
Approximate Entropy showing a relatively low as 77% and
71%. All of the p-values is greater than 0.0001. The result
can be considered passed. The bit sequences has passed the
DIEHARD test as well [15].
In the second round of test, we have selected the delay
chain length from 8, 16, 32 and clock frequency is at 50 MHz.
The test result showing the NIST and DIEHARD test suite can
also pass partially when chain length is 32 but failed at less
than 16. During the STA, we have found that each delay step
is showing a maxim delay as 0.472ns, which combined the
routing delay of 0.404ns and 0.068ns of inverter propagation
delay. For the sampling frequency at 50 MHz, the half clock
period at the delay chain is 20ns. To make a metastability
behavior happen, there should at least (20/0.472 48) delay
steps. In order to make sure a quality randomness generation,
the more delay steps, the better. Thus, the basic guideline to
use this methodology is to keep certain amount of delay units
while keeping the delay resolution can trigger a metastable
state to the latches.
To compare the related TRNG works which have been
implemented on FPGA, some technical performance is listed
in Table II. With the similar mount of logic resources, LUTs
and Slices, the complementary scheme used in this work has
improved the random bit throughput obviously.

TABLE I: NIST Test result TRNG @ 50 MHz.


C1
14
11
13
50
17
7
11
4
7
10
72
16
13

C2
14
6
12
8
10
18
13
4
11
11
18
11
7

C3
10
10
11
7
9
7
7
5
10
12
4
12
2

C4
9
5
6
11
7
19
15
14
13
10
3
6
5

C5
10
11
10
4
9
15
6
13
10
13
1
8
10

C6
11
18
8
5
10
4
15
14
8
12
2
8
17

C7
8
8
10
3
7
5
8
13
11
7
0
13
10

C8
12
7
6
4
6
3
9
10
13
6
0
8
14

C9
5
10
14
3
11
8
8
9
14
7
0
12
12

C10
7
14
10
5
14
14
8
14
3
12
0
6
10

P-VALUE
0.574903
0.137282
0.678686
0.000891
0.334538
0.000216
0.366918
0.058984
0.366918
0.779188
0.008879
0.366918
0.040108

PROPORTION
100/100
100/100
100/100
77/100
98/100
99/100
98/100
100/100
99/100
98/100
71/100
99/100
94/100

STATISTICAL
Frequency
BlockFrequency
CumulativeSums
Runs
LongestRun
Rank
FFT
NonOverlappingTemplate
OverlappingTemplate
Universal
ApproximateEntropy
Serial
LinearComplexity

TABLE II: Throughput Comparison of FPGA based TRNG (Unit:Mbps)


Entropy Source

Reference

Device

Throughput[Mbps]

Speed Grade *

DLL
Free-running Ring Oscillator
D Latch
D Flip-flop
RS Latch
Fibonacci Oscillator
Transition Effect Ring Oscillator

S.H.M. Kwok, E.Y. Lam [5]


D. Schellekens et al. [7]
J. Danger et al. [8]
M. Majzoobi et al. [9]
H. Hata, S. Ichikawa [10]
M. Dichtl, J. Golic [11]
M. Varchola et al. [12]

Xilinx XC2VP20
Xilinx XC2VP30
Altera EP1S25
Xilinx XC5LX50T
Xilinx XC4VFX20
Xilinx XC2S200
Xilinx XC3S500E

6.05
2.5
20
2
12.5
12.5
0.25

2.4
1
8
0.8
5
5
0.1

D Latch

Complementary Scheme

Xilinx XC6VLX240T

50

20

*, Speed Grade Is Calculated as the Free-Running Ring Oscillator as the Baseline for Reference.

IV.

C ONCLUSIONS

[3]

In this work, we have introduced the complementary


scheme for generating true random numbers on a Xilinx
Virtex-6 device. We are able to perform the timing nonuniformity harvest successfully. Using the inverter as the delay unit,
the output can simply achieve the higher speed and the double
data rate. As a result, the proposed TRNG can generate random
numbers stably at 50 MHz. After the post processing step, the
processed output are able to passed the popular statistical test
suites, both DIEHARD and NIST tests.
For the future work, we will extend the work on harvesting
resolutions of the shortage of less entropy problems. Furthermore, we will verify the same scheme on the newest Xilinx
7-family devices. We expect more promising results and higher
performance can be achieved.

This work was partly supported by the Research Grant


Council of the Hong Kong Special Administrative Region,
China (Project No. CityU 123612), and Croucher Startup
Allowance (Project No. 9500015).
The authors would like to thank the very helpful comments
from Dr. Zahid Ullah.

[6]

[7]

[8]

[10]

[11]

[12]

R EFERENCES

[2]

[5]

[9]

ACKNOWLEDGMENT

[1]

[4]

R. C. C. Cheung, D. Lee, W. Luk, and J. D. Villasenor, Hardware


Generation of Arbitrary Random Number Distributions From Uniform
Distributions Via the Inversion Method, IEEE Transactions on VLSI
system, Vol. 15, No. 8, Aug. 2007
N. A. Woods, T. VanCourt, FPGA Acceleration of Quasi-Monte Carlo in
Finance, International Conference on Field Programmable Logic and
Applications, FPL, pp. 335-340, 2008

251

[13]
[14]
[15]

P. Diaconis, J. Fulman and S. Holmes, Analysis of Casino Shelf Shuffling


Machines The Annals of Applied Probability, Vol. 23, No. 4, pp. 16921720, 2013
T. Saito, K. Ishii, I. Tatsuno, S. Sukagawa, and T. Yanagita, Randomness
and Genuine Random Number Generator With Self-testing Functions,
Joint International Conference on Supercomputing in Nuclear Applications and Monte Carlo, Japan, October 17-21, 2010
S. Kwok and E. Lam, FPGA-based High-speed True Random Number
Generator for Cryptographic Applications,
IEEE TENCON, Nov.
2006
B. Sunar, W.J. Martin and D.R. Stinson, A Provably Secure True Random
Number Generator with Built-In Tolerance to Active Attacks,
IEEE
Transactions on Computers, Vol. 56, Issue. 1, Jan. 2007
D. Schellekens, B. Preneel, and I. Verbauwhede, FPGA vendor agnostic
true random number generator,
International Conference on Field
Programmable Logic and Applications, FPL, pp. 1-6, 2006
J. Danger, S. Guilley and P. Hoogvorst, High speed true random number
generator based on open loop structures in FPGAs,
Microelectronics
Journal Vol. 40, Issue 11, pp 1650-1656, Nov. 2009
M. Majzoobi, F. Koushanfar and S. Devadas, FPGA-Based True Random
Number Generation Using Circuit Metastability with Adaptive Feedback
Control,
Cryptographic Hardware and Embedded Systems, CHES,
Lecture Notes in Computer Science Vol. 6917, pp 17-32, 2011
H. Hata and S. Ichikawa, FPGA Implementation of Metastability-Based
True Random Number Generator
IEICE Transactions on Information
and System., Vol. E95-D, No.2, Feb. 2012
M. Dichtl and J. D. Golic, High-Speed True Random Number Generation with Logic Gates Only,
Cryptographic Hardware and Embedded
Systems - CHES, Lecture Notes in Computer Science Vol. 4727, pp
45-62, 2007
M. Varchola, M.Drutarovsk`y, New High Entropy Element for FPGA
Based True Random Number Generators,
Cryptographic Hardware
and Embedded Systems - CHES, Lecture Notes in Computer Science
Vol. 6225, pp 351-365, 2010
R. Davies, Exclusive or (xor) and hardware random number generators
http://www.robertnz.net/pdf/xor2.pdf, 2002
NIST, NIST Special Publication 800-22,
Rev1a , Apr. 27, 2010
G. Marsaglia,
DIEHARD Battery of Tests of Randomness,
http://www.stat.fsu.edu/pub/diehard/

Anda mungkin juga menyukai