Abstract—The Physically Unclonable Functions (PUFs) are SRAM-PUF having two cross-coupled latches. The most re-
used in numerous security applications such as device authen- cent RS-LPUF is based on the location information of the
tication, secret key generation, FPGA intellectual property (IP) random latches and the practical advantage of it over SRAM-
protection, and trusted computing. In this paper, compact im-
plementations of Ring oscillator-based PUF (RO-PUF), Arbiter- PUF is that its behavior does not rely on a power-up condition.
based PUF (A-PUF) and RS Latch-based PUF (RS-LPUF) on an Furthermore, programmable delay line (PDL) based FPGA
FPGA (Field Programmable Gate Array) platform are presented. PUF [10] also possess the potential for on-chip applications.
The proposed schemes provide very competitive area trade-offs In this paper, effective approach for three distinct FPGA
and effectively enable smallest FPGA implementations, reported
based PUFs design are being reported where they make use
so far, of RS-LPUF, RO-PUF, and A-PUF respectively. The
designs have been validated by developing prototypes on Xilinx of programmable delay lines in conjunction with A-PUF, RO-
Spartan-6 FPGAs at core voltage of 1.2V and normal operating PUF, and RS-LPUF respectively to generate higher probability
temperature. Finally, a detailed comparison of statistical analysis of random variations. In summary, the key contributions of this
on their performance using measured PUF data have been paper are as follows:
carried out. It has been demonstrated that the proposed designs
exhibit significantly improved performance in terms of statistical • Compact design of A-PUF, RO-PUF, and RS-LPUF on
properties when compared to the existing works. a Xilinx Spartan-6 FPGAs (XC6SLX45-3CSG324) are
proposed. The programmable delays of FPGA LUTs have
Keywords-PUF; PDL; FPGA; RO PUF, Arbiter PUF, RS Latch
PUF; been used to achieve more random process variations.
• The Obtained results from these designs have been com-
162
through a UART interface to the 8-bit Galois LFSR and
that generates the 256 subsequent challenges. Then from
each of these subchallenges two different ROs are chosen
for comparison. The subchallenges consist of two parts: the
4 least significant bits (LSB) select one of the 16 ROs in
group 1 through the multiplexer 1 while the 4 most significant
bits (MSB) of the subchallenge select one of the 16 ROs
in group 2 through the multiplexer 2. The frequency of the
selected ROs in group 1 and group 2 are then obtained and fed
into the 32-bit respective counters. A crystal 100 MHz clock
signal generated by an on-board oscillator drives the 8-bit
reference counter. The counter 1, counter 2, and the reference
counter start counting at the same time and are forced to
stop when reference counter hits its maximum value. Then,
the comparison of counter 1 and counter 2 values generates
a response bit 0 or 1 for this RO pair depending on which
counter had the higher value. Subsequently, the generated
response is stored in a 256-bit shift register. In this case,
therefore, 256 response bits are generated for each 8-bit master
challenge. Finally, the generated response bits are sent to the
PC using a UART 8-bit interface for the PUF performance
analysis. The controller circuitry is responsible for starting and
stopping the ROs using the enable input, the ROs selection,
Fig. 3: The proposed design of RO-PUF. counting the oscillations, and returning the responses based on
their comparisons.
B. RS-LPUF
[9] and [16] provide the implementations of SR-latch based
PUF on Xilinx Spartan-6 FPGA device. In [9], PUF response
determines the final state of the output patterns (all zeros, all
ones, or a combination of zeros and ones) of each latch by
applying consecutive rising edges. If the final output (1000
repetitions of the experiment) patterns are all zeros then the
corresponding response bits are 00. Similarly, if the final
output patterns are all ones then the corresponding response
Fig. 5: PUF array con- bits are 11 and if the final output patterns combine ones and
Fig. 4: Implementation of figuration of 32 ROs on zeros then the corresponding response bits are 10. In [16], PUF
2 ROs per CLB. Spartan-6 device response is determined using the exact number of oscillations
at the output of each latch during the metastable state by
applying a rising-edge at a control signal. In the current work,
used for enabling the RO. Two ROs are implemented inside the PUF response is derived from the difference in counting the
a single CLB, where each RO is implemented inside a single numbers of 1’s of the selected pairs of RS-LPUFs by applying
slice of CLB as shown in Fig. 4, each with a different color a consecutive rising edges and the concept is similar to the
scheme. In this design, 32 ROs are configured at the center proposed RO-PUF.
of a chip in a 4 × 4 matrix of CLBs as shown in red in The implementation reported in [9] has two NAND gates
Fig. 5. The hard macro technique to place them at selected and one FF (for each SR-latch) implemented in same kinds
locations in order to make all the ROs identical has been used. of slices on different CLBs. On the other hand, the design
In order to generate programmable delays inside the 6-input in [16] has two NAND gates and two FFs (for each SR-
LUT, one LUT-input is the ring connection while one other latch) implemented on different kinds of slices on the same
input is essentially the challenge bit. All other remaining LUT- CLB. In the current work, two NAND gates and one FF (for
inputs are fixed to zero. In addition, it is important to note that each SR-latch) are implemented in a single slice on one CLB
one of the input to the LUT of AND gate is used for enabling shown in Fig. 6. Moreover, In this work only 32 RS latches
the RO (see Fig. 3). for generating the 256 response bits are used whereas the
designs in [9] and [16] require 128 RS latches and 512 latches
The architecture of the proposed RO-PUF, given in Fig. 3,
respectively. Additionally, the proposed approach employs the
implemented on Xilinx Spartarn-6 FPGA device starts with
concept of PDL in the design of PUF for improving the
initiation of 8-bit master challenge from a user PC (Matlab)
163
random response. The proposed RS-LPUF architecture is shown in Fig. 7. In
this design, 32 RS latches are configured at the center of a chip
in a 4 × 2 matrix of CLBs. A 100-MHz clock signal generated
by an on-board oscillator is applied to each FF, which divides it
into a 50-MHz clock signal that is applied to corresponding RS
latches. The D-type FF acts as frequency divider by feeding
back the output from Q to the input terminal D, the output
pulses at Q have a frequency that are exactly one half that
of the input clock frequency (see Fig. 7). The RS Latches are
enabled and stopped using the enable signal of the architecture
and before applying the enable signal the flip-flop (FF) is reset.
It is done to ensure that the latches always start with the same
initial state. In this design, similar to the RO-PUF approach,
the generation of 256 subsequent challenges and selection of
2 SR-latches are done in 2 groups based on the subsequent
challenge inputs.
Fig. 6: Implementation of 4 RS latches per CLB As per the architecture, RS latch output is 1 when input
is 0 in a stable state. When input of the RS latch changes
from 0 to 1 (i.e., the rising edge), the RS latch temporarily
enters a metastable state. It then enters a stable state with either
output 0 or 1. In every clock cycle, the outputs of selected RS
latches in group 1 and group 2 are fed into the 8-bit counters
1 and 2 respectively. Then both counter value is incremented
every time selected RS latches output 1’s. The counter 1,
counter 2, and 8-bit reference counter start counting at the
same time and are forced to stop when reference counter hits
its maximum value. The comparison of counter 1 and counter
2 values generate a response bit 0 or 1 for this RS latch pair,
depending on which counter had the higher value. Finally, the
generated response is stored in a 256-bit shift register. In this
case, 256 response bits are generated for each 8-bit master
challenge which are then sent to the PC using a UART 8-bit
interface for the PUF performance analysis.
C. A-PUF
In [10], PDL based A-PUF is presented on Xilinx Virtex 5
FPGA device. The design of the switching elements in PDL
based A-PUF uses non-swapping path based switches instead
of path-swapping switches to cancel out the delay skews
caused by asymmetries in routing on FPGAs. In their design,
Fig. 7: RS-LPUF design. the PDL (top and bottom part of switch) is implemented by 2
LUTs each acting as an inverter. The 2 LUTs are implemented
In the proposed work, an SR-latch is created from two LUTs in same kinds of slices on different CLB. They placed 16
and one flip-flop (FF) as shown in Fig. 6. Two LUTs are PUF instances in parallel on the FPGA to produce 16 bits of
used to create NAND gates while the FF in front of the two responses per challenge, where each PUF instance consists of
NAND gates is used to reduce the clock skew. Furthermore, 64 stages of PDLs. Finally, the PUF responses are determined
four latches are implemented inside a single CLB whereas two to apply 1000 random challenges to the PUF instances. If
RS Latches are implemented inside similar type of CLB slice more than half of the responses are 1’s, the PUF response
using hard macro, shown in Fig. 6, each with a different color is considered 1, and 0 otherwise. On the other hand, the
scheme. current work proposed in this paper utilizes the proposed RO-
These slices are placed manually at predefined location PUF design approach where PUF response is derived from
with predefined connections to minimize the skew of internal the difference in counted 1’s between the selected pairs of
signals. In order to generate programmable delays inside the PUF instances from 32 PUF instances by applying rising clock
6-input LUT, 3 inputs of each LUT are used for connection edges. It is imperative to note that each PUF instance here
purposes such as FF output, another LUT output, and one consists of 8 stages of PDLs. The PDL is implemented by 2
challenge bit while the remaining LUT-inputs are fixed to zero. LUTs each acting as buffer and the 2 LUTs are implemented
164
in different slice on the same CLB (see Fig. 8). configured delay paths of the PUFIs (see Fig. 9). A 100-MHz
clock signal generated by an on-board oscillator is also applied
to the PUFIs. In every clock cycle, the outputs of a selected
PUFI in group 1 and group 2 are fed into the 8-bit counters 1
and 2 respectively. Then both counter value is incremented
every time the selected PUFIs outputs 1’s. The counter 1,
counter 2, and the 8-bit reference counter start counting at
the same time and are forced to stop when reference counter
hits its maximum value. Next, the comparison of counter 1 and
counter 2 values generates a response bit 0 or 1 for this PUFIs
pair, depending on which counter had the higher value. Finally,
the generated response is stored in a 256-bit shift register. As
a result, 256 generated response bits for each 8-bit master
Fig. 8: Implementation of one PUFI on the 3 CLBs. challenge are sent to the PC for PUF response analysis.
165
TABLE I: Comparison of FPGA implementation results for the R EFERENCES
proposed RO-PUF, RS-LPUF and A-PUF with State-of-the-Art [1] P. S. Ravikanth, “Physical one-way functions,” in PH. D. THESIS.
Massachusetts Institute of Technology, Mar 2001.
[2] B. Gassend, D. E. Clarke, M. van Dijk, and S. Devadas, “Silicon
CLB for Area Response Target
Design PUF elements (Total bit length FPGA
physical random functions,” in Proceedings of the 9th ACM Conference
config.† Slices*) Device on Computer and Communications Security, CCS 2002, Washington,
Spartan-3E DC, USA, November 18-22, 2002. ACM, 2002, pp. 148–160.
[14] 128 — 127 (XC3S500E) [3] J. Guajardo, S. S. Kumar, G.-J. Schrijen, and P. Tuyls, “FPGA Intrinsic
Spartan-3E PUFs and Their Use for IP Protection,” in Proceedings of the 9th
[15] 130 727 283 (XC3S500E)
RO-PUF Spartan-6 International Workshop on Cryptographic Hardware and Embedded
[17] — 952 49 (XC6SLX45) Systems, ser. CHES ’07, 2007, pp. 63–80.
Our work Spartan-6 [4] H. Kang, Y. Hori, and A. Satoh, “Performance evaluation of the first
(Section III-A) 16 82 256 (XC6SLX45)
commercial PUF-embedded RFID,” in The 1st IEEE Global Conference
Spartan-6
[9] 256 — 256 (XC6SLX16)
on Consumer Electronics 2012, Oct 2012, pp. 5–8.
RS-LPUF 324 + Spartan-6 [5] G. E. Suh and S. Devadas, “Physical Unclonable Functions for Device
[16] 128 1 BRAM 256 (XC6SLX16) Authentication and Secret Key Generation,” in Proceedings of the 44th
Our work Spartan-6 Design Automation Conference, DAC 2007, San Diego, CA, USA, June
(Section III-B) 8 54 256 (XC6SLX45)
4-8, 2007. IEEE, 2007, pp. 9–14.
Virtex-5
[10] 2064 — 160 (XC5VLX30)
[6] Y. Alkabani, F. Koushanfar, and M. Potkonjak, “Remote activation of ICs
A-PUF Spartan-6 for piracy prevention and digital right management,” in 2007 IEEE/ACM
[18] — 4,288 64 (XC6SLX45) International Conference on Computer-Aided Design, Nov 2007, pp.
Our work Spartan-6 674–677.
(Section III-C) 96 234 256 (XC6SLX45)
[7] J. W. Lee, D. Lim, B. Gassend, G. E. Suh, M. van Dijk, and S. Devadas,
†
Number CLBs for elements configuration: 1. ROs in RO-PUF, 2. RS latches in RS-LPUF, “A technique to build a secret key in integrated circuits for identification
3. PUF Instances in A-PUF and authentication applications,” in VLSI Circuits, 2004. Digest of
*
The total slices for the PUF with control logic (without UART). Technical Papers. 2004 Symposium on, June 2004.
[8] S. S. Kumar, J. Guajardo, R. Maes, G. J. Schrijen, and P. Tuyls,
“Extended abstract: The butterfly PUF protecting IP on every FPGA,”
in Hardware-Oriented Security and Trust, 2008. HOST 2008. IEEE
International Workshop on, June 2008, pp. 67–70.
reliability and uniformity values are closer to the ideal values [9] D. Yamamoto, K. Sakiyama, M. Iwamoto, K. Ohta, M. Takenaka, and
as well. K. Itoh, “Variety enhancement of PUF responses using the locations of
random outputting RS latches,” J. Cryptographic Engineering, vol. 3,
no. 4, pp. 197–211, 2013.
TABLE II: Performance Comparisons with Previous PUFs [10] M. Majzoobi, F. Koushanfar, and S. Devadas, “FPGA PUF using
programmable delay lines,” in 2010 IEEE International Workshop on
Design Unique Relia Uni Information Forensics and Security, WIFS 2010, Seattle, WA, USA,
ness bility formity December 12-15, 2010. IEEE, 2010, pp. 1–6.
Ideal [11] Xilinx., “Spartan-6 Libraries Guide for HDL Designs,” December, 2009.
Value 50% 100% 50% Available at:, http://www.ftdichip.com/Support/SoftwareExamples/
[14] 43.50 94 — CodeExamples/CSharp.htm.
[15] 47.67 98.10 50.75 [12] M. Majzoobi, F. Koushanfar, and S. Devadas, “FPGA-Based True
RO-PUF [13] 46.99 99.13 50.56 Random Number Generation Using Circuit Metastability with Adaptive
Our work 47.13 99.16 50.61 Feedback Control,” in Cryptographic Hardware and Embedded Systems
(Section III-A) - CHES 2011, vol. 6917. Springer, 2011, pp. 17–32.
[13] A. Maiti, V. Gunreddy, and P. Schaumont, “A Systematic Method
[9] 49.0 99.14 — to Evaluate and Compare the Performance of Physical Unclonable
RS-LPUF [16] 49.4 98.87 — Functions,” IACR Cryptology ePrint Archive, vol. 2011, p. 657, 2011.
Our work 48.1 99.19 50.2 [Online]. Available: http://eprint.iacr.org/2011/657
(Section III-B) [14] A. Maiti and P. Schaumont, “Improving the quality of a Physical Unclon-
A-PUF [13] 18.37 99.76 55.69 able Function using configurable Ring Oscillators,” in 2009 International
Our work 44.30 96.00 48.45 Conference on Field Programmable Logic and Applications, Aug 2009,
(Section III-C) pp. 703–707.
[15] B. Habib, K. Gaj, and J. P. Kaps, “FPGA PUF Based on Programmable
LUT Delays,” in Digital System Design (DSD), 2013 Euromicro Con-
V. C ONCLUSION ference on, Sept 2013, pp. 697–704.
[16] B. Habib, J. Kaps, and K. Gaj, “Efficient SR-Latch PUF,” in Applied
In this work, optimized designs and implementations of Reconfigurable Computing - 11th International Symposium, ARC 2015,
RO-PUF, RS-LPUF, and A-PUF with significantly enhanced Bochum, Germany, April 13-17, 2015, Proceedings, vol. 9040. Springer,
2015, pp. 205–216.
performances have been presented. Here, the programmable [17] R. Maes, A. V. Herrewege, and I. Verbauwhede, “PUFKY: A Fully
delays of FPGA LUTs have been used to achieve more random Functional PUF-Based Cryptographic Key Generator,” in Cryptographic
PUF response. It has been demonstrated that the proposed Hardware and Embedded Systems - CHES 2012, vol. 7428. Springer,
2012, pp. 302–319.
implementations yield very area effective A-PUF, RO-PUF, [18] S. Katzenbeisser, Ü. Koçabas, V. van der Leest, A. Sadeghi, G. J.
and RS-LPUF implementations reported so far. The statistical Schrijen, H. Schröder, and C. Wachsmann, “Recyclable PUFs: Logi-
analysis of the features of proposed PUF show substantial cally Reconfigurable PUFs,” in Cryptographic Hardware and Embedded
Systems - CHES 2011, 2011, pp. 374–389.
improvement over existing PUFs. The proposed PUF im-
plementations, therefore, could be very good candidates for
lightweight security applications. In future, the proposed work
could be extended by validating these designs at varying
temperatures and operating voltages.
166