Anda di halaman 1dari 5

Autonomous Device Identification Architecture for

Internet of Things

Hirofumi Noguchi, Tatsuya Demizu, Naoto Hoshikawa, Misao Kataoka, Yoji Yamato
NTT Network Service Systems Laboratories
Nippon Telegraph and Telephone Corporation
3-9-11 Midori-cho, Musashino-shi, Tokyo 180-8585 Japan
Email: { hirofumi.noguchi.rs, tatsuya.demizu.av, naoto.hoshikawa.yu, misao.kataoka.ry, yoji.yamato.wa }@hco.ntt.co.jp

Abstract—A wide variety of devices is being installed in various locations of laptops and web camera will also change
environments such as homes, factories, and streets with the rapid depending on the movement of the user. Therefore, changes in
expansion of the Internet of Things (IoT). To properly and the installation location such as rooms need to be accurately
securely use IoT devices, the states of various devices that have detected to know where the device is now.
different properties and protocols must be managed. However, it
is difficult to understand the consistency of an individual device
Next, consider the network. Some devices have multiple
when its installation place or software changes, since many network interfaces. For example, smartphones change the
individual IoT devices do not have unique identifiers. In this access network from Wi-Fi to a mobile network such as LTE.
paper, we propose an architecture that estimates individual Changes of the network also need to be accurately detected to
devices by analyzing and combining multiple pieces of know where the device is now. Conventionally, by looking at a
information that can be obtained from the device. This Mac address, a device connected to a network can be uniquely
architecture estimates individual device identity based on the identified, but recent operating systems (OSs) randomly
time change pattern of the feature amount extracted from a generate a media access control (MAC) address each time a
signal transmitted by a device. Results of a simulation revealed network connection is made in consideration of security.
that this architecture could identify the individual devices to find
the correct device.
Therefore, it can no longer be used as a consistent key.
Also, consider software operation on the device. Firmware
Index Terms—IoT, device, identification, management. and OSs are regularly updated. Thus, if the same device cannot
be recognized before and after the software is updated, the
I. INTRODUCTION device will be unknown. In this way, if the location of the
device, the network or the software changes, changes to an
A wide variety of devices is being connected to the Internet individual device need to be tracked and grasped to know
as the Internet of Things (IoT) rapidly expands. It is expected where device is now located in the real world and the network.
that 50 billion devices will be connected to the Internet by 2020 In addition, from the viewpoint of security, individual
[1], and more and more devices will be installed in various devices that have dynamically changing states must be
environments such as homes, factories, and streets [2][3][4]. consistently tracked and grasped. For example, if it is not
These devices include sensors such as cameras and possible to track the state log of a certain device from the past
thermometers, small computers such as smart phones, and to the present regardless of the location or software change, it is
actuators such as speakers and displays. Furthermore, their impossible to specify past behaviors or effect range when
computing resources and protocols are also diverse. To failures happen. Also from the viewpoint of device
properly and safely use such a large variety of devices, device authentication, the identity of devices must be recognized
managers have to understand and manage the properties and before and after the change to judge the safety without re-
states of each device. authentication when the state of the authenticated device
However, there are some problems with managing IoT changes within the authentication policy.
devices. First, devices are dynamically added and removed in In this way, it is necessary to grasp the state of various
the same environment. Furthermore, in OpenIoT [5], one devices with different properties and protocols and identify the
device will be used for various purposes, so the installation devices without confusing them with others, even if the state of
location, the network connection state, the type and version of the device changes. There is a way to provide a database (DB)
the software will change dynamically. in an environment, and manually manage the state of devices,
For example, consider the location. In a home, the but manually managing a huge number of devices is not
installation location of sensor on the appliance will change in realistic. Therefore, we are establishing technology to realize
accordance with appliance movement. At a factory, while a IoT resource management and high security by automatically
production line is being refurbished, the sensors will be moved detecting the state and change of various devices and
to another production line and reused there. Furthermore, the

978-1-4673-9944-9/18/$31.00 ©2018 IEEE

407
identifying individual ones. This is an important and III. PROPOSED ARCHITECTURE
challenging theme for expanding IoT. This section presents our proposed IoT device identification
The rest of this paper is organized as follows. Section II architecture. To automatically identify individual devices, we
surveys the related work. Section III describes the device chose a method to estimate device identity. Many IoT devices
identification architecture. Section IV presents the simulation do not have information that can uniquely identify individuals
and experimental results. Section V discusses the future such as certificates. It is also impossible to reliably identify
challenge. Finally, Section VI concludes the paper. individuals from communication information. However, it is
possible to estimate an individual’s identity by analyzing and
II. RELATED WORK combining multiple pieces of information. Even if the
estimation is not completely precise, if the estimation accuracy
There are some technologies for identifying individual is sufficiently high, devices can be managed.
devices. Figure 1 shows an overview of the architecture. This
One existing technology issues a computer certificate with architecture estimates individual device identity based on the
EAP-TLS [6]. It can identify an individual computer by issuing time change pattern of feature amount extracted from a signal
a computer certificate for and installing it on each device. transmitted by a device such as a sensor. In the change pattern
However, it can be applied to only resource-rich devices that of the device feature amount, characteristics of each device and
can handle the EAP-TLS protocol such as personal computers the unique way it is used in each environment appear.
even though many IoT devices such as sensors are resource- In addition, we separate the data acquisition interface and
constrained devices. To manage a wide variety of devices in estimation logic. The architecture can be extended to deal with
various environments, we have to take another way. various kinds of devices.
There is technology to identify the user of a device and its The process of individual device identification is discribed
hardware configuration from web fingerprints [7][8]. It finds below.
the characteristics of users and devices and identifies them with
high accuracy by investigating the behaviors of the browser. A .Extraction of Device Features
However, it can only be applied to devices that can operate  First process step of individual device identification is an
browsers, so another way is still needed to manage a wide extraction of device feature amounts. Feature amounts are
variety of IoT devices. periodically extracted from signals such as sensor value or keep
There are methods for identifying individual devices from alive signals, or responses to actions such as port scanning. The
hardware fingerprints such as device radio wave intensity and feature amount includes information indicating the state of the
output video characteristics [9][10]. These methods are device and the software version being executed. It also includes
effective for identifying counterfeit goods and detecting traffic characteristics such as the average traffic amount and a
defective items, but they need high-load analysis using special communication interval within a certain period of time. Since
equipment. However, the current technology maturity level is the communication protocol varies depending on the type of
not sufficient to manage a large number of devices at low cost. device, different devices have different methods to acquire data
There is a method for managing resource constrained and extract the feature amount. Therefore, in this architecture, a
devices using IP-based management protocols such as SNMP protocol capture is provided for each protocol. To acquire and
and NETCONF [11]. It shows a method for utilizing IP-based handle low-layer communication information such as a MAC
protocols on devices with limited central processing units frame, a protocol capture should preferably be implemented as
(CPUs) and memory. However, an OS must be able to handle a gateway in the local network environment.
these protocols. Therefore, applicable devices are still limited
and need to be expanded to track devices that have changed B .Individual Device Identification
state.  The proposed architecture generates a change pattern from
There is a method for identifying the OS and application the device feature amount. Comparing this change pattern and
operating on the device by analyzing the traffic of the device change patterns stored in the past, the architecture identifies
[12][13]. However, it can identify only the type of software, individual devices.
not individual devices.  Figure 2 shows details of processing in device identifier. For
As described above, there is no individual-device each feature amounts, the pattern detector generates a time
identification technology that can deal with various kinds of change pattern for each individual and stores it in the device
devices used in various environments and does not require any pattern DB. The change pattern is, for example, a difference in
special measuring equipment. It is thus an important research continuous values or an approximate curve. The change pattern
theme for the future IoT expansion. and the individual device are stored in association with each
other in the device pattern DB. When identifying a target
device or storing change pattern, the similarity calculator
calculates the degree of similarity with patterns already stored
in the device pattern DB.

408
Feature Message Time 5

Feature 1
Amount 09:00:02 09:00:05 09:00:09 嵣嵣嵣 09:00:56 09:00:58
device signal
Feature 1 AP1 AP2 AP3 嵣嵣嵣 AP4 AP5
0
Protocol Feature Feature 2 1 3 3 嵣嵣嵣 1 3 0 10 20
signal Capture Amounts Feature 3 100 100 100 嵣嵣嵣 200 200 Message Time

device
Device
Device Identifier
Device Information Unknown
Feature Identifier Device DB Feature pattern Device
Feature pattern
pattern
Amounts Amount Amount
Pattern Similarity Information
Device Feature
Amount Storage Detector Calculator
signal Protocol
device Device
pattern
pattern
Capture pattern
Device Pattern DB
Pattern DB
Device #1 Device #2 Device #3
pattern pattern pattern
pattern
pattern pattern
pattern pattern
pattern

Fig.1. Overview of Architecture


Fig.2. Individual Device Identification Process

Equations 1 and 2 show formulas for calculating the degree of The architecture regards the feature amount with large variance
similarity. as a unique feature amount, and gives it greater weight.
ci = 1 - (Dxi / max( Dx)). (1) Equation 3 shows the formula for calculating the weight.
C = å k i ci . (2) k i = vi / å v (3)
 c is the degree of similarity for each feature amount, C is the k is the weight of each feature amount, and the sum of k is 1. v
comprehensive degree of similarity and x is a digitized feature is a variance value of the degree of similarity for all change
amount. c and C take a value in the range of 0 to 1. c is patterns. k is calculated in accordance with the ratio of variance
calculated by comparing the change pattern from identification value.
target devices and the change pattern in the device pattern DB.
k is the weight of each feature amount. The method of D .Applying identification results
calculating it is explained later. After calculating the degrees of  Final process step is applying the identification result in the
similarity for each feature amount, and integrating these device DB. The device DB records the state of the managed
degrees with their weights, the similarity calculator device such as location or executing software. This architecture
consequently obtains a comprehensive degree of similarity for updates the state information of the device on the device DB by
the device. using the estimation result. In addition, this architecture also
The device identifier calculates the degree of similarity for updates the device pattern DB. This architecture registers new
all the patterns in the device pattern DB, and estimates that the feature amount change patterns in the device pattern DB.
device with the highest degree of similarity is the same as the
identification target device. When no device has a similarity  As described above, it is possible to automatically identify
degree higher than a certain level, the device identifier an individual device the state of which changes.
estimates that the identification target device is a new device
connected to the network.
IV. RESULTS OF SIMULATION
C .Calculation of weight of feature amounts We evaluated the proposed architecture by simulation. As
The proposed architecture calculates the weight of each an intial expreriment, we tried to confirm that this architecture
feature amount. The feature amount change pattern is can distinguish individuals based on the degree of similarity
preferably unique enough not to overlap among multiple calculated from the feature amount of the device. This is
individuals. For example, in an environment containing many because if similarity value of many devicess in an environment
mobile devices, the locations of the majority of devices change are close value, the architecture can’t find the matching device
frequently and should not be referenced for individual properly.
identification. As another example, in an environment where Simulation conditions are shown in Table I. Mobile devices
software of the same model devices is simultaneously updated that constantly move in the environment are assumed. There
in a constant cycle, its change pattern should not be referenced are 10 devices of the same model, and all have 5 feature
for individual identification either. If the proposed architecture amounts, access point, communication delay, communication
handles all the feature amount changes uniformly, it estimates interval, average communication amount, and OS version.
many devices as a close degree of similarity and the accuracy Assuming the movement of the mobile device, each device
of estimation is reduced. This is a major problem, particularly always selects one access point and changes it in a certain
when only small quantities of feature amounts can be obtained. probability within a certain range. The communication delay
Therefore, this architecture evaluates the uniqueness of each depends on the access point, whereas the communication
feature amount. Similarity calculator calculates the variance interval and communication amount depend on the application
value of the degree of similarity of all change patterns. being executed. The device always executes one of two kinds

409
of applications, and in the initial state there is an equal number V. DISCUSSION
of devices. An executed application is changed at a low This research is still in its early stages, and further research
frequency. The OS version is the same and unchanged for all is required for implementation of the proposed architecture.
devices. Under the above conditions, we generated random Although the simulation results show the effectiveness of the
values as data sets of feature amounts. Each data set includes architecture in a specific use case model, a detailed method of
consecutive 10 sets of feature amounts that change according to individual device identification must be studied in order to
change frequency. We generated such datasets for 10 devices. identify various IoT devices.
On the basis of the change pattern of one device, we First, variation patterns in appropriate time ranges need to
calculated the degree of similarity with all other devices using be extracted for each feature amount because patterns vary
the proposed individual identification method. To calculate the depending on the feature amount of the device. Sensor data
change pattern of a feature amount, we used an approximate may change in a cycle of several minutes, whereas software
linear function of the feature amount with respect to time. For versions may change in cycles of several days. Considering
calculation, the access point and the OS are replaced with a how to generate change patterns with various time ranges for
corresponding number. To calculate the degree of similarity for various feature amounts is a challenge.
each feature amount, we use the slope and intercepts of the Second, the change pattern must be extracted and the
approximate linear function of each feature amount change. degree of similarity calculated in accordance with the
Figure 3 shows the distribution of the similarity degree for 10 characteristics of the feature amount. For a continuous feature
devices. The variance value of the similarity degree for 10 amount, the proposed architecture can calculate the degree of
devices is 0.095. The proposed architecture was able to similarity by finding an approximate function from the
separate devices into similar and unlike based on change changing trajectory. On the other hand, for a discontinuous
pattern. Therefore the architecture will be able to find the feature amount, considering how to formulate the change
correct device. In addition, when all the weights of each feature pattern and calculate the degree of similarity is a challenge. We
amount are handled at an equal ratio, the variance value is plan to do more simulations to improve the architecture. Also,
0.053. The weighting effectively improved the variance and the we plan to verify the effectiveness of the architecture by
estimation accuracy. applying it in an actual IoT operational environment.
Furthermore, applying this architecture to device
authentication is a next challenge. It’s hard to identify a
TABLE I. Simulation Configuration
malicious device accurately because it can impersonate another
device by sending the same change pattern. One of measures is
Feature Candidate Change Change
to make it difficult to impersonate a device by identifying
Amount Frequency Range
Access Point 1,2,3,4,5 20 % (1,2,3),(4,5,6)
devices using a lot of feature amounts.
Application A,B 1% (A,B) Finally, as the future outlook on this research, we are
OS X 0% - aiming to realize autonomous IoT service building. We are
considering an approach to combine this architecture with the
Access Point Delay (ms) service control technology [14][15]. We believe that we can
1 10 use autonomous device identification in the network to find
2 15 appropriate devices for the service.
3 30
4 20
5 10 VI. CONCLUSION
A wide variety of devices are being installed in various
Application Amount (Kbyte) Interval (ms)
A 100 200
environments such as homes, factories, and streets with the
B 200 350 rapid expansion of the Internet of Things (IoT). To properly
and securely use IoT devices, the state of various devices that
1 0.1
have different properties and protocols must be managed.
However it is difficult to understand the consistency of an
DEGREE OF SIMILARITY

0.8 0.08
individual device when its installation place or software
changes, since many individual IoT devices do not have unique
VARIANCE

0.6 0.06
identifiers.
In this paper, we proposed an architecture that solves the
0.4 0.04 above problem by estimating individual devices by analyzing
and combining mutiple pieces of information that can be
0.2 0.02 obtained from the device. This architecture estimates individual
device identity based on the time change pattern of the feature
0 0 amount extracted from a signal transmitted by a device.
No Weight With Weight Simulation results of the model case assuming a mobile
Fig.3. Degree of Similarity for Each Device from Simulation device revealed that proposed architecture can identify

410
individual devices by using the degree of similarity calculated [8] Gunes Acar1, Marc Juarez, Nick Nikiforakis, Claudia Diaz,
from the feature amount of the device. Seda Gürses, Frank Piessens, and Bart Preneel, "FPDetective:
This research is in its early stage, and we proposed only dusting the web for fingerprinters," Proceedings of the 2013
ACM SIGSAC conference on Computer & communications
basic architecture to automatically identify IoT devices. In the
security, 2013.
future we will study various devices and computation methods
and implementations corresponding to their features, and will [9] Gianmarco Baldini, Member IEEE, and Gary Steri, "A survey of
techniques for the identification of mobile phones using the
verify the effectiveness of the architecture by using it in an physical fingerprints of the built-in components," IEEE
actual IoT operational environment. Communications Surveys & Tutorials, 2017.
[10] Qiyue Li, Member, IEEE, Hailong Fan, Wei Sun, Jie Li,
Liangfeng Chen, and Zhi Liu, "Fingerprints in the Air: Unique
Identification of Wireless Devices Using RF RSS Fingerprints,"
REFERENCES IEEE Sensors Journal, 17.11, pp. 3568-3579, 2017.
[11] Anuj Sehgal, Vladislav Perelman, Siarhei Kuryla, and Jürgen
[1] D.Evans, “The Internet of Things – How the Next Evolution of Schönwälder, “Management of resource constrained devices in
the Internet is Changing Everything,” Cisco Internet Business the internet of things,” IEEE Communications Magazine, 50.12,
Solutions Group (IBSG), April, 2011. 2012.
[2] Gao Chong, Ling Zhihao, and Yuan Yifeng, "The research and [12] Takashi Matsunaka, Akira Yamada, and Ayumu Kubota,
implement of smart home system based on internet of things," "Passive OS fingerprinting by DNS traffic analysis," Advanced
Electronics, Communications and Control (ICECC), Information Networking and Applications (AINA), IEEE 27th
International Conference on. IEEE, 2011. International Conference On. IEEE, 2013.
[3] Lee, Jay, "Smart factory systems," Informatik-Spektrum, 38.3, [13] Dario Bonfiglio, Marco Mellia, Michela Meo, Dario Rossi, and
pp. 230-235, 2015. Paolo Tofanelli, "Revealing skype traffic: when randomness
[4] Aditya Gaur, Bryan Scotney, Gerard Parr, and Sally McClean, plays with you," ACM SIGCOMM Computer Communication
"Smart city architecture and its applications based on IoT," Review, Vol. 37. No. 4, ACM, 2007.
Procedia Computer Science, 52, pp. 1089-1094, 2015. [14] Yoji Yamato, Hiroyuki Ohnishi and Hiroshi Sunaga,
[5] Jaeho Kim, and Jang-Won Lee, "OpenIoT: An open service "Development of Service Control Server for Web-Telecom
framework for the Internet of Things," Internet of Things (WF- Coordination Service," IEEE International Conference on Web
IoT), 2014 IEEE World Forum on. IEEE, 2014. Services (ICWS 2008), pp.600-607, Sep. 2008.
[6] RFC 5216, “https://www.rfc-editor.org/rfc/rfc5216.txt.” [15] Hiroshi Sunaga, Yoji Yamato, Hiroyuki Ohnishi,
[7] Gunes Acar, Christian Eubank, Steven Englehardt, Marc Juarez, Masashi Kaneko, Masami Iio and Miki Hirano, "Service
Arvind Narayanan, and Claudia Diaz, "The web never forgets: Delivery Platform Architecture for the Next-Generation
Persistent tracking mechanisms in the wild," Proceedings of the Network," ICIN2008, Session 9-A, Oct. 2008.
ACM SIGSAC Conference on Computer and Communications
Security, 2014.

411

Anda mungkin juga menyukai