Anda di halaman 1dari 111

Communications

in Computer and Information Science 26


Patrick Bond (Ed.)

Communications
and Networking
in China
1st International Business Conference, ChinacomBiz 2008
Hangzhou, China, August 2008
Revised Selected Papers

13
Volume Editor

Patrick Bond
ICST, Begijnhoflaan 93a, 9000 Gent, Belgium
E-mail: patrick.bond@icst.org

Library of Congress Control Number: 2009920523

CR Subject Classification (1998): C.2, J.1, B.4

ISSN 1865-0929
ISBN-10 3-642-00204-8 Springer Berlin Heidelberg New York
ISBN-13 978-3-642-00204-5 Springer Berlin Heidelberg New York

The copyright of the content of this book is governed by the Belgian Copyright Act (30 Jun 1994). With
the exceptions stated explicitly by the Act, no part of the content of the book is allowed to be translated,
modified, copied, stored in a database or electronic retrieval system or made public without the prior written
consent of ICST.

ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering 2009
Printed in Germany

springer.com
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper SPIN: 12607983 06/3180 543210
Preface

ChinacomBiz 2008 was the first international business-to-business (B2B) conference


held in collaboration with the Chinacom scientific event in Hangzhou, China, on
August 28. It was specifically tailored to produce more effective dialogue between the
research, technology and business communities on technical developments in China
and around the world. The event had industry support via well-know industry names
such as the WIFI Alliance, Springer, CREATE-NET and Arris.
The main focus of the event was on transport networks and infrastructures, video
distribution systems and methods and the associated software systems that tie them all
together. A total of 24 authors submitted their papers and a final 11 registrations were
accepted. It was an excellent start to the B2B program and feedback was positive from
all attendees.
The presentations this year originated from the USA, Denmark, Germany, Brazil
and of course China, providing the audience with a wide variety of topics and perspec-
tives in all the three categories mentioned above. The presentations from the event are
available for download from the website itself (see http://www.chinacombiz.org/).
Next years event is already being planned and it will again be collocated with the
Chinacom scientific event. We look forward to having another great interaction with
science, business and technology from a China perspective.

August 2008 Patrick Bond


Table of Contents

Undetectable Manipulation of CRC Checksums for Communication


and Data Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Frank Schiller, Tina Mattes, Uwe Weber, and Rainer Mattes
Survey of PHY and LINK Layer Functions of Cognitive Radio Networks
for Opportunistic Spectrum Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Liang Hu, Villy B. Iversen, and Lars Dittmann
Optimizing TCP Performance over UMTS with Split TCP Proxy . . . . . . 25
Liang Hu and Lars Dittmann
Use of Photonic Networks in Digital Cinema Postproduction Dailies
Workow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Michal Krsek, Paul Hearty, Laurin Herr, and Jurgen Schulze
Visualizing TCP/IP Network Flows to Control Congestion and
Bottlenecks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Nanjun Li and Werner Zorn
ZBMF: Zone Based Message Ferrying for Disruption Tolerant
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Bahadir K. Polat, Moazzam Khan, Nikhil Tuli, and Pooja Kailay
Improving Query Mechanisms for Unstructured Peer-to-Peer
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Guangwei Fang and Xiao Zheng
Layer 3 Initialization Procedures Recommendation in Next Generation
Heterogeneous Mobile Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Sebastiao Boanerges Ribeiro Junior and
Paulo Roberto de Lira Gondim
Broadcast Data Scheduling in Wireless Environment . . . . . . . . . . . . . . . . . 76
Xianjun Sun, Chuang Lin, Weidong Liu, and Yanping Xiao
Analysis on Imai-Shins LR-AKE Protocol for Wireless Network
Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Yingjie Wang, Wei Luo, and Changxiang Shen
Surveys on the Intrusion Tolerance System . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Kuo Zhao, Nurbol, Fei Ren, Jianfeng Chu, and Liang Hu
Study on the DS/CDMA-CSMA Multi-user Communication Systems . . . 98
Yun-xiao Zu, Xiao Wu, and Yu-qin Lv

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105


Undetectable Manipulation of CRC Checksums for
Communication and Data Storage

Frank Schiller1, Tina Mattes1, Uwe Weber2, and Rainer Mattes2


1
Technische Universitt Mnchen, Institute of Information Technology in Mechanical
Engineering, Garching near Munich, Germany
{mattes,schiller}@itm.tum.de
2
Siemens AG, Industry Sector, Nuremberg, Germany

Abstract. The Cyclic Redundancy Check (CRC) is an efficient and widespread


coding method to detect random errors in industrial and business data
transmission. Because of its simple implementation and low residual error
probability, it is also applied to data stored in digital memory. If no error
occurred then the checksum (FCS, Frame Check Sequence) is consistent to the
original data. Additionally, the actual value of the FCS is often used as a crite-
rion for e.g. the correct version of the software in digital memories or to detect
its manipulation. Although it is well known that data protected by CRC can be
manipulated undetectably, CRC is applied for the detection of manipulations
but not random errors. The paper demonstrates and proves in detail how easily
data, like business software or video streams, can be manipulated such that the
manipulation is not detectable even for CRC of high order.

Keywords: Cyclic Redundancy Check, checksum, manipulation, data


transmission, data storage.

1 Introduction
The Cyclic Redundancy Check (CRC) is an efficient and widespread coding method
to detect random errors in industrial and business data transmission [1, 2, 3]. CRC can
be easily implemented and results in a very low residual error probability in relation
to the number of redundant bits or length of the checksum, respectively [4]. For
instance, by means of 32 redundant bits, a residual error probability of 2-32 is
achievable. Because of its simple implementation and low residual error probability, it
is also applied to data stored in digital memory.
Using CRC, the integrity of data is ensured by a checksum (FCS, Frame Check
Sequence) that is the result of a polynomial modulo operation [5, 6, 7]. If no error
occurred then the FCS is consistent to the original data. This consistency is typically
the evaluation criterion in data transmission.
For data storage, the actual value of the FCS is used in addition to the consistency
check by practitioners in industry as a criterion for e.g. the correct version of the
software in digital memories or in order to detect its manipulation. It is assumed that
each change of the data will lead to another consistent FCS. Therefore, it appears to
be sufficient to check the value of the FCS. This nave approach does not consider

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 19, 2009.


ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
2 F. Schiller et al.

that CRC is a mean for detection of random but not intelligent errors. It is well known
that data protected by CRC can be manipulated undetectably (cf. e.g. [8]).
The paper demonstrates and proves in detail how data can be manipulated actually
such that a manipulation is undetectable in data transmission and storage. It empha-
sizes that CRC must not be applied to data if intelligent sources of errors, i.e. humans
supported by intelligent tools in adversarial attacks, are to be faced. In such cases,
additional checking of data and other coding methods of cryptography are necessary.
The paper is structured as follows. The general functionality of the CRC is ex-
plained in Section 2. Section 3 describes some necessary aspects of implementation of
CRC. Since errors in form of random faults and manipulations are to be investigated
later, Section 4 introduces the modeling of errors. In Section 5, the general kind of
undetectable manipulation is developed first regarding manipulation and consistency
check, before a simple way of manipulation of data stored in digital memory will be
presented in Section 6. There, in addition to the consistency check, the value of the
FCS is compared to the original one. The approach is illustrated by means of an ex-
ample in Section 7. The paper finishes with some conclusions.

2 Functionality of CRC
Since CRC has been developed for the detection of errors in data transmission or
communication, respectively, the principles of CRC are explained in connection with
communication. There CRC is applied as follows: A polynomial modulo operation
with a binary polynomial, the so-called generator polynomial g(x) of degree r, is used
to calculate the FCS of the given binary net data ND in the following way: enclosed
in parentheses and set on the right margin.

fcs ( x ) = ( nd ( x ) x r ) mod g ( x ) . (1)

In (1), nd(x) and fcs(x) denote the polynomial counterpart of ND and FCS that
build the telegram T = [ND, FCS]. Note that FCS consists of r bits. T is sent to the
receiver and is possibly falsified to T during the transmission. In the receiver, check
(2) proceeds:

t ' ( x) mod g ( x) = ((nd ' ( x) x r ) + fcs' ( x)) mod g ( x) = 0 ? (2)

If (2) is not true, T has been falsified. If (2) holds, T is regarded to be transmitted
correctly and ND is used in the intended way in the application.
For instance, generator polynomial g(x) = x3+x+1 and net data ND = [1110011] are
given. ND leads to the binary polynomial nd(x) = 1x6+1x5+1x4+0x3+0x2+1x1+1x0 =
x6+x5+x4+x+1.
The degree of g(x) is r = 3. According to (1), the polynomial counterpart of FCS is
calculated by fcs(x) = ((x6+x5+x4+x+1) x3) mod (x3+x+1)=x. Thus, FCS = [010] and the
telegram T that is sent to the receiver is T = [ND, FCS] = [1110011010]. Obviously,
(2) holds, if T = T. But if for instance T = [1110001011] then t(x) mod g(x) =
(x9+x8+x7+x3+x+1) mod (x3+x+1) = x2+x+1 0. Hence, the falsification is detected.
Undetectable Manipulation of CRC Checksums for Communication and Data Storage 3

The determination of the FCS in the sender and the consistency check in the
receiver are realized e.g. by a linear feedback shift register (LFSR, see e.g. [4, 7]) or a
corresponding table method and will be explained in the following section.

3 Implementation of CRC
Realizations of CRC have been developed for hardware and software, cf. e.g. [4, 9]. A
standard way to implement the polynomial modulo operation (1) is the application of
a linear feedback shift register (LFSR, see Fig. 1). The number of bits of the register
is equal to the degree r of g(x). The gi denote the coefficients of g(x). The overall bit
pattern of z = [zr-1 z0] represents the state of the register. The input u stands for the
bits shifted stepwise into the register. For u(x) = nd(x)xr, the final state corresponds to
the solution of (1). That means, that ND is bitwise shifted into the register followed
by r zeros. For u(x) = t(x), the final state corresponds to the solution of (2).

z r-1 zr-2 z r-3 ... z1 z0 u


logical XOR
logical AND
gr =1 gr-1 gr-2 g1 g0 = 1
...

Fig. 1. Implementation of CRC by a Linear Feedback Shift Register (LFSR)

A much more sophisticated LFSR with additional feedback is sometimes applied


where the result of the modulo operation of the product of nd(x) and xr is available at
each step k, cf. [4]. Furthermore, the same functionality can be achieved by means of a
specific table approach with a much lower processing time.
For the explanation of the overall approach, only the implementation of Fig. 1 is
needed. It is formulated in MATLAB code in Listing 1. The length of the input pattern
is l. At each step k, the next state of the register z(k+1) is calculated based on the pre-
sent state z(k) and the present input u(k). Obviously, whenever bit zr-1 is at value 1, i.e.
z(k) 2r-1, the feedback of the LFSR becomes active (cf. Fig. 1). This behavior is han-
dled by a distinction of case in Listing 1.
Listing 1. Simple MATLAB implementation of CRC
z(1)=0; % exemplary initial state
for k=1:l
if z(k)>=2^(r-1)
z(k+1)=bitxor(2*z(k)+u(k),g);
else z(k+1)=2*z(k)+u(k);
end; end;
Since it is needed in Section 5, a special kind of inversion of CRC is shown in
Listing 2. The final state z(l+1) is given, and at each step k, the present state of the
register z(k) is calculated based on the given corresponding state z(k+1) and the present
input u(k). The activation of the feedback at step k is detected based on the difference
between u(k) and the most right hand bit of z(k+1).
4 F. Schiller et al.

Listing 2. MATLAB implementation of inverted CRC


z(l+1)=733298177; % exemplary final state
for k=l:-1:1
if mod(z(k+1),2)~=u(k)
z(k)=floor(bitxor(z(k+1),g)/2);
else z(k)=floor(z(k+1)/2);
end; end;

4 Modeling of Errors
Obviously, CRC can not detect all errors. If T is equal to T = [1100101010] in the
example of Section 2, then (2) holds for t(x), hence the falsification is not detectable.
Errors can be modeled by superimposed error patterns F. These patterns have the
same length like T. A bit of F is allocated by value 0, if the corresponding bit in T is
correct, and it is allocated by value 1, if the corresponding bit in T is falsified.
Consequently, T is superimposed by F such that T = T+F holds (cf. Fig. 2). Note that
+ stands for exclusive-or in the space of binary polynomials and binary words.

mod g ( x)

m r
nd (x) fcs(x) =0

Error: + f (x)

nd ' ( x) fcs' ( x) = 0?

Fig. 2. Modeling of errors by superimposition

An error is undetectable by CRC if and only if f(x), the polynomial counterpart of


F, is divisible by the generator polynomial g(x), since:
t ' ( x) mod g ( x) = (t ( x) + f ( x)) mod g ( x)
= t ( x) mod g ( x) + f ( x) mod g ( x) . (3)
= f ( x) mod g ( x)
holds. Therefore, t(x) mod g(x) is equal to zero if and only if f(x) mod g(x) is equal to
zero. This feature (3) is applied in the following to get manipulations undetectably.

5 Manipulation and Consistency Check


In general, each error pattern F whose polynomial expression f(x) is a multiple of g(x)
cannot be detected. If f(x) is a result of a manipulation and not divisible by g(x), an
additional manipulation fadd(x) has to take place for undetectability such that:
t ' ( x) mod g ( x) = (t ( x) + f ( x) + f add ( x)) mod g ( x)
= ( f ( x) + f add ( x)) mod g ( x) (4)
=0
Undetectable Manipulation of CRC Checksums for Communication and Data Storage 5

holds. Equation (4) is equivalent to


f ( x) mod g ( x) = f add ( x) mod g ( x) .
The most practicable form of the additional error pattern Fadd is a frame of r bits that
is shifted (cf. Fig. 3). Obviously, a length of r bits is sufficient for Fadd since

f ( x) mod g ( x) = ( f add ( x) x l ) mod g ( x)


= (( f add ( x) mod g ( x)) x l ) mod g ( x)
holds and fadd(x) mod g(x) is of length r. Therefore, fadd can also be determined by
sampling its 2r alternatives. This is not considered here because of complexity.
The shift parameter l should be chosen such that there is no practical impact on the
original manipulated data by F. Otherwise, the result of the main data manipulation
would be changed by Fadd again. For instance, ND = ND holds for l = 0.

mod g ( x)

m r
nd (x) fcs(x) =0
Main data
+ f (x)
manipulation:

nd ' ( x) fcs' ( x)
Manipulation + fadd (x) l
for consistency:

nd " ( x) fcs" ( x) = 0!

Fig. 3. Additional manipulation for consistency

After the main manipulation of nd(x) by means of f(x) resulting in nd(x), the
necessary fadd(x) for undetectability can be calculated as follows (cf. (2)):

(nd " ( x) x r + fcs" ( x)) mod g ( x) = 0


(nd ' ( x) x r + fcs' ( x) + f add ( x) x l ) mod g ( x) = 0 (5)
(nd ' ( x) x r + fcs' ( x)) mod g ( x) = ( f add ( x) x l ) mod g ( x)
That means, that the result of the modulo operation of the data after the main
manipulation has to be calculated first before fadd can be determined:
Algorithm 1:
Given: data after main manipulation nd ' ( x) x r + fcs ' ( x)
generator polynomial g(x)
shift parameter l
Find: additional error pattern fadd(x) such that (5) holds
Solution: 1. result1 ( x) = (nd ' ( x) x r + fcs' ( x)) mod g ( x)
2. ( f add ( x) x l ) mod g = result1 ( x)
6 F. Schiller et al.

The first part of the solution is solved by means of a typical implementation of the
modulo operation like e.g. in Fig. 1 or Listing 1. The second part is solved by means of
the inverted CRC in Listing 2. Starting with state result1(x) and considering an input u
of l zeros, fadd(x) is determined (see Listing 3).

Listing 3. Determination of Fadd for consistency (in MATLAB)


z(l+1)=result_1;
for k=l:-1:1
if mod(z(k+1),2)==1
z(k)=bitxor(z(k+1),g)/2;
else z(k)=z(k+1)/2;
end; end;

That means, a data frame like a telegram consisting of ND and FCS is consistent.
Any manipulation F can be covered by means of an adequate Fadd.

6 Manipulation and Consistent Identical Checksum


In addition to the consistency between net data nd(x) and checksum fcs(x), the value
of fcs(x) has to be equal to the original one fcs(x), see Fig. 4.

mod g ( x)

m r
nd (x) fcs(x) =0
Main data
+ f nd (x)
manipulation:

nd ' ( x)
Manipulation
for consistency + f add (x) l
and identical FCS:

nd " ( x) fcs(x) = 0!

Fig. 4. Additional manipulation for consistency and identical checksum

Therefore, after the main manipulation of nd(x) by means of f(x) resulting in nd(x),
the necessary fadd(x) for undetectability is calculated as follows (cf. (5)):

(nd " ( x) x r + fcs( x)) mod g ( x) = 0


(nd ' ( x) x r + f add ( x) x l + fcs( x)) mod g ( x) = 0 (6)
(nd ' ( x) x r + fcs( x)) mod g ( x) = ( f add ( x) x l ) mod g ( x)
That means, that a modulo operation of the net data after the main manipulation
ND attached by the original FCS has to be executed before fadd can be determined:
Undetectable Manipulation of CRC Checksums for Communication and Data Storage 7

Algorithm 2:
Given: net data after first manipulation nd(x)
original checksum fcs(x)
generator polynomial g(x)
shift parameter l
Find: additional error pattern fadd(x) such that (6) holds
Solution: 1. result2 ( x) = ( nd ' ( x) x r + fcs( x)) mod g ( x)
2. ( f add ( x) x l ) mod g = result2 ( x)

Again, the first part of the solution is solved by means of a typical implementation
of the modulo operation like e.g. in Fig. 1 or Listing 1. The second part is solved by
means of the inverted CRC like in Listing 3, but starting with state result2(x).
That means, a data frame like a telegram consisting of ND and FCS is consistent
and FCS is equal to the original one that is expected and checked. Any manipulation
occurred before the introduction of Fadd is not detectable.

7 Example
Consider the part of a program in Instruction List (cf. [10]) in Table 1 (first column)
running e.g. on a fail-safe industrial programmable controller (PLC). The program is
supposed to be critical and therefore a FCS of the program calculated by CRC is at-
tached. For maintenance reasons, the program has to be modified but the FCS should
be the same as before since its value is checked by independent authorities. The pro-
gram contains the instruction LD 0xFFFFFFFF that is meaningless for the program
execution, but the value of the constant (in hexadecimal notion) will be changed finally
in order to get the manipulations undetectably. The actual basis, that means ND, for the
determination of the FCS is supposed to be the machine code (see Table 1, second
column). A fictitious processor is assumed. The FCS is calculated with generator poly-
nomial 0x104C11DB7 that it is applied e.g. in Ethernet, FDDI, ZIP and PNG.

Table 1. Original program

original program fictitious machine hex code representation


LD x C442078
LD y 4C442079
-D 2B44
T z 54207A
LD 0xFFFFFFFF 4C44203078FFFFFFFF
LD x 4C442078
PUSH 50555348
+D 2B44
T y 542079
FCS: 471CE694

The original program is now manipulated by changing the operation subtraction in


the third row into addition, see Table 2.
Since the FCS has not been changed during the manipulation of the program is de-
tectable by a consistency check according to (2). If the old FCS would be substituted
8 F. Schiller et al.

by the new consistent FCS according to (1), the manipulation is still detectable by
comparison of the new FCS to the expected original one that is usually mentioned in
the documentation. Therefore, not the FCS but another part of the program has to be
substituted. Here, the constant 0xFFFFFFFF is used in the example.
The first manipulation (main data manipulation) should not affect the constant to be
modified later for undetectability, see Fig. 5. Otherwise, the result of the main ma-
nipulation would be changed, again. Therefore, a bit pattern of zeros of length r is
introduced at the appropriate position.

mod g ( x)

m r
nd (x) fcs(x) =0
Main data
+ fnd 1 (x) 0L0 fnd 2 ( x)
manipulation:

nd ' ( x)
Manipulation
for consistency + fadd (x) l
and identical FCS:

nd " ( x) fcs" ( x) = 0!
= fcs(x)!

Fig. 5. Manipulation schema

Now, Algorithm 2 (see Section 6) is applied:


Given: nd(x) (see Table 2, second column)
original checksum fcs(x) is 0x471CE694
generator polynomial g(x) is 0x104C11DB7
shift parameter l=136
Find: additional error pattern fadd(x) such that (6) holds
Solution: 1. result 2 ( x) is 0x2BB53E01
2. f add (x) is 0x8EB047EC

That means that the constant 0xFFFFFFFF has to be superimposed with


0x8EB047EC in the program. Finally, the new value of the constant is therefore
0x714FB813, see Table 2.

Table 2. Manipulated program (undetectable manipulation)

manipulated program fictitious machine hex code representation


... ...
+D 2C44
T z 54207A
LD 0x714FB813 4C44203078714FB813
... ...
FCS: 471CE694
Undetectable Manipulation of CRC Checksums for Communication and Data Storage 9

The change of the value of the constant does not have any influence on the
execution of the program at runtime but prevents the detection of the manipulation.
The consistency as well as the identical FCS is ensured.

8 Conclusions
CRC cannot be applied to errors caused by intelligent sources. It can never substitute
sophisticated methods and algorithms of security [11, 12]. In data communication, the
only restriction is the time necessary to execute the undetectable modification. In the
near future, telegrams can be manipulated on-line without any chance of detection
afterwards in the receiver. Even in the case of the application of CRC to ensure data
stored in digital memories, where in addition to the consistency between the net data
ND and FCS the actual value is checked by independent authorities, CRC is not at all
sufficient. Here the overall software has to be at least compared to a master copy or
appropriate cryptographic codes and algorithms should be applied.
Deciders in business and industry have to be aware of such issues in order to be able
to make the correct decision for the protection of data.

References
1. Peterson, W., Weldon, E.J.: Error Correcting Codes. MIT Press, Cambridge (1996)
2. MacWilliams, F.J., Sloane, N.J.: Theory of Error-Correcting Codes. North-Holland,
Amsterdam (1991)
3. Blahut, R.E.: Algebraic Codes for Data Transmission. Cambridge University Press,
Cambridge (2003)
4. Schiller, F., Mattes, T.: An Efficient Method to Evaluate CRC-Polynomials for Safety
Critical Industrial Communication. Journ. of Appl. Computer Science 14, 5780 (2006)
5. Koopman, P., Chakravarty, T.: Cyclic Redundancy Code (CRC) Polynomial Selection for
Embedded Networks. In: DSN 2004, Florence, Italy, pp. 145154 (2004)
6. Castagnoli, G.: On the Minimum Distance of Long Cyclic Codes and Cyclic Redundancy-
Check Codes, ETH Zurich, Diss. No. 8979 (1989)
7. Schiller, F., Mattes, T.: Analysis of CRC-Polynomials for Safety-Critical Communication
by Deterministic and Stochastic Automata, SAFEPROCESS, Beijing, China, pp. 1003
1008 (2006)
8. Wikipedia (2008),
http://en.wikipedia.org/wiki/Cyclic_redundancy_check
9. Williams, R.N.: A Painless Guide to CRC Error Detection Algorithms. Rocksoft Pty Ltd.,
Hazelwood Park (1993)
10. International Electrotechnical Commission: IEC 61131-3, edn. 2.0 (2003)
11. Menezes, J., van Oorschot, P.C., Vanstone, S.A.: Handbook of Applied Cryptography.
CRC Press, Boca Raton (2001)
12. Bellare, M., Rogaway, P.: Introduction to Modern Cryptography. Univ. of California
(2005)
Survey of PHY and LINK Layer Functions of Cognitive
Radio Networks for Opportunistic Spectrum Sharing

Liang Hu, Villy B. Iversen, and Lars Dittmann

Department of Communications, Optics & Materials


Technical University of Denmark, Lyngby, Denmark
{lh,vbi,ld}@com.dtu.dk

Abstract. Cognitive Radio recently arises as the enabling technology for


opportunistic spectrum sharing, has drawn significant attentions from the wide
range of research communities, including signal processing, machine learning,
networking and algorithms. We survey the research challenges of designing
cognitive radio networks for opportunistic spectrum sharing, with the aim of
sparking new research interests in this field. We follow a layered approach
where PHY and LINK layer functions are classified respectively and associated
open research questions are proposed. In our work, firstly various proposed
cognitive radio network architectures are reviewed associated with open
research questions for each of those network architectures. Secondly, we survey
the cognitive radio physical layer functions and propose open research
questions at physical layer. Thirdly, we focus on link layer functions of
cognitive radio networks and propose the open research questions.

Keywords: Cognitive radio, opportunistic spectrum sharing, spectrum sensing,


spectrum analysis, spectrum selection, spectrum coordination.

1 Introduction
A recent report published by the federal communication commission (FCC) in US has
shown a surprising finding, which highlights a different cause of the shortage of
frequency resource: In many bands, spectrum access is a more significant problem
than physical scarcity of spectrum, in large part due to legacy command-and-control
regulation that limits the ability of potential spectrum users to obtain such access.
According to FCC, 1) a large amount of licensed spectrum is used sporadically where
the temporal and geographical variations in the utilization of the licensed spectrum
range from 15% to 85%; 2) It is the complicated and old regulations that prevent us
having open and flexible access to abundant spectrum. It becomes obvious that the
bottleneck of spectrum scarcity is not the physical spectrum scarcity but spectrum
allocation and regulation scarcity i.e. the imbalance of spectrum utilization between
licensed spectrum and unlicensed one and the inefficient spectrum allocation scheme
and regulations. Apparently, in order to increase the efficiency of natural spectrum
resource utilization, more flexible spectrum management techniques and regulations
are required. Cognitive Radio is one of enabling technologies to achieve such goal.
In the following sections, we firstly define the terminologies of cognitive radio
networks in section II. In section III, we review various cognitive radio network

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 1024, 2009.


ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
Survey of PHY and LINK Layer Functions of Cognitive Radio Networks 11

architectures proposed for opportunistic spectrum sharing. In section IV, we present the
heterogeneous cognitive radio network architecture and corresponding network functions
in Protocol stack. The Physical layer and Link layer research challenges are investigated
in section V and VI respectively. Finally, the conclusion is drawn in section VII.

2 Definitions in the Context of Cognitive Radio Network


In the following, we define the terminologies in the context of cognitive radio systems
which will be employed throughout this paper.
Cognitive Radio
It is a radio that can change its transmitter parameters based on the interaction with
the environment in which is operates. It has two capabilities: 1) environment aware
capability--capture information from its radio environment e.g. by either spectrum
sensing or Spectrum Usage Policy updates. 2) Re-configurability: certain operation
parameters, such as transmission power, modulation scheme, and carrier frequency
and so on, can be dynamically changed in real-time according to the current radio
environment. Cognitive Radio is a way to break the spectrum access constraints in
traditional spectrum regulations e.g. imbalance between spectrum allocation and
spectrum usage, in order to improve spectrum utilizations supporting increased
emerging wireless applications while minimize the interference and provide highly
reliable ubiquitous wireless communications.
Primary System (PS)
Primary System is defined as an entity that legally owns some frequency band and has
primary access to those frequency bands. There are several examples of Primary
Systems: Mobile Cellular system, TV broadcast system, and Emergency Service
system. The frequency bands for Primary Systems are allocated by spectrum authorities
according to certain spectrum regulations and also called licensed spectrums.
Spectrum Opportunities (SO)
Spectrum Opportunities is also called spectrum hole or spectrum gap. It is defined as
a band of frequencies assigned to a Primary System, but, at a particular time and
specific geographic location, the band is not being utilized (therefore can be rent to or
borrowed to other systems).
Secondary System (SS)
Secondary System is an entity that wants to acquire unused spectrum of license owners
(Primary System) for its own communication. When SS coexist with PS, by detecting
new available spectrum in the environment, SS can use a frequency band as long as the
corresponding PS is not using it. After acquiring the licensed spectrum, it must
continuously monitor the re-appearance of PS system and minimize its interference to
the PS communications while coordinate the spectrum access among all SSs.

3 Overview of Physical and Link Layer Functions in Cognitive


Radio Networks
To evolve future cognitive radio-based heterogeneous wireless networks, from
network level point of view, there are new technical challenges in network design
12 L. Hu, V.B. Iversen, and L. Dittmann

compared to existing wireless networks. Basically, the challenges can be categorized


into two aspects: protection of PS network functionalities from SS network
interference and support of QoS in SS networks.
Protection of PS network from SS network interference
For supporting QoS of PSs, it is essential to protect primary network functionalities
from secondary network interference. Whenever secondary networks opportunistically
share licensed spectrum with primary networks, the communications between PSs must
not be interfered significantly by SSs. This is because PSs are the owners of the
spectrum and ideally their network services should be guaranteed as if there were no
secondary networks. Primary systems are already deployed and cannot be modified
technically in most cases any more thus, if spectrum is shared the secondary systems
have to provide all required functionality. Hence, secondary networks, rather than
Primary networks, should have cognitive radio capabilities to protect primary network
functionalities, which have to address the following open issues:
1. The SSs should reliable identify active PSs during their opportunistic operation
on some licensed bandwidth and before they switch to another licensed bandwidth.
While SSs are opportunistically using a licensed bandwidth, the SS should
continuously periodically monitor that channel to detect whether PSs are
reclaiming the channel. If SSs cannot detect active PSs in time, the collision
between PSs signal and SS signal will unfortunately happen thus the PSs
communication might be harmed. Besides, before SS switch to a new spectrum
opportunity, it should also reliable identify whether there is already active PSs
communication on that spectrum opportunity to avoid harming ongoing PS
communications.
2. After reliable identifying active PSs, group of SSs (the secondary network)
should exchange their information of active PS identification to achieve group
consensus regarding PS activities. This information exchange can help more
reliable and efficient identifying and protecting active PSs globally by
cooperation of various SSs. Upon information exchange, each SS may have a
shared global knowledge of active PSs. Moreover, achieving consensus is an
important requirement to protect primary systems. It is no help to detect them
reliably but being unable to achieve consensus about the sensing results in the
secondary network.
3. Upon identifying that PSs are reclaiming their spectrum, SS should vacate the
spectrum of PSs as soon as possible and switch to some new spectrum
opportunity for its ongoing communications.
QoS support in SS networks
The support of QoS in SS networks is another aspect in designing CR-based wireless
networks. Supporting of QoS in existing wireless networks is already not easy due to
the characteristics of wireless channels while supporting QoS in CR-based SS
networks becomes even more challenging. SS network services may always be
additionally interrupted when PSs reclaim spectrum, as SSs have to give radio
resource access priority to PSs according the spectrum regulation. From time to time,
the QoS of SSs might degrade severely if PSs reclaim the spectrum too frequently or
even get stopped when SS cannot identify new spectrum opportunities for some
Survey of PHY and LINK Layer Functions of Cognitive Radio Networks 13

period. However, supporting some basic (maybe stochastically justified) QoS in SSs
communication is essential for the popularity and wide deployment of CR-based
wireless networks.
To cope with those technical challenges, new network functionalities are needed to
be developed, which are introduced in the following section. We strongly believe that
new network functionalities should be implemented in SS networks. As CR based
wireless networks are built on top of existing wireless network infrastructures, PSs
should remain unchanged while adding the required additional CR capabilities to the
SSs. Thus, all newly introduced complexity is put in the SS networks and hidden from
PS networks. Ideally, the network services in PS networks are not disrupted by SS
network in the sense that it looks as if there were no SSs. In Figure 1, the protocol
stack for CR networks is presented with different CR functions mentioned above.
Physical Layer: Spectrum Sensing, Data reconfigurable transmission based
on Software Defined Radio (SDR).
Link Layer functions: Spectrum Analysis, Spectrum Selection (including
spectrum adjustment), and Spectrum Coordination.

Fig. 1. Physical and Link Layer functions of cognitive radio networks

We provide an overview of different new network functionalities in this section with


respect to two networks coexistence scenarios: SS networks coexisting with PS
networks and SS networks coexisting with other SS networks. More descriptions
about these various functions are described in sections later.
SS networks coexisting with PS networks
There exists temporally unused spectrum either temporally or geographically in
licensed spectrum used by primary network. Thus SS network can exploit those
unused spectrum from the PS network through its cognitive radio capabilities.
There are several functionalities that SS networks should have. Firstly, SS within
one cell should be able to sense and detect spectrum opportunities precisely from PS
networks. This is defined as Spectrum Sensing. Secondly, it should also be able to
14 L. Hu, V.B. Iversen, and L. Dittmann

estimate the current state of the wireless channel and predict its capacity. This is
defined as Spectrum Analysis. Then SS should be able to select a set of spectrum
bands appropriate according to spectrum rules or spectrum policy while reconfiguring
the physical layer RF to switch to the newly selected frequencies. This process is
called Spectrum Selection. Thirdly, as primary users in general are not aware of
SSs, SS networks should be able to vacate the spectrum opportunities immediately
and move to a new available spectrum band as soon as PS reclaims them. Thus the
interference to PS can be minimized. Besides, in order to support a certain QoS of SS
payload communications, the SS should be able to identify and select new spectrum
opportunities as soon as possible, in order to mitigate the interference from PS. This is
called Spectrum adjustment or re-selection which maintains the ongoing wireless
connection of SS while protecting primary system. The mutual interference between
SS networks and PS networks depends very much on a) the activity pattern of PSs and
users (either deterministic or random); b) the spectrum opportunity detection and
vacation scheme of SSs. Finally, the Spectrum Coordination function is needed to
coordination the secondary usage of the spectrum resource and transmitter-receiver
handshake in the context of heterogeneous spectrum opportunities.

SS networks coexisting with other SS networks


There are two types of coexistence among SS networks. The first one is the coexistence
of SS networks in unlicensed spectrum, for example in the ISM band. The second one is
secondary networks coexisting in virtually unlicensed spectrum which is temporally
or spatially unused licensed bands. Since the two networks share the same spectrum for
their own operations, they may interfere with each other without proper coordination
schemes, thus reducing the spectrum efficiency and QoS support at SS networks. In
addition, in contrast to coexistence among PS and SS networks, every SS network now
has the equal right to access the unlicensed or virtually unlicensed spectrum and do not
have to vacate the spectrum resource when other networks reclaim it. Instead, different
SS networks may compete for the shared spectrum resource where in an extreme case
one network may even behave quite egoistically in order to obtain as much spectrum
as possible. Thus, coexistence of SS networks requires new sophisticated spectrum
coordination function and local spectrum usage policy compared with above case. This
new coordination function and local spectrum usage policy should a) improve the
efficiency of spectrum usage; b) minimize the mutual interference among different SS
networks and support QoS; c) provide spectrum access fairness and mitigate egoistic
misbehavior.

4 Review of Physical Layer Functions


Cognitive radio communication is strictly conditional on the reliable detection of
unoccupied spectrum. One distinguished function of cognitive radio physical layer is
Spectrum Sensing: to sense the environment over all available degrees of freedom
(time, frequency, and space) in order to identify frequency bands currently available
for transmission. Secondly, after identifying the spectrum opportunity, physical layer
should have flexible and adaptive Data Transmission schemes that provide best
spectrum utilization and capacity while avoiding interference to any PS. The
Survey of PHY and LINK Layer Functions of Cognitive Radio Networks 15

capabilities and implementation limits of physical layer has a strong impact on the
upper layer performance in the sense that it provides the realistic physical models to
upper protocol design. In this section, we focus on function of the spectrum sensing.
We firslty review the design challenges in RF front-end design for reliable spectrum
sensing. Then we present various spectrum detection methods: non-cooperative
transmitter signal based detection, cooperative transmitter based detection and
Interference based receiver detection. To the end, the open research problems of
spectrum sensing are outlined.

Fig. 2. Cognitive Radio Transceiver

4.1 Wideband RF Front-End


The architecture of cognitive radio transceiver is showed in figure 2 where RF front-end
and Signal detection are two most important blocks for spectrum sensing. Figure 2
shows an architecture of wideband RF front-end capable of simultaneous sensing over
several GHz wide spectrum. In this architecture, a wideband signal is received through
the RF front-end, sampled by the high speed analog-to-digital (A/D) converter, and
measurements are performed for the detection of the licensed user signal.

4.2 Spectrum Detection Methods


After reliable reception and sampling of a wideband signal, advanced digital signal
processing techniques can be used to improve a) the RF front-end sensitivity by
processing gain and b) primary user knowledge of the signal characteristics. The most
efficient way of detect spectrum opportunities is to detect the primary users that are
receiving data within the communication range of a secondary user. However, in reality,
cognitive radio aware station can hardly have a direct measurement of a channel
between a primary receiver and a transmitter. Instead, most recent research focuses on
detecting the primary transmitters by the local observation of secondary users. In
general, spectrum opportunity detection can be classified into either non-cooperative or
cooperative transmitter detection and interference based receiver detection.
4.2.1 Non-cooperative Transmitter Signal Based Detection
There are three types of transmitter detection techniques: a match filter detector, an
energy detector, and a cyclostationary feature detector. In order to evaluate the
16 L. Hu, V.B. Iversen, and L. Dittmann

performance of those detectors, several aspects are discussed [15]: processing gain for
a given probability of detection, sensitivity to unknown noise and interference, and
implementation complexity.
1) The match filter detector is the optimal detector in stationary Gaussian noise when
the signal of primary system is known to the receiver. It can maximize the signal-to-
noise ratio (SNR) among all other detectors. The implementation complexity for
match filter is rather high compare with other detector in the sense that it needs a
separate matched filter based receiver for every primary system.
2) When receiver cannot get sufficient information about primary signal, the optimal
detector is energy detector which only integrates squared samples of received signal.
Energy detector does not work well for detecting spread spectrum signals (direct
sequence and frequency hopping signals). However, the implementation of energy
detector is relatively simple which makes it a candidate for primary signal detection.
3) Cyclostationary feature detectors can extract distinct features of modulated signals
such as sine wave carrier, symbol rate, and modulation type. These features are detected
by analyzing a spectral correlation function that is a two dimensional transform, in
contrast to power spectrum management being one dimensional transform. The main
advantage of cyclostationary feature detector is that it can discriminate the noise energy
from modulated signal energy, thus it has more robustness against unknown noise
variance which may result in dismiss detection.

4.2.2 Cooperative Transmitter Signal Based Detection


Cooperative transmitter detection refers to spectrum sensing methods where
information from multiple SSs is incorporated for primary user detection. The
cooperative transmitter detection can be implemented in either distributed manner as
in [28], or centralized manner as in [13]. In centralized method, there is a master node
in secondary network that collect all sensing information from the SSs and detect the
spectrum opportunities collaboratively. In contrast, in distributed method, there is no
master node among SSs but instead local sensing information is exchanged among
every SSs. For this, a signaling or control channel is needed to exchange the data
between different CR nodes. The investigation of the existence of such a channel is
also presented in [28]. There is a clear trade-off that the sensing performance is
improved at the cost of overhead. In [35], the author study a cooperative sensing
method based on energy detector that can mitigate the multi-path fading and
shadowing effects. This cooperative scheme improves the detection probability in a
heavily shadowed environment. In [33], cooperative sensing is introduced as a mean
to reduce the sensitivity requirements on an individual Cognitive Radio.

4.2.3 Interference Based Receiver Detection


Traditionally, regulatory bodies minimize interference in a transmitter-centric way
through coordination of bands of operation, radiated power, out-of-band emissions,
and location of individual transmitters and so on. However, interference actually takes
place at the receivers. Therefore, recently FCC introduced a new model for measuring
interference and refers to it as interference temperature, showed in figure 3. In
contrast to current transmitter-centric approach, this model tries to regulate the
interference at the receivers where the interference really occurs. Under this new
Survey of PHY and LINK Layer Functions of Cognitive Radio Networks 17

model, a maximum noise level would be set for an entire band and new systems could
be placed in service if it is anticipated that the interference temperature limit would
not be exceeded. In other words, the interference temperature model accounts for the
cumulative radio frequency energy from transmissions and set a maximum cap on
their aggregate level.

Fig. 3. Interference Temperature Model

In [34], a interference based receiver detection is studied, where the local oscillator
(LO) leakage power emitted by the RF front-end of the primary receiver is exploited
for the detection of primary receivers. In order to detect the LO leakage power, low-
cost sensor nodes cab be mounted close to the primary receivers. The sensor nodes
detect the leakage LO power to determine the channel used by the primary receiver
and this information is used by the SSs to determine the spectrum opportunities.

4.3 Open Research Issues in Spectrum Sensing

Receiver based primary signal detection


The effective Interference Temperature measurement is a significant challenge for
receiver-based detection models. Under the collocated sensing architecture, even though
a SS is naturally aware of its transmission power and its precise location, it still cannot
practically, effectively measure or estimate the interference temperatures at nearby
primary receivers. Primary receivers are passive devices, thus a SS cannot identify their
locations and cannot estimate the effects of their transmission on all possible receivers.
In this case the Interference Temperature model alone does not effectively protect the
licenses. Sensor network based architecture may facilitate the interference temperature
based receiver detection by deploying a separate network for spectrum sensing and local
interference temperature estimation at primary receiver. However, the cost and
feasibility of deploying a separate sensor network is still under studied.
Cooperative Sensing
Cooperative Sensing can increase the probability of reliable spectrum opportunity
detection while reducing the sensitivity requirement of individual SS. The cooperative
sensing in general helps the mitigation of the misdetection spectrum opportunities in
fading and shadowing environment but on condition that cooperating users more or less
face independent fading. To more precisely study the performance gain by cooperative
18 L. Hu, V.B. Iversen, and L. Dittmann

sensing, a comprehensive model relating different parameters such as detection


probability, number and spatial distribution of spectrum sensors and more importantly
propagation characteristics is yet to be developed. Secondly, energy based cooperative
sensing is susceptible to uncertainty in noise power, cyclic feature based cooperative
detection is the subject for future research to solve this problem. Last but not least, the
un-trusted SSs fail in un-modeled way impose an upper bound on the achievable
performance gain of cooperative sensing. Therefore, how to enhance the cooperation
among SSs and misbehaving SS detection && punishment are the potential research
areas in cooperative sensing.

5 Review of Link Layer Functions

5.1 Spectrum Opportunity Analysis

In SS networks, the available spectrum opportunities may show different spatial and
temporal characteristics. It is very important for SSs to understand the characterization
of different spectrum bands so that the best available channel can be selected according
to the user requirement. In order to describe the dynamic nature of SS network, each
spectrum opportunity should be classified not only by the time-variant radio
environment but also by primary user activities and spectrum band information such as
operating frequency and bandwidth. Thus we need to define the parameters which
indicate the quality of spectrum opportunity such as interference level, wireless link
errors, path loss, link layer delay, and spectrum holding time and so on.
Interference Level
Different spectrum bandwidths may have different interference characteristics, e.g.
some unlicensed spectrum is more crowded than licensed one. On the primary
receiver side, one can detect the amount of interference as the indicator of maximum
channel capacity according Shannon Capacity. Depending on the interference level of
the spectrum band, secondary user can decide its transmission power on this band or
even decide whether it should change to another channel with less interference.
Spectrum Holding Time
Holding time refers to the expected duration that the SS can occupy a licensed band
before get interrupted by primary users re-appearance. This parameter depends on the
primary traffic patterns. If the primary users use a specific band very frequently, the
spectrum holding time of this band would be too small to be used by secondary users.
Obviously, the longer the channel holding time, the less interruptions secondary users
experience thus the better quality of the spectrum opportunities would be. Since
frequent spectrum handover can decrease the holding time, previous statically patterns
of handover should be considered while designing SS networks with large expected
holding time [18].

5.2 Spectrum Opportunity Selection

Once all the available spectrum bands are characterized, appropriate operating
spectrum band should be selected for the current transmission considering the QoS
Survey of PHY and LINK Layer Functions of Cognitive Radio Networks 19

requirements of end users, spectrum usage policies set by the regulatory body, and the
spectrum characteristics.
Based on the user requirement, the data rate, acceptable error rate, delay bound, the
transmission mode, and the bandwidth of the transmission can be determined. Then,
according to spectrum selection rules and policies, the set of appropriate spectrum
bands can be chosen. Haitao [20] proposed five spectrum selection rules that tradeoff
fairness and utilization with communication cost and algorithms complexity, though
the assumption that all channels have similar throughput does not always hold. In
particular, conflicts-free assignment based rules let secondary users reserve channels
which are preferable when the spectrum is finely partitioned, and when users have
constant traffic, e.g. video streaming. Contention-based rules require fine granularity
in time to efficiently utilize spectrum, providing improved utilization for coarsely
partitioned spectrum and busty data traffic. E.Knightly[21] proposed an opportunistic
frequency channel skipping protocol for searching better quality channel, where the
channel decision is based on SNR. In [22], in order to consider the activities of
primary users, the spectrum selection has considered the number of handover
happened in a certain band.
The above spectrum selection methods take a reactive sense-and-avoid approach to
impulsively reconfigure spectrum usage without knowledge of future dynamics. In
[18], a proactive spectrum selection method is proposed to examine the performance
of cognitive radio system in the TV-broadcast scenario. SSs utilize past observations
of spectrum band holding time to build predictive models on spectrum availability
and select spectrum opportunities to maximize utilization and minimize disruptions of
primary users.
Finally, the spectrum selection needs to incorporate spectrum adjustment or re-
selection function. The spectrum adjustment takes place when the quality of current
spectrum band degrades due to user mobility or the primary user re-appears. Daniel
[23] studied a reliable link maintenance model for cognitive radio network which is
essentially a spectrum re-selection model.
The spectrum selection structure is indicated in figure 4:

Fig. 4. Spectrum Selection Structure

The spectrum status is read in, and checked against the spectrum rules or spectrum
policy. If the user is satisfied, no change is necessary and the selector terminates.
Otherwise, the rules are used to determine the appropriate spectrum bands, and the
spectrum selection is done accordingly.
20 L. Hu, V.B. Iversen, and L. Dittmann

5.3 Spectrum Usage Coordination

5.3.1 Inter-network Spectrum Coordination


Inter-network spectrum coordination refers to spectrum sharing among multiple SS
networks being deployed in overlapping locations and shared spectrum as show in the
figure 5.

Fig. 5. Spectrum sharing and coordination

Traditionally, one of the focuses in inter-network spectrum coordination is the


coexistence of WLAN and Bluetooth networks in ISM band, where the spectrum
coordination is done by a spectrum etiquette protocol. This spectrum etiquette
protocol facilitates spectrum coordination between WLAN and Bluetooth by
announcing the radio and service parameters on common spectrum coordination
channel (CSCC) based on a low rate 802.11b radio. Besides, another focus has been
in the area of spectrum usage coordination of different access points in WLAN.
In the context of cognitive radio, the inter-network spectrum coordination can be
classified into two categories:
1) Decentralized Spectrum Coordination
2) Centralized Spectrum Coordination
In the first category, different SS networks use different radio access methods coexisting
in the same unlicensed band. Nodes within SS networks of different wireless interfaces
cannot directly exchange control information with each other, thus they can only
mitigate the interference by local observation. One of the interesting scenarios is the
coexistence of IEEE 802.11b and IEEE 802.16a. IEEE 802.16a can provide wireless
connectivity to homes and offices while short-range IEEE 802.11b WLAN offers
complementary wireless local area network services within a home or offices. In [24],
a reactive interference avoidance algorithm has been proposed to deal with the
coexistence problem of IEEE 802.11b and 802.16a network where interference
mitigation is done by controlling transmit frequency, transmit power and spectrum time
occupancy. With a higher level protocol complexity, X.Jing proposed [25] a proactive
algorithm to improve spectrum coordination between radio nodes using a common
spectrum coordination channel at the edge of the shared frequency band or even at
licensed band.
The centralized inter-network spectrum coordination is common in many practical
environments such as home and offices. It has a central spectrum broker as we
defined above that can process detailed information about wireless network, which
Survey of PHY and LINK Layer Functions of Cognitive Radio Networks 21

allows highly efficient network configuration and better enforcement of a complex set
of policies. In [26], a dynamic spectrum access protocol (DSAP) is proposed as one
example of centralized inter-network spectrum coordination. In this work, a central
entity acts as the spectrum broker to lease the spectrum to SSs. The spectrum lease
acts in a way analogous to how DHCP lease IP address to host in a network.Another
example can be found in [27], where a distributed QoS based dynamic channel
reservation (D-QDCR) scheme is used.

5.3.2 Intra-network Spectrum Coordination


In Intra-network spectrum coordination, the focus is how SSs can try to access the
spectrum opportunities in licensed band without causing interference to the primary
users. Typically, it refers to scenario where one SS network coexists with primary
network. Intra-network Spectrum Coordination function not only coordinates
spectrum usage within the SS network, but also should always protect primary
network traffic and minimize the interference from SS to PS. Intra-network spectrum
coordination can be classified into two types.
1) Cooperative coordination
In cooperative coordination, SSs act somewhat selfless, occasionally sacrificing local
performance to improve the system utility by explicitly exchanging control
information with its neighbors. In particularly, each SS exchanges spectrum
opportunities information with neighboring users so that SS within vicinity have a
common view of spectrum availability and each secondary user also knows the
spectrum opportunities of its neighbors. Thus each secondary user selects a spectrum
band not only based on its own utility but also need to consider the usage of its
neighbors. Jun [28] proposed a cooperative distributed coordination scheme in a
scenario where a secondary wireless ad-hoc network coexist with a primary network.
It aims to alleviate the saturation problem of the common control channel.
2) Non-cooperative coordination
By frequent coordination information exchange, the cooperative spectrum coordination
may heavily stress the communication resources of constrained networks such as sensor
and mobile ad-hoc networks. The non-cooperative coordination can significantly reduce
the coordination and control traffic by enabling SS to act independently based on local
observations of the interference pattern and neighboring nodes. Haitao [30] proposes a
device-centric spectrum management scheme which eliminates the need of common
control channel for spectrum sharing. P.Papadimitratos [31] propose another non-
cooperative spectrum coordination MAC AS-MAC for an ad-hoc overlay network
(SS network) coexisting with GSM networks (PS network) to exploit the unused
spectrum left by them.

5.4 Open Research Problems in Link Layer

The Common Control Channel is both needed in Inter-network Spectrum Sharing and
Intra-network Spectrum Sharing to provide many Cognitive Radio functions such as
interference mitigation [17], distributed cooperative sensing, transmitter-receiver
handshake, communication with central entity [18], and so on. There are problems
with traditional CCC with respects to scalability, device cost and security. In the
22 L. Hu, V.B. Iversen, and L. Dittmann

context of work [17], there are several drawbacks of CCC for coexisting WiMAX and
WLAN: First, it requires a static assignment of licensed spectrum before deployment,
increasing complexity and cost. Second, the licensed channels fixed bandwidth limits
scalability in terms of device density, traffic and spectrum ranges. Finally, a simple
jamming attack of the fixed control channel would disrupt the entire network.
Secondly, a set of dynamic CCCs can be constructed by selecting them from dynamic
available data channel lists shared among a cluster of neighbouring SSs as indicated
in [20]. By organizing users into groups, coordination and control traffic are
distributed onto multiple CCC, which is more scalable and robust to jamming attack.
However, the drawback is the group and discovery formulation might result in a large
overhead and decrease the SSs throughput when group updates are triggered too often
due to primary traffic dynamics. The third approach is to eliminate the need for the
Common Control Channel as in [22] where SSs act independently based on the local
interference observation and spectrum selection rules instead of collaborating with
other neighbouring SSs or [24] where a receiver-oriented method can be implemented
for transmitter-receiver handshake.

6 Conclusion
Traditional fixed spectrum allocation has been proved to be inefficient in terms of
spectrum utilization which potentially leads to spectrum scarcity. The emerging
cognitive radio technology forms the foundations of opportunistic spectrum sharing to
improve the spectrum utilization and efficiency. We survey the research challenges in
physical and link layer of cognitive radio networks. In particular, at the physical layer,
the most challenging problems lie in spectrum sensing: the design of wideband
sensing RF-Front-End and reliable spectrum detection algorithms, as well as flexible
data transmission schemes over non-continuous, highly dynamic, heterogamous
spectrum opportunities. In the link layer, the main research challenges lie in:
designing comprehensive framework of spectrum analysis and selection, designing
MAC protocols for spectrum coordination/sharing among secondary systems and
primary systems. The issues covered in this review address broad aspects of research
challenges in designing cognitive radio networks for opportunistic spectrum sharing.
We hope this survey could shed a light on various open research questions in this
field, bring the research community an overview picture of cognitive radio network
research, and mostly importantly spark new research interests/directions.

References
[1] Webb, W., Marks, P.: Radio spectrum pricing. IEEE Review (March 1996)
[2] Zander, J.: On the cost structure of future wideband wireless access. In: IEEE 47th
Vehicular Technology Conference, vol. 3, pp. 17731776 (May 1997)
[3] Mixed signals on the airwaves: The allocation of spectrum is neither efficient nor
effective. Financial Times (December 28, 2002)
[4] Why 3G may still come up short. Business Week Online (March 22, 2004)
[5] Petrin, A., et al.: Measurement and analysis of urban spectrum usage. In: 6th International
Symposium on Advanced Radio Technologies (ISART), pp. 4549 (March 2004)
Survey of PHY and LINK Layer Functions of Cognitive Radio Networks 23

[6] Engelman, R., et al.: Fedral communications commission, spectrum policy task force,
report of the spectrum efficiency working group. Technical report (November 2002)
[7] Peha, J.M.: Approaches to spectrum sharing. IEEE Communication Magazine (February
2005)
[8] Grossglauser, M., Tse, D.: Mobility increase capacity of ad hoc wireless networks.
IEEE/ACM Transactions on networking (August 2002)
[9] Weiss, T.A., Jondral, F.K.: Spectrum pooling: an innovative strategy for the enhancement
of spectrum efficiency. IEEE Radio Communication Magazine (March 2004)
[10] Brodersen, R.W., et al.: Corvus: a cognitive radio approach for usage of virtual
unlicensed spectrum, Berkeley Wireless Research Center (BWRC) White paper (2004)
[11] Cabric, D., et al.: A Cognitive Radio approach for usage of virtual unlicensed spectrum.
In: Proc 14th IST Mobile and Wireless Communication (June 2005) (submit)
[12] Buddhikot, M.M., et al.: DIMSUMNet: new directions in wireless networking using
coordinated dynamic spectrum access. In: Proc. IEEE WoWMoM 2005, pp. 7885 (June
2005)
[13] Cordeiro, C., et al.: IEEE 802.22: the first worldwide wireless standard based on
cognitive radios. In: Proc. IEEE DySPAN 2005 (November 2005)
[14] Zekavat, S.A., et al.: User-Central Wireless System: Ultimate Dynamic Channel
Allocation. In: Proc. IEEE DySPAN 2005 (November 2005)
[15] Cabric, D., Brodersen, R.W.: Physical Layer Design Issues Unique to Cognitive Radio
Systems. In: IEEE 16th International Symposium on Personal, Indoor and Mobile Radio
Communications (September 2005)
[16] Wild, B., Ramchandran, K.: Detecting Primary Receivers for Cognitive Radio
Applications. In: Proc.IEEE DySPAN 2005, pp. 144150 (November 2005)
[17] Tang, H.: A Unified Approach to Wireless System Design, PhD thesis, University of
California at Berkeley (2002)
[18] Acharaya, P., Singh, S., Zheng, H.: Reliable Open Spectrum Communication Through
Proactive Spectrum Access. In: IEEE TAPAS 2006, Boston, US (August 2006)
[19] Tang, H.: Some physical layer issues of wide-band cognitive radio system. In: Proc. IEEE
DySPAN 2005, pp. 151159 (November 2005)
[20] Zheng, H.: Device-centric spectrum management. In: Proc. IEEE DySPAN 2005, pp. 56
65 (November 2005)
[21] Sadeghhi, B., et al.: Opportunistic media access for multirate ad hoc netoworks. In: Proc.
ACM/IEEE MOBICOM 2002, Atlanta, CA (2002)
[22] Krishnamurthy, S., et al.: Control Channel based MAC layer configuration, routing and
situation awareness for cognitive radio networks. In: Proc. First IEEE Workshop on
Wireless Mesh Networks (WiMesh) (September 2005)
[23] Willkomm, D.: Reliable Link Maintenance in Cognitive Radio Systems. In: Proc. IEEE
DySPAN 2005 (November 2005)
[24] Jing, X., et al.: Reactive Cognitive Radio Algorithms for Co-Existence between IEEE
802.11b and 802.16a Networks. In: Proc. IEEE Globecom (2005)
[25] Jing, X., et al.: Spectrum Co-existence of IEEE 802.11b and 802.16a Networks using the
CSCC Etiquette Protocol. In: Proc. IEEE DySPAN 2005(November 2005)
[26] Brik, V., et al.: DSAP: A protocol for Coordinated Spectrum Access. In: Proc. IEEE
DySPAN 2005 (November 2005)
[27] Marias, G.: Spectrum Scheduling and brokering based QoS demands of competing
WISPs. In: Proc. IEEE DySPAN 2005 (November 2005)
[28] Zhao, J., et al.: Distributed coordination in dynamic spectrum allocation networks. In:
Proc. IEEE DySPAN 2005 (November 2005)
24 L. Hu, V.B. Iversen, and L. Dittmann

[29] Ma, L., et al.: Dynamic open spectrum sharing MAC protocol for wireless ad-hoc
network. In: Proc. IEEE DySPAN 2005 (November 2005)
[30] Zheng, H., et al.: Device-centric spectrum management. In: Proc. IEEE DySPAN 2005
(November 2005)
[31] Papadimitratos, P., et al.: A bandwidth sharing approach to improve licensed spectrum
utilization. In: Proc. IEEE DySPAN 2005 (November 2005)
[32] Zhao, Q., Tong, L., Swami, A.: Decentralized cognitive MAC for dynamic spectrum
access. In: Proc. IEEE DySPAN 2005 (November 2005)
[33] Mishra, S.M., Sahai, A., Brodersen, R.W.: Cooperative Sensing among Cognitive Radios.
In: Proc. IEEE ICC 2006 (June 2006)
[34] Wild, B., Ramchandran, K.: Detecting Primary Receivers for Cognitive Radio
Applications. In: Proc. IEEE DySPAN 2005, pp. 144150 (November 2005)
[35] Ghasemi, A., Sousa, E.S.: Collaborative Spectrum Sensing for Opportunistic Access in
Fading Environment. In: Proc. IEEE DySPAN 2005, pp. 131136 ( November 2005)
[36] Shellhammer, S.J., Shankar, S.: Performance of Power Detector Sensors of DTV Signals
in IEEE 802.22 WRANs. In: TAPSA 2006 (2006)
Optimizing TCP Performance over UMTS with Split
TCP Proxy

Liang Hu and Lars Dittmann

Department of Communications, Optics & Materials


Technical University of Denmark, Lyngby, Denmark
{lhua,ladit}@fotonik.dtu.dk

Abstract. The TCP performance over UMTS network is challenged by the


large delay bandwidth product. Large delay bandwidth product is mainly caused
by the latency from the link layer ARQ retransmissions and diversity technique
at physical layer which are used to cope with radio transmission errors. To cope
with large delay bandwidth product, we propose a novel concept of split TCP
proxy which is placed at GGSN between UMTS network and Internet. The split
proxy divides the bandwidth delay product into two parts, resulting in two TCP
connections with smaller bandwidth delay products which can be pipelined and
thus operating at higher speeds. Simulation results show, the split TCP proxy
can significantly improve the TCP performance in terms of RLC throughput
under high bit rate DCH channel scenario (e.g.256 kbps). On the other hand, it
only brings small performance improvement under low bit rate DCH scenario
(e.g.64 kbps). Besides, the split TCP proxy brings more performance gain for
downloading large files than downloading small ones. To the end, for the
configuration of the split proxy, an aggressive initial TCP congestion window
size (e.g. 10 MSS) at proxy is particularly useful for radio links with high data
rates DCH channels with large delay bandwidth product.

Keywords: Delay Bandwidth Product, Split TCP proxy, UMTS, TCP congestion
control.

1 Introduction
Ubiquitous Internet access is regarded as a key success factor for third generation
mobile communication system. WCDMA/UMTS [1, 2] shows this trend by providing
efficient support for packet-switched data services with data rates up to 384 kb/s for
wide area coverage and maximum 2M bits/s for hot spot areas. The mobile internet
access over UMTS is expected to bring one or two order of magnitude higher data
rates than the previous 2G cellular networks. However, it is important that the
combination of Internet application and underlying transport layer can make good use
of the underlying large network capacity provided by WCDMA air interface. One of
the main problems of UMTS network is TCP throughput degradation in the large
delay bandwidth products (up to around 20k bytes). A number of studies can be found
in the literature [3, 4, and 7] have shown TCP performs poorly over wireless links.
However, those studies primarily either focuses on optimize TCP to cope with radio
transmission errors [4] or the long propagation delay in satellite system [7]. In

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 2535, 2009.


ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
26 L. Hu and L. Dittmann

contrast, we target at the large delay bandwidth product problem in UMTS access
network. The large delay bandwidth product is mainly caused by either the variable
end-to-end delay due to link layer ARQ retransmissions or the enhanced physical
layer transmission bandwidth (which can be up to 2M kbps in UMTS and 10M kbps
in High Speed Packet Access (HSPA)).
The contributions of this work are two-folds: firstly, the TCP performance over
UMTS dedicated transport channels (DCH) is studied under various delay bandwidth
product scenarios. Unlike [5], our simulator incorporates the impact of packet loss
rate of Internet. Secondly, a novel TCP split proxy concept is proposed to cope with
the poor TCP performance under large delay bandwidth product scenarios. The split
proxy is implemented at GGSN. With split proxy, a TCP connection from an Internet
server is terminated in the proxy located at GGSN, and a second TCP connection is
used towards the mobile client. The split proxy divides the bandwidth delay product
into two parts, resulting in two TCP connections with smaller bandwidth delay
products which can be pipelined and thus operating at higher speeds.
The rest of the paper is organized as follows. In section 2, we introduce the UMTS
network architecture. In section 3, we present the concept of split TCP proxy. In
section 4, we describe the simulation settings for evaluating TCP performance with
and without split proxy. The simulation results and analysis are described in section 5.
We draw final conclusions in section 6.

2 Overview of UMTS Network


2.1 UMTS Network Architecture
Figure 1 shows the WCDMA network architecture in packet switch operation [6]. The
network functionality is divided into three groups: User Equipment (UE), UMTS
Terrestrial Radio Access Network (UTRAN) and Core Network (CN). UTRAN
consists of Node B and Radio Network Controller (RNC). The CN comprises two
basic nodes: Serving GPRS Support Node (SGSN) and Gateway GPRS support Node
(GGSN). It acts as the backbone for the radio network and the connectivity toward
external networks. GGSN provides inter-working with external packet-switched
networks such as public internet or corporate intranets via the GI interface. SGSN is

Fig. 1. WCDMA network architecture in PS domain


Optimizing TCP Performance over UMTS with Split TCP Proxy 27

Fig. 2. WCDMA protocol architecture-User-Plane. RLC provides reliable data transfer over
error prone radio link by ARQ retransmission.

connected to RNC via the IuPS interface. SGSN is connected to RNC via the luPS
interface. UE is connected to UTRAN over the WCDMA radio interface Uu.

2.2 UMTS Network Protocol Architecture

Figure 2 depicts the WCDMA protocol architecture for the transmission of user data
plane data which is generated by TCP or UDP based applications. The applications as
well as the TCP/IP protocol suite are located at the end-nodes: UE and host.
Basically, WCDMA provides two types of channels: dedicated channels (DCH)
exclusively used by one user and common channels shared between users. In this
work, only DCH channels are considered.
Four different protocols are involved in the transmission over the air interface:
Packet Data Convergence Protocol (PDCP)
Radio Link Control protocol (RLC)
Medium Access Control (MAC)
Physical layer (PHY)
The PDCP provides header compression functionality which improves spectra
efficiency for transmitting IP packets over the radio interface.
The RLC provides link layer ARQ functionality which is typically required for
dealing with packet loss due to radio transmission errors. RLC can operate in three
different modes: The acknowledged mode, unacknowledged mode and transparent
mode. Due to the TCP performance is sensitive to packet loss due to transmission
errors, it is nature to user RLC acknowledge mode for TCP traffics. The
acknowledged mode provides reliable data transfer over the error-prone radio
interface via ARQ protocol. In the unacknowledged mode, the data transfer over the
radio interface is not error free but with no additional delay due to retransmission. The
functionality of transparent mode is similar to unacknowledged mode but no protocol
information is attached to the PDU. RLC protocol in WCDMA provides the means to
optimize the protocol for delay or throughput, or a trade-off between them. Basically,
the protocol uses a selective repeat request strategy, where the acknowledgements are
triggered by either poll flags sent by the data sender or the detection of erroneous
blocks at the receiver, or based on periodic events.
28 L. Hu and L. Dittmann

The Medium Access Control (MAC) layer provides a set of logical channels to RLC.
Besides, the MAC layer is responsible for controlling the amount of data sent according
to negotiated minimum and maximum data rates during the radio bearer setup. The
instantaneous data rate is determined for every transmission time interval (TTI). The
MAC layer requests the amount of data buffered at the RLC layer, which is ready for
transmission. Based on available radio resources and the amount of data to transmit, the
MAC layer submits the appropriate number of RLC blocks to the PHY layer. The
Physical layer (PHY) contains besides all radio frequency functionalities, the signal
processing including Rake receiver, channel estimation, power control FEC, interleaving,
and rate match.

3 Split TCP Proxy for UMTS


Performance Enhancing Proxies are introduced when the performance suffers due to
characteristics of a link or sub-networks [9]. An important subclass of PEPs are split
connections proxies. This means a TCP connection from one end system is terminated
at the proxy and another connection originate there. With TCP proxy, local
acknowledgement in the proxy about the state of a TCP connection can be used to
enhance the performance by shortcutting the transmission of ACKs or
retransmissions. In the context of UMTS, the proxy is only used for shortcutting the
transmission of ACKs, as RLC takes care of the retransmissions. The placement of
TCP proxy in UMTS network is indicated in figure 3. In UMTS, a TCP connection
from an Internet server is terminated in the proxy located in the CN, and a second
TCP connection is used towards the mobile client. To cope with the large delay
bandwidth of UMTS network, the split proxy divides the bandwidth delay product
into two parts, resulting in two TCPs with smaller bandwidth delay products and thus
operating at higher speeds. Unlike previous work [10] [11], the purpose of
introducing a TCP proxy is to increase the throughput during TCP slow start, rather
than the fast packet loss recovery due to radio transmission errors , as the link layer of
UMTS are carefully designed for radio transmission errors.
Figure 4 shows the difference in the information flow for the two cases with and
without Split Proxy supporting local ACKs. While in case a, without Split Proxy the
server has to wait for the response from the client resulting in a poor slow start
performance, the Split Proxy speed up the data transfer in case b. The Split Proxy
effectively implements pipelining for the two TCP connections. On the connections
toward the client the PEP sends the first segment, and in parallel requests already new
data from the server.
Secondly, to further enhance of UMTS radio link utilization, the TCP connection
from proxy to client can be configured with an initial window size large than the two
segment proposed by [12], or the three segment proposed by [13]. This approach is
motivated by the observation: the RTT for the TCP connection traversing the Internet
is typically smaller than the RTT for the connection over UMTS network. This means
that during a conventional slow start, data arrive faster at the proxy than they can be
sent toward the client when both two split TCP connections use the classic initial
window size. To balance the speed of data transfer, a more aggressive TCP initial
window over the radio bearer can enable the proxy forward more segments without
Optimizing TCP Performance over UMTS with Split TCP Proxy 29

Fig. 3. The Schematic of Split TCP Proxy in the end-to-end TCP path. The split TCP proxy is
placed in between UMTS core network and Internet.

Fig. 4. Schematic TCP transfer with and without Proxy. In the case of using split TCP proxy,
by using the local ACK, two TCP connections can be pipelined for transmissions.

waiting for ACKs from the client. Thus, during TCP slow start, the radio link can be
more efficiently utilized since the TCP sending window can more quickly reach the
maximum capacity of the radio link.

4 Simulation Settings
We firstly evaluate the end-to-end TCP performance over WCDMA DCH channels
without split proxy. Secondly, we compare the TCP performance with split proxy and
without it. The performance evaluation is based on OPNET simulator [8]. This model
assumes that an application transfers files on request from a server in the internet to a
mobile client connect to UMTS network. A TCP Reno is used as the transport protocol,
which is configured for a mobile environment according to the recommendations in
[14]. All radio protocols have been implemented in details as described earlier. The
Internet and Core Network have been simplified by add a constant delay to transmit IP
packets which is 100 ms for one way. In addition, independent packet losses are
considered in the internet. For the radio link, it is assumed a dedicated physical channel
is used. The power control is assumed to work perfectly, resulting in a constant block
error rate on the radio link. Block errors are assumed to be independently distributed.
The RLC layer is configured in acknowledged mode and PDU delivery is in
sequence. The RLC transmitting and receiving window is both 1024 PDUs while the
SDU from upper layer is discard after MaxDAT times RLC layer retransmissions.
Polling is not supported in OPENT model. The status report timer is 100 ms.
30 L. Hu and L. Dittmann

The TCP receiver advertised window (awnd) is 32768 Byte. The slow start
threshold (ssthresh) is set to the size of awnd. The TCP retransmission timeout (RTO)
is set to minimum 1 second and maximum 60 second. Fast retransmit is triggered after
sender receives three duplicated ACKs from receiver. The simulation parameters are
given in the table below:

Table 1. Simulation Parameters

Fig. 5. Simulation Setup

As indicated in figure 5, the simulation setup is based on the system architecture


discussed in the previous section. The network traffic is HTTP and FTP traffic, where
the user behaviors of requesting files using HTTP and FTP are modeled such that the
Optimizing TCP Performance over UMTS with Split TCP Proxy 31

inter-request time follows exponential distribution with average value 30 seconds.


The file size is in between 50, 100, or 200 Kbytes.

5 Simulation Results

5.1 TCP Performance over DCH without Split Proxy

A. Impact of TCP slow start


From the efficiency point of view, the impact of the TCP slow start after the TCP
connection establishment depends on the overall amount of requested data. For small
files (e.g. WWW pages), the document transmission is affected by the poor radio link
utilization during the TCP slow start, with the subsequent throughput degradation. On
the other hand, for larger files (e.g. ftp downloads) the effect is expected to be minor.
In the following, we aim to study the influence of slow start to the RLC throughput,
with various file sizes and DCH bit rates.

RLC throughput VS File Size RLC throughput VS TCP initial counter


160 85

80
140 DCH 128 kbps
DCH 64 kbps 75
DCH 256 kbps
120 70
RLC throughput (kbps)
RLC Throughput (kbps)

65 DCH 256 kbps


100 DCH 64 kbps
60

80
55

50
60
45
40
40

20 35
50 100 150 200 1 1.5 2 2.5 3 3.5 4
File Size ( kBytes) initial counter (MSS)

Fig. 6. The impact of TCP Slow Start. When file Fig. 7. The impact of TCP initial congestion
size reduces from 200 to 50 Kbytes, RLC window. Large Initial window improves
throughput drops dramatically on DCH 256 kbps TCP performance in slow start.
while only drops slightly on DCH 64 kbps.

As shown in the graph, in general, for all DCH channel bit rates (64, 128,256
kbps), small file size has a lower RLC throughput than large file size. This is due to
the fact that: The RLC throughput suffers in very TCP slow start phase; Assuming
continuous downloading files, downloading small files involves more TCP slow start
phases compare to downloading large files. Thus downloading small files has more
degradation of RLC throughput.
In particularly, for a higher bit rate DCH channel, small file size degrades the RLC
throughput significantly. For low speed DCH channel, the throughput of RCL
decreases very little when the file size evolves from large size to small size. As it is
indicated in the figure 6, for the DCH 256 kbps, the RLC throughput decrease almost
113 kbps when the file size evolves from 200 Kbytes to 50 Kbytes. In contrast, for
DCH 64 kbps, the RLC throughput decreases only 20 kbps when the file size becomes
50 Kbytes. The reason is that: a high data rate DCH channel indicates a high delay
32 L. Hu and L. Dittmann

bandwidth product; With a large delay bandwidth product, the RLC throughput of
small file decreases more dramatically due to low utilization of network capacity
during TCP slow start phase; On the other hand, in a low delay bandwidth product
environment such as a low bit rate DCH case, there are few difference between good
and bad utilization of network capacity during TCP slow start because the network
capacity is anyway very small.
B. Impact of TCP initial congestion window
(Other simulation parameters: file size= 100k bytes, Internet loss ratio=0, RLC
MaxDAT=10)
As it is shown in figure 7, a large TCP initial congestion window can improve the
TCP performance for the slow start phase. Because a larger TCP initial window can
enable the sender to fill up the TCP pipe much more quickly than a small value in
the slow start phase. Thus the utilization of radio link is increased via using a large
slow start initial counter. This is especially the case, when the delay bandwidth
product is large. As shown in the figure 7, for DCH 256 kbps, the RLC throughput
increases more than 30% when the initial counter increases from 1 MSS (Maximum
Segment Size) to 4 MSS. In contrast, for DCH 64 kbps, the RLC throughput has a
small increase less than 20% when the initial counter is set to 4 MSS.

Imapct of Internet Loss Ratio Influence of RLC MaxDAT to TCP performance


150 160

DCH 128 kbps


140
DCH 256 kbps 140

130
120
RLC throughput (kbps)

120
RLC throghput (kbps)

110 100

100
80

90
60 BLER 10%
80 BLER 20%

70
40

60 20
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 3 4 5 6 7 8 9 10
Internet Packet Loss Ratio MaxDAT

Fig. 8. The impact of Internet Loss Ratio Fig. 9. The impact of RLC MaxDAT

C. Impact of Internet loss rate


The impact of Internet packet loss ratio is studied in this section. We assume the
following simulation setting: 128 kbps and 256 at radio link, Internet loss rate (1%,
5%, and 10%); Inter-request time Exponential with mean 30s, file size 200k bytes.
The conclusion we get from the simulation are: 1) for high bandwidth radio link, the
impact of Internet loss ratio is significant to the RLC throughput; As shown in the
figure 8, the RLC throughput drops around 36% when the Internet loss ratio increases
from 1% to 10%. 2) For low bandwidth radio link, the impact of Internet loss ratio to
the RLC throughput is very small. As shown in the figure 8, for DCH 128 kbps, the
Optimizing TCP Performance over UMTS with Split TCP Proxy 33

throughput degradation is only 23% when the internet loss ratio evolves from 1% to
10%. The reason is that a high bandwidth radio link indicates a large Bandwidth
Delay Product (BDP); RLC throughput is more vulnerable to Internet packet loss rate
when the delay bandwidth product is large, compare to the case of small delay
bandwidth product.

D. RLC interaction with TCP protocol


UMTS radio access network introduces the RLC layer to recover the block erasures
caused by the link level errors of the radio channel. With a sufficiently number of
RLC retransmissions, the SDU error ratio provided to Transport layer can be
negligible. Figure 9 plots the RLC throughput as function of RLC parameter
MaxDAT for DCH data rate (64,128,256 kbps) respectively. The results clearly show
that the more reliable the RLC, the higher RCL throughput can be achieved. In order
to test the reliability of RLC, the MaxDAT is set to 3, 5 and 10. It is shown that the
more reliable the RLC, the better RLC throughput can be achieved. In order to reach a
high degree of reliability, different MaxDAT is needed for different BLER targets. As
indicated in figure 9, for BLER 10%, when MaxDAT is set to 5, the RLC throughput
is maximized while BLER 40% requires a MaxDAT larger than 10 to achieve max
RLC reliability.

5.2 Comparison of the Performance with Split TCP Proxy and without Split
TCP Proxy

From figure 10, the average RLC throughput is presented for different sizes of files
being transmitted. It is clear that, for every small file sizes, very little performance
gain can be achieved with a proxy. For a larger file size, a proxy gives significant
performance gain. The major benefit of a proxy at rates of 64 and 128 kb/s is that it
reduces the time spent in slow start. Since the proportion of slow start as part of the
total transmission time decreases as the file sizes increases, the gain of TCP proxy
increases. Besides, for a data rate of 256 kb/s, the gain increases further for larger file
sizes. In this case, the link is underutilized not only during slow start, but also during
fast retransmit and fast recovery after packet losses. For large files, fast retransmit and
fast recovery becomes important. With proxy, the RLC throughput can significantly
increase during fast retransmit and fast recovery after packet losses. To sum up, a
proxy increase TCP performance for small data rates DCH mainly during the slow
start, while for high data rates DCH both slow start phase and fast retransmit phase.
Figure 11 compares, for different radio bearer rates, the average RLC throughput
with and without proxy. For the proxy, different initial windows have been
investigated. For a TCP configuration according to table 1, a proxy provides small
gain for link rates of 64 and 128 kbps. Also, a large initial window of 10 segments in
the proxy only marginally improves the results for these data rates. However, for the
data rate of 256 kbps, a proxy with an initial window of 3 and 10 segments achieves
the gain of 33% and 66% respectively. This shows that a aggressive initial TCP
congestion window at proxy is particularly useful for radio links with high data rates
whose delay bandwidth product is large.
34 L. Hu and L. Dittmann

256 with Proxy 250


250 256 without proxy
128 with proxy
128 without proxy
64 with proxy 200
200 64 without proxy 64 kbps
128 kbps

Average RLC Throughput


256 kbps
RLC throuhgput (kbps)

150
150

100
100

50 50

0 0
0 50 100 150 200 250 300 350 400 450 500 1 2 3 4 5 6 7 8 9 10
file size (k bytes) initial window at Split Proxy

Fig. 10. Average RLC throughput at different Fig. 11. Average RLC throughput at different
DCH rates for varying file sizes DCH rates for different proxy configuration

6 Conclusion
We firstly study the TCP performance without split proxy over UMTS DCH channel
under various delay bandwidth product. The simulation results show that: TCP
performance in terms of RLC throughput is more vulnerable to the impact of TCP
slow start and Internet packet loss under high bite rate DCH channels (thus large
delay bandwidth scenarios). A large TCP initial congestion window can improve TCP
performance degradation due to TCP slow start, especially under high DCH bite rates
cases (thus large delay bandwidth product). For a given Block Error Rate (BER)
caused by the radio transmission errors, a proper configuration of RLC MaxDAT can
minimize the SDU error ratio provided to upper transport layer.
To cope with large delay bandwidth product, we propose a split TCP proxy placed
in between UMTS access network and Internet. By shortcutting the transmission of
ACKs and pipelining for two split TCP connections, it can significantly improve the
TCP performance under large delay bandwidth product scenarios. Simulation results
show that: a split proxy increases TCP performance only during the slow start under
low data rates DCH, while both slow start phase and fast retransmit phase under for
high data rates DCH. Thus it brings more performance gain under high data rats DCH
channels (thus high delay bandwidth product scenarios). Besides, the performance
gain offered by proxy is larger for downloading large files than downloading small
files. Finally, for the configuration of the split proxy, an aggressive initial TCP
congestion window size (e.g. 10 MSS) at proxy is particularly useful for radio links
with high data rates whose delay bandwidth product is large.

References
[1] Kaaranen, H., et al.: UMTS Networks. Architechture, Mobility and Services. John-Wiley
& Sons, New York
[2] 3GPP, http://www.3gpp.org
Optimizing TCP Performance over UMTS with Split TCP Proxy 35

[3] Caceres, R.,, L.: Improving the Performance of Reliable Transport Protocols in Mobile
Computing Environments. IEEE Journal on Selected Areas in Communications 13(5)
(1995)
[4] Balakrishnan, H., et al.: Improving Reliable Transport and Handoff Performance in
Cellular Wireless Networks. ACM Wireless Networks 1(4) (1995)
[5] Ameigeiras, P.J., Mogensen, P.: Imapct of TCP flow control on the radio resource
managemnt of WCDMA Networks. In: VTC 2002 (2002)
[6] 3GPP, Network Architechture, TS 23.002 (March 2000)
[7] Luglio, M., et al.: On-board satellite split TCP proxy. IEEE Journal on Selected Areas
in Communications 22(2) (2004)
[8] http://www.opent.com
[9] RFC 3135, http://rfc-ref.org/RFC-TEXTS/3135/kw-proxy.html
[10] Balakrishnan, H., et al.: Improving TCP/IP Performance over Wireless Networks. In: 1st
ACM Conf.Mobile Commun. and Networking, Berkeley, CA (November 1995)
[11] Bakre, A., Badrinath, B.R.: I-TCP: indirect TCP for Mobile hosts. In: 15th Intl Conf.
Distributed Communication Systems (May 1995)
[12] Henderson, T.R., Katz, R.H.: Transport Protocols for Internet-Compatible Satelite
Networs
[13] Allman, M.: Increasing TCPs initial Window, RFC 2414 (September 1998)
[14] Inamura, H., et al.: TCP over 2.5 G and 3 G Wireless Networks, RFC 3481 (February 2003)
Use of Photonic Networks in Digital Cinema
Postproduction Dailies Workflow

Michal Krsek1, Paul Hearty2, Laurin Herr3, and Jurgen Schulze4


1
CESNET, z.s.p.o., Zikova 4, Praha 6, 160 00, The Czech Republic
michal.krsek@cesnet.cz
2
Rogers Communications Centre, Ryerson University, Toronto, Canada, M5B 2K3
phearty@ryerson.ca
3
CineGRID Inc., 5756 Ayala Avenue, Oakland, California 94609, USA
laurin@cinegrid.org
4
California Institute for Telecommunications and Information Technology
University of California at San Diego, 9500 Gilman Drive, MC 0436
La Jolla, California 92093-0436, USA
jschulze@ucsd.edu

Abstract. We describe a recent proof of concept demonstration of international,


grid-based collaboration in Post Production for Digital Cinema. In the
demonstration, very high resolution (4,096 x 2,160 Pixels x 24 Frame per
second) Digital Cinema content captured in the Czech Republic was rendered
using processing resources in the United States. Editing and color correction
were carried out in real time by an editor/colorist in Canada under the artistic
direction of a cinematographer in the Czech Republic. Final conforming,
product assembly, and exhibition were carried out in the Czech Republic. The
creative workflow demonstrated was made possible by a 10 GE fiber
networking infrastructure and by grid-like resource management.

Keywords: CineGrid, digital cinema, Post Production, high-speed photonic


networks, networked collaboration.

1 Introduction
CineGrid is a non-profit membership organization whose mission is to build an
interdisciplinary community focused on the research, development and demonstration
of networked collaborative tools, enabling the production, use and exchange of very
high-quality digital media over high-speed photonic networks.
One such application is the Digital Cinema. With resolutions of up to 4096 pixels
horizontally x 2160 vertically at 24 frames per second, and with each pixel requiring
up to 16 bits per Red, Green and Blue color channel, Digital Cinema provides not
only exceptionally high quality imagery for exhibition, but also exceptional
challenges in Production and Post Production. For example, when 4K x 2K image
data is captured and output in raw, rather than in Red-Green-Blue format, the
processing resources required to render the data from the sensor in a time-frame that
is viable for Production are both substantial and rare. And, finally, the technical and

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 3639, 2009.


ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
Use of Photonic Networks in Digital Cinema Postproduction Dailies Workflow 37

creative resources necessary to edit and color-correct digital cinema content are rare
and geographically distributed, particularly given the realities of on-location shooting.
CineGrid members believe that the existence of broadband photonic networks and
of grid-based tools for the discovery, acquisition, provisioning, and management of
creative and processing resources can facilitate Production and Post Production in
Digital Cinema. They further believe that the network management, distributed
resource management, and other tools adapted or developed to support such
applications will be applicable to other high-demand applications, such as scientific
visualization and collaboration.
The importance of high-speed networks in enabling the Digital Cinema applications
is evident when one considers the basic parameters of the medium.

2 Demonstration Objective and Concept


The underlying objective of the demonstration was to show that Production and Post
Production in Digital Cinema can be facilitated by using high-speed photonic
networks and by grid-like management of distributed resources. The demonstration
was conceived in two parts:
Capture in a 4K format at one location with rendering at another, in this case
Prague and San Diego, respectively;
Edit and color correct with the editor/colorist at one location and the cinematographer
at another, in this case Toronto and Prague respectively.

3 Approach, Strategy and Basic Technology


For the first part of the demonstration, we captured 4K x 2K digital cinema imagery
in Prague using a Dalsa Origin 16 camera and used processing-cluster resources at the
California Institute of Telecommunications and Information Technology (Calit2) at
the University of California San Diego to render the raw Bayer images1 to RGB. The

Fig. 1. Link and signal structure used for editing/color-correction demonstration

1
In the Bayer format used, the image is captured against a target 4096 x 2160 x 24 sampling
structure, but the approximately 8.8 megapixel spatial sampling lattice is divided into 4-pixel
blocks in which Green is sampled at twice the frequency of Red and Blue.
38 M. Krsek et al.

key factor was to allow transfer of the raw 4K x 2K data from Prague to San Diego
and of the rendered RGB data back to Prague within a viable time window for full-
scale Production and Post Production.
For the second part of the demonstration, we established a near-real-time workflow
between the cinematographer in Prague and an editor/colorist in Toronto. The key
factor at this stage was to establish a transcontinental interconnect that would support
the delivery of workable image data from Prague to Toronto, a two-way telepresence
between the cinematographer and the colorist, and bi-directional transfer of control
data between the colorists control surface in Toronto and the cinematographers base
processing system in Prague. The link and signal structure were as shown in Figure 1.

3.1 Network Connections

The demonstration described here was presented at the 2007 meeting of the Global
Lambda Integrated Facility2 in Prague. The network connections were established ad
hoc and were as shown in Figure 2.

Fig. 2. Diagram of the Network used for the demonstration

4 Challenges and Results


Several challenges had to be solved while preparing this demonstration.

2
GLIF is an international virtual organization that promotes standards for fiber optic (lambda)
networking, involving dedicated, high-capacity circuits based on optical wavelengths, which
terminate at exchange points known as GOLEs (GLIF Open Lightpath Exchanges).
Use of Photonic Networks in Digital Cinema Postproduction Dailies Workflow 39

First, tuning up all the network equipment as well as processing nodes to work
with very long high-speed circuits was necessary (e.g. jumbo frames had to be set on
all the routers and switches and tune-up TCP window size on processing nodes to get
maximum throughput).
Second, we needed to render (de-Bayer) raw image data using parallel processing.
Processing of one second of image data took about 23 seconds on one machine, so
parallel processors were used to get nearly real-time performance. Time distribution
of the de-Bayering is shown in Table 1.

Table 1. Time Distribution of de-Bayering from Prague at San Diego cluster

Step Duration per one second of footage


Transfer Prague -> San Diego 4.4 seconds
Processing on cluster 9.6 seconds
Transfer San Diego -> Prague 9.1 seconds

The processing devices used for this demonstration were Linux based personal
computers or servers that had well defined interfaces HD-SDI (SMPTE 292M) and
1000Base-T (IEEE 802.3ab). We tweaked those interfaces for best performance.
Third, it was necessary to set up and maintain telepresence between Toronto and
Prague. This two-way connection had to support the exchange of HDTV video as we
as audio, with minimum latency and jitter.
And, finally, we needed to tweak console/cluster communication for the color
correction system used, a Baselight FOUR3. In its design configuration, this system is
intended to support of separations of up to 50 meters between the control surface and
the base cluster. In our case, we had a separation of approximately 7,500 km,
resulting in lengthier communication delays.

5 Conclusions
We demonstrated the feasibility of using of high-speed photonic networks for real-
time Post Production of Digital Cinema source material.
Both the colorist in Toronto and the cinematographer in Prague provided positive
feedback insertion of the long-distance photonic network between them had no
effect on the collaborative creative activity. The high speed, low latency and zero
jitter on the network, combined with the high resolution telepresence, allowed
collaboration as if in the same room.

3
BaseLight FOUR is a cluster-based color correction device developed by UK based vendor
FilmLight.
Visualizing TCP/IP Network Flows to Control
Congestion and Bottlenecks

Nanjun Li1 and Werner Zorn2


1
Institute for Digital Communications, University of Edinburgh,
EH9 3JL, United Kingdom
n.li@ed.ac.uk
2
Hasso Plattner Institute, University of Potsdam, 14480, Germany
zorn@hpi.uni-potsdam.de

Abstract. Internet technological trends indicate that the link transmission speed
has been significantly improved in the last decade, while the network nodes
(routers and switches) have become the common bottleneck of Internet. Based
on this insight, this paper applies the newly developed Visualized IP-based
Network Simulator (VINS) to study TCP/IP networks formed on fast speed
links. VINS design complies with the related RFCs and BSD operating system
in three ways: IP networking; socket hosting and interfacing; service on the
node. Two sample scenarios are presented regarding IP networking and TCP
end-to-end congestion control for bulk data delivery such as video files, and
handling packet overflow and delay on intermediate nodes.

Keywords: TCP/IP, Performance Evaluation.

1 Introduction
Several recent studies indicate that the bottleneck of todays Internet has been shifted to
the node, such as routers and switches, since the link speed has been significantly
improved and even exceeded the traffic demands. However the existing network
simulators still consider congestion issue over the links. This paper applies the newly
developed Visualized IP-Network Simulator (VINS) into TCP/IP network performance
evaluation where overflow and delay are frequent on the nodes.
In Section 2 we briefly review the TCP/IP stack in BSD operating system [1], related
IETF RFCs (i.e. [2][3][4]) and some existing network simulators. Section 3 introduces the
design of VINS, which is a C++ Win32 application. We summarize VINS contribution as:
IP networking, socket hosting/interfacing, and node store/relay services. The first example
is presented in Section 4, illustrating the capability of VINS to simulate IP network with
complex topology. The second scenario is presented in Section 5, as a study on TCP
end-to-end congestion control and avoidance when packets can be delayed and lost on
an intermediate node. By varying parameterized background traffic, we compare the
performance of four TCP mainstream variants Tahoe, Reno, NewReno and SACK, and
find new insights using our evaluation method. We conclude in Section 6 and brief plan
in future.

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 4047, 2009.


ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
Visualizing TCP/IP Network Flows to Control Congestion and Bottlenecks 41

2 TCP/IP Networks and Existing Simulators


To network unknown number of nodes, addressing/subnetting schemes are designed
as Classful [2] and CIDR [3], which grant free and easy network maintenance and
extension. The network topology is the result of all nodes routing tables (layer-3)
and link connectivity as infrastructure (layer-2). In Internet, IP and routing table mis-
configuration by mistakes (i.e., BGP mistakes) or malicious attacks is commonly
observed [4][5], and may result in TCP disruption because of falsification of
reachability. Besides the intermediate nodes, host nodes provide access points to users
as well platforms for socket-based applications. These sockets choose TCP [6][7] and
UDP [8] as transport modules.
Each individual node in an IP network is an autonomous system working in
asynchronous mode to handle unpredictable network events, such as timeout, packet
arrival, departure and user request (Fig. 1). In BSD, IP interrupt queue is monolithic
with default capacity to 50, and normally scheduled as first-in-first-out (FIFO). This
design prevents parallelism causing packet flow disorder, which is found to be critical
to TCP performance [10]. In the link layer, a node may have multiple interfaces and
associated interface-send-queues for output.
IP interrupt routine serves packets in two ways: 1) hosting, by triggering the
transport routines like TCP, if the packet has reached its destination; 2) forwarding,
by looking-up the routing table for the longest-prefix-matching neighbor node and
relaying (Fig. 1).

Fig. 1. A BSD node has a monolithic FIFO queue at the IP layer

S. Floyd [11] and J. Stone [12] claim that in the wired networks, the majority of
packet losses are caused by traffic congestion, while the losses due to link corruption
instead of congestion are rare. Other studies (i.e. [13]) find that the node has become
the main bottleneck of todays Internet since the link transmission speed has been
significantly improved and even exceeding traffic demands. We infer that congestion
commonly occurs when the IP interrupt queue is overflowed despite of the router may
42 N. Li and W. Zorn

still have large buffer. We also studied three existing simulators: NS-2 [14], OPNET
[15] and OMNeT++ [16], which provide much needed tools and are widely used in
network research. However their simulated networks might have some reality
concerns while comparing with the implementation in real world: 1) IP module has
been missing. 2) It might be expected if simulators can provide store-relay function
and service time on the nodes. 3) Socket is not included in simulation, which is the
interface bridging the user byte-streams and discrete protocol unit in the networks. In
BSD, the indicators in the TCP control block (tcpcb), such as snd_nxt, rcv_nxt,
snd_una etc, are pointing to the bytes in the so_snd and so_rcv buffers of a socket,
with which TCP can guarantee datas reliability and pursue high performance by
properly adjusting the congestion window (cwnd := snd_nxt snd_una).

3 Innovative Features of VINS


As for network simulator classification (S.Y. Wang et al [17]), VINS is a traditional
one for object-oriented and discrete-event features, an application layer software.
Each node in VINS has a 32-bit IP address and forwarding prefix, which indicates its
domain and potential size. Once the FIFO queue is overflow, new arriving packets
will be dropped. The packet time-to-live (TTL) is initialized on the source host as 64.
Router decrement packets TTL when relays it, or discards it if any of following
events occurs: 1) TTL expiry. 2) Destination (of packet) is unreachable. 3) Routing
table update / redirect. 4) Other ICMP errors stated in RFC792 and [18].
Service time for packet is simulated with a stochastic value taking certain
probability distribution functions (PDF). VINS currently provides four choices:
Deterministic, Gaussian, Uniform and Exponential. Generating random numbers takes
unpredictable time, i.e., to create two Gaussian numbers we need the Box-Muller
Transformation [19] contains uncertain times of loop, square and logarithm. We split
off a process to generate the normalized random numbers for all nodes need to do is a
linear transformation.
VINS link is parameterized as latency and loss ratio per million packets. Packet is
delayed and dropped in the given values, though in the wired networks such as
Ethernet, the physical delay is not significant and link corruption (i.e. CRC
corruption) is very rare.

4 Scenario 1: IP-Networking
The first sample scenario (Fig. 2) demonstrates VINS capability in simulating multi-
domain networks with diverse transport modules. There are 14 applications running
on different ports of the host nodes, exchanging protocol data units with UDP and
TCP, including Tahoe, Reno, NewReno and SACK [20][21]. Two of them are set
mute mode to receive data only: A:1312 (Tahoe) and N:880 (UDP), while the others
are in talk mode and sending data to their peers. Mute UDP applications do not send
any packet into the network, while mute TCP applications can acknowledge the
received data with pure ACK packets.
Visualizing TCP/IP Network Flows to Control Congestion and Bottlenecks 43

Fig. 2. Scenario 1 demonstrates VINS simulation of various transport modules, including UDP
and TCP variances over a complex IP network

Table 1 is excerpted from the system report, showing the routes of end-to-end
flows may be asymmetric on a bi-directional connection (as light-grayed): packets
from B:80 to H:8080 goes through path B->O->E->V->R->G->H; the counterpart
flow through path H->G->R->J->O->B. The routes depend on the IP configuration
and connectivity of the network.

Table 1. Routes and Protocols of Streams in Scenario 1

Src. Route Dest. Reachability Proto Mode


M:917 M->L->J->R->G->H H:799 True TAHOE TALK
H:799 H->G->R->J->L->M M:917 True NEWRENO TALK
B:80 B->O->E->V->R->G->H H:8080 True NEWRENO TALK
H:8080 H->G->R->J->O->B B:80 True RENO TALK
M:289 M->L->J->R->F->I I:909 True NEWRENO TALK
I:909 I->F->R->J->L->M M:289 True RENO TALK
A:1312 A->D->E->J->L->P P:303 True TAHOE MUTE
P:303 P->L->J->E->D->A A:1312 True RENO TALK
S:2771 S->K->U->W->R->G->H H:137 True SACK TALK
H:137 H->G->R->V->U->K->S S:2771 True SACK TALK
A:1001 A->D->E->V->R->F->N N:880 True UDP TALK
N:880 N->F->R->V->E->D->A A:1001 True UDP MUTE
X:365 X->T->R->V->E->D->Y Y:1224 True SACK TALK
Y:1224 Y->D->E->V->R->T->X X:365 True SACK TALK
44 N. Li and W. Zorn

This sample also exhibits that UDP flows can be flooding and slow down other
flows performance, especially to TCP. Application from A:1001 to N:880 is an UDP
application, VINS reports this flow possesses 30.7% packets that are relayed in the
network. This is because the UDP protocol does not control the rate of injecting
packets. UDP flood attack is commonly seen as flooding a victim router and forcing
it to send many ICMP [9] messages, eventually quench the clients. This issue has
also been described and addressed in [22].

5 Scenario 2: TCP Performance Evaluation


TCP congestion control is performed on the two end-nodes over a connection. A
commonly used evaluation scheme for TCP is to delay and drop packets on the links
by varying the link latency and packet drop ratio within [0, 1]. However, this scheme
does not properly reproduce the congestion issue in real world since packet losses are
not independent. Second, the packet loss ratio can be reduced by TCP congestion
avoidance, thus imposing constant loss ratio is not fair evaluate a TCP variance.
Further, packet delay and loss are related issues, which might be expected to consider
together. Our scheme is to vary parameterized background traffic across an
intermediate node and investigate: statistics on the intermediate node; throughput on
endpoint; impact of background traffic and TCP control variables (Fig. 3).

Fig. 3. TCP connection between A1:11 and B1:1011: all nodes have same mean packet service
time 40 ms in exponential distribution. Background traffic is sent from X to Y.

The background traffic is send from X1, with interval t exponentially distributed
and mean departure rate (packets/sec). The probability of interval t helds (1):
t
P ( , t ) = e (1)
Visualizing TCP/IP Network Flows to Control Congestion and Bottlenecks 45

The probability of A packets arrival per second at R1 takes (2):

A e
P ( A, ) = (2)
A!
We interest the utilization of R1, which is calculated with function (3):
Di
U i = Di X i = (3)
Bi

where X i is the mean service time to the packet; B i is the service rate, the reciprocal
of X i , indicating the maximum departure rate of node i.

Fig. 4. Utilization of R1 illustrates the capability of TCP in exploring free bandwidth

Fig. 5. Tuning the departure rate from X, the arrival rate and departure rates of R1

We found Tahoe performs the best when background traffic is under 40 pkts/sec,
but goes down as background traffic increasing. SACK exhibits its strength in a
congestive environment. A series of experimental tests (R. Bruyero et al [23]) claim
this ratio ranges between 1.16 and 1.29, which backs up our simulation results (as
46 N. Li and W. Zorn

Fig. 6. Comparison among Tahoe, Reno, NewReno and SACK, for special interest we give the
ratio of SACK on Reno

Fig. 7. Diagram of throughput on delay shows Tahoe is always under others

Fig. 6). Plotting throughput on given loss ratio and delay (i.e. Fig. 7), Tahoe always
under-performs, however as we explained, not a fair test.

6 Conclusion
To match the trend of fast link speed in simulation, we designed VINS as a tool for
network design and research. We also devise a fairer method for TCP performance
evaluation rather than imposing constant loss ratio and delay on the links, and find
that Tahoe, which is deemed to be too gentle, performs the best over other three
variances if network is not congestive. With increasing the background traffic rate,
SACK achieves the best performance.
Visualizing TCP/IP Network Flows to Control Congestion and Bottlenecks 47

VINS is being developed to accommodate wireless networks where contention and


congestion may coexist, and nodes are demand to route for destination without IP
configuration (Ad hoc networks). TCPs traditional congestion avoidance needs
improvements to deal with the new issues such as differentiating contention and
congestion, and handling frequent route change etc.

References
1. Wright, G.W., Stevens, W.R.: TCP/IP Illustrated, vol. II. Addison Wesley, Reading (1994)
2. Postel, J.: Internet Protocol, RFC 791
3. Fuller, V.: Classless Inter-Domain Routing, CIDR: an Address Assignment and
Aggregation Strategy, RFC 1519
4. Hengartner, U., Moon, S., Mortier, R., Diot, C.: Detection and analysis of routing loops in
packet traces. In: 2nd ACM SIGCOMM Workshop on Internet measurement (2002)
5. Mahajan, R., Wetherall, D., Anderson, T.: Understanding BGP Misconfiguration. In:
Proceedings of SIGCOMM, Pittsburgh, PA (2002)
6. Postel, J.: Transmission Control Protocol, RFC 793
7. Jacobson, V.: TCP Extensions for High Performance, RFC 1323
8. Postel, J.: User Datagram Protocol, RFC 768
9. Postel, J.: Internet Control Message Protocol, RFC 792
10. Bennett, J.C., Partridge, C., Shectman, N.: Packet Reordering is Not Pathological Network
Behavior. IEEE/ACM Transaction on Networking 7 (1999)
11. Floyd, S.: A Report on Some Recent Developments in TCP Congestion Control. IEEE
Communications Magazine (2001)
12. Stone, J., Partridge, C.: When the CRC and TCP Checksum Disagree. In: SIGCOMM
Symposium on Communications Architectures and Protocols (2000)
13. Molinero-Fernndez, P.: Circuit Switching in the Internet, Ph.D. Thesis, Stanford
University (2003)
14. Fall, K., Varadhan, K.: The NS Manual (2006)
15. OPNET Technologies, Inc., http://opnet.com/
16. Bless, R., Doll, M.: Integration of the FreeBSD TCP/IP-Stack into the Discrete Event
simulator OMNET++. In: Proceedings of the 2004 Winter Simulation Conference (2004)
17. Wang, S.Y., Chou, C.L., Lin, C.C.: The Design and Implementation of the NCTUns
Network Simulation Engine. Simulation Modeling Practice and Theory 15, 5781 (2007)
18. Jacobson, V.: Congestion Avoidance and Control. ACM SIGCOMM (1988)
19. Box, G.E.P., Muller, M.E.: A Note on the Generation of Random Normal Deviates. Annals
Math. Stat. 29, 610611 (1958)
20. Mathis, M., Mahdavi, J., Floyd, S., et al.: TCP Selective Acknowledgment Options, RFC
2018
21. Floyd, S., Mathis, M.: An Extension to the Selective Acknowledgement (SACK) Option
for TCP, RFC 2883
22. CERT Advisory CA-1996-01 UDP Port Denial-of-Service Attack (1996)
23. Bruyeron, R., Hemon, B., Zhang, L.: Experimentations with TCP selective acknowledgment.
ACM SIGCOMM Computer Communication Review 28(2), 5477 (1998)
ZBMF:
Zone Based Message Ferrying
for Disruption Tolerant Networks

Bahadir K. Polat, Moazzam Khan, Nikhil Tuli, and Pooja Kailay

College of Computing
Georgia Institute of Technology
{baha,moazzam,nikhiltuli,poojakailay}@gatech.edu

Abstract. We introduce Zone Based Message Ferrying (ZBMF), a hybrid pro-


tocol for Disruption Tolerant Networks (DTN) which combines the concept of
message ferrying with the well known DTN routing protocols. ZBMF, trans-
forms the disrupted network space into a connected network topology by divid-
ing it into overlapping ferry routing zones. ZBMF gives DTN message routing a
sense of direction, thus providing both contained dissemination of copies of
messages and faster delivery of messages in the network. ZBMF does not dic-
tate a strict movement pattern to message ferries, which can be very useful in
the real world. Rescue helicopters, school buses, tanks or soldiers in the battle
field, all can be designated as message ferries and carry messages without
changing their activity and movement schedules. Due to its low overhead,
ZBMF can be used to cover large areas of disrupted communication such as cit-
ies and large disaster areas. Finally, we implement our ZBMF on the Opportun-
istic Network Environment (ONE) Simulator and evaluate it as performing
better in terms of overhead and latency than other well known protocols for
Disruption Tolerant Networks while maintaining equally high delivery ratios.

Keywords: Disruption Tolerant Networks, Message Ferrying, Space Path,


Time Path.

1 Introduction
Disruption tolerant networks (DTN) are special networks in the wireless communica-
tion concept, in which there exist no space path between the nodes throughout most of
the network duration. In disruption tolerant networking, the nodes communicate when
they meet each other. The communication duration in these meeting times is limited.
Regular routing protocols for wireless networks such as AODV, DSR, DSDV, and
ZRP all require that there exists a space path between the source and the destination
[1]. In these protocols, the messages between source and destination are divided into
packets and routed through known routes that exist for a significant amount of time.
Since, in DTNs, there is no stable path between the source and destination, these rout-
ing protocols suffer from low delivery success ratios and high routing overheads. In
DTN, these protocols will spend impractical amounts of time and control messages to

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 4859, 2009.


ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
ZBMF: Zone Based Message Ferrying for DTNs 49

discover a path between the source and destination, which does not exist most of the
time in DTNs. So these protocols may never discover a path between the source and
destination. Even in circumstances when such a path is found, the chances of disrup-
tion are high, which initiates the discovery process again. As a result, these protocols
will spend the majority of the network time and energy in this discovery-disruption
cycle, creating high overheads and will have extremely low delivery success ratios,
with high delays. From the argument above, we conclude that DTNs need a different
approach for routing. The main scheme for routing in DTN requires that the nodes
transfer complete messages instead of dividing them into packets. The routing proto-
cols in DTN perform message based transfers between nodes. The main paradigm in
DTN routing is store and carry forward. In this scheme the nodes after receiving
messages destined to other nodes store these messages and carry them for a certain
amount of time. When the nodes meet with each other they exchange messages obey-
ing certain rules specified in the particular routing protocol. This way messages are
carried physically in the DTN and reach their destinations. There are many protocols
such as Epidemic [4], Prophet [5], MaxProp [6], Spray and Wait [7] that specify cer-
tain rules for carrying and forwarding messages among the nodes in the DTN.
We will talk about certain characteristics of these routing protocols while we ex-
plain our protocol. Message Ferrying is a particular concept in DTN [2]. The DTN
routing protocols mentioned above depend on the fact that the communicating nodes
are mobile enough to meet with each other to exchange messages. What happens in
the case when the nodes have limited radio range, very low or no mobility and are
positioned far (out of radio range) from each other so that they have almost no chance
of meeting with each other. In message ferrying this problem is solved by designating
mobile nodes, ferries, which travel the area and physically carry messages among
these nodes.

2 Zone Based Message Ferrying


Zone based message ferrying (ZBMF) can be seen as a hybrid routing protocol that
uses certain concepts from DTN routing and message ferrying. ZBMF uses zone in-
formation to route messages in the network. ZBMF makes the following assumptions
about the mobility model. The disrupted ordinary nodes, in the network, are grouped
together into message ferrying zones based on geographical position. The message
ferries designated to a certain zone based on either their local movement pattern or on
a particular specification stay inside their local zones most of the time, but are free to
move to carry out their job inside their local zones. (Refer to Section 4 for details on
plausible solutions for the initial setup of zones). Under the aforementioned mobility
pattern the routing domain is divided into overlapping zones. The overlap areas be-
tween the zones constitute links among the zones. This way, the zones form a con-
nected network. Figure 1 illustrates the mapping of the disrupted network space into a
connected network of zones. ZBMF uses message ferrying inside the zones. When a
message reaches to the zone of its destination it is carried physically to its destination
by a message ferry local to that zone. ZBMF forwards the messages through zone
routes. The protocol uses the shortest zone route to forward the messages to the desti-
nation. The major difference between ZBMF and other DTN routing protocols is that
50 B.K. Polat et al.

ZBMF incorporates a sense of direction into the routing process. A simple example
can show us this difference. Let us say that a node S has a message destined to a node
D (see figure 1 for visual representation). Node S only infects the ferries in the zones
that belong to the shortest zone path to the destination D. Since the zones are con-
nected and we pick the shortest zone path we are assured that the message will get to
its destination D with very small delay, and high delivery success ratio. We also ob-
tain minimum overhead, since we are not infecting any nodes from other zones that
are not in the zone path. This way, our protocol can cover big areas and provide high
success ratios, small delays and minimum overhead.

2.1 Zone Based Message Ferrying, the Protocol

We would like to present the entities and concepts in ZBMF. In ZBMF, the nodes are
identified either as ferries or ordinary nodes. Message Ferry: A message ferry is a
mobile wireless node that physically carries/forwards the messages of the ordinary
nodes. Ordinary Node: An ordinary node is a wireless node with limited radio range
and very low or no mobility. Ordinary nodes are the sources and destinations of the
data messages in the network. Ordinary Nodes use the ferries to carry their messages
for them. In zone based message ferrying the communication domain is partitioned
into overlapping zones. From this concept emerge two types of communication, intra-
zone communication and inter-zone communication. Intra-zone routing uses the prin-
ciples of message ferrying, where the ferries of the zone carry and forward the
messages of the ordinary nodes in their zones. In inter-zone routing the ferries from
different zones meet with each other and selectively forward messages destined for
other zones than their own. The address of any node in the network consists of the
zone id and the network address of the node. The ordinary nodes maintain the ad-
dresses of all the ordinary nodes in the network. The ferries maintain a routing table
in which they store next hop zones only. Since the addresses already contain the zone
ids, ferries do not need to maintain long routing tables for each ordinary node in the
network. These concepts and entities are presented in figure 1 to give us an easy to
understand visual example. The protocol consists of two subsections, intra-zone rout-
ing and inter-zone routing. Intra-zone routing: Inside the zones, the protocol uses
message ferrying principles. The ordinary nodes (ON) maintain a record of all the
addresses of the other ONs in the network. The address of any node in the network
consists of the zone id and the network address of the node. When the ON encounters
a ferry, it first extracts the zone id of the ferry from the ferrys address. If the ferry
belongs to the same zone as the ON does, it then checks to see if the ferry already has
the message by checking the message id list sent to it by the ferry. If the ferry does
not have the message, the ON forwards the message to the ferry. Otherwise it does not
forward the message. When a ferry sees another ON it forwards its message list to the
ON. After the ON performs the checks mentioned above, it sends the message to the
ferry or not. The ferry also checks its message list to see if it has any messages des-
tined to that ON, if yes, the ferry forwards the messages to the ON. When the ferries
contact each other they exchange message lists. The format of the messages is illus-
trated by Figure 2. The ferries first extract each others zone ids and check to see if
they belong to the same zone. If the ferries belong to the same zone they do not for-
ward each other the messages that are destined to ONs in their own zones, they
ZBMF: Zone Based Message Ferrying for DTNs 51

deliver those messages directly. The messages that have a different destination zone
are exchanged between the ferries of the same zone, if the ferries do not have these
messages already. Although this exchange adds some overhead due to infection, it
also speeds up the forwarding of messages through the intermediate zones on the zone
path. In case the ferries belong to different zones then they follow the protocol for
inter-zone routing. Inter-zone routing: Inter-zone routing is concerned with the proper
forwarding of the messages between zones. In the inter-zone routing the contacting
elements of the network are the ferries. When two ferries contact each other, they
exchange message lists. The ferries extract each others zone IDs.

Fig. 1. Zone based Message Ferrying

Source Address Destination Address


Zone_ID + Network Address Zone_ID + Network Address TTL
Flags Msg. Address of the Ferry
Msg_List_Exch. 0/1 ID Zone_ID + Network Address
Zone_Advertisement 0/1
Message Body

Fig. 2. Message Format

Destination Zone Next Hop Zone

Zone 1 Zone 3
Zone 2 Zone 4

Fig. 3. Routing Table of Ferries

They first check to see if they have any messages destined to that zone. The routing
table maintained by each ferry (shown in Figure 3) contains the mapping of the
destination zone with the next hop zone according to the shortest path between
the source and destination zone. If they do, then they check the message list sent by
52 B.K. Polat et al.

Input: Message
Extract: Zone_ID
Check: If Message = Msg_List
If (Msg_List)
If Zone_ID = Own_Zone_ID
Check: Msg_List
if message_m is not in the Msg_List && message_m_destination != own_zone
(ensures that a ferry delivers a message destined to its zone itself)
Forward: message, if the other ferry is not sending you a message
else message_m = forward_queue->next_message;
Loop: until forward_queue = empty or connection is lost.
Else Zone_ID != Own_Zone_ID
Check: Routing table for the next hop zones for messages in the forward
queue which are destined to zones other than own_zone
if have a message_m destined for the other ferry with Zone_ID
Check: Msg_List
if message_m is not in the Msg_List
Forward: message _m, if the other ferry is not sending a
message
else message_m = forward_queue->next_message
Loop: until forward_queue = empty or connection is lost.
else have no message destined for the other ferry with Zone_ID
Else (!Msg_List)
Store: received message in the forward queue

Fig. 4. Algorithm for message forwarding between ferries

the other ferry to see if that ferry already has the message. If the other ferry does have
the message already no further step is taken. If the other ferry does not have the mes-
sage, the ferries forward their messages. If the ferries belong to the same zone they
follow the protocol for intra-zone routing mentioned above. The algorithm for mes-
sage exchange between ferries is presented in Figure 4.

3 Evaluation
The evaluation is done in a simulation environment with a special Delay Tolerant
Network simulator called Opportunistic Network Environment (ONE)[3], developed
by researchers at the Helsinki Institute of Technology. As the simulation environment
the placement of the zones and the membership of individual zones (i.e. the ordi-
nary nodes and the ferries) needed to be consistent for comparison with other proto-
cols, we develop multiple standard network topologies. This enables us to record
repeatable and fair measurements during the evaluation of the ZBMF protocol.

3.1 The Simulation Environment

We compared ZBMF routing protocol to four traditional routing schemes, enumerated


as follows:
i. Epidemic Routing: The basic routing protocol that simply floods a message
that is generated by an ordinary node. The ferries transmit the message
ZBMF: Zone Based Message Ferrying for DTNs 53

without any sense of direction or path, to all the ferries/ordinary nodes that
fall in its movement path. Expectedly, the protocol has a disadvantage of
extremely high message overhead but it makes up for it with its high delivery
ratio [4].
ii. PROPHET Routing: The PROPHET scheme is a novel probabilistic
routing scheme whose algorithm relies on calculation of delivery
predictability to forward messages to the reliable node. The probability is
used to decide if one node is reliable to forward message to. The routing has
lesser overhead than epidemic scheme but the actual values of probabilities
are crucial for success of the scheme [5].
iii. MaxProp Routing: MaxProp is based on prioritizing both the schedule of
packets transmitted to other peers and the schedule of packets to be dropped.
These priorities are based on the path likelihoods to peers according to
historical data and also on several complementary mechanisms, including
acknowledgments, a head-start for new packets, and lists of previous
intermediaries [6].
iv. Spray and Wait Routing: A scheme that sprays a fixed number of copies
into the network, and then waits till one of these nodes meets the
destination. This scheme expectedly needs to be tuned for each topology
because the number of sprayings have to be sufficient enough for giving
the scheme a reasonable chance to deliver the message packet [7].

3.2 Simulation Parameters

The simulation environment as explained above involves setting up network topolo-


gies, each of whose zones are connected to each other by overlapping zone bounda-
ries. The ordinary nodes in the network have a transmit range of 50m and the message
ferries have a transmit range of 100m. The messages distributed across the network
are consistent for all the routing protocols and in all topologies. The message TTLs
(time to live) are set at 2000mSecs and the simulations are run for 20,000mSecs.
There are two ferry speeds for each simulation to measure the effects of ferry speed to
the simulation results.

3.3 Simulation Metrics

Delivery Success Percentage: The percentage of the messages delivered success-


fully to the intended destination station, among the total message events gener-
ated.
Overhead Ratio: The ratio of the number of the overhead messages generated
while transmitting messages among nodes/ferries to the total number of generated
messages.
Median Latency: The time delay between the generation of a message event and
the delivery of the message is called the latency of the message event. The me-
dian latency for the simulation is the median of all the latency values generated
across the simulation.
54 B.K. Polat et al.

Fig. 5. Delivery Success Ratio for Slow Ferry Speed

3.4 Simulation Results

The delivery success ratio graph Fig. 5 indicates that the success ratio for ZBMF is
almost equally as high as Epidemic routing, PROPHET routing and MaxProp Routing
schemes because the ZBMF Routing scheme uses message ferrying zones that are con-
nected through overlapping nodes and boundaries. The topology 4 consists of a straight
line of zones and thus the PROPHET scheme does not do well here as it relies on prob-
abilities to route messages through zones, which drop to low values as the network
extends in one direction. Spray and Wait routing scheme gives a success percentage of
nearly 25 % because the number of sprayings have a drastic effect on the delivery suc-
cess. In Spray and Wait, the number of copies that a message can have, is a fixed num-
ber. If this number is too small, the messages have a lower chance of delivery. On the
other hand if the number of allowed copies is too big, then the schemes suffer from high
overhead. The graph Fig. 6 for the delivery success ratio when the speed of the ferries is
set to fast is almost identical to the previous graph. This is because the message TTL
throughout the simulations is set to a generous value (i.e. 2000 mSecs) as compared to
the total time of simulation. This ensures that as long as the packets are not lost or are
not forwarded to zones with no path to the destination zone, the packets have a reason-
able chance of getting to the destination node. The fast ferry results in only a small
improvement of the success ratio for the five routing protocols. The graph Fig. 7 shows
the most beneficial aspect of the ZBMF Routing protocol. We see that the Epidemic
Routing scheme has a mean overhead of 65, which is in line with expectations because
Epidemic Routing simply floods the message across all nodes and ferries. Prophet and
MaxProp are probabilistic based routing schemes and thus they have a significant de-
crease in the overhead as compared to Epidemic Scheme because they do not flood the
packet across all nodes but transmit packets based on probability values. Spray and Wait
protocol has a low overhead because the algorithm restricts the number of packets that
can be replicated which decreases the number of successfully delivered messages. The
ferries transmit copies of packets to nearby ferries and try to deliver the last replicated
packet themselves; but since very few of the ferries can travel to the transmission range
of the destination, the ferries fail to deliver the last packet to the destinations most of
the time. As a result, the Spray and Wait protocols low overhead has to be taken in
correct perspective the restriction in number of packets translates into lower number
ZBMF: Zone Based Message Ferrying for DTNs 55

of overhead messages and hence lower overhead but also into significant loss of deliv-
ery success. The overhead in the fast ferry case is slightly more for epidemic routing,
and that is because the ferries transmit more packets per simulation and hence the over-
head rises. The Spray and Wait and the probabilistic protocols remain unaffected by
ferry speed, as expected, due to their methodology of selective transmissions as opposed
to the flooding approach taken by Epidemic Routing.

Fig. 6. Delivery Success Ratio for Fast Ferry Speed

Fig. 7. Overhead Ratio for Slow Ferry Speed

The mean latency graph of Figure 9 plots the median latency values of successful
packets for all topologies and routing protocols. Our ZBMF Protocol has the second
lowest median latency for all the topologies. This can be attributed to the sense of
direction and shortest zone path approach that our protocol implements which result
in faster delivery of packets, hence decreasing the median latency. It is interesting to
note that the Spray and Wait protocol has the minimum median latency values among
all the protocols. This can be explained on the basis of the restriction on the number
of packets allowed for replication. The replicated packets are either delivered to the
destination zone by the ferries or they are dropped because more replications cannot
be permitted. Thus, the only successful packets are the ones lying in close proximity
of the originating node and hence, the lower latency. The median latency graph for
56 B.K. Polat et al.

the fast ferry case is essentially unchanged except for the SpraynWait protocol where,
due to the limited number of the successful packets, the ferry speed change results in
significant savings in time taken.

Fig. 8. Overhead Ratio for Fast Ferry Speed latency

Fig. 9. Median Latency for Slow Ferry Speed

Fig. 10. Median Latency for Slow Ferry Speed


ZBMF: Zone Based Message Ferrying for DTNs 57

4 Zone Setup for ZBMF


ZBMF uses zone information to route messages in the network. Successful creation of
overlapped zones through which messages propagate from one zone to another is
crucial for ZBMF. For this purpose we have devised scheme that sets up the zones in
the network space for ZBMF. The detailed implementation of this scheme is left as
future work. The scheme for setting up the zones in the network for ZBMF introduces
only a one-time delay and overhead at the beginning. Once the set up is completed,
ZBMF performs routing and no additional overhead is introduced. Our scheme also
requires that we know the geographic coordinates of the static ordinary nodes in the
network. The position information of the static ordinary nodes is accumulated at a
central node which receives feedback from the mobile ferries about the frequencies
with which they detect the static ordinary nodes. This information is used to divide
the network space into zones and designate ferries to each zone based on their fre-
quency of detecting other nodes. All the nodes flood the network with the Mac ad-
dresses of all the nodes they detect. After a certain settling period P, whose length
depends on the size of the network all the nodes learn the Mac Address of all the other
nodes in the network. The static ordinary node with the smallest Mac Address be-
comes the central node CN. Since CN has the smallest Mac Address it is automati-
cally recognized by all the nodes in the network as the central node. Once the CN is
configured, the ferries send the geographical coordinates of the static ordinary nodes
they detect to the CN. Once the coordinates of the static ordinary nodes are known,
the CN makes an initial setup of zones and groups the static ordinary nodes into
zones. The initial set up of the zones is dependent solely on the coordinates of the
static ordinary nodes. The CN gathers detection frequency messages from the ferries.
These messages contain the Mac addresses of the static ordinary nodes ferries detect
and the ferries' frequency of detecting them. The CN then gathers this information and
makes the necessary modifications to the zone boundaries if necessary so that no
single zone is left without an assigned ferry. After this set up is complete each static
ordinary node and ferry is assigned to a zone. This assignment is based on the fact
that if a ferry visits a specified percentage (node visit percentage np%) of the nodes
in a zone with a specified frequency (node visit frequency threshold nv%), then the
ferry is assigned to that zone. And finally this information is flooded to the network

Ferry ID Ordinary Node Detected Frequency of


detection

Mac Address of Ord. Node 1 24


Mac
Address of ... ...
Ferry n
Mac Address of Ord. Node n 17

Fig. 11. Data collected by Central Node for each ferry for the predetermined duration T
58 B.K. Polat et al.

for all nodes. At the end of this procedure, all the nodes and the ferries in the network
get forwarded the zone configuration of the network and extract their respective zones
and zone routing tables.

5 Conclusion
We proposed a new routing protocol for Disruption Tolerant Networks. Zone based
message ferrying protocol (ZBMF) combines the concept of message ferrying with
traditional DTN routing protocols. ZBMF routes messages through shortest paths of
ferry zones. The major motivation behind ZBMF is to effectively route messages in one
direction through the shortest paths possible. We then implemented and evaluated the
performance of ZBMF. Our evaluations demonstrate the potential of the ZBMF Proto-
col, especially its ability to use routing tables which leads to decrease in the overhead
and latency while still maintaining a high rate of delivery success. ZBMF allows mobile
nodes, designated as message ferries, to operate freely in their zones while carrying
messages among disrupted nodes. This way no special configuration or strict schedule is
needed for message ferries. After the initial zone setup overhead, ZBMF routes mes-
sages with very low overhead through shortest paths. These features enable ZBMF
protocol to be applied to large networks with low costs and render high performance.

6 Future Work
The current work is done under the assumption of connected zones in the topology.
But to partition the challenged network into zones is a nontrivial problem, as dis-
cussed in previous sections. The zone partitioning is important because we can con-
struct the routing tables only on the basis of the relative placement of the zones. One
such method could be electing a central node which would collect information from
the individual nodes and then partition them into zones. There are multiple issues that
need to be taken care of in this scenario as discussed in the section for Zone Setup.
We also found that the performance of the ZBMF Protocol cannot be compared in the
true sense with the SpraynWait Protocol without modifying the latters parameters,
like number of allowed replications, according to the actual topology in each case. We
also need to understand the possible effect of the changes in the environment vari-
ables that were kept consistent till now, like the transmit ranges of the ferries and
zones, the message TTL, scalability problems in case of large number of zones. In the
future, we would like to perform further analysis of the effects of these factors also.
Acknowledgments. We acknowledge the supervision provided to us by Dr. Mostafa
Ammar during the course of this project. His valued advice enabled us to achieve
valuable results and a new scheme for DTNs.

References
1. Fall, K.: A Delay-Tolerant Network Architecture for Challenged Internets, Applications,
Technologies, Architectures, and Protocols for Computer Communication. In: Proceedings
of the 2003 conference on Applications, technologies, architectures, and protocols for com-
puter communications, Karlsruhe, Germany, pp. 2734 (2003)
ZBMF: Zone Based Message Ferrying for DTNs 59

2. Zhao, W., Ammar, M., Zegura, E.: A Message Ferrying Approach for Data Delivery in
Sparse Mobile Ad Hoc Networks International Symposium on Mobile Ad Hoc Networking
and Computing. In: Proceedings of the 5th ACM international symposium on Mobile ad hoc
networking and computing, Roppongi Hills, Tokyo, Japan, pp. 187198 (2004)
3. Opportunistic Network Environment (ONE), Delay-tolerant Networking at TKK Netlab
Helsinki Institute of Technology,
http://www.netlab.hut.fi/~jo/dtn/index.html#one
4. Vahdat, A., Becker, D.: Epidemic routing for partially connected ad hoc networks. Duke
University (2000)
5. Lindgren, A., Doria, A., Schelen, O.: Probabilistic routing in intermittently connected net-
works. ACM SIGMOBILE Mobile Computing and Communications Review (2003)
6. Burgess, J., Gallagher, B., Jensen, D., Levine, B.N.: MaxProp: Routing for Vehicle-Based
Disruption-Tolerant Networks. In: Proc. IEEE INFOCOM (April 2006)
7. Spyropoulos, T., Psounis, K., Raghavendra, C.S.: Spray and wait: An Efficient Routing
Scheme for Intermittently connected mobile networks. In: Proceedings of the 2005 ACM
SIGCOMM workshop on Applications, Technologies, Architectures, and Protocols for
Computer Communication Delay-tolerant networking (2005)
Improving Query Mechanisms for Unstructured
Peer-to-Peer Networks

Guangwei Fang1 and Xiao Zheng2


1
College of Mathematics and Computer Science, Yichun University,
336000 Yichun, China
fanggw@263.net
2
School of Computer Science, Anhui University of Technology,
243002 Maanshan, China
zhx@ahut.edu.cn

Abstract. For unstructured peer-to-peer networks, one of the meth-


ods for improving query performance is informed routing. This paper
introduces an ant-like query routing method. This method views query
messages as articial ants, and utilizes pheromone as routing hints which
direct transferring query messages to nodes owning more resources. This
paper presents generation and update rule of pheromone, routing pol-
icy for articial ants as well. In addition, entering and quitting of nodes
is considered. Our method supports high scalability and is suitable for
large-scale P2P networks. By adjusting the number of articial ants dy-
namically, a better tradeo between cost and quality could be achieved.

Keywords: P2P networks, query routing, ant colony system.

1 Introduction
Peer-to-peer(P2P) networks have been extensively studied in recent years. One
of the key open challenges is the organization and resources query mechanism in
P2P systems. Although with the advantage of faster query, structured P2P net-
works are extremely sensitive to nodes entering and leaving the system. However
for unstructured P2P networks there is no limit on data placement, and extra
process on peers entering and leaving, e.g. resources migration, and position re-
calculation of resources store, it have availability, and suits dynamic computing
situation. In unstructured P2P networks, topology and query routing are the
key factors of inuencing query performance, in which topology impact average
search delay, whereas the latter impact recall and cost. The query routing tech-
nique can be categorized to blinded search and informed search. Blinded search
is adopted by earlier unstructured P2P networks, e.g. Gnutella, the largest open
P2P network in operation, uses a breadth-rst search (BFS) traversal with depth
limit to forward query messages. Henceforth some improved algorithms appear.
But these algorithms have less eciency, which produce more load and useless
query messages.
This paper introduces a pheromone based search technique inspired by the
ant colony algorithm. Dierent from the methods of building an index over the

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 6067, 2009.



c ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
Improving Query Mechanisms for Unstructured Peer-to-Peer Networks 61

data of all neighbors or limiting the message ooding scope to improve search
performance, our technique puts dierent amount of pheromone on dierent
nodes in terms of success ratio of visiting resources, which makes the nodes
owning more resources receive more query.

2 Related Work

Search approaches of unstructured P2P networks can be categorized to blinded


search and informed search. Gnutella uses blinded ooding-based query algo-
rithm to forward messages. Although with the advantages of simple implemen-
tation mechanism, quick response speed, and suiting dynamic joining and leaving
of nodes, the algorithm is not extremely scalable. Each individual query gener-
ates a large amount of trac so that large systems quickly become overwhelmed
by the queryinduced load.
Reference [2] improved Gnutellas algorithm and presented a Random Walk al-
gorithm. In Random Walk, each node forwards k query messages, called walkers,
randomly to its k neighbors. Random Walk can adjust the number of walkers in
order to achieve a trade-o between search success ratio and the number of mes-
sages. This algorithm has become a most popular message forward method. For
example, Freenet uses this algorithm with k=1[9]. Each node forwards the query
to a single neighbor, and waits for a denite response from the neighbor before
forwarding the query to another neighbor (if the query was not satised), or for-
warding results back to the query source (if the query was satised). Reference
[4] studies the behavior of Random Walk in a power-law random network. They
found when selecting those nodes with more degrees, better search performance
will be got. However its scalability lowers because almost every query message
will be sent to these nodes, which will burden almost all query messages in the
whole network.
These improvements of messages forward methods only make the number of
message under control, and can not increase search eciency. Because every node
has the same probability of being queried, the capability of successful search lies
on the density of resources replica and the number of traversed nodes. Beverly
Yang etc. suggested a Directed BFS technique and a Local indices technique
which are dierent from the blinded search mentioned above [1]. The Directed
BFS technique queries a restricted set of nodes intelligently selected to maximize
the probability that the query will be answered. If we allow nodes to answer
queries on behalf of other nodes, then we can still reduce the number of nodes
that process a query without decreasing the number of results. In the Local
Indices technique, each node n maintains an index over the data of all nodes
within r hops of itself, where r is a system-wide variable known as the radius of
the index. When a node receives a Query message, it can process the query on
behalf of every node within r hops. In this way, the data of many nodes can be
searched by processing the query at few nodes, thereby maintaining satisfaction
and number of results while keeping costs low. When r is small, the amount
of metadata a node must index is also quite small. As a result, Local Indices
62 G. Fang and X. Zheng

with small r should be easily accepted by a loosely controlled system such as


Gnutella.
Our work aims to search technique in decentralized and unstructured P2P
networks[2]. The presented algorithm uses accumulated pheromone, which can
be seen as experience, in previous searching to direct latest query routing. Each
node need not maintain the index over the data of its neighbors, which lowers
the cost of maintenance.

3 Problem Descriptions
An unstructured P2P network is dened as a quaternion G(V,E,R,C ). V is a set
of nodes denoting all peers in the network, i.e. V = {vi |i = 1, ..., n}. E is a set
of edges which denotes all links in the network, i.e. E = {eij (vi , vj )|vi , vj V }.
R = ri |i = 1, ..., m is a resources set denoting all shared resources in G. C is a
resource function C : V P(R) that maps each node v V to a subset of R,
where P (R) is a power set of R.
The problem is that given such G and r R, how to solve V subject to
V = {v|r C(v) v V }. In a real decentralized and unstructured P2P
network, nobody knows the global distribution of shared resources, and there is
not any index of whole resources, still not any position information of resources.
Accordingly, the algorithm must traverse nodes in terms of a certain policy, i.e.
form a source node, along edges to reach some nodes, and query the wanted
resource r at having reached nodes. When the number of elements in set V is
very large, a tradeo must be made between recall and cost, e.g. response time,
load. A well-performed algorithm should strike a best recall in a desired response
time, and simultaneously make use of fewer query messages, to lower the load
in the network as possible.

4 Algorithm Design
The principle of Ant Colony System(ACS) is using pheromone remaining in paths
to indirectly exchange the path information among ants, whereas the pheromone
is deposited by former ants passing by the path, which represents some experi-
ence knowledge.
Several researchers have attempted to use the notion of ant pheromones for
network-routing mechanisms [3] and service selection technique [6]. Our work
also follows these thoughts. For each query request, n query messages are gener-
ated. A query message is corresponding to an articial ant. An articial ant se-
lects a neighbor as its next hop in terms of a routing rule, and deposits pheromone
on its passing path. When nding the wanted resource on a certain node, it will
update the pheromone around the node. Moreover, the pheromone remaining on
the path will volatilize in period. Every articial ant has a life-span. When its
life value is zero, the message denoted by an articial ant will be abandoned.
The objective of articial ants is to nd target resources as many as possible
under certain restricted conditions.
Improving Query Mechanisms for Unstructured Peer-to-Peer Networks 63

4.1 Message Routing


In order to transfer the message after it reaching a node, every node should
maintain a message routing table. Simulating the behaviors of ants, our algo-
rithm uses pheromone to direct the routing of query messages. Accordingly, the
message routing table is a pheromone table composed of pheromone on the paths
to neighbors. The item of a pheromone table is composed of a neighbor node
and the pheromone on the relative path, i.e. a pair (n, p). n denotes a neigh-
bor node, p denotes pheromone amount, and (n, p) denotes that the amount of
pheromone on the path from the local node to neighbor n is p. The amount of
pheromone represents the success ratio of previous searching at the correspond-
ing node. More pheromone denotes higher success ratio. Therefore transferring
query messages to the node having more pheromone may get a higher searching
ratio.
The probability of articial ant k selecting neighbor j at node i is dened as


ph(i, j)
, j J(i) T abu(k)
pk (i, j) = uJ(i)T abu(k) ph(i, u) (1)

0, j  J(i) T abu(k)

where J (i) is a set of neighbors of node i, Tabu(k ) represents the nodes ant k
having passed, and ph(i, j ) represents the pheromone amount on the path from
i to j.

4.2 Pheromone Generation and Updating


Pheromone plays an important role in ant routing which directs ant selecting
path. So how to generate and update pheromone is a key to inuence algorithm
performance. In ACS, local update and global update rule is applied to update
the pheromone on the selected path. Our pheromone update policy based on the
principle of ACS and makes some improvement to satisfy search requirements.
In the following cases, pheromone will be generated and updated.
1) As behaviors of real ants, query messages will deposit pheromone on the
path passed by. Its new generated pheromone will add to the pheromone re-
mained on the path. Let phi (j) denote the pheromone amount on the path from
local node i to neighbor j,
phi (j) = phi (j) + (1 )p1 (2)
where (0, 1) and p1 is a constant.
2) When having found a target resource at node n, the articial ant would
diuse pheromone to all neighbors. Receiving a message of pheromone update,
a neighbor would update pheromone in the corresponding item of its pheromone
table.
phm (n) = phm (n) + (1 )p2 , m J(n) (3)
where (0, 1) and p2 is a constant. m belongs to a set of neighbors of
nodes n.
64 G. Fang and X. Zheng

3) For each item of the pheromone table, an update will be done in period.

phi (n) = phi (n), with (0, 1) (4)

Thus if a neighbor is not always visited, its pheromone amount will be closer
to zero. Such node may lack resources.
4) If a new node enters the P2P network, it will send update messages to all its
neighbors. In this situation, update neighbors pheromone table by formula (3).
5) If a new resource is stored in a node, it will send update messages to all its
neighbors. In this situation, update neighbors pheromone table by formula (3).

4.3 Life-Span Control


Every articial ant has life-span, which could control the number of messages
transmitting in the network and insure there are no more messages transferred
ceaselessly after a query is over. Our algorithm also uses TTL to control messages
life-span. When an articial ant is created, its TTL will be set to an initial value.
Hereafter at each passing node, the articial ants TTL will be updated. If not
found target resources, the ant will decrease TTL. If found, nothing done so that
the ant will visit more nodes. If all neighbors are in the tabu table, the ants TTL
will be set to zero. So the ant will not be transferred and be killed. When ant k
at node i, the updating rule of its TTL is:

T T L(k) 1, r(k)  C(i)

T T L(k) = T T L(k), r(k) C(i) (5)


0, J(i) T abu(k)

where r (k ) represents the target resource of ant k wanting to nd. C (i) is the
set of resources at node i. Based on above discussion, pheromone table update
algorithm and message routing algorithm are described as follows.
Algorithm 1. Pheromone table updating algorithm. This algorithm runs after
receiving a pheromone update message. In the following function, the parameter
node1 is a local node, and node2 is a neighbor.
UpdatePheromone(Node node1,node2,float Ph) {
node1.PheromoneTable[node2]=blta*
node1.PheromoneTable[node2]+(1-blta)*Ph }
Algorithm 2. Message routing algorithm.
boolean HandleMessage (Message Query, Node node) {
//if the node ever received this message, delete it.
if SeenMessage(Query,node)
return false;
//if there is not a target resource, the TTL minus 1,
otherwise diffuse pheromone to neighbors.
if (NoFound)
Improving Query Mechanisms for Unstructured Peer-to-Peer Networks 65

Query.TTL--;
else
for any nodei in node.neighbour
UpdatePheromone(node,nodei,ph);
//if all neighbor have been visited, set TTL to zero.
if (all node.neighbour in Query.Tabu)
Query.TTL=0;
//if TTL equals to zero, delete the message.
if(Query.TTL==0)
return false;
//Select a neighbor by (1), and forward the message to it.
nextnode=SelectNextNode(node.neighbour);
ForwardMessage(Query,
nextnode);
//Update the pheromone on the path to the next hop by (2).
Update(node,nextnode);
return true; }

5 Simulations and Performance Analysis

NeuroGrid simulator[7] is a Java based P2P network simulator which emphasizes


particularly on simulation and performance test for unstructured P2P network.
The simulator is initially used to evaluate a search algorithm called NeuroGrid[8].
Due to open source software, it also can be modied to simulate user designed
search algorithm. This paper uses NeuroGrid simulator to simulate our algorithm
and evaluate its performance. The parameters in our experiment are showed in
table 1. The experiment selects randomly a node as a start node of query. Each
experiment consists of at least 10000 trials and the average values are calculated
as experiment results subsequently. This paper implements two experiments. Ex-
periment 1 calculates our algorithms search eciency and the total number of
query messages under the variety number of ants. Search eciency is the ratio
of the number of success search to the total. The number of messages could
impact network load and computing resource in node. Under the precondition

Table 1. Comparison of four algorithms

Conguration Value
No. of nodes 1000
No. of documents 1000
No. of keys 100
No. of neighbors at each node 4
No. of documents at each node 4
No. of keys at each document 2
Initial TTL 7
66 G. Fang and X. Zheng

600 60

Search efficiency (%)


500 50
No.of messages

400 40
300 30
200 20

100 10
0 0
5 10 15 20 25 30 5 10 15 20 25 30
No. of ants No. of ants

Fig. 1. Relation between the number of ants and algorithm performance

Table 2. Comparison of four algorithms

Metrics Our algorithm NeuroGrid Gnutella Random Walk


Average path length 5.3 4.6 6.3 9.2
No. of messages 320 154 5400 2650

of ensuring search eciency, decreasing the number of messages is an eective


method of improving search performance. Fig.1 shows that with the number of
ants increasing, search eciency will properly increase, but the number of mes-
sages will increase simultaneously. Experiment 2 compares our algorithm with
other classical algorithms, e.g. NeuroGrid, Gnutella and Random Walk. Two
metrics are calculated that are average path length and the number of messages.
As showed in Table 2, the performance of our algorithm is little less than Neu-
roGrid. However, each NeuroGrid node maintains a knowledge base that stores
associations between keywords and other NeuroGrid nodes. The maintenance of
NeuroGrid nodes is higher than the node in our algorithm.

6 Conclusion and Future Work

This paper presents an ant-like query message routing mechanism for resource
searching in unstructured P2P networks. The method views the query message
as an articial ant, and target resources as the food ants wanting to search.
The algorithm utilizes pheromone as heuristic information that direct an ant to
select the next hop. By adjust the number of ants, a better tradeo between
search eciency and cost could be achieved. Moreover, the maintenance of a
pheromone table is small, which hardly increase the burden of the resident node.
The future works are improving our algorithm and farther performance analysis,
which include the update policy of pheromone, the relation between the number
of ants and the size of a network, and the adaptation while node entering and
leaving frequently.
Improving Query Mechanisms for Unstructured Peer-to-Peer Networks 67

Acknowledgments. This work is supported by Natural Science Research Foun-


dation of Anhui Higher Education, China under Grants No. KJ2007B308ZC.

References
1. Yang, B., Garcia-Molina, H.: Improving Search in Peer-to-Peer Networks. In: Proc.
of the 22nd International Conf. on Distributed Computing Systems (ICDCS 2002),
pp. 514. IEEE Press, New York (2002)
2. Lv, Q., Cao, P., Cohen, E., Li, K., Shenker, S.: Search and replication in unstructured
peer-to peer-networks. In: Proc. of the Intl. Conf. on Measurements and Modeling
of Computer Systems, pp. 8495 (2002)
3. Dorigo, M., Gambardella, L.M.: Ant Colony System: A Cooperative Learning Ap-
proach to the Traveling Salesman Problem. IEEE transactions on evolutionary com-
putation 1, 5366 (1997)
4. Adamic, L.A., Lukose, R.M., Puniyani, A.R., Huberman, B.A.: Search in power law
networks. Phys. Rev. E 64, 4613546143 (2001)
5. Caro, G.D., Dorigo, M.: AntNet: Distributed stigmergetic control for communica-
tions networks. J. of Articial Intelligence Research 9, 317365 (1998)
6. Zheng, X.: Ant Colony Intelligence Based Solution for Grid Services Selection. In:
Proc. of the 7th World Congress on Intelligent Control and Automation, pp. 2512
2517. IEEE Press, New York (2008)
7. Joseph, S.: An Extendible Open Source P2P Simulator. P2P Journal, 115 (2003)
8. Joseph, S., Hoshiai, T.: Decentralized Meta-Data Strategies: Eective Peer-to-Peer
Search. IEICE Trans. on Communications 6, 17401753 (2003)
9. Freenet website, http://freenet.sourceforge.net
Layer 3 Initialization Procedures Recommendation in
Next Generation Heterogeneous Mobile Networks

Sebastio Boanerges Ribeiro Junior and Paulo Roberto de Lira Gondim

Departamento de Engenharia Eltrica, Universidade de Braslia, Brazil


sbribeiro@brasiltelecom.com.br, pgondim@ene.unb.br

Abstract. The wireless service evolution in conjunction with widespread of


WLAN and multi-mode handsets combining heterogeneous radio technologies
leads to the possibility of ubiquitous service access thru the offer of vertical
handovers techniques. This combined with the multimedia real time service of-
fering unveils the need of new layer 3 initialization procedures in order to keep
the end-user quality and experience. This paper presents an evaluation of the
network connection procedures against the requirements of the new scenario
and proposes changes to the standards.

Keywords: Heterogeneous wireless networks, vertical handover, multimedia


real time service, Wi-Fi association and authentication, layer 3 initialization.

1 Introduction
The technological evolution being seen in the last years lead us to the offer of distinct
high speed packet data access networks suitable for real time multimedia services and
to the availability of multi-mode handsets in the manner to allow the ubiquitous con-
nection on those access networks. In conjunction with the broad acceptation of Ses-
sion Initiation Protocol (SIP) [1] based service and web based services, this evolution
is addressing the end user wish for seamless connection regardless the location and
network being used.
The end user need of seamless connection means that the systems, which can be
triggered by the network or by the handset, automatically choose the best access net-
work based on the better balance of predefined requirements like cost, speed and
quality of service (QoS). Nowadays, the most comprehensive packet data access
network regarding cost and data rate are those based on IEEE 802.11 (Wi-Fi) [2].
Seamless wireless multimedia communication data service that wants to get all the
advantages of the surrounding access networks must be able to execute simple, fast
and effective handover procedure between different accesses networks technologies,
this kind of handovers are best known on the literature as vertical handover [3].
In the case of WWAN (e.g. GSM and UMTS networks) to WLAN (e.g. IEEE
802.11) handover the traditional mechanisms based on quality connection monitoring
in the physical layer cannot be applied due to the WLAN coverage is include in the
WWAN coverage in most of the cases [4]. As the handset is able to scan the available
WLAN networks even when the WWAN signal quality is acceptable, the handover

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 6875, 2009.


ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
Layer 3 Initialization Procedures Recommendation 69

decision to WLAN is handset based. Emerging standards like IEEE 802.21 Media
Independent Handover are introducing network mechanisms to facilitate the vertical
handover decision based on information services supplied from the network to the
handsets [5].
Fast handover is crucial for multimedia real time services mainly when in the di-
rection WWAN (e.g. WCDMA or GSM networks) to WLAN (e.g. IEEE 802.11g/Wi-
Fi), due to the better data rate and cost per bit offered in the existing Wi-Fi networks
[4]. The better end user experience in this case could be gotten as a measurement of
the overall handover delay to WLAN, as this means more data rate availability and
less cost using the service. In this paper, the overall handover delay is considered as
the time used by the system with all the process related to the switch between the
radio bearer access networks until the complete availability of the new wireless net-
work to the service being requested by the end user.
However the process of handover can cause disruption of the service being re-
quested due to the technology and limitations on the handset resources. The pace for
the broad acceptation of seamless vertical handover services is tied with the market
offer of low cost multi-mode handsets even if that means resource constrains. As
example, a handset can switch off the previous radio interface before initiate attach-
ment procedure to the new access network due to lack of resource to keep both radio
interface active or as a strategy to save battery life. The different characteristics be-
tween the heterogeneous access technologies can also cause delays due to the change
in the handset network interface configuration.
An important process that account for the overall handover delay and for the dis-
ruption of service is not related with the radio link itself, but with the network layer
(ISO/OSI layer 3) initialization procedures. These procedures are required by the
handset network interface to setup the Internet Protocol (IP) parameters for the ser-
vice connection to the new access network. The procedures include the network pa-
rameters retrieving from specific servers or from auto configuration methods and the
succeeding address duplication detection method. For each version of IP (aka IPv4
and IPv6) commercially available there are different layer 3 initialization procedures
and protocols defined for these procedures.
Measurements performed by the authors unveiled delay up to 1.5 seconds during
the layer 3 initialization procedures after Wi-Fi association in a handset with Win-
dows Mobile Operational System (OS). This delay value does not guarantee the Qual-
ity of Service (QoS) during multimedia real time data services offering, including
video streaming and video communication service. Literature recommendation stands
that interruption of the service should not be greater of 40 ms for voice service quality
during handover procedures [6].
The layer 3 initialization procedures were evaluated and the standards addressing
duplication detection methods found as the main offender in the delay measurements.
The standards used for address duplication detection are described in this paper and
suggestions for improvements aiming the target delay are proposed.
In this paper we describe the main general procedures that will take place during
the handset vertical handover to Wi-Fi networks before the service being requested
to be ready to re-connect to the operators data network. These procedures are the
authentication and association to the Wi-Fi network and the layer 3 initialization
and connection procedures in IPv4 and IPv6 networks. The results of the authors
70 S.B. Ribeiro Junior and P.R. de Lira Gondim

measurements of the layer 3 connection procedures are also described and we con-
clude with suggestions for standards improvement to the scenario under analysis.

2 IEEE 802.11i Authentication and Handover Delay

The original IEEE 802.11 specification defines two subtypes of authentication ser-
vices: Open System and Shared Key [2]. Open System authentication can be defined
as a null authentication algorithm. Shared Key authentication is based on the ex-
change of a shared secret key and can be used with the WEP (Wired Equivalent Pri-
vacy) privacy mechanism. Due to security weakness these security mechanisms were
superseded by the IEEE 802.11i [7], published as an amendment to the IEEE 802.11
standard.
The 802.11i architecture define the use of IEEE 802.1X [8] standard for authenti-
cation and a new mechanism for confidentiality and data integrity based on RSN
(Robust Security Network) together with the security protocols TKIP e CCMP [7].
Due to backward compatibility with the previous state machine the IEEE 802.11i
defines that an open system authentication shall be requested before the association
procedure to the selected Access Point (AP). With IEEE 802.11i, after a successful
association the mobile station supplicant initiates the IEEE 802.1X authentication and
the key management, in a different way, the original IEEE 802.11 standard allows the
exchange of data frames just after the association state.
The IEEE 802.1X requires the use of some EAP method for the authentication pro-
cedure. Heterogeneous WWAN and WLAN services based on dual mode handsets
look up to use EAP-SIM [9] or EAP-AKA [10] methods in view of the fact that they
are based on the pre-existing WWAN (GSM/WCDMA) credentials stored in the
SIM/USIM Card.
The use of WWAN pre-existing credentials allow seamless authentication of the
handset in any supported WLANs, but may introduce some extra delay due to the
authentication request being sent to the home WWAN.
After the successful IEEE 802.1X authentication, a method of security association
is established for the purpose of data confidentiality. The IEEE 802.11i defines also
key management procedures and protocols to provide fresh keys for the security asso-
ciation called 4-Way Handshake and Group Key Handshake. The Fig. 1 illustrates the
IEEE 802.11 operations described.
After a successful key management procedure the handset can start the network
connection, based on layer 3 procedures. As the handset IEEE 802.11 MAC manage-
ment entity had issued the primitive confirming the association success and had
changed the IEEE 802.11 state variable [2] (State 3: Authenticated, Associated) be-
fore the IEEE 802.1X procedures [7] the handset could also had started the disconnec-
tion from the previous access network. This means that any delay due to the network
connection procedures represents an interruption on the end-users service. Depend-
ing of the IP version supported, different procedures and different delay figures are
applied, as described in the sequence.
Layer 3 Initialization Procedures Recommendation 71

Fig. 1. IEEE 802.11 and 802.11i initial operations

3 Connection Procedures on IPV4 Networks


Most of the data networks deployed today is still based on the old IPv4 standards due
to the lack of ubiquitous end user clients/handset supporting to new IPv6 standards.
Another reason for the current scenario is that some of address space problems aimed
to be solved by IPv6 networks have been temporary postponed as result of the work
done in areas as NAT (Network Address Translation) and address allocation, as the
IETF Classless Inter-Domain Routing (CIDR) standard [11].
IPv4 dynamic connection procedures rely on DHCP [12] protocol. DHCP standard
defines that some method should be used by the client station to check that the IP
address granted is not in use by any other station or host. The industry chosen method
is based on the ARP (Address Resolution Protocol) protocol. The ARP protocol is an
IETF standard [13] defined to find a layer 2 address (e.g. Ethernet MAC address)
based on the layer 3 information (e.g. IP address). For address duplication detection
purposes a special ARP packet is send to the network with the same sender and target
72 S.B. Ribeiro Junior and P.R. de Lira Gondim

address, this packet is called Gratuitous ARP, and it is intended to not be replied.
Most of the handsets models and mobile computer OS deployed nowadays apply
three sequential Gratuitous ARP packets as its address duplication algorithm. The
issue with this approach is the timer value used. For each new Gratuitous ARP packet
sent the timer expiration time is multiplied by a factor (usually 2). As the success
criteria (i.e. no duplicate IP address informed) means no reply to the Gratuitous ARP
request, the address duplication detection means unnecessary delays due to the se-
quence of timer expired requests. Commercial handset with Windows Mobile OS
implementations can show delays as long as 1.5 seconds as result of the sequential
Gratuitous ARP timer expiration, which means an unacceptable delay in the case of
seamless handover of multimedia communication applications.

Fig. 2. IPv4 Layer 3 Initialization Procedure

The authors did performed measurements on different handsets and operational


systems of the delay to execute the network connection procedures. In the worst case,
which was verified with Windows Mobile 5 OS, the handset sent out three Gratuitous
ARP request, the first immediately after the DHCP procedures (DHCP ACK), the
second after 0.5 seconds and the third one was delivered one second after the second
packet, as show in Fig. 2. This means a delay of 1.5 seconds between the first and the
Layer 3 Initialization Procedures Recommendation 73

third Gratuitous ARP packet just to check whether IP address defined to the interface
was duplicated. The best case was seamed in a GF210 handset from UTStarcom with
proprietary OS and means transmit of two gratuitous ARP packets with 0.5 seconds
delay between them. Even in the best case verified the delay due the ARP expiration
timer is not appropriated for the offer of multimedia real time services during the
vertical handover procedure.
There is no definition in the ARP standard [13] about the complete procedure for
the use of gratuitous ARP to address detection. The authors believe that there is no
room for modifications on this old protocol to define a new procedure for multimedia
real time vertical handover services, and the best approach should be based on the
introduction of the new IPv6 protocols suite.

4 Connection Procedures on IPV6 Networks


The IPv6 standard definition came up with a complete set of new protocols to deal
with the IP address configuration issues. The IPv6 suite defines two different standard
methods for network layer configuration: the autoconfiguration [14] defines a state-
less IP address configuration procedure based on the subnet prefix and a local identi-
fier [15] and the DHCPv6 [16] defines a stateful protocol and procedure for address
and network configuration of host interfaces.
All IPv6 configuration methods shall use a duplicate address detection (DAD) algo-
rithm [15], defined in [14], which is based on the use of the new ND (Neighbor Dis-
covery) protocol [17]. In a similar way that ARP protocol implementation, the DAD
algorithm defines that the station shall sends packets containing Neighbor Solicitation
messages with source address as undefined address and target address as the tentative
address (IP address candidate to the interface). The lack of response after a timer expi-
ration delay means that the tentative address can be assigned to the interface.
Even on this new protocol the retransmission timer proposed in the original stan-
dard (RETRANS_TIMER) is set to 1,000 milliseconds [17], an extremely long time
period for multimedia real time communications. In the same standard document is
recognized that the proposed timer values can be overridden by specific changes re-
garding the link layer being used.
Considering the proposed scenario of multimedia real time service during
vertical handover procedure the authors recommend that the ND protocol timers [17]
should be reviewed and updated. This review should include the retransmission
timer (RETRANS_TIMER) and the maximum retransmission solicitation delay
(MAX_RTR_SOLICITATION_DELAY) values, based on the expected maximum
disruption time the specified service should support to keep the end user experience
and quality of the service.
The MAX_RTR_SOLICITATION_DELAY is defined as the upper limit value for
the random time delay selected by the node before send the first message after an
interface initialization. The purpose is alleviating possible congestion problem when
many interfaces are initialized at the same time, such as after a general failure. This
principle could be applicable in the vertical handover scenario proposed due to the
possibility of a group of handsets arrive at the same time under the coverage of the
same AP or WLAN hotspot. But, the value defined in the original standard (1,000 ms)
74 S.B. Ribeiro Junior and P.R. de Lira Gondim

must be updated to attend the proposed scenario and because the AP coverage limits
the amount of concurrent handsets associations.
For voice services the European Telecommunications Standards Institute (ETSI)
recommends a maximum interruption of the service during handover of 40 ms [6].
This figure was recommended for horizontal WWAN handover scenario with
switched voice service, but should be considered as indicative of the tight boundaries
necessary for multimedia real time communications services.
The simple redefinition of the timer expiration time to meet the expected maximum
disruption time could not be feasible for all the deployed Wi-Fi hotspots, due to the
packet travel time in hotspot subnet. The adoption of smaller value for retransmission
timer should lead to a re-engineering of the hotspots network to guarantee that the ND
solicitations messages will be processed by all the nodes before the expiration timer.

5 Conclusion
The broad availability of different high speed wireless data access technologies cover-
ing the same region, together with the evolution of multi-mode handsets and the end
user demands of multimedia real time service has lead to the development of novel
seamless vertical handover methods. This paper scenario is based on vertical hand-
over procedure from WWAN to Wi-Fi access network. The attractiveness of this
scenario is due to the higher data rate and lower cost per bit provided by WLANs
networks in comparison with commercial WWANs (e.g. GSM and UMTS networks).
These new vertical handover scenarios should push the review of handset authenti-
cation and layer 3 connection procedures, in the manner to address the new require-
ments of connection delays and quality of service while the handover is being
executed. The main issue is regarding the disruption of service during a multimedia
real time communication service, as per example a video conference. As the network
layer 3 connection procedures are executed after the Wi-Fi association and authentica-
tion, i.e. the link layer handover, any delay on it affect directly the service quality and
the end-user experience.
We have shown that the duplicate address detection mechanism is the main cause
of the network connection delay, as result of its success criteria based on timer expira-
tion request.
The IPv4 networks offer no method to avoid disruption of the service during the
network connection on the proposed scenario. The methods being utilized account up
to 1.5seconds delay and the authors recommendation is to upgrade the existing ac-
cess networks and handset to support IPv6 for the evaluated scenario.
With the development of IPv6 protocol suite, novel methods for IP configuration
and for duplicate address detection were also developed. Even with these new meth-
ods the published standard defines a long timer limit value which is not feasible for
multimedia real time service handover delay. The authors recommendation for IPv6
Wi-Fi based networks subjected to vertical handover is the definition of new tighter
timer values for the ND protocol specific variables. The use of tighter timer limits
should also be accompanied with engineering guidelines for the local area networks
serving the Wi-Fi hotspots, as extensive hotspots networks can not be able to respond
timely the duplicate address detection requests.
Layer 3 Initialization Procedures Recommendation 75

The adoption of the recommendations of this paper can assist and stimulate the de-
velopment of low tier multi-mode handsets. The capacity to use the same network
interfaces regardless the wireless access network, and continuing being able to offer
seamless handover service can simplify the handset design and save production cost.

References
1. Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Hand-
ley, M., Schooler, E.: SIP: Session Initiation Protocol, RFC 3261, IETF (June 2002)
2. IEEE Computer Society, ANSI/IEEE Std 802.11, Part 11: Wireless LAN Medium Access
Control (MAC) and Physical Layer (PHY) Specifications, 1999 edn. (June 2003)
3. Nasser, N., Hasswa, A., Hassanein, H.: Handoffs in fourth generation heterogeneous net-
works. IEEE Communications Magazine, 96103 (October 2006)
4. Nie, J., Wen, J.C., Dong, Q., Zhou, Z.: A seamless handoff in IEEE 802.16a and IEEE
802.11n hybrid networks. In: 2005 International Conference on Communications, Circuits
and Systems, pp. 383387 (May 2005)
5. de la Oliva, A., Bernados, C.J., Melia, T., Soto, I., Vidal, A., Banchs, A.: A case study:
IEEE 802.21 enabled mobile terminals for optimized WLAN/3G handovers. Mobile Com-
puting and Communications Review 11(2), 2940 (2007)
6. European Telecommunications Standards Institute, TR 122 925 V3.1.1 (2000-2002) Uni-
versal Mobile Telecommunications System (UMTS); Service aspects; Quality of Service
and Network Performance (3G TR 22.925 version 3.1.1 Release 1999) (April 1999)
7. IEEE Computer Society, 802.11i, Part 11: Wireless LAN Medium Access Control (MAC)
and Physical Layer (PHY) Specifications Amendment 6: Medium Access Control
(MAC) Security Enhancements (July 2004)
8. IEEE Computer Society, IEEE Std 802.1X-2004, IEEE Standard for Local and Metropoli-
tan Area Networks Port Based Network Access Control (December 2004)
9. Haverinen, H., Salowey, J.: Extensible Authentication Protocol method for Global System
for Mobile Communications (GSM) Subscriber Identity Modules (EAP-SIM), RFC 4186,
IETF (January 2006)
10. Arkko, J., Haverinen, H.: Extensible Authentication Protocol method for 3rd generation
Authentication and Key Agreement (EAP-AKA), RFC 4187, IETF (January 2006)
11. Rekhter, Y., Li, T.: An architecture for IP address allocation with CIDR, RFC 1518, IETF
(September 1993)
12. Droms, R.: Dynamic Host Configuration Protocol, RFC 2131, IETF (March 1997)
13. Plummer, D.C.: An Ethernet Address Resolution Protocol, RFC 826, IETF (1982)
14. Thomson, S., Narten, T.: IPv6 Stateless Address Autoconfiguration, RFC 2462, IETF
(1998)
15. Donz, F.: IPv6 autoconfiguration. The Internet Protocol Journal 7(2), 1216 (2004)
16. Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., Carney, M.: Dynamic Host Con-
figuration Protocol for IPv6 (DHCPv6), RFC 3315, IETF (July 2003)
17. Narten, T., Nordmark, E., Simpson, W.: Neighbor Discovery for IP Version 6 (IPv6), RFC
2461, IETF (December 1998)
Broadcast Data Scheduling in Wireless Environment

Xianjun Sun, Chuang Lin, Weidong Liu, and Yanping Xiao

Dept. of Computer Science and Technology Tsinghua University, 100084 Beijing, China
sunxj06@mails.tsinghua.edu.cn, chlin@tsinghua.edu.cn,
liuwd@mail.tsinghua.edu.cn, ypxiao@csnet1.cs.tsinghua.edu.cn

Abstract. Broadcast data scheduling is widely used in traffic information sys-


tem, stock price, weather information and news distribution system. This paper
proposes a novel approach to disseminate real-time information of these envi-
ronments, which delivers data through multiple push channels and one pull
channel. The simulation results show that our approach achieves better per-
formance than the previous work in practical applications.

Keywords: Broadcast scheduling, multi-channel, rate adjustment.

1 Introduction
Broadcast data delivery is a good method to disseminate information to large clients
in satellite or wireless asymmetric environment, where bandwidth of downlink chan-
nel is much higher than that of uplink channel. The main aim of broadcast scheduling
is to provide desirable data items for clients with less delay and lower energy con-
sumption. Average expected delay (AED) is an effective metric to evaluate the per-
formance of broadcast scheduling. There are a lot of broadcast scheduling algorithms
to minimize this metric, including off-line and on-line. Off-line algorithms assume
both the entire set of data items and the demand probability for data items are known
in advance. On the contrary, on-line algorithms decide which item to transmit first
without a-priori knowledge of data items. In general, both on-line and off-line algo-
rithms adopt approximate approach to get near-optimal value, and the conclusions are
related to specific assumptions and parameter setting. Compared to off-line algorithm,
the on-line version is suitable for real applications, although it has an inferior to opti-
mization value.
Performance of multi-channel broadcast scheduling is measured by multi-channel
average expected delay (MCAED), which is related with parameters such as access
probabilities, database size and channel bandwidth. Applications, such as meteoro-
logical data dissemination, news dissemination, etc., are expected to delivering bulk
incremental data items to clients. In this situation, database size is variable, which is
considered less by previous work. This motivates us to develop a new on-line sched-
uling algorithm of multiple push channels and one pull channel (MP+P). Our algo-
rithm focus on multi-channel on-line algorithm, which considers more parameters,
such as variable database size, date item length and variable channel bandwidth.
The rest of the paper is organized as follows. Section II presents related work on
multi-channel broadcast scheduling in recent years. Section III describes the system

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 7683, 2009.


ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
Broadcast Data Scheduling in Wireless Environment 77

model and performance metric. Section IV introduces the MP+P scheduling algo-
rithm. Section V presents simulation results, and the last section concludes this paper.

2 Related Work
Pervious research on multi-channel scheduling can be divided into two catalogues: 1)
designing algorithms of data allocation among multi-channels[1], [2], [3], [4]. 2)
investigating into hybrid mechanism of broadcast scheduling in multi-channel envi-
ronment [5], [6].
The FLAT algorithm adopts a simple but not so efficient allocation algorithm,
equally allocating the items to each channel [1]. Peng and Chen proposed VFk algo-
rithm, which skews the amount of data allocated to each channel [2]. Yee and
Navathe proposes the DP algorithm, which using dynamic programming to partition
the data items on multi-channels and achieved an optimal solution [3]. Although these
algorithms share the same objective, they assign different weights to complexity and
performance.
Navrati Saxena and Cristina Pinotti propose a balanced approach to partition the
data items into multi-channels averagely, which adopts a hybrid push-pull scheduling
for each single channel (denoted by BH). In each channel, the push and pull sets are
served in an interleaved [5]. BaiHua Zhang, Xia Wu, Xing Jin and Dik Lun Lee pro-
pose a two-level optimization scheduling algorithm (denoted by TOSA), where high-
level optimization strategy clusters data items into channels according to available
bandwidth, and low-level optimization schedules the items on each channel in order
to guarantee the average performance [6].
However, both average and proportional data allocation among channels can not
guarantee maximum utilization of bandwidth. So we address the issues by a heuristics
approach. In our model, we use a rate function to adjust rate among channels during
the broadcast. Issues of rate control among multiple channels in wireless network
have been focused by several researches recently [7], [8]. In this paper, we combine
scheduling method with rate adjustment scheme, which proves to achieve a perform-
ance tradeoff between efficiency and simplicity.

3 Preliminary

3.1 System Model

Suppose the database contains N data items, denoted by D={d1,d2, ...,dN}. Incremental
data items is D = {(n1 , l1 ), (n2 , l 2 ),...(n m , l m ) , where (ni, li) denotes data item set i consists
of ni data items with equal length li. Let N denote the number of data items in D .
Each item di can be characterized by a demand probability pi.
First, we give a model for multi-channel push and one pull scheduling (MP+P) by
definition.
Definition 1. A broadcast program S for MP+P scheduling is a partition of incre-
mental data into multi-channels, and a placement of data item selected from data set
78 X. Sun et al.

by clients outstanding requests in another channel. This program determine rate of


each channel by a dynamic bandwidth allocation scheme.
For brevity and clarity sake, we formulate the model above by
S1k F ( R (D, k ), R ( D), C (k + 1, B) . Where R (D, k ) denotes a partition of data set
D among k channels, R (D) denotes a cycle scheduling of data item selected from D
according to requests probability, and C(k+1,B) denotes assigning bandwidth to each
channel.
Fig. 1 shows a simple data allocation and bandwidth assignment scheme of MP+P
scheduling, where k=3, R1=3, R2=3, R3=4, R=10. The broadcast program transmits
3, 3 and 4 unit length data item of channel C1, C2 and C3 respectively by a round-bin
mode.

8 6 7 9 C3 Incremental Cyclical

5 3 4 C2 8 6 7 9 5 3 4 1 2 1

R3=4 R1=3 R2=3


2 1 C1 time

Fig. 1. MP+P Scheduling

3.2 Performance Metric

Average Expected Time for pure-push-based schedule can be given by


N
AED = w p Where w denotes expected item delay for items i on S, assuming that,
i =1
i i i

at any instant of time, clients start to listen with the same probability.
Performance metric to evaluate pure-pull-based schedule is often using Average
N w(r j ) p j


r j R j
Access Time, ATT = .Where w(rj) is the actual delay between the
| Rj |
i =1
request r for item j and the transmission time of item i. R and |Rj| denote the set of
request for item i and its size respectively.
Suppose the number of data items is N, that of data items allocated to channel i is
Ni, and Ni=N. Then average wait time of multiple push-based channels is meas-
ured as follows.

1 k 1 N w(r j ) p j
wi pi + j =1
N r j R j
MCAED = . (1)
k 1 N i i =1 | Rj |
Broadcast Data Scheduling in Wireless Environment 79

4 Designing Algorithms of MP+P Scheduling


In this section, we discuss algorithms of MP+P scheduling firstly. Then, we present
rate adjustment scheme for MP+P scheduling. Finally, we give a theoretical analysis
of performance metric lower bound.

4.1 MP+P Scheduling

MP+P scheduling allocates M data flows to k channels according to the data items
arrival character, keeping data item from the same flow in a channel. After a simple
data item partition, our approach begins to order sequence data items in each channel
towards minimizing value of MCAED. MP+P scheduling can be completed by three
steps.
1) Data Allocation. Picking up a data flow with maximum Ai and assigning it to a
channel with minimum Aj, where Ai is a sum value of product of each data items pi
and li in chanel i .
2) Permutation. Sorting items in non increasing order of Ai. As algorithm 2 shows,
this step moves the scheduling towards the near optimization.
3) Rate adjustment. Assigning total link bandwidth to each channel by formula 3
and determining rate of the broadcast program for each channel.

4.2 Rate Adjustment

To the best of our knowledge, there are two kinds of broadcast in the literature, flat
broadcast and non-flat broadcast, which adopt average assign and ratio assign of
bandwidth respectively. For flat broadcast, the expected wait time for an item on the
broadcast is the same for all items regardless of their relative importance to the cli-
ents. Alternatively, the server can broadcast different items with different frequency,
so that the broadcast can emphasize the most popular items and de-emphasize the less
popular ones. Theoretically, non-flat broadcast programs can be considered as a
bandwidth allocation problem.
We assign bandwidth among channels by the way similar to non-flat broadcast
programs, which make bandwidth B proportional to Ai in each channel.
Ai = d C (li / pi ) .We consider data items in each channel as a queue. When one
i i

or more channel queues transmission finished, the others can acquire available band-
width through recalculating rate. This step of rate adjustment will be implemented
during the broadcast. Note that, although recalculation of rate can be made at any
time, we suppose that each data item is transmitted by a constant rate, which implies
any channel can only recalculate rate until it finishes current items transmission at a
fixed rate. Then, the average wait time for channel i to acquire bandwidth after chan-
nel j finishing its task is li/2Ri(t) (ij).
At the beginning of the broadcast program, we calculate a initial value Ri(0) of
each channels rate. Let Ra(t) denote available bandwidth after a certain queue finish-
ing transferring. Ri(0) and Ra(t) meetRi(0)=R and Ra(0)=0.
80 X. Sun et al.

Algorithm 1. Initialization and data allocation


Input:( , k, p)
Output: partition M data flows to K channels
Incremental data flow ={D1,D2,Dm}
Channel queues ={Q1,Q2,...Qk}
1: while () do
2: find Max(Ai) in
3: find Min(Aj) in
4: Qj +Di
5: Remove Di from
6: End while

In order to help the broadcast program know when and which channel is free in
time, we let server maintain a boolean value qi(t) for each channel. if queue i is null,
qi(t)=0,else qi(t)=0.

Ra (t ) = Ri (t ) qi (t ) . (2)

According to rule of random allocation, the rate of each channel can be obtained by

Ri (t ) = Ri (t ) + Ra (t ) . (3)

Algorithm 2. Permutation and rate adjustment


Input :(k, B, p)
Output: broadcast program S
1: For i1 to K do
2: For j1 to |Di|
3: Ai Ascending sort
4: End for
5: Ri(t)Ri(t)+Ra(t) Compute channel rates
6: Ra(t) 0
7: End for

4.3 Complexity and Lower Bound

In this section, we discuss the time complexity and lower bound of MP+P scheduling.
Operation in algorithm 1, such as finding a maximum and a minimum on the maxi-
mum heap m and k can be done in O(n) time. The time complexity for algorithm 2 to
sort, compute boolean value qi(t) and rate of queues in each channel is O(n+nlogn).
So the time complexity of MP+P scheduling algorithm is O(n+nlogn).
For the rest of this subsection, we will present a lower bound of MCAED for
MP+P scheduling. We suppose that the demand probability pi=1, which means each
incremental data items should be transferred. Letting (N ) denote items selected by
outstanding request, we can derive.
Broadcast Data Scheduling in Wireless Environment 81

Theorem 1. For all sets of data items access probabilities,


(N )
1 N + k k N i
MCAEDlower _ bound =
R
( i
2k i =1 j =1
l +
j =1
lj) . (4)

Proof: For multiple push channels, there is wi = li/Ri(t). We replace wi with li/Ri(t) ,
then formula 3 can be modified as below
(N ) l
1 k 1 ni 1
MCAED = + j

k i =1 ni j =1 Ri (t ) j =1 R1 (t )
. (5)

k (N )

R (t)
1 ni + 1 lj
= + (6)
k 2 Ri (t ) 1
i =1 j =1

As analysis of above, the average wait time for channel i to acquire bandwidth of
channel j is li/2Ri (t).Thus, when the broadcast program transfer data items of these
two queues in turns by a total rate Ri(t)+Rj(t), the following holds
li lj li + l j
+ . (7)
Ri (t ) R j (t ) Ri (t ) + R j (t )

Applying (7) to (5), we can obtain the formula 4. Theorem 1 shows that MCAED
can be expressed by a function of data items, rate and access probabilities and reach a
lower bound value when total bandwidth is used during the broadcast.

5 Simulation Results
We make a simulation in wireless environment by a multicast program designed for a
meteorology data disseminate system. Table 1 shows parameters in simulations. We
assume that the access probabilities of data items follow the Zipf distribution. Length
of data items follows average distribution between [1, MaxItemLength].
We compare our approach with BH scheduling under various values, various
numbers of channels, various Item Length and sizes. From Fig. 2(a), we can see that
with the increasing of , the improvement of MP+P exceeds BH by 1.8% and 6.2%
under unit item length and non-uniform item length with MaxItemLength is 50.
Fig. 2(b) shows the performance under different item length distributions with the
number of channels K changing from 3 to 10, and 10000 items in the database. Com-
pared with BH, MP+P improves the performance by 11.9% and 34.9% on average
respectively. Fig. 2(c) depicts the performance affected by incremental item numbers,
which also consider two cases: unit item length and non-uniform item length. Com-
pared with BH, our algorithm improves performance by 17.6% and 22.3%. Fig. 2(d)
shows that MP+P improve performance by 23.9% on average with increase of Max-
ItemLength.
Simulations show that our approach achieves better performance under more gen-
eral environment.
82 X. Sun et al.

Table 1. Parameter and Value

Parameter Denotation Value


Database size N 10000
Channel numbers K 3-10
Bandwidth B 512
Access arrive rate 10
data item length l 1,25,50,75,100,125,150,175
Skew coefficient 0.25,0.5,0.75,1,1.25,1.5

(a) (b)

(c) (d)

Fig. 2. (a) Performance vs. . (b) Performance vs. Channels. (c) Performance vs. Item Number.
(d) Performance vs. MaxItemlength.

6 Conclusion
Broadcast data delivery is rapidly becoming the method of choice for dissemi-
nating information to a massive user population in many new application areas where
Broadcast Data Scheduling in Wireless Environment 83

client-to-server communication is limited. The main advantage of broadcast delivery is


its scalability: it is independent of the number of users the system is serving. In this
paper, we have investigated into strategies broadcast data scheduling. Contributions of
this paper are as follows: 1) data broadcasting is considered as a efficient way, in terms
of delay and bandwidth, for the distribution of information to a large number of users
in wireless communication environment. 2) We present a broadcast scheduling model
of multiple push channels and one channel (denoted by MP+P). 3) We design an on-
line scheduling algorithm based on our model, and analyzes the lower bound value of
its performance metric MCAED.
In future, we plan to extend our work to save power consumption of wireless cli-
ents by predicting accurate receival time of data items on broadcast scheduling.

References
1. Huang, J.-L., Chen, M.-S.: Dependent data broadcasting for unordered queries in a multiple
channel mobile environment. IEEE Transactions on Knowledge and Data Engineering
(TKDE) 16(9), 11431156 (2004)
2. Lee, G., Yeh, M.S., Lo, S.C., Chen, A.: A strategy for efficient access of multiple data items
in mobile environments. In: Proceedings of 3rd International onference on Mobile Data
Management (MDM 2002), Singapore, pp. 7178 (January 2002)
3. Yee, W.G., Navathe, S.B.: Efficient data access to multi-channel broadcast programs. In:
Proceedings of International Conference on Information and Knowledge Management
(CIKM 2003), New Orleans, Louisiana, USA, pp. 153160 (November 2003)
4. Liu, C.-M., Lin, K.-F.: Broadcasting Dependent Data with Minimized Access Latency in a
Multi-channel Environment. In: IWCMC 2006, Vancouver, British Columbia, Canada, July
36 (2006)
5. Saxena, N., Pinotti, M.C.: On-line Balanced K-Channel Data Allocation with Hybrid
Schedule per Channel. In: IEEE Intl. Conf. in Mobile Data Management (MDM) (2005)
6. Zheng, B., Wu, X., Jin, X., Lee, D.L.: TOSA: a near-optimal scheduling algorithm for
multi-channel data broadcast. In: Proceedings of the 6th International Conference on Mobile
Data Management, Ayia Napa, Cyprus, May 09-13 (2005)
7. Song, Y., Zhang, C., Fang, Y.: Throughput Maximization in Multi-channel Wireless Mesh
Access Networks. In: ICNP 2007, Beijing,China, October 16-19 (2007)
8. Choi, J., Na, J., Park, K., Kim, C.-k.: Adaptive Optimization of Rate Adaption Algorithms
in Multi-Rate WLANS. In: ICN 2007, Beijing,China, October16-19 (2007)
Analysis on Imai-Shins LR-AKE Protocol for Wireless
Network Security

Yingjie Wang, Wei Luo, and Changxiang Shen

Beijing Remote Sensing Institution, Beijing 100192, China


yingjie.wang@gmail.com

Abstract. In 2005 Imai and Shin proposed a leakage-resilient authenticated key


exchange protocol(LR-AKE) for wireless network security. For simplicity, the
protocol is based on password authentication plus additional secrets to fit wire-
less environment (e.g., computation constraint). In this paper we show that
Imai-Shins scheme is vulnerable to both client and server impersonation attacks
and needs to be improved to provide strong security for wireless network.

Keywords: Authentication, Key exchange, Password, Security analysis.

1 Introduction
As wireless technology has become more pervasive along with the popularity of mobile
devices, traditional electronic commerce and transactions have come true over wireless
networks, which have great potential for improving access to services. Nevertheless
there is a lot of concern on wireless communication carrying sensitive information, and
two fundamental security issues must be considered: authentication and confidentiality.
Both can be achieved through an authenticated key exchange (AKE) protocol at the end
of which the two parties share a common session key (e.g., the key is used for confi-
dentiality and/or data integrity). The security problem demands for secure algorithms
and protocols that could be able to persuade common users. The level of security is
directly related with the trust on the session key used to protect the sensitive data. This
way the key exchange algorithms and protocols are of great importance in the security
of wireless environments in general. To achieve authenticated key exchange, two
methods are usually used: PKI (Public Key Infrastructure) based authentication and
password based authentication. When it comes to wireless networks, it must be con-
cerned that there are yet natural constrains of the small size and low resource equip-
ments, whose battery limitation strictly compromises its processing capabilities. So
concerning the energy consumption constrains of wireless mobile devices, password
based authentication solutions are better suited due to the user friendliness and small
simplicity of its algorithms.
In 2005 Imai and Shin proposed a leakage-resilient password authenticated key
exchange protocol (LR-AKE)[1] for wireless network security, assuming that no
TRM(Tamper-Resistant Modules) storage is available on both user and server sides.
They claimed the scheme is efficient and secure. The main contribution of this paper is
to expose that Imai-Shins key exchange scheme for wireless environments is not

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 8489, 2009.


ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
Analysis on Imai-Shins LR-AKE Protocol for Wireless Network Security 85

secure enough to most real world applications. We show that impersonation attacks can
be mounted successfully on Imai-Shins scheme: both client impersonation attack
or/and server impersonation attack.
The paper is organized as follows. In Section 2 Imai-Shins protocol is reviewed. In
section 3 we construct client/server impersonation attacks which can be mounted
successfully on Imai-Shins protocol. Conclusions are presented in Section 4.

2 Review of Imai-Shins Protocol [1]


Imai-Shins protocol consists of the four phases: an initialization phase, a secrecy am-
plification phase, a verification phase and a session-key generation phase (see Fig. 1).

Fig. 1. Imai-Shin Leakage-resilient authenticated key exchange (LR-AKE) protocol, where the
enclosed values in rectangle represent stored secrets of client and server, respectively

Preliminary. The protocol is defined over a finite cyclic group G =< g > , where
G = q and G is a prime order subgroup over a finite field F p . Let g and h are
two generators of G so that the Discrete Logarithm Problem is hard to solve [2]. Both
g and h may be given as system parameters.
86 Y. Wang, W. Luo, and C. Shen

1) Initialization. A client C registers a secret value generated by his password pw to


the server S i ( i 1 , a client may register to one or more servers).
The client picks a distinct random polynomial pi (x ) of degree 1 with coefficient
ai1 randomly chosen in ( Z / qZ ) * ,

pi ( x) = a i 0 + ai1 x mod q , (1)

and sets ai 0 = pw , where pw is the users password. After computing pi (1) with
the above polynomial, the client registers securely a value h pi (1) to server S i as
follows:

S i h pi (1) . (2)

pi ' ( x) instead of pi (x) in its


Then the client just stores a secret polynomial
mobile devices, such as mobile phones or PDA devices, and keeps password pw in
mind.

pi ' ( x) = pi ( x) pw = a i1 x mod q . (3)

Where pi ' ( x) is obtained from pi (x) by removing the password.


2) Secrecy Amplification. When the client C wants to share a session key with server
S i , he first recovers the polynomial pi (x) by adding his password pw to the
polynomial pi ' ( x) stored in device. With pi (x) the client computes h pi (1) . Also
he chooses a random number r1 R ( Z / qZ ) * to compute y1 g r1 h pi (1) ,
then sends y1 to S i . Server S i also computes y 2 g 2 h i with a random
r p (1)

number r2 R ( Z / qZ ) * and the secret value h pi (1) registered by the client in the
initializing phase. The server transmits y 2 to the client along with the authenticator
Ver2 as showed in Fig. 1 below. On both ends, the clients keying material becomes
MK C ( y 2 h pi (1) ) r1 , and the servers keying material becomes
MK Si ( y1 h pi (1) ) r2 .
Only if the client uses the right password pw and the corresponding secret value
pi (1)
h to server S i and S i uses the right secret value h pi (1) , both of them can share
the same keying material

MK C = ( y 2 h pi (1) ) r1 = g r1r2 = ( y1 .h pi (1) ) r2 = MK Si . (4)


Analysis on Imai-Shins LR-AKE Protocol for Wireless Network Security 87

3) Verification. The client and the server computes Ver1 = MAC ( SID MK C 10)
and Ver2 = MAC ( SID MK Si 01) , respectively, using a secure message authenti-
cation code with the keying materials as input, and SID = C S i y1 y 2 . Upon
exchanging Ver2 and Ver1 , the client verifies
Ver2 = ? MAC ( SID MK C 01) , (5)

and the server verifies

Ver1 = ? MAC ( SID MK Si 10) . (6)

If both equations hold, they proceed to the session key generation phase. Otherwise, the
session will be closed.
4) Session Key Generation. If both of the above verifications succeed, the two parties
generate the session key using the verified keying materials as follows:

SK C MAC ( SID MK C 11) , (7)

SK S I MAC ( SID MK S I 11) . (8)

3 Cryptanalysis of Imai-Shins Protocol


Imai-Shins protocol requires no tamper-resistant module storage available, and it is
claimed secret leakage resilient. In this section, we show that Imai-Shins scheme is not
as secure as it is claimed, and it is vulnerable to both client and server impersonation
attacks.
Since Imai-Shins protocol does not require a secure server storage, an intruder can
easily get access to secrets table on the server S i , that is, the adversary can get the
p (1)
value h i . Then the adversary can choose to impersonate the client or/and the server
in the run of the protocol. Below we first show how the intruder can impersonate the
client C to communicate with the server.
1) Client Impersonation Attack. The adversary chooses a random number
r1 ' R ( Z / qZ ) * and with h pi (1) he can compute y1 ' g r1 ' h pi (1) , then he
sends (C , y1 ' ) to the server S i . Upon receiving y1 ' , S i computes
y 2 ' g r2 ' h pi (1) with a random number r2 ' R ( Z / qZ ) * and the secret value
h pi (1) stored. In the following steps, the adversary impersonates the client C to
exchange messages ( S i , y 2 ' , Ver2 ' ) and Ver1 ' with the server S i like the client
does in Fig. 1. Ver2 ' and Ver1 ' are computed as:
88 Y. Wang, W. Luo, and C. Shen

Ver1 ' = MAC ( SID' MK C ' 10) , (9)

Ver2 ' = MAC ( SID' MK Si ' 01) , (10)

where

MK C ' ( y 2 'h pi (1) ) r1 ' = ( g r2 ' h pi (1) h pi (1) ) r1 ' = g r1 'r2 ' (11)

is computed by the adversary;

MK Si ' ( y1 'h pi (1) ) r2 ' = ( g r1 ' h pi (1) h pi (1) ) r2 ' = g r1 'r2 ' (12)

is computed on the server side, and

SID' = C S i y1 ' y 2 ' . (13)

It can be seen that MK C ' = MK Si ' , so the adversary and the server can pass mu-
tual authentication. Thus they can share the same session key at the end of the protocol.

2) Server Impersonation Attack. When the adversary gets h pi (1) by corrupting the
secret table on the server S i , he can compute y 2 " g 2 h i by h i and a
r " p (1) p (1)

random number r2 " R ( Z / qZ ) * for communicating with the client C . Since the
p (1)
adversary can share the same secret h i with the client C as the server S i does,
similar to analysis in client impersonation attack, it can be easily concluded that the
adversary and the client can authenticate each other successfully and share a session
key in the end of the protocol. Thus server impersonation attack succeeds.

4 Conclusion

In this paper, we have demonstrated that Imai-Shins scheme is vulnerable to both


server and client impersonation attacks in wireless networks. Thus the scheme is not
secure enough for most real world applications in wireless environment. For authen-
ticated key exchange (AKE) in wireless systems, power and computation constraint
must be concerned, due to simplicity of the algorithm and user friendliness, password
based authentication schemes are usually better suited than those of PKI based. But so
far few schemes based on password authentication can achieve strong security. Our
future research would be focused on how to achieve strong security as well as sim-
plicity of the algorithm/protocol for authenticated key exchange in wireless environ-
ments, one option is to how to make improvements on Imai-Shins scheme to get
enough security for wireless access.
Analysis on Imai-Shins LR-AKE Protocol for Wireless Network Security 89

References
1. Imai, H., Shin, S.H., Kobara, K.: Authenticated key exchange for wireless security. In: IEEE
Wireless Communications and Networking Conference, vol. 2, pp. 11801186. IEEE Press,
New Orleans (2005)
2. Schneier, B.: Applied cryptography: Protocols, algorithms and source code in C, 2nd edn.
John Wiley & Sons Inc., New York (1996)
Surveys on the Intrusion Tolerance System

Kuo Zhao*, Nurbol, Fei Ren, Jianfeng Chu, and Liang Hu**

Department of Computer Science and Technology,


Jilin University Changchun 130012, China
Tel.: 86-431-85168277; Fax: 86-431-85166494
{zhaokuo,hul}@jlu.edu.cn

Abstract. In classical dependability, fault tolerance has been the workhorse of


many solutions. Classical security related work has on the other hand privi-
leged, with few exceptions, intrusion prevention. Intrusion tolerance (IT) is a
new approach that has slowly emerged during the past decade, and gained im-
pressive momentum recently. Instead of trying to prevent every single intrusion,
these are allowed, but tolerated: the system triggers mechanisms that prevent
the intrusion from generating a system security failure. In this paper, we de-
scribe the preliminaries of intrusion tolerance system (ITS) such as the concept
and objective of ITS, discuss the main mechanisms and strategies of ITS, and
present some example systems.

Keywords: Intrusion tolerance system, mechanism, strategies.

1 Introduction
According to the report of CSI/FBI, 95% organizations have the firewalls, 61% have
the Intrusion Detection Systems, and 90% adaptive access controls, but attacks still
could seep into the system. This obviously indicated that another kind of mechanism
is needed to implement the precaution against intrusions due to the limitation of cur-
rent intrusion detection mechanism and intrusion prevention mechanism.
The research on IT has been an active field, and the development of commercial
products continuously varies besides academic discussions. The IT is a brand-new
network security technology which integrates the cryptography and the fault-tolerant
technology with the characteristic of keep providing the corresponding network ser-
vice, and could guarantee the confidentiality and integrity of systems essential data
even if some parts of the system have been already destroyed or controlled by the
attacker successfully. The IT is called the third generation safety work - the core of
survival technology in some materials.
The paper is organized as follows. In Section 2 we describe the preliminaries of in-
trusion tolerance system. Section 3 discusses the main mechanisms of ITS. Section 4

* This work is supported by National Nature Science Foundation of China under Grant NO.
60873235, Program for New Century Excellent Talents in University of China under Grant
No.NCET-06-0300, and Graduate Innovation Program for 985 Project in Jilin University of
China under Grant NO.20080244.
** Corresponding author.

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 9097, 2009.


ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
Surveys on the Intrusion Tolerance System 91

provides the main strategies of ITS. Section 5 presents some example systems. Fi-
nally, Section 6 concludes the paper.

2 Preliminaries
The term intrusion tolerance appeared originally in a paper by Fraga and Powell
[1]. Later their scheme Fragmentation-Redundancy-Scattering was used in the
DELTA-4 project to develop an intrusion-tolerant distributed server composed by a
set of insecure sites. In afterward several years, many independent research works has
carried on under the concept of IT, which were mainly in the protocol aspect, but not
until 2000 did this field making an eruption-like progress. In Europe and America, the
research has been made on the concept, the mechanism and the architecture of IT in
two main projects: OASIS and MAFTLA. A main reason was the appearance of mali-
cious wrong causes the distributional system to have the fundamentality problem.
On the other hand, the traditional fault-tolerant is following an architecture, which
cannot be completely suitable for the aspect of malicious mistake with intention.

2.1 Intrusion Tolerance System Concept


Intrusion Tolerance System (ITS) [2] refers to the system that could provide the ser-
vice while in the situation of suffering a certain invasion.
The ITS not only can resist traditional attack but can also resist attacks that can not
detected by prophylaxis methods such as firewall system, authentication and encryp-
tion system. The system will take some essential measures to guarantee that the key
application or the key service's function works correctly and continuously, these
measures including limit the suspected code and data, reconfigure the hardware and
software resource and so on.
The ITS can be defined as the system that can provide serves continually even if he
server is in the condition of being attack with the ability to filter, restore and adapt
intrusions.

2.2 Objectives of ITS


The main objective of ITS can be divide into four levels:
Firstly: Protective system's safe operation.
The ITS should prevent and stop the impediment of further aggressive behavior, warn-
ing and tries hard to repair systems corresponding flaws automatically when the attacks
were not yet initiated or the attacks were going on but have not create the destruction.
Then: Guarantee service usability.
After attacker successfully attacked the application system, ITS needs to enable the
application systems to provide the complete services or the degradation services and
tried hard to make the system return to (health) normal condition by reorganization
and restoration.
Again: Guarantees on the secret of server data.
ITS needs to guarantee that the intruder cannot obtain the plaintext data when try-
ing to grasped the server data access authority through a illegal way.
Last: Guarantees on the authenticity and integrity of server data.
92 K. Zhao et al.

When the network services were under intrusions, the ITS needs to protect server's
application data and the system data from illegal distortion, or can discover the
distorted part and recovery the data when necessary.
The four levels above are the main objective of ITS. These four levels are the cas-
cade progressive; each latter level provides in-depth protection to the system com-
pared to the preceding level.

3 Intrusion Tolerance Mechanisms


The implementation of ITS is aimed for certain safety mechanisms [3] such as secure
communication mechanism, intrusion detection mechanisms, intrusion containment
mechanism, and error handling mechanism.

3.1 Secure Communication Mechanism

In the network environment, secure communication mechanism is necessary for en-


suring communication between safe and reliable communication, Preventing and
stopping attack like Eavesdropping, camouflage, DOS and so on. The secure commu-
nications mechanism of IT usually applied technologies such as encryption, authenti-
cation, filtering, and classic fault-tolerant correspondence.
During the interaction of IT, there is no design against threshold secret sharing sys-
tem technology for considering that each correspondence side in a domain environ-
ment, threshold secret sharing system technology will increases system's burden and
complexity of achieving the system, unable to guarantee system's Realtime.
Under the demand of Guaranteeing security, you can use Trusted Timely Comput-
ing Base (TTCB) [4] component to implement all parts of the system between the
group communications, its a feasible plan, and TTCB is a real time safe Wormhole
TTCB is a simple component which contains a collection of limited service. Its objec-
tive is to support execution IT protocol and the mixed method of system frame.

3.2 Intrusion Detection Mechanism

The intrusion detection system monitor and analyze the event which occurs through
the computer system or the network, detect and respond to possible attacks, intrusion,
security crack exist in the system. Intrusion detection system bind crack analysis and
attacks forecast technology for predicting the occurrence of error, finding the reason
of attack or security crack. Intrusion detection system can also bind Audit mechanism,
record System's behavior and security event, examine and analyze the reason and the
security problem.
Detection response and tests of system security can predict the error which will
happen by analyzing vulnerability, predicting attack and so on, finding the reason of
attack or security crack, the union audit record module realizes to the situation has the
diary report, and the way of recording the system behavior and the way of security
event, going a step further to exam and analyze the reason and the security problem,
carries on the linkage through the internal response sub module, calling security con-
figuration strategy in advance to implement the system initiative IT.
Surveys on the Intrusion Tolerance System 93

3.3 Intrusion Containment Mechanism

Resource redundant and the design multiplicity can be used to increases the difficulty
and the cost of attack. We can also use safety septa, structure reconfigure and other
method to limit invasion, impediment the further proliferation of invasion.
Resources redundancy and design multiplicity such as increases system isomer of
the server and increases the separation to realize the isolation of the damaged compo-
nent, can be used to increase the difficulty and the cost of invasion, limit and impedi-
ment the proliferation of invasion.

3.4 Error Handling Mechanism

Error handling mechanism prevents disastrous expiration from existence, and it is


composed of Error detection mechanism and Error recovery mechanism.
1. Error detection mechanism
Error detection mechanism has two parts, Complete examination and Audit Log:
Tests through the system safety non-periodically to system's module's security testing,
realizes the examination to the system integrity; Analyze the system module's diary
schedule audit analysis by the audit record module, carries on through the system log
audit analysis to the systematic error examination. Through error detection composed
of two way, Confirms oneself whether create Error recovery strategy because of false
triggering, avoids occurring false triggering itself, can limit the wrong further dis-
semination. However, if we trigger the false triggering, we can localization the reason
of creating error, provides the report for wrong recovery, guarantee system can restore
from the mistake before creating own false triggering this kind of situation.
2. Error recovery mechanism
The purpose of error recovery mechanism is to recover from wrong state for intrusion,
and maintenance or restores the essential data, the key service even the complete
serves, error recovery mechanism contains forward recovery, backward recovery, and
error masking.
z The system continues to a forward condition, which guarantee of providing
the correct service.
z The system recovers to a true state and run anew.
z The system use redundancy to shield wrong for providing correct serves.
Main safeguard mechanism contain: component redundancy, threshold cryp-
tography, system voting operation, Byzantine negotiation and interactive con-
sistency.
Used heterogeneous operating system through each module and increases between
the module to deploy structure these two ways, through system voting operation,
Byzantine negotiation and interactive consistency, under permitted condition, Even if
the partial modules have the security or the wrong question, the system can support
normal service. Error masking is the design mentality of prevent problem from
spreading. But forward recovery and backward recovery the system recover to a true
state and run anew.
94 K. Zhao et al.

Different error detection methods vary in the affection on the validity of recovery.
Error shielding mechanism should be considered primordially, forward recovery and
backward recovery should be unified simultaneously in order to guarantees the integ-
rity of error processing.

4 Intrusion Tolerance Strategies


Not surprisingly, intrusion tolerance strategies derive from a confluence of classical
fault tolerance and security strategies [5]. Strategies are conditioned by several fac-
tors, such as: type of operation, classes of failures (i.e., power of intruder); cost of
failure (i.e., limits to the accepted risk); performance; cost; available technology.
Technically, besides a few fundamental tradeoffs that should always be made in any
design, the grand strategic options for the design of an intrusion-tolerant system de-
velop along a few main lines that we discuss in this section. We describe what we
consider to be the main strategic lines that should be considered by the architect of IT
systems, in a list that is not exhaustive. Once a strategy is defined, design should
progress along the guidelines suggested by the several intrusion-tolerance frameworks
just presented.

4.1 Fault Avoidance vs. Fault Tolerance

The first issue we consider is oriented to the system construction, whereas the remain-
ing are related with its operational purpose. It concerns the balance between faults
avoided (prevented or removed) and faults tolerated.
On the one hand, this is concerned with the zero-vulnerabilities goal taken in
many classical security designs. The Trusted Computing Base paradigm [6], when
postulating the existence of a computing nucleus that is impervious to hackers, relies
on that assumption. Over the years, it became evident that this was a strategy impos-
sible to follow in generic system design: systems are too complex for the whole de-
sign and configuration to be mastered. On the other hand, this balance also concerns
attack prevention. Reducing the level of threat improves on the system resilience, by
reducing the risk of intrusion. However, for obvious reasons, this is also a very lim-
ited solution. As an example, the firewall paranoia of preventing attacks on intranets
also leaves many necessary doors (for outside connectivity) closed in its way.
Nevertheless, one should avoid falling in the opposite extreme of the spectrum
assume the worst about system components and attack severity unless the criticality
of the operation justifies a minimal assumptions attitude. This is because arbitrary
failure protocols are normally costly in terms of performance and complexity.
The strategic option of using some trusted components for example in critical
parts of the system and its operation may yield more performant protocols. If taken
under a tolerance (rather than prevention) perspective, very high levels of dependabil-
ity may be achieved. But the condition is that these components be made trustworthy
(up to the trust placed on them, as we discussed earlier), that is, that their faulty be-
havior is indeed limited to a subset of the possible faults. This is achieved by employ-
ing techniques in their construction that lead to the prevention and/or removal of the
precluded faults, be them vulnerabilities, attacks, intrusions, or other faults (e.g. omis-
sion, timing, etc.).
Surveys on the Intrusion Tolerance System 95

The recursive (by level of abstraction) and modular (component-based) use of fault
tolerance and fault prevention/removal when architecting a system is thus one of the
fundamental strategic tradeoffs in solid but effective IT system design. This approach
was taken in previous architectural works [7], but has an overwhelming importance in
IT, given the nature of faults involved.

4.2 Confidential Operation

When the strategic goal is confidentiality, the system should preferably be architected
around error masking, resorting to schemes that despite allowing partial unauthorized
reads of pieces of data, do not reveal any useful information. Or schemes that by re-
quiring a quorum above a given threshold to allow access to information, withstand
levels of intrusion to the access control mechanism that remain below that threshold.
Schemes relying on error detection/recovery are also possible. However, given the
specificity of confidentiality (once read, read forever...), they will normally imply
some form of forward, rather than backward recovery, such as rendering the unduly
read data irrelevant in the future. They also require low detection latency, to mitigate
the risk of error propagation and eventual system failure

4.3 Perfect Non-stop Operation

When no glitch is acceptable, the system must be architected around error masking, as
in classical fault tolerance. Given a set of failure assumptions, enough space redun-
dancy must be supplied to achieve the objective. On the other hand, adequate protocols
implementing systematic error masking under the desired fault model must be used
(e.g. Byzantine-resilient, TTP-based, etc.). However, note that nonstop availability
against general denial-of-service attacks is still an ill-mastered goal in open systems.

4.4 Reconfigurable Operation

Non-stop operation is expensive and as such many services resort to cheaper redun-
dancy management schemes, based on error recovery instead of error masking. These
alternative approaches can be characterized by the existence of a visible glitch. The
underlying strategy, which we call reconfigurable operation, is normally addressed at
availability- or integrity-oriented services, such as transactional databases, web serv-
ers, etc.
The strategy is based on intrusion detection. The error symptom triggers a recon-
figuration procedure that automatically replaces a failed component by a correct
component, or an inadequate or incorrect configuration by an adequate or correct
configuration, under the new circumstances (e.g. higher level of threat). For example,
if a database replica is attacked and corrupted, it is replaced by a backup. During
reconfiguration the service may be temporarily unavailable or suffer some perform-
ance degradation, whose duration depends on the recovery mechanisms. While the
attack lasts), the service may resort to configurations that degrade QoS in trade for
resilience, depending on the policy used (e.g., temporarily disabling a service that
contains a vulnerability that cannot be removed, or switching to more resilient but
slower protocols).
96 K. Zhao et al.

4.5 Recoverable Operation


Disruption avoidance is not always mandatory, and this may lead to cheaper and sim-
pler systems. Furthermore, in most denial-of-service scenarios in open systems (Inter-
net), it is generically not achievable.
Consider that a component crashes under an attack. An intrusion-tolerant design
can still be obtained, if a set of preconditions hold for the component: (a) it takes a
lower-bounded time Tc to fall; (b) it takes a upper-bounded time Tr to recover; (c) the
duration of blackouts is short enough for the applications needs.
Moreover, the crash, which is provoked maliciously, must not give rise to incorrect
computations. This may be achieved through several techniques, amongst which we
name secure check-pointing and logging. Recoverable exactly-once operation can be
achieved with intrusion-tolerant atomic transactions [8]. In distributed settings, these
mechanisms may require secure agreement protocols.
This strategy concerns applications where at the cost of a noticeable temporary ser-
vice outage, the least amount of redundancy is used. The strategy also serves longrun-
ning applications, such as data mining or scientific computations, where availability is
not as demanding as in interactive applications, but integrity is of primary concern.

5 Some Example Systems


Since the term intrusion tolerance appeared originally, a number of isolated IT
protocols and systems emerged. BFT [9] is an efficient state-machine replication
algorithm. It has been used to implement an intrusion-tolerant NFS server. Rampart
provides tools for building IT distributed services: reliable multicast, atomic multicast
and membership protocols. SecureRing is a view-synchronous group communication
system based on the Totem single-ring protocols [10]. Both Rampart and SecureRing
can be used to build servers using the state-machine replication approach. Fleet [11]
use Byzantine quorum systems [12] to build IT data stores, respectively for data ab-
stractions like variables and locks, and for Java objects. The protocol suite CLIQUES
supports group key agreement operations for dynamic groups of processes [13]. More
recently, two projects have focused on intrusion tolerance, OASIS and MAFTIA,
developing several results that will be detailed ahead.

6 Conclusions
The intrusion tolerance technology is the third generation security technology which
refers to the core survival technology. We have presented a survey of the main con-
cepts, mechanisms and strategies relevant to intrusion tolerant system. In our opinion,
Intrusion tolerance as a body of knowledge is, and will continue to be for a while, the
main catalyst of the evolution of the area of dependability.

References
1. Fraga, J.S., Powell, D.: A fault- and intrusion-tolerant file system. In: 3rd International
Conference on Computer Security, pp. 203218 (1985)
2. Paulo, E.V., et al.: Intrusion-Tolerant Architectures: Concepts and Design. LNCS.
Springer, Heidelberg (2003)
Surveys on the Intrusion Tolerance System 97

3. Fengmin, G., Katerina, G.P., Feiyi, W., Rong, W., et al.: Characterizing Intrusion Tolerant
Systems Using A State Transition Model. In: DARPA Information Survivability Confer-
ence and Exposition (DISCEX II), pp. 211221. IEEE Press, Anaheim (2001)
4. Correia, M., Verissimo, P., Neves, N.F.: The design of a COTS real-time distributed secu-
rity kernel. In: Bondavalli, A., Thvenod-Fosse, P. (eds.) EDCC 2002. LNCS, vol. 2485,
pp. 234254. Springer, Heidelberg (2002)
5. Verssimo, P., Rodrigues, L.: Distributed Systems for System Architects. Kluwer Aca-
demic Publishers, Dordrecht (2001)
6. Verssimo, P., Casimiro, A., Fetzer, C.: The Timely Computing Base: Timely actions in
the presence of uncertain timeliness. In: International Conference on Dependable Systems
and Networks, pp. 533542. IEEE Computer Society, Los Alamitos (2000)
7. Powell, D., et al.: Delta-4: A Generic Architecture for Dependable Distributed Processing.
Research Reports ESPRIT. Springer, Heidelberg (1991)
8. Cachin, C., Poritz, J.A.: Hydra: Secure replication on the internet. In: International Confer-
ence on Dependable Systems and Networks, pp. 167176. IEEE Press, New York (2002)
9. Castro, M., Liskov, B.: Practical Byzantine fault tolerance. In: 3rd Symposium on Operat-
ing Systems Design and Implementation, pp. 173186. ACM SIGOPS, Berkeley (1999)
10. Kihlstrom, K.P., Moser, L.E., Melliar-Smith, P.M.: The SecureRing group communication
system. ACM Transactions on Information and System Security 4, 371406 (2001)
11. Malkhi, D., Reiter, M.K., Tulone, D., Ziskind, E.: Persistent objects in the Fleet system. In:
2nd DARPA Information Survivability Conference and Exposition (DISCEX II), pp. 126
136. IEEE Press, New York (2001)
12. Alvisi, L., Malkhi, D., Pierce, E., Reiter, M.K., Wright, R.N.: Dynamic Byzantine quorum
systems. In: IEEE International Conference on Dependable Systems and Networks, pp.
283292. IEEE Press, New York (2000)
13. Ateniese, G., Steiner, M., Tsudik, G.: New multi-party authentication services and key
agreement protocols. IEEE J. of Selected Areas on Communications 18, 628639 (2000)
Study on the DS/CDMA-CSMA Multi-user
Communication Systems

Yun-xiao Zu, Xiao Wu, and Yu-qin Lv

School of Electronic Engineering, Beijing University of Posts and Telecommunications,


Beijing 100876, China
zuyx@tsinghua.edu.cn

Abstract. The multi-user random communication system is a new researchful


subject. Now most researches focus on fixed assignment multiple access method.
A new approach that combine the fixed multiple access technology with the
random multiple access technology is proposed in this paper, that is the
DS/CDMA-time slot-nonpersistent CSMA access approach. The Markov-Chain
mathematics model is set up. And then the transition of channel states is depicted
and the formula of the network throughput is found. The factors influencing the
performance of the DS/CDMA time-slot nonpersistent CSMA multi-user
communication systems have been analyzed by Monte-Carlo simulation. It has
been proved that the performance of the communication systems can be
improved by using this approach.

Keywords: DS/CDMA, time-slot nonpersistent CSMA, multi-user commu-


nication systems, Markov-Chain, Monte-Carlo.

1 Introduction

The most common MAC layer protocols that used to manage the uplink multiple access
is CSMA/CD (Carrier Sense Multiple Access with Collision Detection) in
DS/CDMA-CSMA multi-user communication systems[1-3]. However, the transmission
delay in this protocol has decreased the channel utilization. Considering the fixed
multiple access technology, for example CDMA, can not only suppress the uplink noise
effectively, but also utilize the bandwidth of the uplink channel availably, and the
random multiple access technology has the better performance for unpredictable effects
on signal propagation. So a new approach that combine the fixed multiple access
technology with the random multiple access technology is proposed in this paper, that
is the DS/CDMA time-slot nonpersistent CSMA access approach. The performance of
DS/CDMA-CSMA multi-user communication systems been studied using this
approach. The throughput of the system in different conditions has been investigated by
Monte-Carlo method.
This work shows that this transmission process is more efficient than traditional
transmission approaches and will improve network performance, particularly for
networks of today and the future which require more and more bandwidth.

P. Bond (Ed.): ChinacomBiz 2008, CCIS 26, pp. 98104, 2009.


ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2009
Study on the DS/CDMA-CSMA Multi-user Communication Systems 99

2 DS/CDMA Time-Slot Nonpersistent CSMA Multi-user Random


Access Model

2.1 Multiple Access Model

Multiple access protocol is used to manage the users sharing the medium resources
efficiently. At present, there are two main multiple access protocols, that are ALOHA
series and CSMA series. The maximum throughput S max of the system for ALOHA
[4]
series and CSMA series is shown in table 1 .

Table 1. The maximum throughput of the system for ALOHA series and CSMA series

Access Mode Smax


ALOHA 0.184
Slot time ALOHA 0.368
Persistent CSMA 0.529
Slot time persistent CSMA 0.531
P(0.1)- persistent CSMA 0.791
nonpersistent CSMA 0.815
P(0.03)-persistent CSMA 0.827
slot time-nonpersistent CSMA 0.857

From Table 1 we can see that the system performance using time-lot nonpersistent
CSMA multiple access technology is better than that of others.

2.2 DS/CDMA Time-Slot Nonpersistent CSMA Uplink Multi-users Random


Access Model

The free time is divided into time slot with width in time-slot nonpersistent CSMA
protocol as shown in Fig.1[4]. The slot length equals the end-to-end propagation delay.
If the frame arrives at the free time slot, then it will be transmitted during the next time
slot. The frame length L is an integral times of time slot.

Fig. 1. Principle model of time-slot nonpersistent CSMA


100 Y. Zu, X. Wu, and Y. Lv

The uplink channel is divided into M independent sub-channels with CDMA


technology, that is the uplink random terminals are divided into M groups and each
group has a code. The time-slot nonpersistent CSMA multiple access protocol is used
in each independent sub-channels.
The time-slot nonpersistent CSMA protocol is bus structure. A single transmission
medium is shared by all terminals. The terminals represent computer resources which
are connected to the network and compete for the access to the channel. When a
terminal receives a message frame and transmits it again during a slot time, the terminal
monitors the channel in order to find if there are other on-going transmissions. The
terminal will capture the channel and transmit the message if the channel is free.
Otherwise, if the channel is not free, the terminal will try again after some random
delay. The maximum channel utilization is determined by the length of the frame and
the propagation time.
For simple analyzing the hypotheses are given.
(1) There are infinite users.
(2) The frame arrives in Poisson[5].
(3) The length of the frame is in geometrical distribution[6]. That is

(1 )x 1 , x = 1, 2 ,3 K
p [ T h e le n g th o f f ra m e = x ] = (1)
0 , o th e r s

Where p [ T h e le n g th o f fr a m e = x ] is the probability of that the length of frame


equals to x. The unit of x is time slot and its average value is 1 / .
(4) There are reliable channel and ideal power control. The capture effect is
neglected[6].
(5) The terminal sends acknowledge signal immediately after it received data
correctly. The length of the acknowledge frame is fixed and the transmission time is
neglected.
(6) The propagation delay is time slots.
(7) The network load G is the sum of the service load of every channel.

2.3 Markov Model for DS/CDMA Time-Slot Nonpersistent CSMA Protocol

There is only one uplink channel if M =1. The channel state of the n+1 time-slot only
dependent on that of the n time-slot and have not any relations with the channel state
before. So the discrete time Markov model can be constructed.
There are five channel states in time-slot nonpersistent CSMA protocol. The
transition of the channel state is shown in Fig.2. The meaning of every state is as
follows:
Study on the DS/CDMA-CSMA Multi-user Communication Systems 101

6

6 6

6 6

Fig. 2. Transition of the channel state

S1 idle slot, which means that the channel is free, no transmission and no
propagation.
S2 message transmission slot, which means that the channel is transmitting
message.
S3 response transmission slot, which means that the channel is transmitting
response.
S4 collision slot between message and response. When the message and the
response of the last message are transmitted at the same slot, there will be a collision.
S5 collision slot between messages. Two or more terminals attempt to transmit
message at the same slot, a collision is said to occur.

The probability of no message arriving is e G because the frame arrives in Poisson.


So the following state transition matrix P can be obtained.

e G Ge G 0 0 1 e G Ge G

0 1 e G
(
1 e G
) 0
P = 1 0 0 0 0


1 0 0 0 0
1
0 0 0 0 (2)
In which the element P (i,j) is the transition probability from state i to state j.
Let = [ 1 2 3 4 5 ] indicates the steady state probability vector.
Where j is the steady state probability of state Sj. Then from j = i Pij and
j = 1 we can get:
1 = Ge G e G + 2

2 = Ge G Ge G e G + 2
102 Y. Zu, X. Wu, and Y. Lv

3 = Ge2 G (Ge G e G + 2 )

4 = Ge G (1 e G ) Ge G e G + 2

5 = (1 e G Ge G ) Ge G e G + 2

The network throughput is the sum of the throughput of message transmission frame
and response transmission frame, so the following normalized network throughput S
can be obtained.

S = 2 + 3 = Ge G (1 + e G ) ( Ge G e G + 2 ) (3)

The average network load of each CDMA channel is G/M if M>1. The network
throughput of each CDMA channel is:
S = G e G (1 + e G ) ( G e G e G + 2 ) (4)
G = G M
The total network throughput is the sum of that of independent CDMA sub-
channels, that is,

S M = MS = M [Ge G (1 + e G )] (Ge G e G + 2 ) (5)

3 Simulation and Analysis of DS/CDMA Time-Slot Nonpersistent


CSMA Multi-user Communication Systems

3.1 The Pseudo-random Code Effects on Network Throughput

The Pseudo-random code, such as Walsh code, m-sequence code, Gold code and
Kasami code, are used as the spreading code of DS/CDMA. There are some noises in

0.9

Walsh Codes
0.8
m-sequemce
Gold Codes
0.7
Kasami Codes

0.6

0.5
S

0.4

0.3

0.2

0.1

0
0 100 200 300 400 500 600 700 800 900 1000
G

Fig. 3. Network throughput S versus network load G with different spreading codes
Study on the DS/CDMA-CSMA Multi-user Communication Systems 103

the channel except signal when signal is transmitted in one channel. The noises will
influence the transmission performance including the network throughput. Now there
are three kinds of noises are considered in this paper and they are AWGN (Gaussian
white noise), pulse jamming and multiple access interference.
Assuming the transmission delay =0.01, the number of users M =20 and the
parameter of the frame length =0.2, then the relationship between the network
throughput and the network load can be achieved by Monte-Carlo simulation in
different spreading codes such as Walsh Code, m-sequence, Gold Code and Kasami
Code. It is shown in Fig.3.
It can be seen from Fig.3 that the impact of different spreading codes on system
throughput is obvious. Compared with other spreading codes, m-sequences is better
because it has good correlation. So we just discuss the system that m-sequence is used
as the spreading code in the following.

3.2 The Frame Length Effects on Network Throughput

The frame length is an important factor that influences the network throughput from
formula (5). The impact of the frame length on network throughput is depicted in Fig.4
under the conditions that the transmission delay =0.01, the user number M = 20,
= 0.2, 0.4, and 0.6 respectively and m-sequence as the spreading code.

0.7

0.6

=0.2
0.5

0.4
S

0.3 =0.4

0.2
=0.6

0.1

0
0 100 200 300 400 500 600 700 800 900 1000
G

Fig. 4. Network throughput S versus network load G in different frame lengths

Fig. 4 shows that the network throughput strongly depends on the frame length. The
bigger the parameter of the frame length, or the smaller the frame length, the smaller
the network throughput, the worse the performance of the network.

3.3 The Transmission Delay Effects on Network Throughput

The transmission delay is another important factor that influences the network
throughput from formula (3). Fig. 5 shows the relationship of network throughput
104 Y. Zu, X. Wu, and Y. Lv

versus network load for the transmission delay =0.01, 0.02, and 0.03, the user number
M=20, the parameter of the frame length =0.2, and m-sequence as the spreading code.
It can be seen from Fig.5 that the smaller the transmission delay, the bigger the
network throughput.

0.7
=0.01
0.6 =0.02
=0.03

0.5

0.4
S

0.3

0.2

0.1

0
0 100 200 300 400 500 600 700 800
G

Fig. 5. Network throughput S versus network load G in different delays

4 Conclusions
A mixed multiple access protocol, that is DS/CDMA slot-time nonpersistent CSMA
multi-user access protocol is developed in this paper based on fixed multiple access
technology and random multiple access technology. A Markov model for DS/CDMA
time-slot nonpersistent CSMA protocol is established. The factors influencing the
performance of the multi-user communication systems are analyzed by Monte-Carlo
simulation. The results indicate that the DS/CDMA slottime nonpersistent CSMA
multi-user access mode has improved the performance of the original CSMA access
mode greatly.

References
1. Garg, V.K.: Spread Spectrum (SS) and CDMA Systems. Wireless Communications &
Networking, 317367 (2007)
2. Aad, I., Ni, Q., Barakat, C., Turletti, T.: Enhancing IEEE 802.11 MAC in congested
environments. Computer Communications 28(14), 16051617 (2005)
3. Chen, J., Sheu, S.-T.: Distributed multichannel MAC protocol for IEEE 802.11 ad hoc
wireless LANs. Computer Communications 28(9), 10001013 (2005)
4. Ting, Z.: The research of checking control tree collision resolution in random multi-access
system [Master degree thesis], Yunnan University (2004)
5. Xinlin, C.: Queuing Theory in Modern Communications. Publishing House of Electronics
Industry (2000)
6. Fei-yan, S., Zhao-yang, Z., Wen-zheng, C.: On Throughput Performance of the CDMA
Reservation-ALOHA Multiple Access System for the HFC Network. Journal of Electronics
29(11), 15521554 (2001)
Author Index

Chu, Jianfeng 90 Nurbol 90

Dittmann, Lars 10, 25 Polat, Bahadir K. 48


Fang, Guangwei 60
Ren, Fei 90
Gondim, Paulo Roberto de Lira 68 Ribeiro Junior, Sebastiao Boanerges 68

Hearty, Paul 36 Schiller, Frank 1


Herr, Laurin 36 Schulze, Jurgen 36
Hu, Liang 10, 25, 90 Shen, Changxiang 84
Sun, Xianjun 76
Iversen, Villy B. 10

Kailay, Pooja 48 Tuli, Nikhil 48


Khan, Moazzam 48
Wang, Yingjie 84
Krsek, Michal 36
Weber, Uwe 1
Li, Nanjun 40 Wu, Xiao 98
Lin, Chuang 76
Liu, Weidong 76 Xiao, Yanping 76
Luo, Wei 84
Lv, Yu-qin 98 Zhao, Kuo 90
Zheng, Xiao 60
Mattes, Rainer 1 Zorn, Werner 40
Mattes, Tina 1 Zu, Yun-xiao 98

Anda mungkin juga menyukai