Anda di halaman 1dari 12

Very well...

You probably always heard that telecom networks are based on the circuit
switching paradigm. And that was correct up to about 15 years ago. Then started a
movement to change networks to the packet switching paradigm. This has been a long, long
way, which will be practically complete with 4G mobile networks deployment. Our first step
is to understand why this paradigm change was deemed necessary.
Circuit switching means that the communication channels between user pairs are rigidly
allocated for all the communication session duration. Although theres statistical formulae
for circuit switching networks capacity planning see this Wikipedia article about the Erlang
traffic unit theres a capacity waste everytime any of the parties isnt using their
communication channel (which is full-duplex, usually).

On the other hand, packet switching doesnt allocate full-session circuits. Transmission
capacity in either direction is granted to users just for the time needed to forward a single
data packet. This packet interleaving allows minimum capacity wasting of transmission
media.

Unfortunately theres no such thing as a free lunch. Packet switching adoption has its tradeoffs. The major one is accepting the possibility of congestion, because any network node
can suddenly have more packets to send through an interface than its transmission
capacity allows. Usually thats dealt with using transmission buffers, so were in the realm of
queuing systems statistics (Erlang C) instead of the more familiar blocking systems
statistics (Erlang B). This and a few other details were the basis for wrong notions about the

unfeasibility of carrier-class telecom services particularly telephony over packet


switching networks. And, with these articles I expect to bury them at last.
Next basic question to answer is: why IP and not any other packet switching network
architecture? Why not full-fledged OSI, for instance? The answer is quite simple: other
network architectures were considered and discarded because their adoption would be too
difficult, or too expensive. The Internet Protocol suite, on the other hand, was immediately
available and was reliable, cheap and simple. With the Internet boom of the 1990s the
option for IP became unquestionable and quite irreversible.
Here in TelecomHall theres a brief explanation of the 7-layer OSI Reference Model.
Likewise, the IP network architecture is structured on 4 layers which match all the
functionalities of the OSI-RM layers. Look at the diagram below.

First thing youll probably say is: wait a minute! Youve said four layers, and this diagram
shows five. Why so? The answer is quite simple: the sockets API isnt a real layer thats
why its shown in a dotted box. When TCP/IP architecture was first deployed thered been
need of something to keep different user sessions properly separated. Sockets API was
devised to that effect, and became a de facto standard, and was ported to all kinds of
operating systems.
Talking of operating systems, one of the great advantages of the TCP/IP network
architecture is the simple scheme of work division among hardware (network interface card)

and software (operating system and user application). Its easy, its simple, and above all, it
works.
Technically we call an application any program that runs under control and taking
advantage of the services of the operating system. Thats a fairly reasonable definition for
our purposes, since all networking architectures are devised to allow communication
between applications, not people. Each application has its own way of human-machine
interaction handling (if it exists at all). Were not concerned with this here. All we want to
explain is how applications can reliably exchange data among them.

And here we arrive at the first paradigm-breaking aspect of the change from the circuitswitching-based plain old telephony service (POTS) and the IP-packet-switching-based nextgeneration network (NGN).
POTS networks are organized in such a way that youve got dumb (and reasonably cheap)
user terminals connected through a smart (and very, very expensive) network. Everytime
the user wants to use network services and for a very long time thered be only one:
telephony he/she has to ask the network for it. By means of sound-based network-to-user
signaling and key-pressing user-to-network signaling (see DTMF and ITU-T recommendation
Q.23 and Q.24) the user says I want to talk with this user, and the network makes the
arrangements to provide the end-to end circuit which the communicating parties will use.
IP-based networks, of which the Internet is the major example, were built assuming the
user terminals are smart (and not overwhelmingly expensive) and the network doesnt have
to have more smartness than necessary to perform a single function: take the data packets
from one side to the other with reasonable reliability. All the aspects of communication that
telephony engineers are used to name as call-control are negotiated directly between user
applications. This is the function of the so-called application-layer protocols.
So we have, so to speak, two different philosophies to handle the call control (which is
another way to say session control): the network-in-the-middle approach, and the end-toend principle. The schematic call-flow diagrams below give an example of the differences
between them.

Generally speaking there are two ways of application interaction, both widely used: peer-topeer and client-server. On peer-to-peer sessions the communicating parties have the same
status, and any of them can request or offer services to the other. Client-server sessions,
on the other hand, have a clear role distinction between the parties: one requests services
(the client) and the other fulfill the services requests (the server).

Most of Internet applications use the client-server model, and that goes quite well with the
end-to-end principle. Otherwise NGN telecommunication services go both ways. Theres
services that are a clear fit to the client-server model, like video or audio streaming, and
theres services that use peer-to-peer, like voice and video telephony (by the way,
videoconferencing can go both ways).
This and a few other issues (security, mostly) forced NGN call-control architecture to use
client-server interactions for signaling, and peer-to-peer or client-server for data exchange,
according to service characteristics. The diagram below is an example of this.

The packet routers between the elements are not shown. And this picture is a gross
oversimplification of NGN architecture. I will not go into details about this, but if you want to
get a more rigorous approach to this subject I recommend you strt reading ITU-T
recommendations Y.2001 General overview of NGN and Y.2011 General principles and
general reference model for Next Generation Networks.
Roughly speaking, the AAA (authentication, authorization and accounting) server role goes
to the IP Multimedia Subsystem (IMS), which was initially standardized by 3GPP/ETSI (see
ETSI TS 123 228 V9.4.0 IP Multimedia Subsystem), and later adopted by ITU
(recommendation Y.2021 IMS for Next Generation Networks). Actually it does much more
than simply AAA functions. Its the entry door to all NGN signaling which are based on
Session Initiation Protocol SIP, and Session Description Protocol SDP (see ETSI TS 124
229 V9.10.2 IP multimedia call control protocol based on SIP and SDP; IETF RFC 3261
SIP: Session Initiation Protocol; and IETF RFC 4566 SDP: Session Description Protocol).

And so we shall do. I must warning you, though: youd better fasten your seat belts, cause
theres turbulence ahead. Few things can be more intellectually intimidating than the writing
style of telecom standards. Truth be told, theyre getting better, but its still a hard
proposition to read them. Even the pictures can be daunting. So I urge you: dont let this
picture scare you out of reading the rest of this article.

This picture comes from ITU-T Recommendation Y.2021. Look at the shaded round-cornered
rectangle. Theres core IMS written on it and it really is that. But were interested in a
single entity in there: the Call Session Control Function (CSCF), and its relationship with the
user equipment (desktop, laptop or handheld computers, smartphones, tablets, whatever)
identified by UE in the picture.
Each line connecting entities are called interfaces (formal terminology is: reference points,
but doesnt matter). Theyre the depiction of logical relationships between the entities, and
each interface uses an application-layer protocol (more than one, sometimes). The signaling

interface between CSCF and UE is identified as Gm in the picture. And the application-layer
protocols used in the Gm interface are SIP and SDP (Im not explaining some acronyms
cause theyre already explained elsewhere I really believe that youre following these
articles from the beginning).
And what does CSCF do? Its the AAA server (and more) that weve talked about in the last
article. Since it looks that most of TelecomHall readers have a mobile background then we
can explain CSCF functionalities this way: its a kind of fusion of HLR (Home Location
Registry) and AuC (Authentication Center).
But theres actually three entities called CSCF, differing by a prefix letter: P (proxy); I
(interrogating); and S (serving). These three flavors of CSCF exist because were talking
of telecom services here. So there are operators own subscribers, and there can be
roaming users.
Whatever the user is local or roamer, one of the first things he/she have to do when
connecting to the network is making contact with the P-CSCF. Item 5.1.1 of ETSI TS 123
228 offers two alternative methods for P-CSCF discovery. I think that the practical way is
combining both:
Dynamic Host Configuration Protocol (DHCP, for IPv4 or IPv6 networks) gives the UE the IP address

(v4 or v6) of the primary and secondary Dynamic Name System (DNS) servers which are capable
of resolving I-CSCF fully-qualified domain name (FQDN) to its IPv4 and/or IPv6 primary and
secondary addresses;
During initial configuration, or in the ISIM (IMS Subscriber Identification Module SIM), or even via
over-the-air (OTA) procedures, the UE receives the FQDN of the I-CSCF.

The I-CSCF forwards all user requests to the S-CSCF thats assigned to serve it. If the user
is local, then thats all. If the user is a roamer, then the S-CSCF of the visited network acts
as an I-CSCF and forwards all user requests to the S-CSCF of the native network of the
user.
To understand the remaining entities in the core IMS we have first to understand that NGNbased services wont simply kick the present telecom services out of the market. Theyll
have to live together, side by side, for a long time yet. So theres a definite need for NGN
and traditional telecom services to interfunction. That is: there should be possible to calls
originated in NGN-connected UEs to terminate on common telephony devices, and viceversa.
Since about ten years ago operators started to substitute traditional telephony switches with
softswitches.
A softswitch is a distributed system (logically, and possibly also geographically), and can be
built (more or less) with an open architecture. Its main building blocks are:
One Media Gateway Controller (MGC), which handles signaling between the softswitch and the rest
of the network elements;

One or more Media Gateways (MGs), which make the translation of media streams between
different physical interconnections.

The MGC controls the MGs assigned to it through a IP-carried signaling protocol whose
specifications are found on ITU-T Reccomendation H.248.1 Gateway Control Protocol:

version 3. The picture below shows how the softswitch elements interconnect with IP and
Public Switchet Telephony Network (PSTN) and the signaling protocols used.

So the Media Gateway Control Function (MGCF) is the IMS element responsible for setting
up the Media Gateway which will bridge the IP data stream to a conventional telephony
circuit. Every IMS-enabled MGC have an instance of MGCF within it.
And that brings another question: since there can be many instances of MGCF available, in
the operator network and in other operators networks which are interconnected, which one
is the best option to bridge between the NGN and the PSTN for each call? This is the
attribution of Breakout Gateway Control Function (BGCF).
Last, but not least, theres the Multimedia Resources Function Controller (MRFC). Certain
application servers (see AS-FE in the picture) need help to deliver services to the UEs. Such
help can be:
According to ITU-T Recommendation Y.2021 Multi-way conference bridges, announcement
playback and media transcoding;

According to TSI TS 123 228 mixing of incoming media streams (e.g. for multiple parties), media

stream source (for multimedia announcements), media stream processing (e.g. audio transcoding,
media analysis), floor control (i.e. manage access rights to shared resources in a conferencing
environment).

Note that MRFC only does control of these activities. The actual execution is handled by
Multimedia Resources Function Processors (MRFPs) in ETSI parlance, or Multimedia
Resources Processor Functional Entities (MRP-FEs) in ITU-T jargon both names refer to
the same software object.
And something very important to keep in mind: P-CSCF, S/I-CSCF, BGCF, MRFC and
MGCF are logical functions which are implemented in software, so they can exist in one
single host machine, or can be distributed among many host machines. Logically it doesnt
matter, but physical implementations of each vendor can vary, and can cast doubts if youre
not aware of this
And then we finally get to NGN signaling protocols: SIP and SDP. The picture below was
extracted from RFC 3261 and gives a fairly good example of a SIP dialog between two
users, Alice and Bob.

The entities involved in the call setup are called User Agents (UA). UAs which request
services are called User Agent Clients (UAC), and those which fulfill requests are called User
Agent Servers (UAS). Although the basic operating mode is end-to-end, the model supports
the use of intermediate proxy servers, which work as back-to-back User Agents (B2BUA)
relaying requests from one user to the other. In the picture Alices softphone and Bobs SIP
phone are the end-to-end user agents, while theres two proxy servers: atlanta.com and
biloxi.com. Linking this with what we already know about IMS, we can identify the P-CSCF
as a SIP proxy server, while the communicating UEs are the end-to-end UAs.

Note: Also visit my blog Smolka et Catervarii (portuguese-only content for the moment)
Quoting RFC 3261:
SIP does not provide services. Rather, SIP provides primitives that can be used to
implement different services. For example, SIP can locate a user and deliver an opaque
object to his current location. If this primitive is used to deliver a session description written
in SDP, for instance, the endpoints can agree on the parameters of a session. If the same
primitive is used to deliver a photo of the caller as well as the session description, a caller
ID service can be easily implemented. As this example shows, a single primitive is typically
used to provide several different services.
SIP primitives are:
REGISTER: indicate an UA current IP address and the Uniform Resource Identifiers (URI) for which
it would like to receive calls;

INVITE: used to establish a media session between UAs;


ACK: confirms message exchanges with reliable responses (see below);
PRACK (Provisional ACK): confirms message exchanges with provisional responses (see below). This
was added by RFC 3262;

OPTIONS: requests information about the capabilities of a SIP proxy server or UA, without setting
up a call;

CANCEL: terminates a pending request;


BYE: terminates a session between two UAs.
Typically a SIP message have to have a response. Like HTTP, SIP responses are identified
with three-digit numbers. The leftmost digit says to which category the response belongs:

Provisional (1xx): request received and being processed;


Success (2xx): request was successfully received, understood, and accepted;
Redirection (3xx): further action needs to be taken by sender to complete the request;
Client Error (4xx): request contains bad syntax or cannot be fulfilled at the server/UA of destiny;
Server Error (5xx): The server/UA of destiny failed to fulfill an apparently valid request;
Global Failure (6xx): The request cannot be fulfilled at any server/UA.

Session Description Protocol (SDP) is described at RFC 4566 (warning: IETF mmusic
working group is preparing an Internet Draft which eventually will supersede RFC 4566).
Matter of fact, it should be called Session Description Format, since its not a protocol as we
use to know. SDP data can be carried over a number of protocols, and SIP is one of them
(although RFC 3261 says that all SIP UAs and proxy server must support SDP for session
parameter characterization).
Quoting RFC 4566:
An SDP session description consists of a number of lines of text of the form:
<type>=<value>
where <type> MUST be exactly one case-significant character and <value> is structured
text whose format depends on <type>. In general, <value> is either a number of fields

delimited by a single space character or a free format string, and is case-significant unless a
specific field defines otherwise. Whitespace MUST NOT be used on either side of the = sign.
An SDP session description consists of a session-level section followed by zero or more
media-level sections. The session-level part starts with a "v=" line and continues to the first
media-level section. Each media-level section starts with an "m=" line and continues to the
next media-level section or end of the whole session description. In general, session-level
values are the default for all media unless overridden by an equivalent media-level value.
Some lines in each description are REQUIRED and some are OPTIONAL, but all MUST appear
in exactly the order given here (the fixed order greatly enhances error detection and allows
for a simple parser). OPTIONAL items are marked with a "*".

Heres an example of an actual SDP session description:

Very well. I think thats enough to you understand how NGN signaling works. Now its time
to get one step down on the TCP/IP protocol stack, so on our next article well be starting to
talk about transport protocols, and will understand how the socket API is used to create
separate sessions over the transport protocols service

Anda mungkin juga menyukai