Anda di halaman 1dari 27

Visit techguide.

com

The Technology Guide Series

Next-Gen VoIP Services and Applications Using SIP and Java

This Guide has been sponsored by

Dont let our sexy curves and cool colors fool you. The internet-age Pingtel xpressa phone, and its virtually limitless Java repertoire of revenue-enhancing possibilities, such as hosted IP voice services, is a very serious money maker indeed. To learn about the opportunities the worlds most intelligent phone can bring you, go to www.pingtel.com/mintmoney. Or send an e-mail to us at hostedvoiceservices@pingtel.com and well get back to you

For Service Providers, its a mini branch of the U.S. Mint.

TECHNOLOGY GUIDE

Visit our Web site to read, download, and print all the Technology Guides in this series.

Table of Contents
Abstract Introduction Architecture Models Technology Enablers for Next Generation Voice Services and Applications Next Generation IP Voice Services and Applications Summary Glossary Appendix A: Session Initiation Protocol (SIP) Concepts and Operation 4 4 6 16 29 33 34 38

Editorial Writing Team

techguide.com
Over 100 Technology Guides in the Following Categories:
Software Applications Network Management Enterprise Solutions Network Technology Telecommunications Convergence/CTI Internet Security

ATGs Technology Guides and White Papers are produced according to a structured methodology and proven process. Our editorial writing team has years of experience in IT and communications technologies, and is highly conversant in todays emerging technologies.

The Guide format and main text of this Guide are the property of The Applied Technologies Group, Inc. and is made available upon these terms and conditions. The Applied Technologies Group reserves all rights herein. Reproduction in whole or in part of the main text is only permitted with the written consent of The Applied Technologies Group. The main text shall be treated at all times as a proprietary document for internal use only. The main text may not be duplicated in any way, except in the form of brief excerpts or quotations for the purpose of review. In addition, the information contained herein may not be duplicated in other books, databases or any other medium. Making copies of this Guide, or any portion for any purpose other than your own, is a violation of United States Copyright Laws. The information contained in this Guide is believed to be reliable but cannot be guaranteed to be complete or correct. Any case studies or glossaries contained in this Guide or any Guide are excluded from this copyright. Copyright 2001 by The Applied Technologies Group, Inc. 209 West Central Street, Suite 301, Natick, MA 01760 Tel: (508) 651-1155, Fax: (508) 651-1171 E-mail: info@techguide.com, Web site: http://www.techguide.com

TECHNOLOGY GUIDE

Next-Gen VoIP Services and Applications Using SIP and Java

Abstract
This Technology Guide explains the unique benets of using the Web architectural model with SIP and Java as the enabling technologies for next generation IP voice services and applications. Using the Web as a reference model for rapid innovation, the Guide contrasts the limitations of circuit-switched telephony and rst generation VoIP architectures with the Web model. It summarizes limitations of centralized-processing models such as traditional telephony, MGCP, and Megaco as compared to peer-to-peer models such as SIP and H.323. This Technology Guide explains in more detail the unique benets of using SIP for call control and Java for making phones intelligent. SIP is compared with H.323 in terms of innovation, scalability, simplicity, ease of deployment, and standardization. The guide also includes an explanation of SIP concepts and operation. A description of Java features supporting new voiceservices and applications is also included. The Guide concludes with examples of new voice-services and applications made possible exclusively by SIP and Java.

Introduction
Traditional telephony has hit a wall in terms of innovation, ease of use, and cost reduction. The core components of traditional telephony the terminal (telephone), PBX, the central ofce switch, and the switching network are struggling and failing to keep up with the rate of innovations on the Internet. The archaic telephony framework with PBXs and Custom Local Area Signaling Services (CLASS) switches providing Centrex and enhanced residential services (call waiting, call forwarding,

caller ID, etc.), cannot provide the types of features that are needed by a contemporary business in the age of e-commerce. The traditional business telephony solutions are complicated, for both the service administrators and the users. Because of the daunting complexity of PBX and CLASS/Centrex user-interfaces, users typically know and use only a fraction of the total feature set. Now imagine telephony services in the context of the current business need. The users would still like to use a phone for making and receiving calls and playing voice-mail messages. However, they would also like to have the phone appliance integrated with a browser-based PC for managing phone books and seamlessly interfacing with other applications, such as customer relationship management (CRM), sales force automation (SFA), supply chain management (SCM), time accounting, etc. In other words, perform tasks most suitable for the PC on the PC and those most suitable for the telephone using a phone appliance and have the two devices seamlessly integrated. Todays telephone just cannot deal with this new business imperative. In contrast, the Internet and Web-based communications have revolutionized the business environment and user personal life-styles by their inexpensive, standards-based innovations. We already have data, multimedia, video, and music applications on the Internet. The Internet is already serving as the underpinning of critical business and IT solutions. Just in the last few years alone the Internet and the Web have generated more innovations than traditional telephony has produced in its entire history. The next frontier for the Web is to apply the same degree of innovation to telephony. Most market surveys have veried that IP telephony is already supplementing traditional telephony and it is expected that the IP telephony architecture will ultimately replace the traditional telephony model.

TECHNOLOGY GUIDE
This Technology Guide explains the architecture of the new IP telephony model using Session Initiation Protocol (SIP) and Java. The Guide also demonstrates the power of SIP and Java in terms of scalability, ease of use, and innovative services and applications.

Next-Gen VoIP Services and Applications Using SIP and Java

Figure 1B: First-generation IP telephony architectures


LAN PBX
"call manager"

IP Centrex
Softswitch "gatekeeper"

Architecture Models
Circuit-Switched and First-Generation IP Telephony Architectures
The traditional telephony architecture is based on a centralized processing model. First generation IP telephony architecture uses a Media Gateway Control Protocol (MGCP), Megaco, or vendor proprietary protocols such as Ciscos Skinny Client Control Protocol (SCCP), which also are centralized architectures similar to the traditional telephony.
Figure 1A: Traditional circuit-switched telephony architectures
PBX Centrex
CLASS 5 switch

Both models have all of their intelligence in a centralized switch or server, which performs all of the telephony functions such as call setup, call forwarding, conference calling, etc. All requests, responses, and state changes must be processed by the central switch/server with the end-station being a dumb terminal. The following are the salient characteristics of the traditional telephony environment: Archaic, Host-to-Dumb Terminal Architecture: Voice service architecture has not changed for generations. Today, PBX and Centrex services are delivered using switches that contain all application intelligence just as mainframes and minicomputers did for IBM 3270 or VT100 terminals in old computer systems. Dumb Terminal The Telephone: Voice service delivery assumes a dumb terminal in telephony parlance the telephone. The end-

TECHNOLOGY GUIDE
user interface for these services on the dumb telephone requires non-intuitive ash sequences and star codes. No options exist for making telephony features easier to use and increasing user productivity. Hardware Specic Software: The voice features reside in software that is usually hardwarespecic and/or proprietary. This environment requires highly-specialized software engineers that are expensive and hard to nd. Even simple software modications require the extensive regression testing of feature interaction. Limited Next-Generation Platforms: Nextgeneration voice service platforms still fall short of business needs. Most rst-generation IP telephony systems, for both service providers and enterprises, do exploit IP for transport and some feature a Java or XML software environment. However, this open environment is not easily made extensible by anyone other than the vendor or possibly a service provider; certainly not the enterprise or an independent software vender with a great idea. These systems, consequently, still perpetuate the same 1960s host-terminal architecture with a dumb telephone as the endpoint: The IP PBX is a host computer with all the smarts driving dumb IP phones. VoIP gateways, softswitches, and their feature servers are merely physically distributed mainframes talking to dumb terminals.

Next-Gen VoIP Services and Applications Using SIP and Java

Web Architecture
The Web represents the most successful application architecture in history. The Web features many intelligent servers located everywhere on the network and an intelligent, browser-based client device (a PC or a low cost Internet appliance). It is the client device, not the server, that both initiates and controls all communications with the server. When a user simply clicks on an icon to access an application, the browser pulls content in the form of HTML and applications (Java, Java script, Flash, Active X, etc.) from the server and runs them on the PC. There is a complete disaggregation of services in the Web model. Not only do the services come from different servers, they may be provided by different and multiple service providers. Some of the examples (shown in gure 2) include Yahoo for news; Amazon for shopping; MSN for instant messaging; ASP services (such as Corio) for customer relationship management (CRM), sales force automation (SFA), enterprise resource planning (ERP); and MP3.com for music. An enterprise can outsource as few or as many services as suits its business model. Key characteristics of the Web architecture include: Intelligent end devices (clients) Distributed, intelligent servers (no central switch or server for services) An open architecture leading to innovation, rapid application development, and lower costs

10

TECHNOLOGY GUIDE
Figure 2: Web application architecture
CRM/SFA MSN Instant Messenger amazon.com MP3.com

Next-Gen VoIP Services and Applications Using SIP and Java

11

Intelligent servers

doubleclick.com

yahoo.com

Virtualcart.com

Java MP3 Flash

Active X HTML Cookies

Intelligent clients

Comparing the Architectures


The Web has revolutionized the world of business. It has enabled a whole new business paradigm in the form of e-business, portals, e-tailers, and collaborative applications. The Web has enabled businesses to reach business partners and customers worldwide with a click of the mouse. Telephony services must change dramatically to become a functional member of this business revolution. However, given their limitations, it is virtually impossible for the current telephony architectures to satisfy emerging requirements. Innovation Traditional and First-Generation IP Telephony The telephone was invented more than 125 years ago. Since then it has enabled people to talk and do only a handful of other things, like use voice mail. All of the features and services in the

voice-world are solely dened and developed by PBX and CLASS switch manufacturers, just as mainframe applications were dened by the vendors. The PBX and CLASS switch vendors, their ideas, their bureaucratic practices, and their business motivations have held innovation in the voiceworld hostage. Voice features reside in software on the switch that is hardware-specic and vendorspecic. It is a proprietary environment that is not openly extensible. Even modest new functions require the onerous regression testing of feature interaction. The centralized, closed-software environment offers no way for enterprises to add their own innovations or enhancements to telephony features, let alone individual users or software developers with really good ideas. Some features are impossible to implement because of the dumbtelephone as the endpoint. Consequently, innovation is and will remain dead, especially when compared to the revolutions on the Web.

Web Innovation on the Web occurs at the edges of the network, where anyone businesses and individuals can create Web sites that are immediately open for other users to interact with. On the Web, in contrast to traditional telephony, a new page or feature can be created in a few minutes. More importantly, the Web page can be conceived, created, delivered and personalized by anyone yahoo, e-bay, GE, a company, an individual, their kids or their grandparents. Several million Web sites are in existence today, up from a few thousand in 1993. These sites satisfy everyones personal and business needs for news, buying, entertainment, chat, sports, sex, etc. regardless of gender, race, religion, ethnic background, industry and occupation. Amazon.com would not have happened if the world needed to rely on the data communications

12

TECHNOLOGY GUIDE
vendors such as Alcatel, Cisco, Lucent, or Nortel to invent the service and add the features to a router or a switch. Ease of Use Traditional and First-Generation IP Telephony For most telephone users, cryptic impossible-toremember ash sequences and * codes are the interface to thousands of PBX and CLASS features. For the fortunate few with block character displays, even IBM 3270 and VT100 terminals appear attractive. Users dont know what voice features exist and if they do, they do not know how to use them. While most voice service platforms such as PBX and CLASS switches offer hundreds or thousands of features (300-400 features in a typical PBX, 3000-4000 in a CLASS 5 switch), most users typically dont know any more than just a few transfer, hold, last number redial. In research conducted by WorldCom, 9 out of 10 executives could not even transfer a call without resorting to the help scream Do I dial ash rst and then the number, or the other way around? Trying to set-up just a 3-party conference call over a PBX is even a bigger nightmare. Its no wonder that the assisted conference calling businesses of AT&T, Sprint and WorldCom are so big and protable. For many, the most difcult part of changing jobs is learning a new phone system. What do I dial to get an outside line? Consequently, for the vast majority, ignorance is bliss, yet very expensive in user productivity.

Next-Gen VoIP Services and Applications Using SIP and Java

13

browsers graphical user interface means that users do not have to memorize features as in the world of telephony. The use of any Web site is an intuitive discovery process, performed simply by pointing and clicking at images and words. Scalability and Capacity Traditional and First Generation IP Telephony In the telephony world, big centralized boxes have all the smarts. Whenever the telephone, the terminal in the parlance of telephone equipment vendors, sends a ash sequence or * code, its the PBX or CLASS switch that gures out what it means. The PBX or the switch also must actively manage each and every call. Consequently, it just does not scale. Support for just one more user may end-up requiring a hugely expensive replacement or addition.

Web A Web site, however, can support millions of users. Scalability is achieved not only through the connection-less nature of IP and by adding more and bigger servers to the Web site. Scalability is also achieved by exploiting an intelligent endpoint the browser-based PC. In fact, its the browser software that interprets Web objects and puts a Web page together. For example, in accessing a typical e-commerce site, its the browser, not a server, that:
Retrieves and displays the source HTML page and embedded product images individually Retrieves and runs a Java applet, Java script, Flash, Active X or other application components Retrieves and displays a dynamic advertisement from DoubleClick.com Retrieves shopping cart services from a ShoppingCart.com

Web On the Web, millions of sites with billions, perhaps trillions, of pages can be easily navigated by pointing and clicking at pictures or words displayed on an intelligent, browser-based PC. In contrast to telephony feature usage, anyone from kids to their great grandparents can easily discover and use any site on the Web. The

14

TECHNOLOGY GUIDE
Stores cookies to identify users and maintain states Encrypts credit card numbers Manageability Traditional and First-Generation IP Telephony An expert the equivalent of the proverbial rocket scientist must perform all maintenance and management tasks for the PBX or the switch. Tools for managing moves/adds/changes tend to be horrendous and, consequently, administrators learn only the basic coping skills. This makes it extremely costly to administer the switch. According to some estimates it can cost as much as $300-$500 per PBX move/add/change. For a Centrex line, it can take weeks for a change to be implemented by the telephone company.

Next-Gen VoIP Services and Applications Using SIP and Java

15

An enterprise has the option of providing PBX services locally through a premises-based system device or these could be outsourced to a networkbased service. The outsourced service not only eliminates capital costs but may actually provide richer services than those available from a PBX. The gure also shows some illustrative services such as unied messaging, presence messaging, instant messaging, and CRM integration, all of which can be provided by separate service providers offering best-of-breed solutions for an enterprises or even an individual users specic requirements.
Figure 3: Web architecture for next-generation voice services and applications
Intelligent servers
CRM/SFA Presence & IM Audio Auctions Hosted PBX service IP PBX

Web Self-service by users is the normal operative model here for registration, buying things, personalizing info, etc. Every ofce device including printers, copiers, and now intelligent IP phones have a built-in Web server that enables remote conguration over the net via browser interface. Every ofce device and home appliance is becoming more intelligent and capable of running automated diagnostics, reporting the ndings, and ordering replacements before service is disrupted.

Unified Messaging

PSTN gateways

Phone-to-phone data & app exchange HTML

Java MP3

PC app integration

Exploiting the Web Architecture for Next Generation Voice Services and Applications
Figure 3 shows what telephony would look like if migrated to a Web-like architecture. In this model, services and applications are resources on the network and are accessed and controlled by the phone and not by a central-switch or a gatekeeper. Nor does a central-switch or gatekeeper control what the phone can do.

Intelligent clients

PCs and other phones are simply resources on the network that provide services to users. In this model, the PC may provide services for the phone such as integration with the desktop applications or the phone may provide services for the PC such as causing the phone to ring and automating conference calls in Microsoft Outlook.

16

TECHNOLOGY GUIDE

Next-Gen VoIP Services and Applications Using SIP and Java

17

Technology Enablers for Next Generation Voice Services and Applications


Clearly, while the model in gure 3 is quite pedestrian in the Web world, it is quite revolutionary in the context of traditional telephony. The components needed to implement this model for telephony are as follows: Intelligent Servers These are distributed resources that interact with intelligent clients (PCs and phones). In terms of hardware and software, these servers are standard Unix, Linux, and Microsoft Windows platforms. Compared to traditional PBXs, these servers offer choices of multiple vendors and competitive pricing with an open applications development environment. Intelligent Phones These phones should provide much more than incoming call ringing. In order to maintain their independence from a central switch, they must also provide local capabilities such as call hold, transfer, forwarding, redial, caller ID, multi-party conferencing, and many other traditional telephony features. The intelligent phones should be thin-client computing devices that can interoperate with PCs and servers on the network. These devices must support dynamic loading and management of applications such as Java applets. For ease of use, they should incorporate functions such as graphical and audio helpers to ease the use of traditional and next generation applications.

Phone Intelligence Technology An ability to support small footprint applications is the key for incorporating intelligence in phones. A powerful yet easy to use programming language used widely for Web-enabling Internet appliances is required. In addition to rich functionality for traditional Web applications, features developed specically for telephony and security are mandatory. Lastly, the language must already be used by hundreds of thousands of programmers worldwide in order for innovation to happen rapidly. Extensible, Scalable Call Control Protocol A call control protocol is used for call related functions such as setting up, monitoring, and terminating calls. However, in the new IP telephony model, the call control protocol must differ from traditional telephony and the rst generation IP telephony protocols. For maximum scalability, the new call control protocol must support peer-to-peer communications whereby two or more phones can set up and communicate directly without requiring anything more than locations services from a call control server. In addition, the protocol must allow the peer-to-peer exchange of applications and data in addition to voice communications. The call control protocol must support a wide range of environments from home-ofce to the largest enterprise and from the smallest to the largest services provider. Thus, the protocol must be highly scalable as well as cost effective in a diverse range of congurations. Since it is not possible to predict all future applications of IP telephony, the protocol must also be extensible in order to accommodate unforeseen requirements.

18

TECHNOLOGY GUIDE
SIP (Session Initiation Protocol) The Call Control Protocol
SIP introduces the benets of the Web architecture to IP telephony. It provides a powerful, extensible, scalable, and easy-to-deploy protocol for call control and media exchange. Several standards are available for building IP telephony solutions. These include the Session Initiation Protocol (SIP) from the IETF; ITU-T H.323, an ITU-T umbrella standard; Media Gateway Control Protocol (MGCP) from IETF; Media Gateway Control (Megaco), a joint protocol by IETF and ITU-T; and proprietary protocols such as Ciscos Skinny Client Control Protocol (SCCP). A high-level comparison of these protocols is included in table 1.

Next-Gen VoIP Services and Applications Using SIP and Java

19

Table1: IP Telephony standards


SIP H.323 MGCP MEGACO PROPRIETARY

Architectural Model Media types Network scope Extensibility Scalability Ease of deployment Standardization

Peer-to-peer Voice, video, data Intra, Extra, and Internet High High High IETF

Peer-to-peer Voice, video, limited data Intra, Extra, and Internet Low Medium Low ITU-T

Master/ slave Voice Intranet only Medium Low Medium IETF

Master/ slave Voice, video Intranet only Medium Low Medium IETF and ITU-T

Master/ slave Voice Intranet only Low Low Medium None

Why SIP Of the protocols listed in table 1, only SIP and H.323 are peer-to-peer protocols. MGCP, Megaco and Ciscos proprietary SCCP represent the old centralized model and suffer from this models limitations discussed earlier. Thus, the real choice for a protocol with Web-like benets comes down to one of the peer-to-peer protocols H.323 or SIP.

H.323, the older of the protocols, was originally designed for video conferencing over the LAN. Since then it has been morphed and used to support voice and video over then WAN as well. SIP, however, was designed from the beginning for multimedia sessions and conferences over the WAN. Because of these differences in their design objectives, SIP offers numerous compelling advantages in the areas of extensibility, scalability, and ease of deployment over H.323. Today there are more products available supporting H.323 than SIP. However, since its introduction, SIP is rapidly becoming the preferred protocol. A January 2001 survey of Voice over IP vendors in Network World found that while 75% of the vendors offered products based on one of the four H.323 versions, an approximately equal number of them were already planning to offer SIP-based products by June 2001. However, the more telling statistic was that less than 25% of the vendors were planning to upgrade their products from H.323 Version 2 to Version 3 and even fewer to Version 4, the latest version of H.323. According to the same survey, most vendors expected H.323 to become a legacy protocol. In contrast, the list of vendors supporting or planning to support SIP is growing rapidly. Service providers embracing SIP include WorldCom, Level 3, Net2Phone, Telia, Webley, Ibasis, LipStream, and TalkingNets as of March 2001 with many more anticipated. The reasons for the rapid ascendancy of SIP become obvious when we compare it with H.323 in the areas of innovation, scalability, ease of deployment, manageability, and the standardization process. Appendix A provides additional details on SIP concepts, denitions, and operation.

20

TECHNOLOGY GUIDE
Innovation SIP enables new services and applications not possible with H.323 (or other IP telephony protocols) and easily empowers service providers, application developers, and enterprises to create unique, differentiated services and applications. For example, SIP uses a simple text-based encapsulation (based on the Internet standard MIME) which enables it to transmit data and application programs with the voice call, making it easy to send business cards, photos, and/or MP3 encoded information during a call. SIP also supports third-party call control through simple applications to modify SIP messages and enable functions such as sending ofce calls to a home phone after 5:00 PM or forwarding video calls to a PC. Lastly, SIP envisions the need to accommodate extensions new protocol headers, methods, bodies and parameters, to implement new and innovative applications. By design not all products are required to support these extensions (just the endpoints) servers or phones that want to use them. Scalability Being peer-to-peer protocols, both SIP and H.323 eliminate the need for central servers to control everything. Peer-to-peer protocols reduce costs of network and server infrastructure equipment necessary to support a user population of a given size. Within peer-to-peer protocols, SIP is a much more efcient and less complex protocol, therefore, more scalable than H.323. H.323 is actually an umbrella specication that includes several protocols from other ITU-T standards. Tables 2 4 cover three categories of such

Next-Gen VoIP Services and Applications Using SIP and Java

21

protocols within H.323. These include Registration, Admission and Status (RAS), Q.931 for call control, and H.245 for transmission of non-telephony signals on the line. As shown in the tables, SIP has a total of 5 methods (commands) and 8 responses and H.323 has 21 commands/messages across the three protocols. SIP can be implemented as a stateless protocol and does not need to maintain any call states, which further increases scalability of SIP. SIP also shows a substantially higher efciency than H.323 during call set-up by using approximately 50% fewer messages. Figures 4 and 5 show call set-up messages for H.323 and SIP, respectively. While H.323 requires a total 13 message exchanges, SIP requires only 7 exchanges. SIP Methods and Response Codes

Table 2: SIP methods


SIP METHODS
INVITE ACK OPTIONS BYE CANCEL REGISTER User or service is being invited to participate in a session. Client has received a nal response to an INVITE request. Server being queried about capabilities. User agent client indicates to server to release the call. Cancels a pending request. Client registers address with a SIP server.

Table 3: SIP response codes


SIP RESPONSE CODES
1xx 2xx 3xx 4xx 5xx 6xx Informational: Request received, continuing to process request. Success: Action successfully received, understood and accepted. Redirection: Further action required to complete request. Client Error: Request contains bad syntax or cannot be executed at server. Server Error: Server failed to execute an apparently valid request. Global Failure: Request cannot be executed at any server.

22

TECHNOLOGY GUIDE
H.323 Commands/Messages

Next-Gen VoIP Services and Applications Using SIP and Java

23

Table 6: H323/H.248 commands and responses


H.248
Command/Message Master-Slave Determination Function Determines which terminal is the master and which is the slave. Possible replies: Acknowledge, Reject, Release (in case of a time out). Contains information about a terminals capability to transmit and receive multimedia streams. Possible replies: Acknowledge, Reject, Release. Opens a logical channel for transport of audiovisual and data information. Possible replies: Acknowledge, Reject, Conrm. Closes a logical channel between two endpoints. Possible replies: Acknowledge. Used by a receive terminal to request particular modes of transmission from a transit terminal. General mode types include VideoMode, AudioMode, DataMode, and Encryption Mode. Possible replies: Acknowledge, Reject, Release. Commands the far-end terminal to indicate its transmit and receive capabilities by sending one or more Terminal Capability Sets. Indicates the end of the H.245 session. After transmission, the terminal will not send any more H.245 messages.

Table 4: H.323 RAS commands and responses


RAS
Command/Message RegistrationRequest (RRQ) Function Request from a terminal or gateway to register with a gatekeeper. Gatekeeper either conrms or rejects (RCF or RRJ) Request for access to packet network from terminal to gatekeeper. Gatekeeper either conrms or rejects (ACF or ARJ) Request for changed bandwidth allocation, from terminal to gatekeeper. Gatekeeper either conrms or rejects (BCF or BRJ) If sent from endpoint to gatekeeper, DRQ informs gatekeeper that endpoint is being dropped; if sent from gatekeeper to endpoint, DRQ call to be dropped. Gatekeeper either conrms or rejects (DCF or DRJ). If DRQ sent by gatekeeper, endpoint must reply with DCF. Request for status information from gatekeeper to terminal. Response to IRQ. May be sent unsolicited by terminal to gatekeeper at predetermined intervals. Recommended default timeout values for response to RAS messages and subsequent retry counts if response is not received.

Terminal Capability Set

AdmissionRequest (ARQ)

BandwidthRequest (BRQ)

Open Logical Channel

DisengageRequest (DRQ)

Close Logical Channel Request Mode

InfoRequest(IRQ) InfoRequestResponse (IRR) RAS Timers and Request in Progress (RIP)

Send Terminal Capability Set

End Session Command

Table 5: H.323/Q.931 commands and responses


Q.931
Command/Message Altering Call Proceeding Function Called user has been alerted phone is ringing. Sent by called user. Requested call establishment has been initiated and no more call establishment information will be accepted. Sent by called user. Acceptance of call by called entity. Sent from called entity to calling entity. Indicates a calling H.323 entitys desire to set up a connection to the called entity. Indicates release of call if H.225.0 (0.931) call signaling channel is open. Afterwards, call reference value can be reused. Sent by a terminal Responds to an unknown call signaling message or to a Status Inquiry message. Provides call state information. Requests call status. Can be sent by endpoint or gatekeeper to another endpoint.

Ease of Deployment Deploying and supporting SIP is similar to HTTP. It uses standard protocols and functions, which already exist in the current IP networks and are well understood by system administrators and technical support personnel. SIP has the following HTTP characteristics: Standard Internet addressing: SIP uses standard IP addressing format for both names and addresses, e.g., sip:username@abcorp.com or sip:1.781.938.5306@abcorp.com Clear text protocol: SIP uses clear text for its protocol encapsulation unlike H.323, which uses binary encoding, making SIP easier to diagnose and troubleshoot.

Connect Setup Release Complete

Status

Status Inquiry

24

TECHNOLOGY GUIDE
Simple error messages: SIP uses familiar errormessages with prexes such as 10x, 20x, etc. Leverages other Internet protocols: SIP uses other familiar Internet protocols such as MIME and Session Description Protocol (SDP), again eliminating the need for new technical training or expertise.
Figure 4: SIP Operation in Proxy Mode

Next-Gen VoIP Services and Applications Using SIP and Java

25

Figure 5: H.323 Call set-up sequence

Endpoint 1 1 Admission Request Admission Confirm 3

Gatekeeper

Endpoint 2

Setup Call Proceeding

4 6

Admission Request

Admission Confirm

Altering Connecting

Site 1
Endpoint 1@Site 1 INVITE Endpoint 2 @Site 2 Proxy Location Server

Site 2
Client 2 @Site 2 9 1 2 Endpoint 2 10

7 8 Terminal Capability Set Master/Slave Determination Terminal Capability Set + Ack Master/Slave Determination + Ack Terminal Capability Set Ack Master/Slave Determination Ack Open Logical Channel Open Logical Channel + Ack

11 Client 2 @Site 2 4 3 12 13

INVITE Endpoint 2 @Site 2

Open Logical Channel Ack Media (RTP) Close Logical Channel End Session Command Close Logical Channel + Ack End Session Command Release Complete

100 Trying 200 OK 5 6 100 Trying 200 OK

Ack Disengage Request Ack Disengage Confirm Endpoint 1 Gatekeeper RAS 0.931 Disengage Confirm Endpoint 2 H.245 Disengage Request

Standardization The ITU-T, organized under the auspices of the United Nations, denes traditional telephony and H.323 standards. It is a slow moving body with a highly political process. Participation in ITU-T activities is limited to paid members. Most of

26

TECHNOLOGY GUIDE
ITU-T documents are written using very dense language, which make it virtually impossible for the uninitiated to fathom their intent. Most ITU-T standards tend to be very complex. For example, H.323 specication with its co-requisite protocols runs some 700 pages compared to about 150 pages for SIP. The ITU-T specications are not freely available and have to be purchased. As of February 2001, you could not even buy the H.323 specications from the ITU-T bookstore because ITU-T still had not made them available for purchase. In contrast, the Internet standardization process is geared toward rapid innovation. It has an open and democratic process which draws architects from the industry, academia, government, and individuals who are experts in specic technology areas. All Internet specications are available for free to anyone and can be simply downloaded from the Internet. Lastly, the Internet standardization is rooted in the proof-of-concept, i.e., there must exist a prototype implementation for a standard to achieve approved status. The standard documents often include model codes to document the standard. Additionally, almost always, the actual code to implement a prototype is available on the Internet for free download and use.

Next-Gen VoIP Services and Applications Using SIP and Java

27

can run on minimalist appliances. Simple Java applets can be developed in anywhere from a few minutes to a few hours. Key features of Java include: Network Orientation Java applications, called applets, run on thinclients. Java applets are network-aware and can open and access objects across the Internet via URLs. The Remote Method Invocation (RMI) feature of Java allows the building of distributed applications. RMI-based applications can connect to other Java applications as well as legacy applications. Java Naming and Directory Interface (JNDI) provides a unied interface to multiple heterogeneous naming and directory services including LDAP directories. JNDI enables seamless connectivity to these services. Developers can build powerful and portable directory-enabled Java applications using this industry-standard interface. Java Database Connector (JDBC) is an application programming interface (API) that provides crossDBMS connectivity to a wide range of SQL databases. Using JDBC, an application can establish connectivity with nearly any enterprise or service provider database from a Java-enabled phone. Java also features specications and supports products which can automate the process of distributing new versions of applications over the network. This includes Java Management Extensions (JMX), the specication, and Java Dynamic Management Kit (JDMK), Suns product which implements this specication. Powerful APIs for Telephony and Speech Applications Java has two APIs specially designed for telephony and speech applications: Java Telephony API (JTAPI) denes interface to access the following functional areas: call control, telephone physical device control,

Java the Applications Engine


A key element of the proposed architecture for the next-generation IP voice services and applications is an intelligent phone. Java is the ideal application engine technology for intelligent phones. Java has already proven itself as one of the most innovative technologies fueling the Internet innovations and Java applications that are at the core of the contemporary Web-pages. Java applications do not reside permanently on thin-clients, thus, do not consume any resources on the phone when not needed. They are typically designed with very small footprints so that they

28

TECHNOLOGY GUIDE
media services, and telephony administrative services. JTAPI functions can be used with both wired and wireless phones and its core functions can be extended to build applications such as call logging and tracking, auto-dialing, screen-based telephone applications, call routing applications, automated attendants, interactive Voice Response (IVR) systems call management center, voicemail, etc. Java Sound API (JSAPI) allows developers to incorporate speech technology into user interface for their Java applets and applications. This API species a cross-platform interface to support command and control recognizers, dictation systems and speech synthesizers. Security Java has a built-in security framework or sandbox that can protect basic phone operation like making and receiving calls from rogue or misbehaving applets. Java enables the construction of virus-free, tamper-free appliances like phones. It also incorporates authentication techniques based on public-key encryption. Javas security features also allow enterprises to control access to resources via policy-based permissions. Support for a Wide Variety of Devices and User Interfaces Java applets can run on virtually any platform due to their platform independence. A Java applet can be written once and run on virtually any operating system including cell phone OS, HP UX, IBM AIX, Palm OS, Sun Solaris, VxWorks, Microsoft Windows, and various other varieties of Unix and Linux systems. To enable a Java application to execute anywhere on the network, the Java compiler generates an architecture-neutral object le and the compiled code is executable on any

Next-Gen VoIP Services and Applications Using SIP and Java

29

processor that is running Java runtime environment. Consequently, a Java applet written for an IP phone appliance can run without modication on a PCbased softphone supporting Java. Ease of Development Sun makes developing applications quick and easy with great tools in their Java Development Kit. In addition, Java is supported by numerous tools, components, and applications that are available from many vendors. In fact, many are available for free on the Internet. These tools include application and user interface (UI) components, authoring and workow tools, and integrated development environments. A wide variety of Java training options ranging from classrooms to web-based are also available. Lastly, due to Javas tremendous popularity, Java software engineers are readily available on permanent or contract basis to assist in development.

Next Generation IP Voice Services and Applications


SIP and Java also enable a whole new generation of applications which are impossible with other telephony architectures. These applications can generally be divided into three categories: Personal productivity applications Occupation specic and industry specic applications Web-telephony integration (WTI) applications Listed below are a few examples of each.

30

TECHNOLOGY GUIDE
Personal Productivity Applications
Electronic business cards send an enriched electronic virtual business card (vCard) including photo and audio le automatically with every call as caller ID information (or selectively during the middle of call). This information can be added into any personal contact database such as Microsoft Outlook, or a corporate CRM, or a Supply Chain Management (SCM) database with the push of a button. Presence and instant messaging use an instant messenger service to determine when geographically distributed colleagues are available for a quick conference call with a customer. Simply click or automatically camp on your buddy list to create the conference call. Call lters have every call from that very important customer ring at every phone business phone, cell phone, home phone, vacation phone, etc. The call will get completed to the rst device from where the user picks up the call. Phone book use multiple phone books corporate, personal, Internet, etc., on the phone and simply point to an entry to make the call. The phone books can be synchronized with the data on a PC or any server. Personalized music on-hold play personalized announcements or music from a favorite MP3 recording or Internet radio station while callers are on hold. Voice tag elimination deliver customized messages to people trying to contact busy contacts and eliminate phone tag.

Next-Gen VoIP Services and Applications Using SIP and Java

31

Automated conference calling create conference call appointments in Microsoft Outlook. The application would automatically set-up the conference call at the specied time. Distinctive rings play unique rings from any sound le based on caller ID or personal directory information. Separate rings could be set up for a boss, spouse, kids, or anyone else.

Industry and Occupation-Specic Applications


Telecommuters get all ofce telephony functionality at home extension dialing, call transfer, intranet intercom, call billing, etc. Consultants start the clock automatically for time accounting or billing when picking up the phone or dialing the number of a client using caller ID or contact database information. Sales reps integrate voice and data information collected during a call with sales force automation applications such as ACT or Goldmine, or an ASP like sales.com. Public relations click-to-dial personalized and up-to-date press, analyst and vendor contact lists, and track and report time on the phone by client using a public relations ASP like mediamap.com.

Web-Telephony Integration (WTI) Applications:


Auction site for purchasing agents of electronic components create a live audio auction for excess DRAM inventory and use the heat of a real-time event to pump-up prices and the auctioneers commission. Use Java applets on the

32

TECHNOLOGY GUIDE
phone to manage the bidding process and to track who raised a hand to bid rst, etc. Virtual call center ASP support the integrated voice and data requirements of call center agents working from their homes. Airlines reservations use a Java applet to visually display interactive voice response (IVR) options rather than forcing users to wait through very long recorded instructions and go through multi-level menus requiring the use of a telephone keypad.

Next-Gen VoIP Services and Applications Using SIP and Java

33

Summary
The Web has revolutionized the world of business. Traditional telephony, however, cannot fulll the needs of the emergent e-business model. The traditional telephony model is constrained by an inexible and inefcient architecture based on centralized processing and the dumb terminal. This environment inhibits innovation, is nearly impossible to use, and simply perpetuates the old, cumbersome, and limited functionality services. IP telephony needs to embrace the Web architectural model in order to achieve rapid and cost effective innovation. Old denitions of enhanced services and features do not come anywhere near even the simplest applications made possible by technologies such as SIP and Java. SIP, coupled with Java, can bring the same revolutionary innovations and mindset to the world of IP telephony that the Web has brought to IT and the data world.

34

GLOSSARY
API: Application Programming Interface, a set of programming functions and calls supported by a language or a software product. APIs are used by software developers to develop programs in a specic language or to enhance or extend the capabilities of a product. Abstract Syntax Notation 1, an object-oriented language used by various architectures such as OSI, ITU-T, and SNMP to dene objects including data structures. Application Services Provider, a service provider that provides applications over a network with a usage-based fee. Custom Local Area Signaling Services, services such as caller ID and ring back provided by a telephone company. Devices in the telephone central ofce that provide such services are called CLASS switches. Central Processing Unit, the arithmetic and logic unit in a computer. Examples include the Intel Pentium family, the AMD Atheon, and the IBM RISC processors.

Next-Gen VoIP Services and Applications Using SIP and Java

35

IVR:

Interactive Voice Response, a system used for generating voice prompts and menus and for accepting and processing user responses. Java Telephony API, an extension to Java that provides telephony functions such as call control. Java Speech API, an extension to Java that provides functions for controlling dictation systems and speech synthesizers Java Naming and Directory Interface, an extension to Java that provides a unied interface to multiple naming and directory services. Media Gateway Control, a VoIP protocol jointly developed by ITU-T and IETF. It uses softswitches and gatekeepers for central control of calls and conferences. Media Gateway Control Protocol, a VoIP protocol developed by and IETF. It uses softswitches and gatekeepers for central control of calls and conferences. Multipurpose Internet Mail Extensions, an Internet standard used for encapsulating e-mail messages in clear text. Private Branch Exchange, a customer premise based telephone switch for intra-campus and outside telephone calls. Public switched Telephone Network, a general reference to telephone networks using circuit switching and time division multiplexing. An ITU-T Call control protocol for ISDN, also used in H.323. It denes procedures for setting up and clearing calls.

JTAPI:

ASN.1:

JSAPI:

JNDI: ASP:

CLASS:

Megaco:

MGCP: CPU:

MIME: CRM: Customer Relationship Management software, used with application such as ACT or Goldmine to keep track of customer contacts and sales information. An ITU-T specication for multimedia conferences over IP for LAN attached stations. It is a peer-to-peer protocol as opposed to MGCP and Megaco which require central control Hyper Text Transfer Protocol, used for encoding and transferring Web objects from Web servers to Web browsers.

PBX:

H.323:

PSTN:

HTTP:

Q.931:

36

GLOSSARY
RAS: Registration, Admission, and Status, a component of H.323, denes procedures whereby users can register themselves with a gatekeeper as a preliminary step to setting up a call. Remote Method Invocation, a component part of Java, allows building of distributed applications that can connect to other Java applications as well as legacy applications. RTP Control Protocol, control protocol for RTP that allows multimedia session partners to monitor the quality of their sessions. Real-time Transport Protocol, an IP standard for encapsulating multimedia streams for transmission over IP networks. It includes information such as packet timestamps to help implement quality of service for a session. Skinny Client Control Protocol, a Cisco proprietary protocol for voice over IP that uses central control with gatekeeper-like functions. Supply Chain Management, used in reference to application programs used for managing purchases and suppliers. Session Description Protocol, an IETF standard to advertise multimedia conferences. SDP is intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. Sales Force Automation, used in references to application programs used for managing sales activities such as capturing customer contact information, generating contracts, and generating order forms.

Next-Gen VoIP Services and Applications Using SIP and Java

37

SIP:

Session Initiation Protocol, IETF standard for peer-to-peer multimedia sessions and IP telephony. An alternative to the ITU-T H.323 protocol. Voice over IP, a general reference to several technologies and protocols that allow voice telephony implementation over IP networks. Examples of components and technologies that enable VoIP include codecs, IP PBXs, softswitches, gateways, H.323, SIP, MGCP, and Megaco.

RMI:

VoIP:

RTCP:

RTP:

SCCP:

SCM:

SDP:

SFA:

38

APPENDIX A

Next-Gen VoIP Services and Applications Using SIP and Java

39

Session Initiation Protocol (SIP) Concepts and Operation


SIP is an Internet protocol dened under Request for Comment 2543 (RFC 2543). SIP is not just for voice communications it supports data and multimedia in its core specication.
Figure 6: SIP and other Internet Protocols

cases of a multicast conference, a full-mesh conference and a two-party phone call, as well as combinations of these. Any number of calls can be used to create a conference.

Call A call consists of all participants in a conference invited by a common source. A SIP call is identied by a globally unique call-ID.
Figure 7: SIP clients and servers

Gopher

Kerb

SMTP

Telnet

FTP

SIP

SNMP

RPC

TCP IP LAN or WAN Interface

UDP

User Agent Client User Agent Server

User Agent Client User Agent Server

In TCP/IP terminology, as shown in gure 6, SIP is an application level protocol and runs over UDP but may use TCP. SIP is based on existing and well-understood Internet protocols and extends them to support IP telephony. SIP Concepts Session A SIP session is a multimedia session consisting of a set of multimedia senders and receivers and the data streams owing from senders to receivers. Session is the basic building block in SIP. All calls and conferences are established by setting up sessions among users.

SIP Servers: Proxy Redirect Location Registrar

Conference A conference is a multimedia session, identied by a common session description. A conference can have zero or more members and includes the

SIP Components User Agent Clients and Servers A user agent is a program that runs on a SIP device (e.g., the phone). It contains a client function and a server function. The user agent client (UAC) is a program that initiates SIP requests such as initiating a call. A UAC is also known as the calling user agent A user agent server (UAS) is a program that receives SIP requests such as an incoming call and sends back responses to those requests. A UAS is also known as the called user agent.

40

APPENDIX A
SIP Servers Location Server A location server is used to obtain information about a callees possible location. A location is the IP address of the domain where a user is located. To locate a user, the name of the user is sent to the location server and the location server returns zero or multiple locations (IP addresses orf domains) where a callee may be found. If the caller already knows the IP address of the destination server, the caller can directly contact the callees UAS.

Next-Gen VoIP Services and Applications Using SIP and Java

41

rwhois, LDAP, multicast-based protocols or operating-system dependent mechanisms to actively determine the end system where a user might be reachable. SIP Addressing SIP uses traditional Internet names as addresses, which consist of a user name and a domain name. This is an important issue because it means that the existing Internet naming, addressing, and routing services can process SIP addresses without modications. Examples of SIP addresses include: SIP:user01@bigcorp.com SIP:user@25.16.10.8 SIP:1-212-555-1212@business.com These addresses are similar to HTTP URL addresses except that they start with SIP instead of HTTP. The rst example shows a user being identied via a typical e-mail address. The second example shows an address where the IP address of the destination is known. The last example shows how we could use a phone number-like address under SIP. The major advantages of this addressing scheme are: It invents no new directory structure and can be processed by existing IP servers Users can use familiar e-mail or URL addresses to make phone calls and have one less thing to remember, the phone number. Domain Name Services (DNS) DNS is a standard Internet service to convert user names, e.g., user01@bigcorp.com into IP addresses, e.g., 172.30.10.20, that can be used for nding user locations and routing calls. Because SIP uses standard IP naming and addressing, we are able to use existing, standard DNS services for SIP without any modication.

Proxy Servers A proxy server is an intermediary program that acts as both a server and a client for the purpose of making requests on behalf of other clients. Requests are serviced internally by a proxy server or forwarded, possibly after translation, to other servers. A proxy interprets and, if necessary, rewrites a request message before forwarding it. Redirect Server A redirect server is a server that accepts a SIP request, maps the address into zero or more new addresses and returns these addresses to the client. Unlike a proxy server, it does not initiate its own SIP requests. Unlike a user agent server, it does not accept calls. Registrar A Registrar is a server that accepts REGISTER requests. A client uses the REGISTER request to let a proxy or redirect server know the location where the client can be reached. It provides a means whereby users can register their locations with a SIP server dynamically. As users move to different locations, they can register their new locations with the local location server. To supplement information obtained through user registrations, a location server may also use one or more TCP/IP protocols, such as nger,

42

APPENDIX A
SIP Messages SIP messages include SIP methods and responses to the methods. These are listed in tables 5 and 6. SIP Message Encapsulation MIME Multipurpose Internet Mail Extensions (MIME) is the Internet standard for describing different types of content on the Internet, including video and image types. It is already used by HTTP for composing Web pages and by e-mail systems for encoding e-mail messages. SIP uses this wellestablished standard for encoding information, eliminating the need for inventing a new technique for encoding voice and multimedia over the Internet. SIP Call Setup SIP is inherently capable of carrying voice, video, and multimedia calls. In the examples below, the setup ows remain the same irrespective of the type of the call. In these scenarios a call set up is illustrated where a caller knows the name but not the IP address of a callee, necessitating the use of a SIP server. If the caller knew the IP address of the callee, the caller would not need services from the SIP servers. With a callees destination IP address known, the callers user agent client only needs to select the protocol (UDP by default), port (5060 by default) and IP address of the SIP user agent server to which the INVITE request should be sent. A successful SIP call setup consists of two messages, an INVITE followed by an ACK. The INVITE request asks the callee to join a particular conference or establish a two-party conversation. It also includes information about the media types and formats that are allowed for the session. If the callee wishes to accept the call, it responds to the invitation by returning a similar description listing the media and format it wishes to use.

Next-Gen VoIP Services and Applications Using SIP and Java

43

When the callee sends a response to the INVITE request agreeing to participate in the call, the caller sends an ACK to conrm callees response. Call Setup Using A Proxy Server To initiate a SIP call, a caller rst locates the appropriate proxy server and then sends a SIP invitation request to the proxy server. The location of the proxy server is locally congured on the user station. The proxy server can also be discovered automatically by the caller using a variety of mechanisms such as DHCP options, DNS SRV and others. Instead of directly sending the call to the intended callee, the proxy server may redirect the SIP request or trigger a chain of new SIP requests to other proxies or location servers. Figure 5 shows detailed ows for SIP call setup using a proxy server and are describe below: 1. Endpoint1@Site1 sends an INVITE request for Endpoint2@Site2 to the proxy server. 2. The proxy server contacts the location service for Endpoint2. 3. The proxy server receives a more precise location for Endpoint2 as Client2@Site2 from the location server. 4. The proxy server issues an INVITE request to the address(es) returned by the location service. The INVITE request carries a Call-ID. (Upon receiving the INVITE request, the called user-agent alerts the user by generating a phone ring). 5. The called user agent returns a 100 Trying response indicating that it is processing the INVITE request. 6. The called user agent returns a 200 OK response to indicate successful processing of the INVITE request.

44

APPENDIX A
7. The calling user agent sends an ACK to complete the handshake. The call is now in place. Call Setup Using Redirect Server Again we assume that the IP address of the caller is not known to the callers agent, thereby, necessitating services of the local SIP server, a redirect server in this case. The key difference compared to the proxy server is that the redirect server cannot initiate an INVITE request.
Figure 8: SIP Operation in Redirect Mode

Next-Gen VoIP Services and Applications Using SIP and Java

45

3. The location server returns information that this client can be found at Site3. 4. The redirect server forwards precise location information to the calling user agent using a 302 Moved Temporarily message: Contact Client2@Site3 5. The calling user agent acknowledges the information with ACK 6. The calling user agent sends an INVITE request directly to the called user agent. 7. The called user agent returns a 100 Trying response indicating that it is processing the request. 8. The called user agent returns a 200 OK response to indicate successful processing of the INVITE request. 9. The calling user agent sends an ACK to complete the handshake. The call is in now place.

Site 1
Endpoint 1 @Site 1 INVITE Endpoint 2 @Site 2 302 Moved Temporarily Contact: Client 2 @Site 3 Ack

Site 2
Redirect Server Location Server

Site 3
Client 2 @Site 3

Endpoint 2

Site 3

INVITE Client 2 @Site 3

100 Trying 200 OK

Ack

The ow of requests and responses for gure 8 is as follows: 1. Enduser1@Site1 sends an INVITE request to the redirect server for Endpoint2@Site2. 2. The redirect server contacts the location server for location information about Endpoint2.

46

NOTES

Next-Gen VoIP Services and Applications Using SIP and Java

47

Telephonic no longer rhymes with moronic.


Pingtel xpressa, the worlds rst Java-based IP phone, does just

about anything a clever Java programmer could dream up. To see what your Java colleagues have taught our phone to do already, go to www.pingtel.com/payphone now and check out our App Dev Zone. A good idea of your own and who knows? You just might get rich. Or famous. Real fast.

For Java Developers, its a pay phone.

This Technology Guide is one in an ongoing series of over 100 solutions-focused Guides. These Guides assist IT professionals in making informed business decisions about specic aspects of technology development and strategic deployment. The Technology Guide Series offers a broad array of titles, each presenting objective information and practical guidance in a non-biased, easy-to-understand style and tone. Our editorial writing team has many years of experience in IT and communications technologies, and is highly conversant in todays emerging technologies. The Technology Guide Series and techguide.com are supported by a consortium of leading technology providers. The Sponsor has lent its support to produce and publish this Guide. This Guide, as well as the entire Technology Guide Series, is made available to view and print at no charge by visiting techguide.com.

Over 100 Technology Guides in the following categories:

Software Applications

Network Management Enterprise Solutions Network Technology Telecommunications Convergence/CTI Internet Security

produced and published by

Anda mungkin juga menyukai