Anda di halaman 1dari 95

COMPUTER NETWORKS

WHAT IS COMPUTER NETWORKS?


It is a collection of autonomous computers interconnected by a single technology.
First let us the difference b/w computer network and distributed system.
The key distinction is that in a distributed system, a collection of independent computers appears
to its users as a single coherent system. Usually, it has a single model or paradigm that it presents
to the users. Often a layer of software on top of the operating system, called middleware, is
responsible for implementing this model. A well-known example of a distributed system is the
World Wide Web. It runs on top of the Internet and presents a model in which everything looks
like a document (Web page).
In a computer network, this coherence, model, and software are absent. Users are exposed to the
actual machines, without any attempt by the system to make the machines look and act in a
coherent way. If the machines have different hardware and different operating systems, that is
fully visible to the users. If a user wants to run a program on a remote machine, he has to log
onto that machine and run it there.
USES OF CN:
We will start with traditional uses at companies, then move on to home networking and recent
developments regarding mobile users, and finish with social issues.
1. Business Applications
2. Home Applications
3. Mobile Users.
4. Social Issues.
Business Applications:
Resource sharing is one of the important business applications. An obvious and
widespread example is having a group of office workers share a common printer. None of the
individuals really needs a private printer, and a high-volume networked printer is often cheaper,
faster, and easier to maintain
than a large collection of individual printers.
Client server model:
In the simplest of terms, one can imagine a companys information system as consisting of one
or more databases with company information and some number of employees who need to
access them remotely. In this model, the data are stored on powerful computers called servers.
Often these are centrally housed and
maintained by a system administrator. In contrast, the employees have simpler machines, called
clients, on their desks, with which they access remote data. The client and server machines are
connected by a network, as illustrated.

If we look at the client-server model in detail, we see that two processes (i.e., running programs)
are involved, one on the client machine and one on the server machine. Communication takes the
form of the client process sending a message over the network to the server process. The client
process then waits for a reply message. When the server process gets the request, it performs the
requested work or looks up the requested data and sends back a reply.

Home applications:
1.Access to remote information: Access to remote information comes in many forms. It can be
surfing the World Wide Web for information or just for fun. Information available includes the
arts, business, cooking, government, health, history, hobbies, recreation, science, sports, travel,
and many others. Fun comes in too many ways to mention, plus some ways that are better left
unmentioned.
2. Person-to-Person Communication: The second broad category of network use is person-toperson communication, basically the 21st centurys answer to the 19 th centurys telephone. Email is already used on a daily basis by millions of people all over the world and its use is
growing rapidly. It already routinely contains audio and video as well as text and pictures. Smell
may take a while. Any teenager worth his or her salt is addicted to instant messaging. This
facility, derived from the UNIX talk program in use since around 1970, allows two people to
type messages at each other in real time. There are multi-person messaging services too, such as
the Twitter service that lets people send short text messages called tweets to their circle of
friends or other willing audiences.
Network hardware:
Personal area network : PANs (Personal Area Networks) let devices communicate over the
range of

person. A common example is a wireless network that connects a computer with its peripherals.
Almost every computer has an attached monitor, keyboard, mouse, and printer. Without using
wireless, this connection must be done with cables. So many new users have a hard time finding
the right cables and plugging them into the right little holes (even though they are usually color
coded) that most computer vendors offer the option of sending a technician to the users home to
do it. To help these users, some companies got together to design a short-range wireless network
called Bluetooth to connect these components without wires.

LAN:
The next step up is the LAN (Local Area Network). A LAN is a privately owned
network that operates within and nearby a single building like a home, office or factory. LANs
are widely used to connect personal computers and consumer electronics to let them share
resources (e.g., printers) and exchange information.When LANs are used by companies, they are
called enterprise networks.

MAN:
A MAN (Metropolitan Area Network) covers a city. The best-known examples of
MANs are the cable television networks available in many cities. These systems grew from
earlier community antenna systems used in areas with poor over-the-air television reception. In
those early systems, a large antenna was placed on top of a nearby hill and a signal was then
piped to the subscribers houses.

WAN (Wide Area Network):


The WAN as we have described it looks similar to a large wired LAN,
but
there are some important differences that go beyond long wires. Usually in a WAN, the hosts and
subnet are owned and operated by different people. In our example, the employees might be
responsible for their own computers, while the companys IT department is in charge of the rest
of the network. We will see
clearer boundaries in the coming examples, in which the network provider or telephone company
operates the subnet. Separation of the pure communication aspects of the network (the subnet)
from the application aspects (the hosts) greatly simplifies the overall network design.

NETWORK SOFTWARE:
Terminologies:
Protocol: Agreement /rule b/w the communication parties on how the communication is to be
done i.e., b/w the systems transferring of data.
Architecture: A set of layers and protocols together farm network architecture.
Interface: it acts as an mediator b/w the two layers and services the lower layer makes available
to the upper layer.
Protocol Hierarchies:
To reduce their design complexity, most networks are organized as a stack of layers or levels,
each one built upon the one below it. The number of layers, the name of each layer, the contents
of each layer, and the function of each layer differ from network to network. The purpose of each
layer is to offer certain services to the higher layers while shielding those layers from the details
of how the offered services are actually implemented. In a sense, each layer is a kind of virtual
machine, offering certain services to the layer above it.
This concept is actually a familiar one and is used throughout computer science, where it is
variously known as information hiding, abstract data types, data encapsulation, and objectoriented programming. The fundamental idea is that a particular piece of software (or hardware)
provides a service to its users but keeps the details of its internal state and algorithms hidden
from them.

CONNECTION ORIENTED:
Connection-oriented service is modeled after the telephone system. To talk to
someone, you pick up the phone, dial the number, talk, and then hang up. Similarly, to use a
connection-oriented network service, the service user first establishes a connection, uses the
connection, and then releases the connection. The essential aspect of a connection is that it acts
like a tube: the sender pushes objects (bits) in at one end, and the receiver takes them out at the
other end. In most cases the order is preserved so that the bits arrive in the order they were sent.
In some cases when a connection is established, the sender, receiver, and subnet conduct a
negotiation about the parameters to be used, such as maximum message size, quality of service
required, and other issues. Typically, one side makes a proposal and the other side can accept it,
reject it, or make a counterproposal.
A circuit is another name for a connection with associated resources, such as a fixed bandwidth.

This dates from the telephone network in which a circuit was a path over copper wire that carried
a phone conversation.
CONNECTION ORIENTED
1. CONNECTION
2. DATA TRANSFER
3. DISCONNECT
Feature: route path is identified from sending host to receiving host a fixed route is identified and
used the same route to transfer all the packets.
VARIATIONS OF CONNECTION ORIENTED:
1. Reliable message stream
2. Reliable byte stream
3. Unreliable connection
Connection less:
connectionless service is modeled after the postal system. Each message (letter) carries the full
destination address, and each one is routed through the intermediate nodes inside the system
independent
of all the subsequent messages. There are different names for messages in different contexts; a
packet is a message at the network layer. When the intermediate nodes receive a message in full
before sending it on to the next node, this is called store-and-forward switching. The
alternative, in which the onward transmission of a message at a node starts before it is
completely received by the node, is called cut-through switching. Normally, when two
messages are sent to the same destination, the first one sent will be the first one to arrive.
However, it is possible that the first one sent can be delayed so that the second one arrives
first.
Variations:
Unreliable data gram connection less service.
Acknowledge data gram service.
Request reply connection less service.
Layers:
OSI (open system interconnection)
The OSI model (minus the physical medium) This model is based on a proposal developed by
the International Standards Organization (ISO) as a first step toward international standardization
of the protocols used in the various layers (Day and Zimmermann, 1983). It was revised in 1995
(Day,
1995). The model is called the ISO OSI (Open Systems Interconnection) Reference Model
because it deals with connecting open systemsthat is, systems that are open for communication
with other systems. We will just call it the OSI model for short.
The OSI model has seven layers. The principles that were applied to arrive at the seven layers

can be briefly summarized as follows:


1. A layer should be created where a different abstraction is needed.
2. Each layer should perform a well-defined function.
3. The function of each layer should be chosen with an eye toward defining internationally
standardized protocols.
4. The layer boundaries should be chosen to minimize the information flow across the interfaces.
5. The number of layers should be large enough that distinct functions need not be thrown
together in the same layer out of necessity and small enough that the architecture does not
become unwieldy.

1.Physical layer: The physical layer is concerned with transmitting raw bits over a
communication
channel. The design issues have to do with making sure that when one side sends a 1 bit it is
received by the other side as a 1 bit, not as a 0 bit. Typical questions here are what electrical
signals should be used to represent a 1 and a 0, how many nanoseconds a bit lasts, whether
transmission may proceed simultaneously in both directions, how the initial connection is
established, how it is torn down when both sides are finished, how many pins the network
connector has, and what each pin is used for. These design issues largely deal with mechanical,
electrical, and timing interfaces, as well as the physical transmission medium, which lies below
the physical layer.
2.Data link layer: The main task of the data link layer is to transform a raw transmission facility
into a line that appears free of undetected transmission errors. It does so by masking the real

errors so the network layer does not see them. It accomplishes this task by having the sender
break up the input data into data frames (typically a few hundred or a few thousand bytes) and
transmit the frames sequentially. If
the service is reliable, the receiver confirms correct receipt of each frame by sending back an
acknowledgement frame.
Another issue that arises in the data link layer (and most of the higher layers as well) is how to
keep a fast transmitter from drowning a slow receiver in data. Some traffic regulation mechanism
may be needed to let the transmitter know when the receiver can accept more data. Broadcast
networks have an additional issue in the data link layer: how to control access to the shared
channel. A special sublayer of the data link layer, the medium access control sub layer, deals
with this problem.
The network layer: The network layer controls the operation of the subnet. A key design issue is
determining how packets are routed from source to destination. Routes can be based on static
tables that are wired into the network and rarely changed, or more often they can be updated
automatically to avoid failed components. They can also be determined at the start of each
conversation, for example, a terminal session, such as a login to a remote machine. Finally, they
can be highly dynamic, being determined anew for each packet to reflect the current network
load.
If too many packets are present in the subnet at the same time, they will get in one anothers
way, forming bottlenecks. Handling congestion is also a responsibility of the network layer, in
conjunction with higher layers that adapt the load they place on the network. More generally, the
quality of service provided (delay, transit time, jitter, etc.) is also a network layer issue.
Transport layer:
The basic function of the transport layer is to accept data from above it, split it up into smaller
units if need be, pass these to the network layer, and ensure that the pieces all arrive correctly at
the other end. Furthermore, all this must be done efficiently and in a way that isolates the upper
layers from the inevitable changes in the hardware technology over the course of time.
The transport layer also determines what type of service to provide to the session layer, and,
ultimately, to the users of the network. The most popular type of transport connection is an errorfree point-to-point channel that delivers messages or bytes in the order in which they were sent.
However, other possible kinds of transport service exist, such as the transporting of isolated
messages with no guarantee
about the order of delivery, and the broadcasting of messages to multiple destinations. The type
of service is determined when the connection is established. (As an aside, an error-free channel is
completely impossible to achieve; what people really mean by this term is that the error rate is
low enough to ignore
in practice.)
The session layer: The session layer allows users on different machines to establish sessions
between them. Sessions offer various services, including dialog control (keeping track of whose
turn it is to transmit), token management (preventing two parties from attempting the same
critical operation simultaneously), and synchronization (check pointing long transmissions to
allow them to pick up from where they left off in the event of a crash and subsequent recovery).

The presentation layer: Unlike the lower layers, which are mostly concerned with moving bits
around, the presentation layer is concerned with the syntax and semantics of the information
transmitted. In order to make it possible for computers with different internal data
representations to communicate, the data structures to be exchanged can be defined in an abstract
way, along with a standard encoding to be used on the wire. The presentation layer manages
these abstract data structures and allows
higher-level data structures (e.g., banking records) to be defined and exchanged.
Application layer: The application layer contains a variety of protocols that are commonly
needed by users. One widely used application protocol is HTTP (Hypertext Transfer Protocol),
which is the basis for the World Wide Web. When a browser wants a Web page, it sends the name
of the page it wants to the server hosting the page using HTTP. The server then sends the page
back. Other application protocols are used for file transfer, electronic mail, and network news.
2. Tcp/Ip reference model: Let us now turn from the OSI reference model to the reference model
used in the grandparent of all wide area computer networks, the ARPANET, and its successor,
the worldwide Internet. Although we will give a brief history of the ARPANET later, it is useful
to mention a few key aspects of it now. The ARPANET was a research network sponsored by the
DoD (U.S. Department of Defense). It eventually connected hundreds of universities and
government installations, using leased telephone lines. When satellite and radio networks were
added later, the existing protocols had trouble interworking with them, so a new reference
architecture was needed. Thus, from nearly the beginning, the ability to connect multiple
networks in a seamless way was one of the major design goals. This architecture later became
known as the TCP/IP Reference Model, after its two primary protocols. It was first described by
Cerf and Kahn (1974), and later refined and defined as a standard in the Internet community
(Braden, 1989). The design philosophy behind the model is discussed by Clark (1988).
The link layer: All these requirements led to the choice of a packet-switching network based
on a connectionless layer that runs across different networks. The lowest layer in the model, the
link layer describes what links such as serial lines and classic Ethernet must do to meet the
needs of this connectionless internet layer. It is not really a layer at all, in the normal sense of the
term, but rather an interface between hosts and transmission links. Early material on the TCP/IP
model has little
to say about it.
The Internet layer: The internet layer is the linchpin that holds the whole architecture together.
It is shown in Fig. 1-21 as corresponding roughly to the OSI network layer. Its job is to permit
hosts to inject packets into any network and have them travel independently to the destination
(potentially on a different network). They may even arrive in a completely different order than
they were sent, in which case it is the job of higher layers to rearrange them, if in-order delivery
is desired. Note that internet is used here in a generic sense, even though this layer is present
in the Internet.

The application layer: The layer above the internet layer in the TCP/IP model is now usually
called the transport layer. It is designed to allow peer entities on the source and destination
hosts to carry on a conversation, just as in the OSI transport layer. Two end-to-end transport
protocols have been defined here. The first one, TCP Transmission Control Protocol), is a
reliable connection-oriented protocol that allows a byte stream originating on one machine to be
delivered without error on any other machine in the internet. It segments the incoming byte
stream into discrete messages and passes each one on to the internet layer. At the destination, the
receiving TCP process reassembles the received messages into the output stream. TCP also
handles flow control to make sure a fast sender cannot swamp a slow receiver with more
messages than it can handle.
The internet:
The Internet is not really a network at all, but a vast collection of different networks that use
certain common protocols and provide certain common services. It is an unusual system in that it
was not planned by anyone and is not controlled by anyone. To better understand it, let us start
from the beginning and see how it has developed and why. For a wonderful history of the
Internet, John Naughtons (2000) book is highly recommended. It is one of those rare books that
is not only fun to read, but also has 20 pages of ibid.s and op. cit.s for the serious historian.
Some of the material in this section is based on this book.
Architecture of the Internet: The architecture of the Internet has also changed a great deal as it
has grown explosively. In this section, we will attempt to give a brief overview of what it looks
like today. The picture is complicated by continuous upheavals in the businesses of telephone
companies (telcos), cable companies and ISPs that often make it hard to tell who is doing what.
One driver of these upheavals is telecommunications convergence, in which one network is used
for previously different
uses. For example, in a triple play one company sells you telephony, TV, and Internet service
over the same network connection on the assumption that this will save you money.
Consequently, the description given here will be of necessity somewhat simpler than reality. And
what is true today may not be true tomorrow.
The big picture is shown in Fig. 1-29. Let us examine this figure piece by piece, starting with a
computer at home (at the edges of the figure). To join the Internet, the computer is connected to
an Internet Service Provider, or simply ISP, from who the user purchases Internet access or

connectivity. This lets the computer exchange packets with all of the other accessible hosts on
the Internet.The user might send packets to surf the Web or for any of a thousand other uses, it
does not matter. There are many kinds of Internet access, and they are usually distinguished by
how much bandwidth they provide and how much they cost, but
the most important attribute is connectivity.

ATM :( ASYNCHRONOUS TRANSFER MODEL)


1. ATM is better than osi
2. Basically atm offer services for moving ip packets
3. Atm networks are connection oriented and the connection is called as virtual circuit.
4. After establishing the virtual circuit either slides can begin transmission of data
5. atm transmit 53 bytes of data which is a fixed size packet called call 5 bytes for header and 48
bytes for pay load or actual data.

PHYSICAL LAYER
---The theoretical basis for data communication info can be
transmitted through wires by varying some physical property such as voltage or current.
---By representing the value of voltage with a single valued function
of time f(t) can be modelled with the behavior of signal.
Fourier series:
Any reasonably period function g(t) with period T can be constructed as the sum
of number of sines and cosines.

g(t)=1/2+ansin(2nft)+ bncos(2nft)
n=1

n=1

Where f=1/T and an, bn are the sine and cosine amplitudes and is a
constant such decomposition is called fourier series.

Where an , bn, can be computed as

T
an=2/Tg(t) .sin(2nft)dt

0
T
bn=2/Tg(t) .cos(2nft)dt

T
=2/Tg(t) dt

--Bandwidth: The range of frequencies transmitted without being modified or altered is called
bandwidth.
--The bandwidth is a physical property of the transmission medium depends on
thickness and length of the medium.

construction,

--magnetic media.
--A twisted pair consists of two insulated copper wires about 1mm thickness.
--These wires are twisted together to reduce the finite antinas which is produced
When both wires are in parallel.
--Twisted pairs can be several meters without amplification, but for longer distances repeaters
are required.
--The bandwidth depends on the thickness of the wire and several megabits/sec transmission rate
can be achieved.
--They are cheap and perform well.

Coaxial Cables:

i.)Coaxial cable consists of stiff copper wire surrounded by an insulating material.

ii.)The insulator is encased by a cylindrical conductor as a closely braided mesh.


iii.)The outer conductor is a protective plastic material.
iV.)The bandwidth depends on the cable quality,length and modern cables have
the bandwidth upto 1GHZ.

Ex: The city cable wire used for cable TV and MANs.

Fiber Optics:
An optical transmission has three key components
i.)The light source.
ii.)The transmission media.
iii.)Detector.

--A pulse of light indicates a one bit and the absence of light indicates a zero
Width.

--The transmission medium is an ultrathin fiber of glass and the detector generates an electrical
pulse when light falls on it.
--By attaching a light source at one end of an optical fiber and a detector to the other end.
--A unidirectional data transmission system that accepts an electrical signal converts and
transmits the signal by the light pulse and then reconverts the output to an electrical signal at the
receiving time.
--When light passes from one medium to other the ray is refracted at the boundary the amount of
refraction depends on the media.

Atienuation:
-- The loss of energy has the signal propagates outwards.
-- The loss of energy is expressed in decibals per second.

Noise:
An unwanted energy from sources other than the transmitter.

Modems:
To avoid the problems assocaited with DC-signalling,AC-signalling the modems are used.

a.)It shows a binary signal of 0s and 1s.


b.)The second sine wave is an example of amplitude modulation for two different amplitudes
which are used to represent 0 and 1.
c.)Shows frequency modulation where two different frequencies are used to represent 0 and
1.
d.)It represents the simplest form of the phase shift modulation as the sine wave carrier is
systematically shifted from 00 to 1800.

Quadrative phase shift keying (QPSK):

--In this we see the angles at 450,1350,2350,3150 with constant amplitudes.


--The phase of a dot indicated by an angle from the origin,it has four valid combiantions and can
be used to transmit 2 bits per second .

Multiplexers:
--Telephone companies have developed eloborate schemes for multiplexig over a singal trunck
they are freuency division multiplexing.
--The frequency spectrum is divided into frequency with each user having exclusive frequency.

---Time divisio multiplexing:


-The user takes the turn each division periodically getting the entire
bandwidth for little amount of time.
---Baud rate:
-The number of samples per second if mesure in bauds.
--Bit rate:
-The amount of bits transferred I one second is called bit rate.

Switching:
When a message is created at the centre and intended to send the message to the receiver there
are three varities of switching there are
i.)Circuit switching: In circuit switching a virtual circuit i.e path from sender to recevier is
established,the entire message is allowed to pass through the circuit bit by bit.

ii.)Message switching:In this form of switching no physical path is established in advance


between sender and receiver.
--When the sender has a block of data to be sent it is stored in the routers and then forwarded
later.
--Router inspected for errors and then transmitted to the next router till it reaches the receiver.
--A network using this technique is called store and forwarded network.

iii.)Packet Switching: Packet Switching is very similar to message switching except a type upper
limit to the size of the block or packet and these packets are placed in the routers main memory
and then transfer to the receiver.

Differences b/w Circuit Switching and Packet Switching:


Circuit Switching

Packet Switching

Circuit step is required.

No circuit step is required.

Reservation of bandwidth is required.

Reservation of bandwidth is not required.

Fault tolerance is very less.

Fault tlorence is more.

No congestion can occur due to bandwidth


reservation.

Probability of congestion.

Wastage of bandwidth in some situations.

No wastage of bandwidth.

2.DATA LINK LAYER


FUNCTIONALITIES:
a.provides well defined services and acts as an interface to the network layer.
b.deals with transmission errors.
c.regulates the flow of data so that slow receivers are not swamped by fast
senders.
Relationship between packets and frames:

Services provided by Data Link Layer to Network Layer:


a)Unacknowledged connectionless service.
b)acknowledged connectionless service.
c)acknowledged connection- oriented service.
FRAMING METHODS:
a)Character count.
b)Character stuffing.
c)Bit stuffing.
Breaking the incoming bit streams into frames is one of the functions of the data link layer. It is
basically used to check for transmission errors in the data.
A Character Stream a) with out errors b)With Errors.

a)A Frame Delimited By Flag Bytes b)Four examples of Byte Sequences before and after
stuffing.

CYCLIC REDUNDANCY CHECK:

It is one of the error detecting methods and is used to detect a single error.
In this process, we have a generator and a frame.
For example ,we have total n no. of digits in generator ,we append
n-1 no. of zeroes at the end of frame and continue the division process.
After performing the division as mentioned above, if the remainder is obtained 0,we
conclude there is no error and the data is correctly transmitted.
If the remainder is not 0,there is an error and the message is
discarded and receiver asks the sender to retransmit the data.
Fig.below illustrates the calculation for a frame 1101011011 using the generator G(x) = x4 + x +
1.
Calculation of the polynomial code checksum:

Elementary Data Link Protocols:


When the data link layer accepts a packet, it encapsulates the packet in a frame by
adding a data link header and a trailer to the frame .Frame consists of an embedded packet which
contains control information and a check sum.
When a frame arise at receiver, the hardware computes check sum .If it is found error ,it
is informed to the data link layer .If there is no error, it just checks control information in the
header and passes the packet to the network layer.

A frame consists of 4 fields.


1) Kind
2) Sequence
3) Acknowledgement
4) Information
The first 3 contain control information and 4th contains actual data to be transferred. These
control fields are collectively called as Frame Header.
The kind field tells whether there are any data in the frame, because some of the protocols
distinguish frames containing only control information from those containing data as well. The
seq and ack fields are used for sequence numbers and acknowledgements, respectively. The info
field of a data frame contains a single packet.
1) Unrestricted Simplex Protocol.
Assumptions:
1. Data is transmitted only in one direction.
2 .Both transmitting and receiving layers are always busy.
3. Processing time is ignored.
4. Infinite buffer space is available.
5. No chance of damaged frames.
6. This protocol consists of 2 separate procedures i.e.,sender1 and receiver1.
In this case, the only event is frame arrival. In senders procedure, packet is collected from
network layer and the frame is forwarded to the physical layer.
In receivers procedure, the frame is collected from the physical layer and
converted in to packets, then forwarded to network layer. It is a very simple
protocol and highly unrealistic, imaginary protocol.

2) Simplex Stop-andWait Protocol

Assumptions:
1)One direction flow of data from sender to receiver.
2) The receiver has only finite buffer capacity.
3) Communication is error free.
4) The receiver has a finite processing speed.
The advantage of this protocol is that it is able to prevent the sender from flooding the
receiver with fastest data rate, than it can handle.
The sender procedure has one frame which is made from the packet received from

network layer and forwarded to physical layer. But, the physical layer would be waiting
for the acknowledgement frame from the previously transmitted frame from the receiver.
The receiver procedure has two frames. One for the frame received from the physical
layer to be converted to a packet and forwarded to the network layer. The other frame is
sent to the physical layer as acknowledgement to the sender for the already transmitted
frame.

Disadvantages:
1) Sender always waits for the acknowledgement frame to send the next frame.
2)If the acknowledgement frame gets corrupted,the sender cannot perform anymore.

3) Simplex Protocol for a Noisy Channel:

Assumptions:
1) Data transfer is only in one direction.
2) Separate sender and receiver.
3) Finite processing capacity and speed at the receiver.
4) Since it is a noisy channel , errors in the data frames or acknowledgement frames are
expected.
5) Every frame has an unique sequence number.

6) After a frame has been transmitted, timer is started for a finite time. Before the timer expires,
if the acknowledgement is not received, the frame gets retransmitted.
7) When the acknowledgement gets corrupted or the sent data frame gets damaged, how long the
sender should wait to transmit the next frame is infinite.
This protocol is also called as Positive Acknowledgement with Retransmission (PAR) or
Automatic Repeat Request Protocol (ARQ).
Sliding Window Protocols:
1. Piggy Backing
2. One Bit Sliding Window Protocol
In the data link protocols, data frames transmitted in one direction only, where as in
sliding window protocols, the data is transmitted in both the directions.
1.Piggy Backing:
When a data frame arrives,instead of immediately sending the acknowledgement
frame,the receiver waits for the next data packet and the acknowledgement ia attached to the
outgoing frame.
When no data packet from receiver ,the receiver should send an acknowledgement
to the sender before the time elapses at the sender to avoid retransmission of data.
2.One Bit Sliding Window Protocol:
In this protocol, each frame contains a sequence number ranging from 0 to 1.In
general, the sequence numbers will be from 0 to 2^ (n-1),where n is the sequence number bit
field. This protocol is similar to elementary data link protocols except the direction of
communication. Next frame to send tells which frame the sender is trying to send .The frame
expected tells which frame the receiver is expecting. In both the cases, 0 and 1 are the only
possibilities.
The Acknowledgement field contains the number of last frame received without
error. If this number agrees with the sequence number of the frame, the sender is trying to send ,
the sender knows it is received correctly. The next frame stored in the buffer can be fetched from
the network layer. If the sequence number disagrees, it must continue to send the same frame.

SLIDING WINDOW PROTOCOLS:

1. Pipelining
2. Round Trip Time
3. Protocol 5(Go Back N)
4. protocol 6(Selective Repeat)
1.Pipelining:
In all the previous protocols,the rule is that a sender has to wait for an
acknowledgement before sending another frame.It is a technique which includes parallelism in
the protocols and allows the sender to transmit more than one frame before blocking.With an
appropriate choice,how many frames sender will be able to continuously transmit frames for a
time equal to round trip time.
2.Round Trip Time:
Sum of the times taken from the sender to receiver and from receiver to sender for a
frame to transmit.
3.Protocol 5(Go Back N protocol):
One of the pipelining protocol which deals with errors. The receiver simply
discards all subsequent frames sending no acknowledgements for the discarded frames. The data
link layer refuses to accept any frame except the next one it must give to the network layer. If the
senders window fills up before the timer runs out, the pipeline begins to empty and the sender
retransmit all the unacknowledged frames in order starting with the lost frame.

Advantages:
i.
Minimum Buffer management is required.
ii.
It requires minimum buffer space.

Disadvantages:
i.
ii.

More retransmissions in case of high error rate.


Most of the band width is wasted.

4.Protocol 6(Selective Repeat):


This is the another data link layer flow control protocol, which handles
transmission errors with pipelining technique. When the pipelining technique is used, a bad
frame that is received is discarded but good frames received after it are buffered. When the

senders time is out, the

unacknowledged frame is transmitted. If that frame arrives correctly,

the receiver can deliver to the network layer in sequence. Selective Repeat Protocol is often
combined with negative acknowledgements when it detects an error.

A
dvantages:
i.
ii.
iii.

Lot of Band Width is saved.


Minimum no.of retransmissions.
Highest efficiency is required.

Disadvantages:
1. It requires large amount of buffer space at the receiver end.

MEDIUM ACCESS CONTROL SUBLAYER


In point-to-point connection, there is no competition among the users since only 2 users uses the
channel but in broadcast networks the problem arises to use the channel when more number of
users uses the channel.
DYNAMIC CHANNEL ALLOCATION:
1) STATION MODEL :The model consist of n independent stations, each can generate frames for
transmission called terminals. Once a frame has been generated, the station is
locked and does nothing until the frame has been successfully transmitted.
2) SINGLE CHANNEL ASSUMPTION :A signal channel is available for communication, all stations can transmit on the
channel and all can receive from it. As far as the hardware is considered, all
stations are considered.
3) COLLISION ASSUMPTION :If two frames are transmitted simultaneously, they overlap in time and the
resulting signal is garbled (loss of signal), this is called collision.
4) CONTINUOUS TIME :Frame transmissions can begin at any instant of time (no time limit).
5) SLOTTED TIME :Time is divided into discrete intervals or slots, frame transmission always begin at
the start of the slot, a slot may contain zero or more frames.
6) CARRIER SENSE :Station can tell if the channel in use before trying to use it. If the channel is
sensed as busy, no station will attempt to use it until it goes idle.
CSMA PROTOCOLS

CARRIER SENSE MULTIPLE ACCESS PROTOCOLS


Protocols in which stations listen for a carrier and act accordingly are called as carrier sense
protocols.
a) Persistent & non-persistent CSMA (or) 1-persistent :When a station has data to send, it first listens to the channel to see if anyone is
transmitting at the moment. If the channel is busy the station waits until it
becomes idle.
When a station detects an idle channel, it transmits a frame. If a collision
occurs, the station waits for a random amount of time and starts again. The
protocol is called 1-persistent because the station transmits with a probability of 1
when it finds the channel idle.
b) P-persistent CSMA :It applies two slotted channels and works as follows:When a station becomes ready to send the frame, it senses the channel, if it
is idle, it transmits with a probability of p, with a probability of q [q=1-p]. It
differs until the next slot. This process is repeated until either the frame has been
transmitted or another station has begun transmitting.

CARRIER SENSE MULTIPLE ACCESS WITH COLLISION DETECTION(CSMA/CD):If two staations senses the channel to be idle and begin trasmitting
simultaneously ,the above protocols may not work.
CSMA/CD uses the conceptual model, a station has finished its trasmittiing its frame,any other
station having a frame to send may attempt to trasnsfer the frame .If two or more stations
decides to transmit simultaneously,there will be a collision.If a station detects a collision,it
aborts its transmission waits a random period of time and then transmitts again.
CSMA/CD can be in one of three states: contention,

transmission, or idle.

TRANSMISSION PERIOD:
Stations can send the frames without any collision.
CONTENSION PERIOD:
When more than one station ready with their data, a fight among themselves for a channel.
IDLE PERIOD:
The channel is not busy and no station is ready with frames.
MULTIPLE ACCESS PROCESS:
The basic idea of ALOHA is very simple.The users transmit the data whenever they are
required.There will be collisions and all the colliding frames will be damaged.Systems in which
multiple users share a common channel in which they lead to conflicts which are known as
contention systems.
In pure ALOHA, frames are transmitted at completely
arbitrary times.

When

two stations try to occupy the channel

at a time,the two frames colllides each other and both the frames are lost.In this particular

case,the first bit of the new frme overlaps of the first bitof the frames and they should be
retramsmitted once again.
Frame time denotes the amount of time needed to transmit the standard fixed length frame and it
uses poison distribution formula.
N=no.of frames generated by all the users in on frame time.
0<N<1
K=no.of transmissionss attempt for frame time.
G=no.of old frmes +no.of new frames.

Low load ,N=0 i.e., less no.of collisons => N=G


At high load,N>1i.e., more no.of collisons=>G>N
P 0=probbility of successful transmission.
S = GP 0
Where s=throughput
SLOTTED ALOHA:In this protocol,any station is not allowed whenever it is ready with the data frames
but it has to wait for the beginning of the next slot.
Since,the vulnerbility is half the probability of no other system uses the same slot for the
transmission and the throughput

Efficiency of pure ALOHA is 80% and


Efficiency of slotted ALOHA is 36%.
It shares the data verifies all channels and through stacks of queus slots are maintained.
Collision-Free Protocols
These protocals resolves the contenstion period for the channel without any collisions at
all.They are called as reservation protocol.
When we assume that they are exactly n stations each with a uni address from 0 to n-1.

Bit-Map Protocol:
In this protocol each contension period consists of n-slots, when station J has a frame to send it
announces by inserting 1 bit into the slot J. After all n slots have passed by each station has
complete knowledge of which stations need to tansmit.

Station they begins the transmitting the frames in numerical order and it helps us to avoid the
collision.
Protocols like this in which the desire to transmit is broadcast before the actual transmission are
called reservation protocols.
Performance:The performance of the protocol calculates under two circumstances
i.

At low load

ii.

At high load

The effiencey at low load = d/d+N


The effiencey at high load =d/d+1
d = amount of data, N=overhead per frame
Advantages:1. There are no collisions
2. All stations have an equal choice of reserving its slot for transmission
Disadvantages:1. Lower numbered stations have to wait on an average of 1.5 N slots.
2. High numbered stations have to wait on an average of 0.5 N slots.
3. Efficiency can br further improved.
Binary Countdown:
Bit map protocol is not suitable for larger networks where thousands of stations exits.
In binary count down protocol,every station has a binary address, A station which
requires the channel broad casts its address as a binary string with high ordered bit string

starting with high ordered bit all address are assumed to be same length.
The bits on each address position from different stations are Boolean or operation is
performed.
When stations 0010,0100,1001 & 1010 are in competition for the channel, the last two
stations has higher order bit 1.

Therefore, the first two stations giveup for the current round.Now the competition is
between the last two stations, the next bit is 0 in both the station addresses, so the
competition continues in the next bit position only 1010 station has one in that position
Therefore 1001, gives up and the winner is 1010.
Channel efficiency is d/(d+log2N).
Ethernet MAC sublayer protocol:

The most common kinds of Ethernet cabling.


In Ethernet frame each byte contains the bit pattern 10101010.The Manchester encoding of this
pattern produces square layer to allow the data from receiver clock to synchronize with the
senders clock and the size of preamble is 8bytes.

Cable topologies. (a) Linear. (b) Spine. (c) Tree.(d)segmented


Reaptater :It is a physical layer device it receives an amplifies, the signals in both the directions
ENCODING:There are 2 approches
1.Mancherster encoding
2.Diffrntial Manchester encoding
Manchester encoding:
Each bit period is divided into two equal intervals a binary 1 sent by having the
volatage high during the 1st interval and low in the 2nd interval a binary zero is just reverse i.e
first low and then high and
advantage is simple to implement
disadvantage requires twice as much as compers to binary encoding scheme
Diffrential Manchester encoding:In this method a 1 bit is indicated by the absence of transistion as the start of onterval. A zero bit

is indicated by the presence of transistion at the start of interval.


Disadvantage: needs more complex equipment
Advantage: less noise in the signal
(a) Binary encoding. (b) Manchester encoding. (c)
Differential Manchester encoding.

Destination and source addresses:


In this they use 6 byte addresses for source and destination nodes in the higher bit,the presence of
zero it tells it is an ordinary address, in higher bit the presence of 1 it tells multicast address and
if all 1s it tells broadcast address.
Type field:
It tells the receiver what to do with this frame and also specifies which process on the receiver
will handle this frame. Its size is 2 bytes.
Data field:
Data is maximum of 1500 bytes and minimum of 0 bytes.
Padding field:

The minimum requirement for the frame length is 64 bytes, the pad field is used to fill out the
frame to the minimum size.
Checksum field:
If data is received and error is identified then checksum algorithm will be used.

Data link layer switching:


Multiple LANs can be connected by the devices called bridges which operate in the data link
layer.
Bridges examine the data link addresses for doing routing.
Multiple LANs connected by a backbone to handle a total load higher than the capacity of a
single

LAN.

Uses of bridges:
Different departments in the organization have different goals and the organizations may be
geographically connected to accommodate the load, it is necessary to split a single LAN for
reliability purpose and physical distance between the LANs.
Bridges should be transparent so that any bridge can be removed or inserted without changing
any hardware, software or configuration tables.
Fig:

Flooding:
Every incoming frame for an unknown destination is output on all the LANs to which the
bridge is connected except the one it is arrived on.
Bridges connect multiple LANs.
Each bridge maintains a hash table within them.
Hash tables are used to forward the incoming frame to the next available LAN on the
destination address.
Bridges operate plug in and plug out without changing any software or hardware.
When bridge is connected newly it fills the data into the hash table by backward mechanism.
Addition or removal of any host or any LAN has no impact on rich performance.
The routing procedure for an incoming frame depends on the LAN it arrives and the LAN of its
destination is as follows:
(i)If destination and source LANs are same, discard the frame.
(ii)If destination and source LANs are different, forward the frame.
(iii)If destination LAN is unknown, use flooding.
Repeater:
These are the analogue devices that are connected for a given cable segments. A signal appearing
on one of them is amplified and put out on the other cable. Repeater can only understand volts.
Hubs:
Hub has a number of input lines that it joins electrically. Frames arriving on any of the line are
sent out on all the others. If two frames arrives at the same time they will collide. Therefore the
entire hub forms a single collision domain and hubs do not amplify the incoming signal.
Fig:
Switches:
Switches are similar to bridges. A switch is often used to connect individual computers, switches
only forward frames.
Bridges:
A bridge connects two or more LANs when a frame arrives the software in the bridge extracts

the destination address from the frame and look it in the hash table where to send the frame in tis
domain.
Router:
When a packet comes into a router, the frame header and trailer are stripped off and the packet
located in the frame is passed into the routing software. This software uses the packet header to
choose an output line.
Gateways:
It understands the format and the content of the data and translate the message from one format
to another format.

THE NETWORK LAYER


1.Store and forward packet switching :

Fig: The environment of the network layer protocols


A host with a packet to send transmits it to the nearest router, either on its own LAN or a
point-to-point links to the carrier. The packet is stored there until it has fully arrived so the
checksum can be verified. Then it is forwarded to the next router along the path until it reaches
the destination host, where it is delivered. This is called store-and-forward packet switching.
2.Implementation of connection less service :
In connectionless service, the packets are called datagrams. Datagrams are injected into
subject independently of each other. No advance setup is required.

As table
Initially

later

Cs table

Es table

Fig: Routing with in a datagram subnet


Router table : Every router has an internal table telling it where to send the packets for each
destination. Each table entry is a pair consisting of a destination and the outgoing line to use for
the destination. In case a traffic jam or delay in transmission router tables updated accordingly.
In this example, packets 1,2,3 follows A,C,B,F route, and observed a congestion in the
network and table A is updated. So the packet 4 took a different route i.e. A,B,so on.
The destination that manages tables and makes the routing decisions is called the
Routing Algorithm.
3.Implementation of connection oriented service :

H1

H3

As table

Cs table
Es table
Fig: Routing within a virtual circuit
For connection oriented service, we need a virtual circuit subnet. When a
connection is established, a route from the source machine to the destination machine is chosen
as part of the connection setup and stored in tables inside the routers. The route is used for all
traffic following over the connection. When a connection is released, the virtual circuit is also
terminated. With connection oriented service, each packet carries an identifier telling which
virtual circuit it belongs to.
Example : Host H1 establishes a connection to H2 with identifier=1. The first line in all router
tables are created with id=1. When host H3 wanted to have a connection with H2, H3 thinks also
id=1 but A-knows id=1 is already in use and gives id=2 for H3 to H2 traffic. All second entries of
all table i.e. A,C,B are for H3 to H2.
4.Comparision of Virtual circuit and Datagram subnets :

6.Quality of
service

Difficult

Easy if enough resources can be


allocated in advance for each virtual
circuit.

7.Congestion
control
Issue

Difficult

Easy if enough resources can be


allocated in advance for each virtual
circuit. Virtual circuit subnet

1.Circuit setup

Not needed

2.Addressing

Each packet contains the full Each packet contains a short virtual
source and destination addresses. circuit numbers.

3.State
information

Routers do not hold state info Each virtual circuit requires router
about connections.
table space per connection.

4.Routing

Each
packet
independently.

Datagram subnet

Required

is

routed Route chosen when virtual circuit is


setup, all packets follow it.

5.Effect of router None, except the packets lost All virtual circuits that passed through
failure
during the crash.
the failed router are terminated.

Routing algorithms :
1. Non adaptive / Static algorithms: They do not base their routing decisions on
measurements or estimates of the current traffic and topology.
2. Adoptive / Dynamic routing: They change their routing decisions to reflect changes in the
topology, and the current traffic as well.
Shortest path routing: (A static routing algorithm)
Metrics used: Distance, Bandwidth, average traffic, communication cost, mean queue
length, measured delay etc.
One among them is used in Routing algorithms. Dijkstra(1959) developed this. Each

node is labelled with its distance from the source node along the best known path. Initially, no
paths are known, so all nodes are labelled with infinity. As the algorithm proceeds and paths are
found, the labels may change, reflecting better paths. A label may be tentative or permanent.
Initially all labels are tentative. When it is discovered that a label represents the shortest possible
path from the source to that node, it is made permanent and never changed there after.

Flooding: (A static routing Algorithm)


In this algorithm every incoming packet is sent out on every outgoing line except the one
it arrived on. Flooding generate infinite number of duplicate packets unless some measures are
taken.
Measure 1: To have a hop counter contained in the header of each packet, which is
decremented at each hop, with the packet being discarded when the counter reaches zero.
Measure 2: The source router put a sequence number in each packet. Each router then
needs a list per source router telling which sequence number originating at the source have
already been seen. If an incoming packet is on list, it is not blocked.
Selective flooding: The router do not send every incoming packet out on every line, only
on those lines that are going approximately in the right direction.
Distance Vector Routing:
This algorithm operate by having each router maintain a table giving the best known
distance to each destination and which line to use to get these. These tables are updated by
exchanging information with the neighbors.
In distance vector routing each router maintains a routing table induced by, and
containing one entry for each router in the subnet. This entry contains two parts : the preferred
outgoing line to use for the destination and estimate of the time /distance to that destination.

TO

New
estimated
delay from
J

Line

A
B
C
D
E
F
G
H
I
J
K
L

0
12
25
40
14
23
18
17
21
9
24
29

24
36
18
27
7
20
31
20
0
11
22
33

20
31
19
8
30
19
6
0
14
7
22
9

21
28
36
24
22
40
31
19
22
10
0
9

JA
JI
JH
JK
Delay in 8 Delay in 10 Delay in 12 Delay in 6 sec
sec
sec
sec

8
20
28
20
17
30
18
12
16
0
6
15

A
A
I
H
I
I
H
H
I
K
K

New routing table for


J

Vectors received from js four neighbors


The router is assumed to know the distance to each of its neighbors .if the metric is hops, The
distance is just one hop. If the metric is queue length, the radius simply examines each queue .if
the metric is delay, the router can measure it directly with special ECHO packets that the receiver
just time stamps and sends back as fast as it can .
Example:
Consider how j computes its new route to router g. neighbors of j are I,K,A, and H routers. j
can reach g through one of its neighbors,
With I: JI+IG=3+1=4 msec
With k: JK+KG=31+6=37 MSEC
With A: JA+AG=8+18=26 MSEC
With H: JH+HG=6+12=18 MSEC
Obviously through H i.e JH and HG is the best route
The count to infinity problem:-

This algorithm reacts rapidly for good news but leisurely for bad news .After 4 exchanges E
comes to know ,it is 4 hops away from router A.

B.
Bad news:-

Where a link between A and B fails , B looks at C and increments its value to 3 looking at B. C
also increments its values to 4 and this into with its neighbors .This process continued and
values become infinity ,but no station is found . This is called count to infinity problem in DV
routing.

LINK-STATE ROUTING
This is simple and can be done in five steps.
1.Discover its neighbors and learn their new address .
2.Measure the delay or cost to each of its neighbors.

3.Constructs a packets telling all it has just learned.


4.Send this packet to all other routers.
5.Compute the shortest path to every other router.

1.Learning about neighbours:-

Every router can know who is its neighbors by sending a hallow packet on point to point
liner. But in some typical cases as shown in fig a, This can be resolved by adding a new during
node N to which A,C,I are connected.
Measuring the live cost:-It is important to know the live cost or delay between its neighbors by
every router .This can be achieved by sending a echo packet to all its neighbors .The other side
returns immediately there by round trip time (RTT) can be estimated and half of RTT would be
the live cost between that two routers.
3.building link state packets:- As the information is collected ,it should be exchanged with
routers of the subnet. Each router is building a packet contains all this information called link
state packets.

C
SEQ
AGE
B
D
E

2
3
1

SEQ

SEQ

AGE

AGE

A
C
F

5
1
8

C
F

3
7

B
Seq
Age
A 4
C 2
AF 6
Seq
Age
B 4
E 5

F
SEQ
The link state packets for subnet.

AGE
B
D
E

6
7
8

Each link state packet contains a


sequence number to tell the router
which is the latest information .Having a higher sequence number is more latest than a lower
sequence number. As age field tells how long the packet should live and where to die . Generally
this is hop count. If age is 60 a packet may traverse up to 60 routers.

1a

- 1c - 3

Distributing
1b 1b
This
the routers
1d 1c
age fields
confusion.
2a 1b
Computing
Once2ba 1b
can
2d With
1b
results of
normal
2d 1b
Hierarchical
As 3a
the 1c
router
done3b
Where
4a
1c
regions,
route4b
1c
For huge
zones and
4c
1c

the link state packets:algorithm uses flooding to distribute all the packets to all
in the subnet. During this process sequences number and
are node helpful in reducing the congestion and

1
2
3

new routes:router has accumulated a full set of link state packets ,it
constructs entire subnet graph.
the help of shortest path to all possible destinations. The
the algorithm can be installed in the routing tables and
operations resumed.
routing:networks grows, it is not possible to keep every other
information in any routing table .so the routing have to be
hierarchically.
hierarchical routing is used, the router are divided into
with each routers knowing all the details about how to
packets to destinations within its region.
networks regions are divided into cluster, the cluster into
zones into groups e t c. .

3
4
3
2
4
4
4

5a

1c

5b

1c

5d

1b

5d

1c

From the
number of
There are
considered
to 7 from

5e
1a

1b

1d

1c

1b

1c

1c

1c

above subnet, the full table has 17 entries equal to total


routers in the subnet and from the hierarchical table.
entries for all local routers, but all other regions have been
in to a single router. So that number of entries got reduced
17.

Broadcasting Routing:packets to all destinations simultaneously called


Broadcasting. There are various methods.
1. Where
the source have a complete list of all destinations, Source
creates distinct packets to all destinations and forwards.
Sending

2. Flooding is another method, but not suitable for point to-point links.
3. Multi destination Routing. Each packets contains either a list of destinations or a bitmap
indicating the designed destinations. With this all the destinations are partitioned and grouped.
Each packet having one group of addresses and where a packet comes on to a router, router pick
up the concerned outgoing linear to reach to only those destinations. Finally all the destinations
are reached.
3. Spanning tree: - When a router needs to be broad casted, tacking that router as the root
node, a spanning tree could be constructed which includes all router but no loops. Further
these packets are forwarded to all destinations.

5. Reverse path Forwarding:


As shown in figure below, the encircled nodes are from the spanning tree making the
preferred path. In general all the incoming packets they take the root from the spanning tree. This
is known as preferred path, where the router send the packets back, it uses the same known
preferred path. Any other router gets a packet from other than preferred path, it simply
disconnect the packet. In the diagram root node is router I.

Advantages:
1. Reasonably efficient.
2. Easy to implement.
3. It does not requires router to know about spanning tree.
4. No overhead of destination list/bitmap.
MULTICAST ROUTING
Sending a message to a group is called multicasting & its routing algorithm is called multicast
routing.
Multicasting requires group management. It is needed to create and destroy groups. Routing
should be informed which of their hosts belong to which groups. Sometimes hosts inform this
information, in other case routers query their hosts periodically for the same.

To do multicast routing, each router computes a spanning tree covering all other routers.
When a process sends a multicast packet to a group, the first router examines its spanning tree
and pruses it, removing all lines that does not lead to hosts that are members of the group.
Multicast packets are forwarded only along the appropriate spanning tree.
Advantages: So far this is the best algorithm for multicast routing.
Disadvantages: It scales poorly to large networks.
CONGESTION CONTROL ALGORITHMS
When too many packets are present in the subnet, performance degrades. This situation is called
congestion.
What should we do in Routers?
1. Within the router, the memory capacity can be increased. So that a bigger queue is
accommodated.
2. The processing speed of the router can be increased.
Overall, congestion control has to do with making sure the subnet is able to carry the offered
traffic.
I.

General principles of congestion control.


There are open loop and closed loop solutions. In open loop solutions, the decisions are
made without regard to the current state of the network.
In closed loop, the decisions are based on feedback.

It has 3 parts.
1. Monitor the systems to detect when and where congestion occurs.
2. Pass their information to places where actions can be taken.
3. Adjust system operation to correct the problem.
Metrics used for congestion control
1. % of all packets discarded for lack of buffer space.
2. The average queue lengths.
3. The number of packets that timeout and retransmitted.
4. The average packet delay & standard deviation of the packet delay.
II.

Congestion control in virtual - circuit subnet.


There are approaches used to control congestion.
1. Admission control: once congestion has been sisvalled, no more virtual circuits are set
up until the problem has gone away.
2. New virtual CKH: Allow new virtual circuits but carefully route all new virtual
circuits around problem areas.

3. Negotiation between host and subnet: When a virtual circuit is setup host can
negotiate for an agreement with the subnet. This agreement normally specifies the volume &
shape of the traffic, quality of service required. To keep its part of agreement, the subnet
reserve resources along the path when the circuit is setup.
III.

CONGESTION CONTROL IN DATAGRAM SUBNETS.


Warning Bit: when a router in a congestion, it set a warning bit into the packet header. The
destination router understands this is informed to the sender in its acknowledgement. As long
as the router in the working state it continued to set warning bit. Then the resource adjusted
its transmission rate accordingly. Every router along the path could set warning bit, traffic
increased only when no router was in trouble.

Choke Packet: In this procedure, the router sends a choke packet back to the source host.
When the source host gets a choke packet, it is required to reduce the traffic sent to the
specified destination.
Hop-By-Hop Choke Packet: At high speeds or over long distances, sending a choke packet
to the source hosts does not work well because the reaction is slow. An alternative approach
is to have the choke packet take effect at every hop it passes through. The net effect of this
hop-by-hop scheme is to provide quick relief at the point of congestion at the price of using
up more buffer upstream.

IV.

Load Shedding
When no method is able to control the congestion & routers are heavily flooded with

packets. They try to drop packets at random. This is called load shedding. But there are some
alternatives, which packet to drop & which is not.
Senders can send their packets with some priority classes to indicate how important they
are. If they do this, then when packets have to be discarded, routers can drop first packets
from the lowest class and the n the next lowest class & so on.
V.

JITTER CONTROL
For audio, video streaming it does not matter how much if the packet take 20msec or 30msec
to be delivered, as long as the transit time in constant. The variation in the packet arrival time
called Jitter.

To control Jitter
Method 1:
When a packet arrives at a router, the router checks to see how much the packet is behind
or ahead of its schedule. This information is stored in the packet and updated at each hop. In
the way packets ahead of schedule get slowed down and which are behind schedule gets
speeded up.
Method 2:
Jitter can be eliminated by buffering at the receiver and then fetching data for display
from the buffer instead of from the network.
Quality of service:
A stream of packets from a source to a destination is called a flow. The needs of each flow can
be characterized by 1.Realiability 2.Delay 3.Jitter 4.Bandwidth. Together these determine QOS
(Quality of Service) the flow requires.

Flows are classified into 4 categories


1. Constant Bit Rate (CBR)
e.g. Telephony (uniform bandwidth, delay)
2. Read time variable bit rate (RT VBR)
e.g. compressed video conferencing.
3. Non-real time variable bit rate (NRT VBR)
e.g. watching a movie on Internet.
4. Available Bit Rate (ABR)
e.g. file transfer.
Technique for achieving good quality of service
1. Over provisioning: Providing so much router capacity, buffer space and bandwidth that
packets fly through easily.
2. Buffering: Flows can be buffered on the receiving side before being delivered. Buffering
does not affect the reliability or bandwidth, and increases the delay, but it smooths out jitter.
3. Traffic shaping: Traffic shaping is about regulating the average rate of data transmission.
This is applied at sending side but not at receiving side. Monitoring a traffic flow is called
traffic policing. Two algorithms are popular in this scheme.
a) The Leaky Bucket Algorithm:

The leaky bucket consists of a finite queue. When a packet arrives, if there is room on
the queue it is appended to the queue. Otherwise it is discarded. At every clock tick, one packet
is transmitted.
The byte counting leaky bucket is implemented as, at each tick, counter is initialized to
n. If the first packet on the queue has fewer bytes than the current value of the counter, it is
transmitted, and the counter is decremented by that number of bytes. Additional packets may also
be sent, as long as the counter is high enough. When the counter drops below the length of the
next packet on the queue, transmission stops until the next tick, at which time the residual byte

count is reset and the flow can continue.


Example: Data produced by computer at a rate of 25mb/sec
duration = 40 m sec
Leaky bucket out flow rate = 2mb/sec
Bucket capacity = 1mb
this bucket takes 500 m sec to send data onto the network
25mb X 40 m sec = 1000
2mb X 500 m sec = 1000
i.e. 25mb X 40 m sec =2mb X 500 m sec.
b) The token Bucket algorithm:
Leaky bucket has a rigid output pattern. It cant handle sudden bursts of data arrivals. In
this algorithm, the leaky bucket holds tokens generated by a clock at the rate of one token every
T sec.

The token bucket algorithm can save number of token depends on the size of the bucket, n.
This property means that bursts of up to n-packets can be sent at once, allowing some burstiness
in the output stream and giving faster response to sudden bursts of input.

TRANSPORT LAYER
-The hardware and software (within the transport layer) that does the work is called transport
entity.

-The purpose of transport layer is host to host communication.


-The data unit of transport layer is:

-Transport service primitives:


Primitives
i.)Listen

Packets sent
None

ii.)connect

Connection request(c.r)

iii)send
iv.)receive

Data
None

v.)disconnect

disconnection

Connection Management:

Meaning
It blocks untill some process
tries to connect.
Actively attempt to establish
a connection.
Sending information
Blocks until the data packet
arrives.
Removes connection.

A state diagram for connection establishment release for these


simple as shown, each transition is triggered by some event , either a primitive executed by
the local transport user or by an incoming packet.

Berkley Sockets:
i.)Socket: It creates new communication end point and contains IP address and port number.
ii.)Bind: It attaches a local address to a socket.
iii.)Listen: Announces willingness to accept the connection.
iv.)Accept: Blocks the caller until a connection attempt arrives.
v.)connect: Actively attempts to establish a connection.
vi.)send: Sends some date over the connection.
vii.)Receive: Receives some data from the connection.
viii.)close: Release the connection.
Elements of transport protocols:
i.)Addressing.
ii.)connection establishment.
iii.) connection release.

-posts are treated as transport service access points in transport layer.


i.)Addressing:
-When an application process wishes to set up an connection to a remote
application process , it must specify to which portion it should connect .These are called as
Transport Address . In the internet layer, these end points are called ports . In general , they
are transport service access points.
--Three way handshake:
a.)client sends a connection request to the destination.
b.)Destination accepts and sends an acknowledgment.
c.)client acknowledges this incoming acknowledgment again.
ii.)Connection establishment:
A transport entity sents a connection request (TDPU) to the connection and
waits for a connection accepted reply. This creates transport connection.

Old duplicate connection request appearing out of nowhere some has send the request to the
host 2 acknowledgment will be sent to host 1 after verification the host 1 rejects the
connection.
Connection release:
- It is terminating as a connection , this is of two types i.)asymmetric release.
-

Ii.)symmetric release.

TCP uses symmetric release.

Symmetric release it treats the connection as two separate uni-directional conncetions


and requires each one to be released separately.

In asymmetric release the can be lost.

The networks is unreliable the sender must before send all the TDPU use (transfer protocol
data use) all the TDPU use are not the same size and there are different schemes available to
keep ine buffer from one TDPU.
a.)Chained fixed size buffer:
If the buffer size is chosen equal to the largest possible space will be wasted, wherever a
short TDPU arise.
b.)Variable size buffer:
We can have variable sized buffers with short TDPU , and is accommodated in large
buffers.
Adv: Better memory utilization.
Disadv: more complicated buffer management.
c.)Single large buffer:
The third possibility it is dedicated , a single large buffer for connection and the system
also make use of the memory.
Multiplexing:

Multiplexing is of two types:


a.)Upward multiplexing: four distinct transport connections use the same connection to the
remote host.
b.)Downward multiplexing: If an user needs more bandwidth than virtual circuit , solution is
to distribute the traffic among the connection so that effective bandwidth is increased.
Crash Recovery:
All hosts and routers are subjected to crash network is of 2 types
i.) connection less service all TDPUs are lost and the transport layer has to recover the entire
data.
ii.)If the network layer is connection oriented , when a virtual circuit is lost, so it should
establish a new connection and asking the destination upto what percent it has received.
--when a host crashes the sender might send a broadcast message to other hosts announcing
that it has just crashed and requesting that status of the connection. They have 2 states:
1.)TDPU outstanding.
2.)NO TDPU outstanding.
Based on the state information , the sender decides from where to retransmit.
Internet Transport protocols:
i.)UDP-connection-less(User Datagram Protocol):
The internet protocol suit supports a connectionless transport protocol . UDP provides a
way for applications to send encapsulated IP datagram and sent them without having a
connection establishment UDP transmits segment consisting of 8 byte header by a payload.

a.)Source Port: This is needed when a reply must be sent back source. In the reply , datagram
source code port becomes destination port.
b.)UDP_length: This includes the 8 byte header and data.
c.)UDP_checksum: This is optional and store 0 if not computed UDP doesnt do flow
control , error control , retransmission. UDP is useful inclient server application , Domain
Name System (DNS).
ii.)TCP(Transmission Control Protocol):

It was designed to provide a reliable end to end byte stream over an unreliable network. It is
a connection oriented protocol. A TCP entity accepts user datagram and breaks them into
small pieces not exceeding 64 kb.
i.)Source port and Destination port: They identify the local end points of the connection.
ii.)sequence and acknowledgment number: This field is 32 bit long because every byte of
data is numbered in tcp and the acknowledgment number specifies the next byte expected.
ii.)TCP header length: It tells how many 32 bit words are contained in the TCP header.
URG: When URG is set to 1 , the urgent pointer is in use, it is used to indicate a byte offset
from the current sequence number at which urgent data to be found.
Ack : The acknowledgment bit is set to 1 , to indicate that the acknowledgment is valid, if
ack=0, the segment doesnt contain an acknowledgment .
PSH: This indicates the pushed data , receiver is requested to deliver the date to the
application upon arrival and not buffering.
RST: This bit used to reset a connection that has become confused due to host crash or some
other reason.
SYN: The syn bit is used to establish a connection. The connection request has syn=1, ack=0
which indicates no use of the piggy back acknowledment if syn=ack=1 , use piggy back
acknowledment.
FIN: It is used to release a connection , ot specifies that sender has no more data to transmit.
WindowSize: This field tells how many bytes may be sent starting at the byte
acknowledment.
Checksum:Thisis also provided for extra reliability it checksums the header and data.
Options: This field provides a way to add extra facilities not covered by the regular header.

TCP connection Management:


-3 way handshaking
-seq,cr
-ack,seq
Connections are established inTCP by measn of 3 way handshake to establish a connection
one side say the server waiting for an incoming connection by executing listen and accept
primitives the client connect specifying IP addresses to which it want to connect and the
max. TCP segment size is willing to accept.
TCP connection release:
TCP connection is full duplex. To release a connection a TCP segment has to set fin
bit , when the fin is acknowledged that direction is shutdown for new data . when both
directions have the shutdown , the connection is released.

TCP Transmission Policy:

This is also called as window management.


EX: Suppose a receiver has 4096 byte buffer, if the sender transmit 2058byte segments i.e
correctly received, the receiver will acknowledge the segment. However, it has only 2048
bytes of buffer space and a window of 2048 starting at the next byte expected. The sender
transmits another 2048 bytes which are acknowledged and the window is zero, the sender
must stop until the application process on the receiving host has removed some data from the
buffer.

Nagles Algorithm in Windows Management:


When the data comes into the sender one
byte at a time, just sends the first byte and buffer all the rest until the outstanding byte is
acknowledged, then send all the buffered characters in one TCP segment and start buffering
again until they all acknowledge.
Silly Window Syndrome in Window Management:

This problem degrades TCP performance. This problem occurs when data is
passed to the sending TCP entity in large blocks. On the receiving side, it reads 1 byte data at
a time. From the figure, initially TCP buffer on the receiving side is useful, and the sender
has this information and the application reads, char from TCP stream at the receiver and
sends a window update one to the sender to send 1 byte of information. The sender sends 1
byte and the buffer is now full, and the receiver acknowledges the 1 byte segment. This
behavior can go on for ever.
Solution:
Instead of sending update for every one byte, if it is forced to wait until it has decent
amount of space available and then sends the window for data.
Congestion:
Happens when offered load close to network capacity
Several factors can cause congestion
Sudden traffic bursts
Role of memory (buffering capacity)
Slow processors
Low bandwidth links
Congestion control vs. flow control
Congestion control involves the whole network
Only a pair of nodes (sender and receiver of a flow) are involved in flow control
But both may require same response from senders, i.e., reducing sending rate

Congestion Control:
Virtual circuit subnets
Admission control
Rerouting new virtual circuits around congested areas
Resource reservation during virtual circuit setup
Jitter (delay variation) control for audio/video applications
Can be bounded by computing expected transit time
Prioritizing packets that are behind their schedule the most can help control jitter
Buffering at receiver can help for video on demand or stored audio/video streaming apps,
but not for live apps like Internet telephony or video conferencing.
Datagram subnets
Monitor resource (e.g., output line) utilization and enter a warning state when it exceeds
a threshold
On packet arrival for the resource in warning state, inform traffic source
using:
Warning bit
Choke packets
Hop-by-hop choke packets
Load shedding (random vs. application-aware)
- E.g., Random Early Detection (RED) with TCP.
TCP Congestion Control:
Congestion can be dealt with by employing a principle borrowed from physics: law of
conservation of packets, i.e., refrain from injecting a new packet until an old one leaves the
network
TCP achieves this goal by adjusting transmission rate via dynamic manipulation of sender
window = min (receiver window, congestion window)

a) A fast network feeding a low capacity receiver flow control problem (receiver window)
b.)A slow network feeding a high capacity receiver congestion control problem (congestion
window)
Congestion detection via monitoring retransmission timeouts because lossesmainly due to
congestion in wired Internet and losses lead to timeouts

Internet Congestion Control Algorithm:


When connection established, congestion window = max segment size and threshold =
64KB
Congestion window is increased exponentially in response to acknowledgements until
either a timeout occurs or threshold is reached or receiver window is reached (slow start
phase)
When timeout, threshold is set to half the current congestion window and congestion
window reset to max segment size
After crossing threshold, congestion window increases linearly till a timeout occurs or
receiver window is reached (congestion avoidance phase)

The Application Layer


DNSThe Domain Name System
the way DNS is used is as follows. To map a name onto an IP address, an application program calls a library
procedure called the resolver, passing it the name as a parameter. The resolver sends a UDP packet to a local DNS
server, which then looks up the name and returns the IP address to the resolver, which then returns it to the caller.
Armed with the IP address, the program can then establish a TCP connection with the destination or send it UDP
packets

The DNS Name Space

In principle, domains can be inserted into the tree in two different ways. For example,
cs.yale.edu could equally well be listed under the us country domain as cs.yale.ct.us. In
practice, however, most organizations in the United States are under a generic domain, and
most outside the United States are under the domain of their country. There is no rule
against registering under two top-level domains, but few organizations except multinationals
do it (e.g., sony.com and sony.nl).
Each domain controls how it allocates the domains under it. For example, Japan has
domains ac.jp and co.jp that mirror edu and com. The Netherlands does not make this
distinction and puts all organizations directly under nl. Thus, all three of the following are
university computer science departments:
1. cs.yale.edu (Yale University, in the United States)
2. cs.vu.nl (Vrije Universiteit, in The Netherlands)
3. cs.keio.ac.jp (Keio University, in Japan)

Resource Records
Every domain, whether it is a single host or a top-level domain, can have a set of resource
records associated with it. For a single host, the most common resource record is just its IP
address, but many other kinds of resource records also exist. When a resolver gives a
domain name to DNS, what it gets back are the resource records associated with that name.
Thus, the primary function of DNS is to map domain names onto resource records.
A resource record is a five-tuple. Although they are encoded in binary for efficiency, in most
expositions, resource records are presented as ASCII text, one line per resource record. The
format we will use is as follows:
Domain_name Time_to_live Class Type Value
The Domain_name tells the domain to which this record applies. Normally, many records exist
for each domain and each copy of the database holds information about multiple domains
The Time_to_live field gives an indication of how stable the record is. Information that is
highly stable is assigned a large value, such as 86400 (the number of seconds in 1 day).
Information that is highly volatile is assigned a small value, such as 60 (1 minute). We will
come back to this point later when we have discussed caching.
The third field of every resource record is the Class. For Internet information, it is always IN.
For non-Internet information, other codes can be used, but in practice, these are rarely
seen.
The Type field tells what kind of record this is.

Name Servers
In theory at least, a single name server could contain the entire DNS database and respond
to all queries about it. In practice, this server would be so overloaded as to be useless.
Furthermore, if it ever went down, the entire Internet would be crippled

The next step is to start at the top of the name hierarchy by asking one of
the
root name servers. These name servers have information about each toplevel domain
To contact a root server, each name server must have information about one
or more root name servers. This information is normally present in a system
configuration file that is loaded into the
DNS cache when the DNS server is started. It is simply a list of NS records for
the root and the corresponding A records.
There are 13 root DNS servers, unimaginatively called a-root-servers.net

through m.root-servers.net. Each root server could logically be a single


computer.However, since the entire Internet depends on the root servers,
they are powerful and heavily replicated computers. Most of the servers are
present in multiple geographical locations and reached using anycast
routing, in which a packet is delivered to the nearest instance of a
destination address; we described anycast in Chap. 5 The replication
improves reliability and performance.
The root name server is unlikely to know the address of a machine at UW,
and probably does not know the name server for UW either. But it must know
the name server for the edu domain, in which cs.washington.edu is located.
It returns the name and IP address for that part of the answer in step 3. The
local name server then continues its quest. It sends the entire query to the
edu name server (a.edu-servers.net). That name server returns the name
server for UW. This is shown in steps 4 and 5. Closer now, the local name
server sends the query to the UW name server (step 6). If the domain name
being sought was in the English department, the answer would be found, as
the UW zone includes the English department. But the Computer Science
department has chosen to run its own name server. The query returns the
name and IP address of the UW Computer Science name server (step 7).
Finally, the local name server queries the UW Computer Science name server
(step 8). This server is authoritative for the domain cs.washington.edu, so it
must have the answer. It returns the final answer (step 9), which the local
name server forwards as a response to flits.cs.vu.nl (step 10). The name has
been resolved.
The second point is caching. All of the answers, including all the partial
answers returned, are cached. In this way, if another cs.vu.nl host queries for
robot.cs.washington.edu the answer will already be known. Even better, if a
host
queries for a different host in the same domain, say
galah.cs.washington.edu, the
query can be sent directly to the authoritative name server, cached answers
are not authoritative, since changes made at
cs.washington.edu will not be propagated to all the caches in the world that
may
know about it.
The third issue is the transport protocol that is used for the queries and
responses.
It is UDP. DNS messages are sent in UDP packets with a simple format
for queries, answers, and name servers that can be used to continue the
resolution.
We will not go into the details of this format. If no response arrives within a
short
time, the DNS client repeats the query, trying another server for the domain
after

a small number of retries. This process is designed to handle the case of the
server
being down as well as the query or response packet getting lost. A 16-bit
identifier is included in each query and copied to the response so that a
name server
can match answers to the corresponding query, even if multiple queries are
outstanding at the same time.
Electronic mail
Electronic mail, or e-mail, as it is known to its many fans, has been around for over two
decades. Before 1990, it was mostly used in academia. During the 1990s, it became known
to the public at large and grew exponentially to the point where the number of e-mails sent
per day now is vastly more than the number of snail mail (i.e., paper) letters.
E-mail is full of jargon such as BTW (By The Way), ROTFL (Rolling On The Floor Laughing),
and IMHO (In My Humble Opinion). Many people also use little ASCII symbols called
smileys or emoticons in their e-mail. A few of the more interesting ones are reproduced in
Fig. 7-6.

The first e-mail systems simply consisted of file transfer protocols, with the convention that
the first line of each message (i.e., file) contained the recipient's address. As time went on,
the limitations of this approach became more obvious.
Some of the complaints were as follows:
1. Sending a message to a group of people was inconvenient. Managers often need this
facility to send memos to all their subordinates.
2. Messages had no internal structure, making computer processing difficult. For
example, if a forwarded message was included in the body of another message,
extracting the forwarded part from the received message was difficult.
3. The originator (sender) never knew if a message arrived or not.
4. If someone was planning to be away on business for several weeks and wanted all
incoming e-mail to be handled by his secretary, this was not easy to arrange.
5. The user interface was poorly integrated with the transmission system requiring users
first to edit a file, then leave the editor and invoke the file transfer program.
6. It was not possible to create and send messages containing a mixture of text,
drawings, facsimile, and voice.

Architecture and Services


In this section we will provide an overview of what e-mail systems can do and how they are
organized. They normally consist of two subsystems: the user agents, which allow people
to read and send e-mail, and the message transfer agents, which move the messages
from the source to the destination. The user agents are local programs that provide a
command-based, menu-based, or graphical method for interacting with the e-mail system.
The message transfer agents are typically system daemons, that is, processes that run in
the background. Their job is to move e-mail through the system.
Typically, e-mail systems support five basic functions. Let us take a look at them.
Composition refers to the process of creating messages and answers. Although any text
editor can be used for the body of the message, the system itself can provide assistance
with addressing and the numerous header fields attached to each message. For example,
when answering a message, the e-mail system can extract the originator's address from the
incoming e-mail and automatically insert it into the proper place in the reply.
Transfer refers to moving messages from the originator to the recipient. In large part, this
requires establishing a connection to the destination or some intermediate machine,
outputting the message, and releasing the connection. The e-mail system should do this
automatically, without bothering the user.
Reporting has to do with telling the originator what happened to the message. Was it
delivered? Was it rejected? Was it lost? Numerous applications exist in which confirmation of
delivery is important and may even have legal significance (''Well, Your Honor, my e-mail
system is not very reliable, so I guess the electronic subpoena just got lost somewhere'').
Displaying incoming messages is needed so people can read their e-mail. Sometimes
conversion is required or a special viewer must be invoked, for example, if the message is a
PostScript file or digitized voice. Simple conversions and formatting are sometimes
attempted as well.
Disposition is the final step and concerns what the recipient does with the message after
receiving it. Possibilities include throwing it away before reading, throwing it away after
reading, saving it, and so on. It should also be possible to retrieve and reread saved
messages, forward them, or process them in other ways.

The User Agent


E-mail systems have two basic parts, as we have seen: the user agents and the message
transfer agents. In this section we will look at the user agents. A user agent is normally a
program (sometimes called a mail reader) that accepts a variety of commands for
composing, receiving, and replying to messages, as well as for manipulating mailboxes.
Some user agents have a fancy menu- or icon-driven interface that requires a mouse,
whereas others expect 1-character commands from the keyboard. Functionally, these are
the same

RFC 822
Messages consist of a primitive envelope (described in RFC 821), some number of header fields, a blank line, and
then the message body. Each header field (logically) consists of a single line of ASCII text containing the field
name, a colon, and, for most fields, a value. RFC 822 was designed decades ago and does not clearly distinguish the
envelope fields from the header fields.

MIMEThe Multipurpose Internet Mail Extensions


In the early days of the ARPANET, e-mail consisted exclusively of text messages written in
English and expressed in ASCII. For this environment, RFC 822 did the job completely: it
specified the headers but left the content entirely up to the users. Nowadays, on the
worldwide Internet, this approach is no longer adequate. The problems include sending and
receiving
1. Messages in languages with accents (e.g., French and German).
2. Messages in non-Latin alphabets (e.g., Hebrew and Russian).
3. Messages in languages without alphabets (e.g., Chinese and Japanese).
4. Messages not containing text at all (e.g., audio or images).
A solution was proposed in RFC 1341 and updated in RFCs 20452049. This solution, called
MIME (Multipurpose Internet Mail Extensions) is now widely used
The Content-Description: header is an ASCII string telling what is in the message. This
header is needed so the recipient will know whether it is worth decoding and reading the
message. If the string says: ''Photo of Barbara's hamster'' and the person getting the
message is not a big hamster fan, the message will probably be discarded rather than
decoded into a high-resolution color photograph.
The Content-Id: header identifies the content. It uses the same format as the standard
Message-Id: header.
The Content-Transfer-Encoding: tells how the body is wrapped for transmission through a
network that may object to most characters other than letters, numbers, and punctuation
marks. Five schemes (plus an escape to new schemes) are provided. The simplest scheme
is just ASCII text. ASCII characters use 7 bits and can be carried directly by the e-mail
protocol provided that no line exceeds 1000 characters.
The next simplest scheme is the same thing, but using 8-bit characters, that is, all values from 0 up to and including
255. This encoding scheme violates the (original) Internet e-mail protocol but is used by some parts of the Internet
that implement some extensions to the original protocol. While declaring the encoding does not make it legal,
having it explicit may at least explain things when something goes wrong. Messages using the 8-bit encoding must
still adhere to the standard maximum line length.

Message Transfer
The message transfer system is concerned with relaying messages from the originator to
the recipient. The simplest way to do this is to establish a transport connection from the
source machine to the destination machine and then just transfer the message. After
examining how this is normally done, we will examine some situations in which this does not
work and what can be done about them.

SMTPThe Simple Mail Transfer Protocol


Within the Internet, e-mail is delivered by having the source machine establish a TCP
connection to port 25 of the destination machine. Listening to this port is an e-mail daemon
that speaks SMTP (Simple Mail Transfer Protocol). This daemon accepts incoming
connections and copies messages from them into the appropriate mailboxes. If a message
cannot be delivered, an error report containing the first part of the undeliverable message is
returned to the sender.
SMTP is a simple ASCII protocol. After establishing the TCP connection to port 25, the
sending machine, operating as the client, waits for the receiving machine, operating as the
server, to talk first. The server starts by sending a line of text giving its identity and telling
whether it is prepared to receive mail

POP3
Unfortunately, this solution creates another problem: how does the user get the e-mail from
the ISP's message transfer agent? The solution to this problem is to create another protocol
that allows user transfer agents (on client PCs) to contact the message transfer agent (on
the ISP's machine) and allow e-mail to be copied from the ISP to the user. One such
protocol is POP3 (Post Office Protocol Version 3), which is described in RFC 1939.

POP3 begins when the user starts the mail reader. The mail reader calls up the ISP (unless
there is already a connection) and establishes a TCP connection with the message transfer
agent at port 110. Once the connection has been established, the POP3 protocol goes
through three states in sequence:
1. Authorization.
2. Transactions.
3. Update.
The authorization state deals with having the user log in. The transaction state deals with the user collecting the emails and marking them for deletion from the mailbox. The update state actually causes the e-mails to be deleted.

Delivery Features
Independently of whether POP3 or IMAP is used, many systems provide hooks for additional
processing of incoming e-mail. An especially valuable feature for many e-mail users is the
ability to set up filters. These are rules that are checked when e-mail comes in or when the
user agent is started. Each rule specifies a condition and an action. For example, a rule
could say that any message received from the boss goes to mailbox number 1, any message
from a select group of friends goes to mailbox number 2, and any message containing
certain objectionable words in the Subject line is discarded without comment.

The World Wide Web


The World Wide Web is an architectural framework for accessing linked documents spread
out over millions of machines all over the Internet. In 10 years, it went from being a way to
distribute high-energy physics data to the application that millions of people think of as
being ''The Internet.'' Its enormous popularity stems from the fact that it has a colorful
graphical interface that is easy for beginners to use

Architectural Overview
From the users' point of view, the Web consists of a vast, worldwide collection of documents
or Web pages, often just called pages for short. Each page may contain links to other
pages anywhere in the world. Users can follow a link by clicking on it, which then takes
them to the page pointed to. This process can be repeated indefinitely. The idea of having
one page point to another, now called hypertext, was invented by a visionary M.I.T.
professor of electrical engineering, Vannevar Bush, in 1945, long before the Internet was
invented.
Pages are viewed with a program called a browser, of which Internet Explorer and
Netscape Navigator are two popular ones. The browser fetches the page requested,
interprets the text and formatting commands on it, and displays the page, properly
formatted, on the screen. An example is given in Fig. 7-18(a). Like many Web pages, this
one starts with a title, contains some information, and ends with the e-mail address of the
page's maintainer. Strings of text that are links to other pages, called hyperlinks, are often
highlighted, by underlining, displaying them in a special color, or both.

The Client Side

Let us now examine the client side of Fig. 7-19 in more detail. In essence, a browser is a
program that can display a Web page and catch mouse clicks to items on the displayed
page. When an item is selected, the browser follows the hyperlink and fetches the page
selected. Therefore, the embedded hyperlink needs a way to name any other page on the
Web. Pages are named using URLs (Uniform Resource Locators). A typical URL is
http://www.abcd.com/products.html
We will explain URLs later in this chapter. For the moment, it is sufficient to know that a URL
has three parts: the name of the protocol (http), the DNS name of the machine where the
page is located (www.abcd.com), and (usually) the name of the file containing the page
(products.html).
When a user clicks on a hyperlink, the browser carries out a series of steps in order to fetch
the page pointed to. Suppose that a user is browsing the Web and finds a link on Internet
telephony that points to ITU's home page, which is http://www.itu.org/home/index.html.
Let us trace the steps that occur when this link is selected.
1. The browser determines the URL (by seeing what was selected).
2. The browser asks DNS for the IP address of www.itu.org.
3. DNS replies with 156.106.192.32.
4. The browser makes a TCP connection to port 80 on 156.106.192.32.
5. It then sends over a request asking for file /home/index.html.
6. The www.itu.org server sends the file /home/index.html.
7. The TCP connection is released.
8. The browser displays all the text in /home/index.html.
9. The browser fetches and displays all images in this file.

The Server Side


So much for the client side. Now let us take a look at the server side. As we saw above,
when the user types in a URL or clicks on a line of hypertext, the browser parses the URL
and interprets the part between http:// and the next slash as a DNS name to look up.
Armed with the IP address of the server, the browser establishes a TCP connection to port
80 on that server. Then it sends over a command containing the rest of the URL, which is
the name of a file on that server. The server then returns the file for the browser to display.
To a first approximation, a Web server is similar to the server of Fig. 6-6. That server, like a
real Web server, is given the name of a file to look up and return. In both cases, the steps
that the server performs in its main loop are:
1. Accept a TCP connection from a client (a browser).
2. Get the name of the file requested.
3. Get the file (from disk).
4. Return the file to the client.
5. Release the TCP connection.

URLsUniform Resource Locators


We have repeatedly said that Web pages may contain pointers to other Web pages. Now it is
time to see in a bit more detail how these pointers are implemented. When the Web was
first created, it was immediately apparent that having one page point to another Web page
required mechanisms for naming and locating pages. In particular, three questions had to be
answered before a selected page could be displayed:
1. What is the page called?
2. Where is the page located?
3. How can the page be accessed?
If every page were somehow assigned a unique name, there would not be any ambiguity in identifying pages.
Nevertheless, the problem would not be solved. Consider a parallel between people and pages. In the United States,
almost everyone has a social security number, which is a unique identifier, as no two people are supposed to have
the same one.

Statelessness and Cookies


As we have seen repeatedly, the Web is basically stateless. There is no concept of a login
session. The browser sends a request to a server and gets back a file. Then the server
forgets that it has ever seen that particular client.
At first, when the Web was just used for retrieving publicly available documents, this model was perfectly adequate.
But as the Web started to acquire other functions, it caused problems. For example, some Web sites require clients to
register (and possibly pay money) to use them. This raises the question of how servers can distinguish between
requests from registered users and everyone else. A second example is from e-commerce. If a user wanders around
an electronic store, tossing items into her shopping cart from time to time, how does the server keep track of the
contents of the cart?
Cookies can also be used for the server's own benefit. For example, suppose a server wants
to keep track of how many unique visitors it has had and how many pages each one looked
at before leaving the site. When the first request comes in, there will be no accompanying
cookie, so the server sends back a cookie containing Counter = 1. Subsequent clicks on that
site will send the cookie back to the server. Each time the counter is incremented and sent
back to the client. By keeping track of the counters, the server can see how many people

give up after seeing the first page, how many look at two pages, and so on.
Cookies have also been misused. In theory, cookies are only supposed to go back to the originating site, but hackers
have exploited numerous bugs in the browsers to capture cookies not intended for them. Since some e-commerce
sites put credit card numbers in cookies, the potential for abuse is clear.

HTMLThe HyperText Markup Language


Web pages are currently written in a language called HTML (HyperText Markup
Language). HTML allows users to produce Web pages that include text, graphics, and
pointers to other Web pages. HTML is a markup language, a language for describing how
documents are to be formatted. The term ''markup'' comes from the old days when
copyeditors actually marked up documents to tell the printerin those days, a human being
which fonts to use, and so on. Markup languages thus contain explicit commands for
formatting. For example, in HTML, <b> means start boldface mode, and </b> means
leave boldface mode. The advantage of a markup language over one with no explicit markup
is that writing a browser for it is straightforward: the browser simply has to understand the
markup commands. TeX and troff are other well-known examples of markup languages.
By embedding all the markup commands within each HTML file and standardizing them, it
becomes possible for any Web browser to read and reformat any Web page.
A Web page consists of a head and a body, each enclosed by <html> and </html> tags
(formatting commands), although most browsers do not complain if these tags are missing.
As

can be seen from Fig. 7-26(a), the head is bracketed by the <head> and </head> tags and
the body is bracketed by the <body> and </body> tags. The strings inside the tags are
called directives. Most HTML tags have this format, that is they use, <something> to mark
the beginning of something and </something> to mark its end. Most browsers have a menu
item VIEW SOURCE or something like that. Selecting this item displays the current page's
HTML source, instead of its formatted output.

Figure 7-26. (a) The HTML for a sample Web page. (b) The formatted
page

XML and XSL


HTML, with or without forms, does not provide any structure to Web pages. It also mixes
the content with the formatting. As e-commerce and other applications become more
common, there is an increasing need for structuring Web pages and separating the content
from the formatting. For example, a program that searches the Web for the best price for
some book or CD needs to analyze many Web pages looking for the item's title and price.
With Web pages in HTML, it is very difficult for a program to figure out where the title is and
where the price is.
For this reason, the W3C has developed an enhancement to HTML to allow Web pages to be
structured for automated processing. Two new languages have been developed for this
purpose. First, XML (eXtensible Markup Language) describes Web content in a
structured way and second, XSL (eXtensible Style Language) describes the formatting
independently of the content. Both of these are large and complicated topics, so our brief
introduction below just scratches the surface, but it should give an idea of how they work.

Anda mungkin juga menyukai