Concepts of Networking

news |

Protocols

A protocol means the rules that are applicable for a network. Protocol defines standardized formats for data packets, techniques for detecting and correcting errors and so on.

To understand the concept of communication protocol, let us assume that A and B need to talk to none another. They want to exchange their ideas. But it turns out that, both, A and B are egoist. They start talking again simultaneously, then pause for breath simultaneously, and then start talking again. Now imagine the confusion and chaos. To avoid it, they must follow a set of rules while talking. For instance, say first A must talk, then he/she must give B a chance to put forward his/her ideas, and so on. This common set of rules would be known as communication protocol for A and B

Thus for effective use of a network it must follow a standardized protocol. There are various protocols that are used in various types of networks. For example TCP/IP, HTTP, SMTP etc. We will be discussing some of the protocols in our next section.

4.1 TCP/IP

TCP/IP (Transmission Control Protocol / Internet Protocol) is the main suite of protocols used for the Internet. This set of protocols includes TCP, IP, HTTP, FTP, PPP, and many others.

TCP/IP was designed as an open standard, to be capable of implementation on all types of hardware and software systems.

TCP adds a great deal of functionality to the IP service it is layered over:

· Streams. TCP data is organized as a stream of bytes, much like a file. The datagram nature of the network is concealed. A mechanism (the Urgent Pointer) exists to let out-of-band data be specially flagged.

· Reliable delivery. Sequence numbers are used to coordinate which data has been transmitted and received. TCP will arrange for retransmission if it determines that data has been lost.

· Network adaptation. TCP will dynamically learn the delay characteristics of a network and adjust its operation to maximize throughput without overloading the network.

· Flow control. TCP manages data buffers, and coordinates traffic so its buffers will never overflow. Fast senders will be stopped periodically to keep up with slower receivers.

Full-duplex Operation

No matter what the particular application, TCP almost always operates full-duplex. The algorithms described below operate in both directions, in an almost completely independent manner. It's sometimes useful to think of a TCP session as two independent byte streams, traveling in opposite directions. No TCP mechanism exists to associate data in the forward and reverse byte streams. Only during connection start and close sequences can TCP exhibit asymmetric behavior (i.e. data transfer in the forward direction but not in the reverse, or vice versa).

Sequence Numbers

TCP uses a 32-bit sequence number that counts bytes in the data stream. Each TCP packet contains the starting sequence number of the data in that packet, and the sequence number (called the acknowledgment number) of the last byte received from the remote peer. With this information, a sliding-window protocol is implemented. Forward and reverse sequence numbers are completely independent, and each TCP peer must track both its own sequence numbering and the numbering being used by the remote peer.

TCP uses a number of control flags to manage the connection. Some of these flags pertain to a single packet, such as the URG flag indicating valid data in the Urgent Pointer field, but two flags (SYN and FIN), require reliable delivery as they mark the beginning and end of the data stream. In order to insure reliable delivery of these two flags, they are assigned spots in the sequence number space. Each flag occupies a single byte.

Window Size and Buffering

Each endpoint of a TCP connection will have a buffer for storing data that is transmitted over the network before the application is ready to read the data. This let's network transfers take place while applications are busy with other processing, improving overall performance.

To avoid overflowing the buffer, TCP sets a Window Size field in each packet it transmits. This field contains the amount of data that may be transmitted into the buffer. If this number falls to zero, the remote TCP can send no more data. It must wait until buffer space becomes available and it receives a packet announcing a non-zero window size.

Sometimes, the buffer space is too small. This happens when the network's bandwidth-delay product exceeds the buffer size. The simplest solution is to increase the buffer, but for extreme cases the protocol itself becomes the bottleneck (because it doesn't support a large enough Window Size). Under these conditions, the network is termed an LFN (Long Fat Network - pronounced elephant).

Round-Trip Time Estimation

When a host transmits a TCP packet to its peer, it must wait a period of time for an acknowledgment. If the reply does not come within the expected period, the packet is assumed to have been lost and the data is retransmitted. The obvious question - How long do we wait? - lacks a simple answer. Over an Ethernet, no more than a few microseconds should be needed for a reply. If the traffic must flow over the wide-area Internet, a second or two might be reasonable during peak utilization times. If we're talking to an instrument package on a satellite hurtling toward Mars, minutes might be required before a reply. There is no one answer to the question - How long?

All modern TCP implementations seek to answer this question by monitoring the normal exchange of data packets and developing an estimate of how long is "too long". This process is called Round-Trip Time (RTT) estimation. RTT estimates are one of the most important performance parameters in a TCP exchange, especially when you consider that on an indefinitely large transfer, all TCP implementations eventually drop packets and retransmit them, no matter how good the quality of the link. If the RTT estimate is too low, packets are retransmitted unnecessarily; if too high, the connection can sit idle while the host waits to timeout.

Internet Protocol

· The IP protocol is responsible for routing the packages created by TCP.

· IP is a connectionless protocol. It is not concerned with whether or not the data actually reaches the recipient, just with moving that data to its designated destination.

· IP adds a header to the datagram created by TCP, resulting in a total of two different headers added to the original source data.

· The IP header includes the following information:

· A checksum to provide a means of checking data integrity at each stopover point.

· A hop count or time to live, which determines the maximum number of hops a package can make.

· Both source and destination addresses are also included in the IP header.

· The IP protocol is used to determine the route a data packet will take to its destination. If the destination IP address is not known by the local gateway, that gateway will pass the packet on to its default gateway. This process will continue until the desired destination is reached.

· Using IP, different datagrams from a single data source may take different routes to their destination, thus causing some packets to arrive out of order. To avoid this randomness, it is also possible to prescribe a set route for the data to take.

IP Addressing

· In order to participate in a TCP/IP network, each computer (or host) must have a unique IP address. These addresses may be automatically assigned using DHCP (Dynamic Host Configuration Protocol) or manually entered into the host computer.

· An IP address is made up of a single 32-bit number (meaning it has 32 ones or zeros). This number is usually divided into four 8-bit segments separated by dots. Each 8-bit segment has a value between 0 and 255.
Example: 011111111.00111111.00011111.00000111 = 127.63.31.7.

· Dotted decimal notation refers to writing IP addresses using four decimal numbers (numbers between 0 and 255) separated by dots.

· The first portion of an IP address is usually used to identify the network, while the second portion identifies a particular machine within that network.

· An IP address composed of the network portion of the IP followed by all zeros identifies the network itself. Example: 192.168.0.0 refer to the 192.168 network.

· An IP address composed of the network portion of the IP followed by all 255s is called a broadcast address. Example: A packet addressed to 192.168.255.255 would be delivered to every machine on the 192.168 network.

· The IP address 192.168.x.x is reserved for private networks.

· The current version of IP addressing is IPv4 (version 4) and allows over 17 million addresses, which is proving insufficient. A new version, called IPng (IP next generation) or IPv6 is currently being phased in and will provide more IP addresses (over 70 octillion).

· IP addresses are divided into the following classes:

· Class A: Highest-order bit set to zero; IP address range from 1.x.x.x to 126.x.x.x; first octet makes up the network portion of the IP address. There may be 127 class A networks, each having up to 16,777,214 connected hosts. All Class A networks are currently taken.

· Special: The address 127.0.0.1 is reserved for loopback tests.

· Class B: Highest order bit set to 10; IP address range from 128.0.x.x to 191.255.x.x; first two octets make up the network portion of the IP address. There are no Class B addresses free.

· Class C: Highest order bits set to 110; IP address range from 192.0.0.x to 223.255.255.x; first three octets determine network portion of IP address.

· Class D: Highest order bits set to 1110; used exclusively for multicasting (delivery to a group of host computers.

· Class E: Highest order bits set to 1111; reserved for experimental use.

· A new addressing scheme called CIDR (Classless Inter-Domain Routing Scheme) breaks down IP addresses into segments smaller than class C to fit the needs of different companies.

Subnet Masks

· A subnet mask is a way of dividing a single network into multiple physical networks by reallocating the hosts portion of the IP addressing scheme. The new IP address scheme has a network portion, a subnet portion, and a host address that is shorter than under the original scheme.

· Subnets help reduce network traffic by keeping local traffic on one side of a router and isolating the information from the LAN on the other side of the router.

· A router must be used to implement a subnet scheme.

· To define a subnet mask, convert the network portion of the IP address into binary notation. Next, select the number of binary digits to use for the subnet mask. Finally, calculate the new dotted decimal ranges available under each subnet.
Example:

· Key:Network; Subnet; Host

· IP Network Address:
172.25.16.x

· Binary IP Network Address:
10101100 00011001 00010000 xxxxxxxx

· Add Subnet Mask:
10101100 00011001 0001000 11xxxxxx

· Four New Subnets Available:
A.10101100 00011001 0001000 00xxxxxx
B.10101100 00011001 0001000 01xxxxxx
C.10101100 00011001 0001000 10xxxxxx
D. 10101100 00011001 0001000 11xxxxxx

· Dotted Decimals of New Subnets:
A.172.25.16.0 to 172.25.16.63
B.172.25.16.64 to 172.25.16.127
C.172.25.16.128 to 172.25.16.191
D.172.25.16.192 to 172.25.16.255

· On a subnet, the first available address in the subnet class is the new network number and the last available address is the new broadcast number.
Example: In subnet A above, 172.25.16.0 is the network number and 172.25.16.63 is the subnet broadcast number.

4.2 UDP

UDP provides users access to IP-like services. UDP packets are delivered just like IP packets - connection-less datagrams that may be discarded before reaching their targets. UDP is useful when TCP would be too complex, too slow, or just unnecessary.

UDP provides a few functions beyond that of IP:

· Port Numbers. UDP provides 16-bit port numbers to let multiple processes use UDP services on the same host. A UDP address is the combination of a 32-bit IP address and the 16-bit port number.

· Checksumming. Unlike IP, UDP does checksum its data, ensuring data integrity. A packet failing checksum is simply discarded, with no further action taken.

4.3 HTTP

Purpose

The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems. HTTP has been in use by the World-Wide Web global information initiative since 1990. The first version of HTTP, referred to as HTTP/0.9, was a simple protocol for raw data transfer across the Internet. HTTP/1.0, improved the protocol by allowing messages to be in the format of MIME-like messages, containing meta-information about the data transferred and modifiers on the request/response semantics. However, HTTP/1.0 does not sufficiently take into consideration the effects of hierarchical proxies, caching, the need for persistent connections, or virtual hosts. In addition, the proliferation of incompletely implemented applications calling themselves "HTTP/1.0" has necessitated a protocol version change in order for two communicating applications to determine each other's true capabilities.

This specification defines the protocol referred to as "HTTP/1.1". This protocol includes more stringent requirements than HTTP/1.0 in order to ensure reliable implementation of its features.

Practical information systems require more functionality than simple retrieval, including search, front-end update, and annotation. HTTP allows an open-ended set of methods and headers that indicate the purpose of a request. It builds on the discipline of reference provided by the Uniform Resource Identifier (URI), as a location (URL) or name (URN), for indicating the resource to which a method is to be applied. Messages are passed in a format similar to that used by Internet mail [9] as defined by the Multipurpose Internet Mail Extensions (MIME).

HTTP is also used as a generic protocol for communication between user agents and proxies/gateways to other Internet systems, including those supported by the SMTP, NNTP, FTP, Gopher, and WAIS protocols. In this way, HTTP allows basic hypermedia access to resources available from diverse applications.

Requirements

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

An implementation is not compliant if it fails to satisfy one or more of the MUST or REQUIRED level requirements for the protocols that it implements. An implementation that satisfies all the MUST or REQUIRED level and all the SHOULD level requirements for its protocols is said to be "unconditionally compliant"; one that satisfies all the MUST level requirements but not all the SHOULD level requirements for its protocols is said to be "conditionally compliant."

Terminology

This specification uses a number of terms to refer to the roles played by participants in, and objects of, the HTTP communication.

connection

A transport layer virtual circuit established between two programs for the purpose of communication.

message

The basic unit of HTTP communication, consisting of a structured sequence of octets matching the syntax and transmitted via the connection.

request

An HTTP request message.

response

An HTTP response message.

resource

A network data object or service that can be identified by a URI. Resources may be available in multiple representations (e.g. multiple languages, data formats, size, and resolutions) or vary in other ways.

entity

The information transferred as the payload of a request or response. An entity consists of meta-information in the form of entity-header fields and content in the form of an entity-body.

representation

An entity included with a response that is subject to content negotiation. There may exist multiple representations associated with a particular response status.

content negotiation

The mechanism for selecting the appropriate representation when servicing a request. The representation of entities in any response can be negotiated (including error responses).

variant

A resource may have one, or more than one, representation(s) associated with it at any given instant. Each of these representations is termed a `variant'. Use of the term `variant' does not necessarily imply that the resource is subject to content negotiation.

client

A program that establishes connections for the purpose of sending requests.

user agent

The client which initiates a request. These are often browsers, editors, spiders (web-traversing robots), or other end user tools.

server

An application program that accepts connections in order to service requests by sending back responses. Any given program may be capable of being both a client and a server; our use of these terms refers only to the role being performed by the program for a particular connection, rather than to the program's capabilities in general. Likewise, any server may act as an origin server, proxy, gateway, or tunnel, switching behavior based on the nature of each request.

origin server

The server on which a given resource resides or is to be created.

proxy

An intermediary program, which acts as both, a server and a client for the purpose of making requests on behalf of other clients. Requests are serviced internally or by passing them on, with possible translation, to other servers. A proxy MUST implement both the client and server requirements of this specification. A "transparent proxy" is a proxy that does not modify the request or response beyond what is required for proxy authentication and identification. A "non-transparent proxy" is a proxy that modifies the request or response in order to provide some added service to the user agent, such as group annotation services, media type transformation, protocol reduction, or anonymity filtering. Except where either transparent or non-transparent behavior is explicitly stated, the HTTP proxy requirements apply to both types of proxies.

gateway

A server which acts as an intermediary for some other server. Unlike a proxy, a gateway receives requests as if it were the origin server for the requested resource; the requesting client may not be aware that it is communicating with a gateway.

tunnel

An intermediary program which is acting as a blind relay between two connections. Once active, a tunnel is not considered a party to the HTTP communication, though the tunnel may have been initiated by an HTTP request. The tunnel ceases to exist when both ends of the relayed connections are closed.

cache

A program's local store of response messages and the subsystem that controls its message storage, retrieval, and deletion. A cache stores cacheable responses in order to reduce the response time and network bandwidth consumption on future, equivalent requests. Any client or server may include a cache, though a cache cannot be used by a server that is acting as a tunnel.

cacheable

A response is cacheable if a cache is allowed to store a copy of the response message for use in answering subsequent requests. The rules for determining the cacheability of HTTP responses are defined in section 13. Even if a resource is cacheable, there may be additional constraints on whether a cache can use the cached copy for a particular request.

first-hand

A response is first-hand if it comes directly and without unnecessary delay from the origin server, perhaps via one or more proxies. A response is also first-hand if its validity has just been checked directly with the origin server.

explicit expiration time

The time at which the origin server intends that an entity should no longer be returned by a cache without further validation.

heuristic expiration time

An expiration time assigned by a cache when no explicit expiration time is available.

age The age of a response is the time since it was sent by, or successfully validated with, the origin server.

freshness lifetime

The length of time between the generation of a response and its expiration time.

Fresh. A response is fresh if its age has not yet exceeded its freshness lifetime.

stale

A response is stale if its age has passed its freshness lifetime.

semantically transparent

A cache behaves in a "semantically transparent" manner, with respect to a particular response, when its use affects neither the requesting client nor the origin server, except to improve performance. When a cache is semantically transparent, the client receives exactly the same response (except for hop-by-hop headers) that it would have received had its request been handled directly by the origin server.

validator

A protocol element (e.g., an entity tag or a Last-Modified time) that is used to find out whether a cache entry is an equivalent copy of an entity.

upstream/downstream

Upstream and downstream describe the flow of a message: all messages flow from upstream to downstream.

inbound/outbound

Inbound and outbound refer to the request and response paths for messages: "inbound" means "traveling toward the origin server", and "outbound" means "traveling toward the user agent"

Overall Operation

The HTTP protocol is a request/response protocol. A client sends a request to the server in the form of a request method, URI, and protocol version, followed by a MIME-like message containing request modifiers, client information, and possible body content over a connection with a server. The server responds with a status line, including the message's protocol version and a success or error code, followed by a MIME-like message containing server information, entity meta-information, and possible entity-body content.

Most HTTP communication is initiated by a user agent and consists of a request to be applied to a resource on some origin server. In the simplest case, this may be accomplished via a single connection (v) between the user agent (UA) and the origin server (O).

request chain ------------------------>

UA -------------------v------------------- O

<----------------------- response chain

A more complicated situation occurs when one or more intermediaries are present in the request/response chain. There are three common forms of intermediary: proxy, gateway, and tunnel. A proxy is a forwarding agent, receiving requests for a URI in its absolute form, rewriting all or part of the message, and forwarding the reformatted request toward the server identified by the URI. A gateway is a receiving agent, acting as a layer above some other server(s) and, if necessary, translating the requests to the underlying server's protocol. A tunnel acts as a relay point between two connections without changing the messages; tunnels are used when the communication needs to pass through an intermediary (such as a firewall) even when the intermediary cannot understand the contents of the messages.

request chain -------------------------------------->

UA -----v----- A -----v----- B -----v----- C -----v----- O

<------------------------------------- response chain

The figure above shows three intermediaries (A, B, and C) between the user agent and origin server. A request or response message that travels the whole chain will pass through four separate connections. This distinction is important because some HTTP communication options may apply only to the connection with the nearest, non-tunnel neighbor, only to the end-points of the chain, or to all connections along the chain. Although the diagram is linear, each participant may be engaged in multiple, simultaneous communications. For example, B may be receiving requests from many clients other than A, and/or forwarding requests to servers other than C, at the same time that it is handling A's request.

Any party to the communication which is not acting as a tunnel may employ an internal cache for handling requests. The effect of a cache is that the request/response chain is shortened if one of the participants along the chain has a cached response applicable to that request. The following illustrates the resulting chain if B has a cached copy of an earlier response from O (via C) for a request, which has not been cached by UA or A.

request chain ---------->

UA -----v----- A -----v----- B - - - - - - C - - - - - - O

<--------- response chain

Not all responses are usefully cacheable, and some requests may contain modifiers, which place special requirements on cache behavior.

In fact, there are a wide variety of architectures and configurations of caches and proxies currently being experimented with or deployed across the World Wide Web. These systems include national hierarchies of proxy caches to save transoceanic bandwidth, systems that broadcast or multicast cache entries, organizations that distribute subsets of cached data via CD-ROM, and so on. HTTP systems are used in corporate intranets over high-bandwidth links, and for access via PDAs with low-power radio links and intermittent connectivity. The goal of HTTP/1.1 is to support the wide diversity of configurations already deployed while introducing protocol constructs that meet the needs of those who build web applications that require high reliability and, failing that, at least reliable indications of failure.

HTTP communication usually takes place over TCP/IP connections. The default port is TCP 80, but other ports can be used. This does not preclude HTTP from being implemented on top of any other protocol on the Internet, or on other networks. HTTP only presumes a reliable transport; any protocol that provides such guarantees can be used; the mapping of the HTTP/1.1 request and response structures onto the transport data units of the protocol in question is outside the scope of this specification.

In HTTP/1.0, most implementations used a new connection for each request/response exchange. In HTTP/1.1, a connection may be used for one or more request/response exchanges, although connections may be closed for a variety of reasons.

4.4 SMTP

Simple Mail Transfer Protocol (SMTP) is Internet's standard host-to-host mail transport protocol and traditionally operates over TCP, port 25. In other words, a UNIX user can type telnet hostname 25 and connect with an SMTP server, if one is present.

SMTP uses a style of asymmetric request-response protocol popular in the early 1980s, and still seen occasionally, most often in mail protocols. The protocol is designed to be equally useful to either a computer or a human, though not too forgiving of the human. From the server's viewpoint, a clear set of commands is provided and well documented in the RFC. For the human, all the commands are clearly terminated by newlines and a HELP command lists all of them. From the sender's viewpoint, the command replies always take the form of text lines, each starting with a three-digit code identifying the result of the operation, a continuation character to indicate another lines following, and then arbitrary text information designed to be informative to a human.

If mail delivery fails, sendmail (the most important SMTP implementation) will queue mail messages and retry delivery later. However, a backoff algorithm is used, and no mechanism exists to poll all Internet hosts for mail, nor does SMTP provide any mailbox facility, or any special features beyond mail transport. For these reasons, SMTP isn't a good choice for hosts situated behind highly unpredictable lines (like modems). A better-connected host can be designated as a DNS mail exchanger, then arrange for a relay scheme. Currently, there are two main configurations that can be used. One is to configure POP mailboxes and a POP server on the exchange host, and let all users use POP-enabled mail clients. The other possibility is to arrange for a periodic SMTP mail transfer from the exchange host to another, local SMTP exchange host which has been queuing all the outbound mail. Of course, since this solution does not allow full-time Internet access, it is not too preferred.

Extended SMTP, or ESMTP by definition extensible, allowing new service extensions to be defined and registered with IANA. Probably the most important extension currently available is Delivery Status Notification (DSN).

4.5 FTP

Terminology

ASCII

The ASCII character set is as defined in the ARPA-Internet Protocol Handbook. In FTP,

ASCII characters are defined to be the lower half of an eight-bit code set (i.e., the most

significant bit is zero).

access controls

Access controls define users' access privileges to the use of a system, and to the files in that system. Access controls are necessary to prevent unauthorized or accidental use of files. It is the prerogative of a server-FTP process to invoke access controls.

byte size

There are two byte sizes of interest in FTP: the logical byte size of the file, and the transfer byte size used for the transmission of the data. The transfer byte size is always 8 bits. The transfer byte size is not necessarily the byte size in which data is to be stored in a system, nor the logical byte size for interpretation of the structure of the data.

control connection

The communication path between the USER-PI and SERVER-PI for the exchange of

commands and replies. This connection follows the Telnet Protocol.

data connection

A full duplex connection over which data is transferred, in a specified mode and type. The data transferred may be a part of a file, an entire file or a number of files. The path may be between a server-DTP and a user-DTP, or between two server-DTPs.

data port

The passive data transfer process "listens" on the data port for a connection from the active transfer process in order to open the data connection.

DTP

The data transfer process establishes and manages the data connection. The DTP can be passive or active.

End-of-Line

The end-of-line sequence defines the separation of printing lines. The sequence is Carriage Return, followed by Line Feed.

EOF

The end-of-file condition that defines the end of a file being transferred.

EOR

The end-of-record condition that defines the end of a record being transferred.

error recovery

A procedure that allows a user to recover from certain errors such as failure of either host

system or transfer process. In FTP, error recovery may involve restarting a file transfer at

a given checkpoint.

FTP commands

A set of commands that comprise the control information flowing from the user-FTP to

the server-FTP process.

file

An ordered set of computer data (including programs), of arbitrary length, uniquely

identified by a pathname.

mode

The mode in which data is to be transferred via the data connection. The mode defines the data format during transfer including EOR and EOF. The transfer modes defined in FTP are described in the Section on Transmission Modes.

NVT

The Network Virtual Terminal as defined in the Telnet Protocol.

NVFS

The Network Virtual File System. A concept which defines a standard network file system

with standard commands and pathname conventions.

page

A file may be structured as a set of independent parts called pages. FTP supports the

transmission of discontinuous files as independent indexed pages.

pathname

Pathname is defined to be the character string, which must be input to a file system by a user in order to identify a file. Pathname normally contains device and/or directory names, and file name specification. FTP does not yet specify a standard pathname convention. Each user must follow the file naming.

The protocol interpreter. The user and server sides of the protocol have distinct roles implemented in a user-PI and a server-PI.

record

A sequential file may be structured as a number of contiguous parts called records.

Record structures are supported by FTP but a file need not have record structure.

reply

A reply is an acknowledgment (positive or negative) sent from server to user via the control connection in response to FTP commands. The general form of a reply is a completion code (including error codes) followed by a text string. The codes are for use by programs and the text is usually intended for human users.

server-DTP

The data transfer process, in its normal "active" state, establishes the data connection with the "listening" data port. It sets up parameters for transfer and storage, and transfers data on command from its PI. The DTP can be placed in a "passive" state to listen for, rather than initiate a connection on the data port.

server-FTP process

A process or set of processes which perform the function of file transfer in cooperation with a user-FTP process and, possibly, another server. The functions consist of a protocol interpreter (PI) and a data transfer process (DTP).

server-PI

The server protocol interpreter "listens" on Port L for a connection from a user-PI and

establishes a control communication connection. It receives standard FTP commands

from the user-PI, sends replies, and governs the server-DTP.

type

The data representation type used for data transfer and storage. Type implies certain

transformations between the time of data storage and data transfer. The representation

types defined in FTP are described in the Section on Establishing Data Connections.

user

A person or a process on behalf of a person wishing to obtain file transfer service. The human user may interact directly with a server-FTP process, but use of a user-FTP process is preferred since the protocol design is weighted towards automata.

user-DTP

The data transfer process "listens" on the data port for a connection from a server-FTP

process. If two servers are transferring data between them, the user-DTP is inactive.

user-FTP process

A set of functions including a protocol interpreter, a data transfer process and a user interface which together perform the function of file transfer in cooperation with one or more server-FTP processes. The user interface allows a local language to be used in the command-reply dialogue with the user.

user-PI

The user protocol interpreter initiates the control connection from its port U to the server-FTP process, initiates FTP commands, and governs the user-DTP if that process is part of

the file transfer.

The FTP Model

With the above definitions in mind, the following model (shown in Fig. 4.1) may be

diagrammed for an FTP service.

-------------

|/---------\|

|| User || --------

||Interface|<--->| User |

|\----^----/| --------

---------- | | |

|/------\| FTP Commands |/----V----\|

||Server|<---------------->| User ||

|| PI || FTP Replies || PI ||

|\--^---/| |\----^----/|

| | | | | |

-------- |/--V---\| Data |/----V----\| --------

| File |<--->|Server|<---------------->| User |<--->| File |

|System| || DTP || Connection || DTP || |System|

-------- |\------/| |\---------/| --------

---------- -------------

Server-FTP USER-FTP

Fig. 4.1 Model for FTP Use

NOTES: 1. The data connection may be used in either direction.

2. The data connection need not exist all of the time.

In the model described in Figure 1, the user-protocol interpreter initiates the control connection. The control connection follows the Telnet protocol. At the initiation of the user, standard FTP commands are generated by the user-PI and transmitted to the server process via the control connection. (The user may establish a direct control connection to the server-FTP, from a TAC terminal for example, and generate standard FTP commands independently, bypassing the user-FTP process.) Standard replies are sent from the server-PI to the user-PI over the control connection in response to the commands. The FTP commands specify the parameters for the data connection (data port, transfer mode, representation type, and structure) and the nature of file system operation (store, retrieve, append, delete, etc.). The user-DTP or its designate should "listen" on the specified data port, and the server initiate the data connection and data transfer in accordance with the specified parameters. It should be noted that the data port need not be in the same host that initiates the FTP commands via the control connection, but the user or the user-FTP process must ensure a

"listen" on the specified data port. It ought to also be noted that the data connection may be used for simultaneous sending and receiving.

In another situation a user might wish to transfer files between two hosts, neither of which is a local host. The user sets up control connections to the two servers and then arranges for a data connection between them. In this manner, control information is passed to the user-PI but data is transferred between the server data transfer processes. Following is a model of this server-server interaction.

Control ------------ Control

---------->| User-FTP |<-----------

| | User-PI | |

| | "C" | |

V ------------ V

-------------- --------------

| Server-FTP | Data Connection | Server-FTP |

| "A" |<---------------------->| "B" |

-------------- Port (A) Port (B) --------------

Fig. 4.2 Server-Server Interaction

The protocol requires that the control connections be open while data transfer is in progress. It is the responsibility of the user to request the closing of the control connections when finished using the FTP service, while it is the server who takes the action. The server may abort data transfer if the control connections are closed without command.

The Relationship between FTP and Telnet

The FTP uses the Telnet protocol on the control connection. This can be achieved in two ways: first, the user-PI or the server-PI may implement the rules of the Telnet Protocol directly in their own procedures; or, second, the user-PI or the server-PI may make use of the existing Telnet module in the system.

Ease of implementation, sharing code, and modular programming argue for the second approach. Efficiency and independence argue for the first approach. In practice, FTP relies on very little of the Telnet Protocol, so the first approach does not necessarily involve a large amount of code.

4.6 CSMA/CD

The acronym CSMA/CD signifies Carrier Sense Multiple Access with Collision Detection and describes how the Ethernet protocol regulates communication among nodes. While the term may seem intimidating, if we break it apart into its component concepts we will see that it describes rules very similar to those people use in polite conversation. To help illustrate the operation of Ethernet, we will use an analogy of a dinner table conversation. Let’s represent our Ethernet segment as a dinner table, and let several people engaged in polite conversation at the table represent the nodes. The term Multiple Access covers what we already discussed above. When one Ethernet station transmits, all the stations on the medium hear the transmission, just as when one person at the table talks, everyone present is able to hear him or her.

Now let's imagine that you are at the table and you have something you would like to say. At the moment, however, I am talking. Since this is a polite conversation, rather that immediately speak up and interrupt, you would wait until I finished talking before making your statement. This is the same concept described in the Ethernet protocol as Carrier Sense. Before a station transmits, it "listens" to the medium to determine if another station is transmitting. If the medium is quiet, the station recognizes that this is an appropriate time to transmit.

Carrier Sense Multiple Access gives us a good start in regulating our conversation, but there is one scenario we still need to address. Let’s go back to our dinner table analogy and imagine that there is a momentary lull in the conversation. You and I both have something we would like to add and we both "sense the carrier" based on the silence, so we begin speaking at approximately the same time. In Ethernet terminology, a collision occurs when we both spoke at once. In our conversation, we can handle this situation gracefully. We will both hear the other speak at the same time we are speaking. Then we can stop to give the other person a chance to go on. Ethernet nodes also listen to the medium while they transmit to ensure that they are the only stations transmitting at that time. If the stations hear their own transmission returning in a garbled form, as would happen if some other station had begun to transmit its own message at the same time, then they know that a collision occurred. A single Ethernet segment is sometimes called a collision domain because no two stations on the segment can transmit at the same time without causing a collision. When stations detect a collision, they cease transmission, wait a random amount of time, and attempt to transmit when they again detect silence on the medium.

The random pause and retry is an important part of the protocol. If two stations collide when transmitting once, then both will need to transmit again. At the next appropriate chance to transmit, both stations involved with the previous collision will have data ready to transmit. If they transmitted again at the first opportunity, they would most likely collide again and again indefinitely. Instead, the random delay makes it unlikely that any two stations will collide more than a few times in a row.

4.7 Other Network Protocols

· SLIP(Serial Line Internet Protocol) is a protocol used to transmit over serial lines, such as with a modem over phone lines. SLIP is a simpler protocol that has a low overhead, but its lack of some desired features (such as password encryption and error checking) has caused it to be largely replaced by PPP.

· PPP (Point-to-Point Protocol) is commonly used to establish remote connections to Internet service providers or LANs. PPP can run over various types of connections, provides error correction, supports auotmatic TCP/IP configuration, and provides a number of other benefits above SLIP, although it does demand higher overhead.

· With PPTP (Point-to-Point Tunneling Protocol) is a Microsoft-created protocol that uses PPP to create a Virtual Private Network (VPN). To use PPTP, a user establishes a PPP connection to the desired server, then launches a PPTP connection. In effect, the user is then connected to the server via PPP, but is able to transfer information securely from within the PPP connection thanks to the PPTP session. PPTP is currently not standard and not supported by all operating systems.

· POP3 (Post Office Protocol 3), on the other hand, is used to get mail off of a mail server. When used by an email client, POP3 downloads all the client messages available from the server. The IMAP (Internet Message Access Protocol) is a different protocol for retrieving mail off an email server; however, IMAP supports downloading selected messages only, and leaving the rest on the server.

· The NNTP (Network News Transfer Protocol) provides the facilities for transferring information on newsgroups (Usenet news). The protocol allows posting, distribution, and retrieval of the messages among both clients and servers.

· LDAP (Lightweight Directory Access Protocol) is an open protocol for accessing information directories, which supply such data as email addresses and names.

· Gopher was a method for organizing and displaying files on an Internet server before the advent of the World Wide Web. This system has largely been replaced by the web.

· TELNET is a protocol used mostly on Unix servers, which allows users to log onto a remote computer and use it as they were sitting at the console themselves.

· LPR (Line Printer Remote) allows a user to send a print file to a remote server for printing.

back...