In recent years, the proliferation of Web users has rocketed to epidemic proportions, revealing design weaknesses in HTTP as network resources become saturated. The newest incarnation of HTTP, HTTP/1.1, aims at addressing some of these weaknesses while maintaining a high degree of compatibility with HTTP/1.0.
Compared to its predecessors, HTTP/1.1 improves overall performance with a suite of new features to facilitate more efficient use of proxy servers, caches, persistent connections, and partial document retrieval. Additional features include an enhanced authentication mechanism, a content encoding for dynamically generated entity-bodies, and provisions for protocol switching. While early protocol designs emphasized simplicity, HTTP/1.1's main emphasis is maturity: its advances in robustness, security, and efficiency point towards the future designs of HTTP-NG.
In this article, we give a brief history of HTTP and then discuss the most significant improvements in the HTTP/1.1 proposed standard.
GET /weather/index.htmlThe server would then return the contents of /weather/index.html.
While this model seems astoundingly simple, it got the job done. The most basic need of Web applications was satisfied: to get remote data over the network.
In this simple implementation, one could easily write a Web server that passes the URI to an open system call and transmits the contents of the file over the network. This led to a hierarchical organization of resources, since most filesystems are organized as such, although some modern systems may have other organizations as Web servers are increasingly being used in conjunction with databases.
1/macslip-scripts/touch-toneThe server would then transmit the contents of /macslip-scripts/touch-tone. The leading 1 in the URI indicates that the client is expecting a Gopher directory, which is anagous to HTML for HTTP. If there was a 0 instead of a 1, the client would expect a file, like a plainly formatted text or graphic file. This leading number, as it turns out, indicates a media type. By specifying the type of data with the request, the client has a better understanding of the data received.
In the end, ten media types were defined in the Gopher specification.
Client request:
GET /weather.html HTTP/1.0 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*Server Response:
HTTP/1.0 200 OK Date: Saturday, 20-May-95 03:25:12 GMT Server: NCSA/1.3 Content-type: text/html Last-modified: Wednesday, 14-Mar-95 18:15:23 GMT Content-length: 1029HTTP/1.0 introduced the following features:
Byte ranges eliminate unnecessary communication among the client and server, thereby increasing performance. Byte ranges also provide more efficient downloading of incomplete entity-bodies in the cache. When a user opts to interrupt a transfer and request the entity-body at a later time, the client uses byte ranges to complete the interrupted transfer. Within HTTP/1.0, the client has no other choice except to request the entire document from scratch.
To request a particular range, the client uses the Range header. For example, to request the first thousand bytes of a document:
Range: bytes=0-999If the first digit in the byte range(s) is omitted, the range is assumed to count from the end of the document. If the second digit is omitted, the range is assumed to reach from the specified byte to the end of the document.
When responding to a range request, the server uses the Content-Range header to specify what range is being transmitted. The client can use this information to determine where in the document the data should be inserted. The content range gives the starting byte, the ending byte, and the total number of bytes in the document. For example, when returning the first thousand bytes in a document, the server might send:
Content-Range: bytes 0-999/21912 Content-Length: 1000A server can indicate whether it supports byte ranges using the Accept-Range header.
In addition to PDF documents, byte ranges are useful for Web-based streaming multimedia. Clients, for example, could play subsets of MPEG, AVI, or QuickTime movies while the next subset is requested from a server.
Accept-Language: da, en-gb;q=0.8, en;q=0.7indicates the client's preference of a Danish entity-body (with a preference of 1), but will accept British English (with a preference of 0.8) as well. Finally, an entity-body encoded in regular English would also be accepted.
Quality factors, when applied to the Accept header, can indicate the client's preference for certain media types. These preferences may be inferred from the user's preferences. For example, a user may prefer to download a JPEG image instead of GIF, to take advantage of JPEG's compression and smaller download times.
Chunks are useful for the transmission of streaming multimedia, where one frame of the media may vary in size and composition from the next. One possible use of chunked data is streaming video, where an entire image is transmitted in the first chunk, and differences to the previous image are transmitted in the next chunk.
HTTP/1.1 101 Switching Protocols Upgrade: HTTP/2.0 . . (other headers ...) .Being able to switch protocols allows clients and servers to communicate at higher levels of HTTP than 1.1 or to use alternate protocols more suited to the purpose of the transaction. In addition, no extra TCP connections are established to switch to the new protocol. This eliminates TCP slow start delays and connection overhead.
While the Content-MD5 header is only generated by origin servers, it provides a mechanism to detect entity-body corruption within the response chain. This header is not a mechanism to ensure a tamper-resistant entity-body. Instead, this header provides a way to guarantee that the entity-body was transmitted correctly and without error.
To address these problems, the digest authentication scheme [3] allows clients to send authorization information without sending the username and password in the clear. When the client requests a protected resource on the server, the server responds with a nonce value. Under ideal circumstances, this value is unique for the client and is time based, which may expire if a proper response is not issued in a predetermined time. The response from the client is a 128 bit digest function that requires input parameters of the user's username, password, nonce value, and URI. This value, encoded as a 32 byte hexadecimal number, is sent back to the server for authentication checks.
Through the use of the digest, the client and server never send the actual username or password over the network. Instead, a digest value is returned and may not be honored after a server-specific expiration time for nonce values. This ensures that a playback attack will succeed for at most one resource, instead of BASIC's shortcoming of access to an entire realm.
As with other password authorization schemes, digest does not address the issue of how to initially define a password on the server without sending it in the clear over the network.
To eliminate delays in establishing TCP connections, HTTP/1.1 clients and servers do not assume one transaction per connection. By default, the connection is maintained and multiple request/response transactions may take place. This new mode of connection, called persistent connection, utilizes the network more efficiently by reducing the overhead of establishing new connections and maintaining connection information in the TCP stack. Furthermore, clients can pipeline requests to the server by sending multiple requests at the beginning of the session and read the response from the server.
When either the client or server wishes to disconnect, the request or response contains a Connection: Close header. For example, if a client wishes to request three entity-bodies, the transaction would look like this:
Client request:
GET /images/red-dot.jpg HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*Server response:
HTTP/1.1 200 OK Content-type: image/jpeg Content-length: 1029 [entity-body]Client request:
GET /images/green-dot.jpg HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*Server response:
HTTP/1.1 200 OK Content-type: image/jpeg Content-length: 1029 [entity-body]Since the client wishes to close the connection after this transaction, the client issues its last request with the Connection header:
GET /images/blue-dot.jpg HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* Connection: closeServer response:
HTTP/1.1 200 OK Content-type: image/jpeg Content-length: 1029 [entity-body] [TCP connection closes]In addition, HTTP/1.1 reduces separate TCP connections with the Upgrade header, as mentioned earlier. Instead of disconnecting the TCP connection to re-establish a connection under a different protocol, the Upgrade header allows the client and server to switch to a protocol more suited to the transaction or communication requirements.
The 100 response code indicates to HTTP/1.1 clients that the initial request has not yet been rejected by the server, and the client may continue with the rest of the request. This provides a short and quick response to requests for non-existent URI's or IP based authentication rejection.
The 100 response code (or lack of it) also serves as an indication of a non-HTTP/1.0 server. When a client wishes to communicate using HTTP/1.1 or greater, it must first verify that the server understands 1.1, instead of 1.0. Since the 100 response code is a new introduction to HTTP in 1.1, the client may now assume that such a response is issued from a 1.1 (or greater) server.
Entity tags also simplify the cache management process, by reducing the set of URIs at a site into a set of entity tags. In HTTP/1.0, a cache would need to maintain a set of URIs and its corresponding modification time. A major drawback, however, is the problem of storing the same resource at multiple URIs on the server. Under the old system implemented by HTTP/1.0, it would not be possible to determine if two different URIs referred to the same resource. (For example, /homes/zxc/index.html could be the same resource as /~zxc/index.html.) Under the entity tag system, a caching proxy can now determine that different resource locations refer to the same resource and keeps only one copy of the entity in the cache.
An entity tag is either strong or weak. A strong entity tag changes when any portion of the resource changes. If one or more bytes change in a given entity-body, a strong entity tag for the resource would also change. A weak entity tag changes only when the semantics of the entity-body changes. Hence, multiple (but similar) resources could share the same weak entity tag even though the bytes within the entity are different. The distinction between strong and weak entity tags give the client more options to deal with: if the entity-body changes but keeps its semantics, then it may be feasible to avoid reloading the entity over an expensive communications link.
As a conditional mechanism for comparing entity tags, the If-Match and If-None-Match headers provide conditional action, depending on the value of the entity tags. The If-Match header means that the request should be performed only when the supplied entity tag matches the entity tag of the resource on the server. On the other hand, the If-None-Match header is used to indicate that the request should be performed only when the resource does not match the one supplied by the client.
The If-Match and If-None-Match headers are useful for transactions using the PUT method, since it would guarantee that the resource is updated from an entity that was previously known to the client. This facilitates the use of a revision control system at the protocol level. When If-None-Match is used with * under a PUT request, the server would perform the request only if the resource does not previously exist.
To facilitate more flexibility with proxy servers, HTTP/1.1 includes a new header to specify caching behavior and preferences. In HTTP/1.0, the only caching mechanisms were the Pragma header with a no-cache parameter and the If-Modified-Since header. In HTTP/1.1, the Cache-Control header indicates a client's caching request or a resource's caching attributes in the server's response.
With the Cache-Control header, a client now has greater flexibility to request a resource with desired age, freshness, and staleness. The max-age option of the cache-control header specifies that the URI must be created or modified relative to a number of seconds from the current time. The min-fresh option requests that the URI must not expire within a given number of seconds from the current time. The max-stale option indicates that an entity that is expired by a given number of seconds is acceptable. The no-cache option, from Pragma header, has migrated to the Cache-Control header.
In addition, the client can specify that the response from a server is not to be stored by a caching proxy with the no-store option, to prevent secure information from being stored on non-volatile storage. Finally, the only-if-cached option can be used to retrieve a cached copy of a resource when the origin server is down or when the proxy-to-origin-server network is unstable.
On the server side, the server can now specify to caching systems that information is public or private with the Cache-Control header. Public documents maybe cached by shared caches, but private documents can only be stored in a particular user's private cache. The must-revalidate option specifies that a client must check the server after a document has become stale. In addition to the max-age, no-cache, and no-store options, servers can specify a no-transform option to prevent proxy servers from modifying the entity-body or headers of the origin server's response.
In HTTP/1.0, the server could have multiple DNS entries and different IP addresses; each corresponding to a different document tree on the server. To conserve the number of IP addresses used, software multihoming allows the server to have one IP address but multiple DNS entries to the same address. The Host header, sent by the client, contains the hostname and port. B y examining the Host header, the server can then determine the correct document tree to use when mapping the URI to a resource on the system.
HTTP-NG plans to transition today's Web with a heavy emphasis on proxy servers. To glue HTTP/1.0 and 1.1 based clients to HTTP-NG, older Web clients can communicate to proxy servers in native 1.0 or 1.1, while the proxy server communicates to the origin server using more powerful and efficient features in HTTP-NG.
Separate requests to a proxy server for a given origin server can be combined into one connection with multiple channels. The multiplexing of separate TCP connections into one connection offers a more efficient use of network resources. This trend pushes the proxy server's connection to commonly used sites to become longer, as newer requests to the origin server keep the connection alive when current requests are finished.
In HTTP/1.1, a few new features have facilitated content negotiation. HTTP-NG plans to extend these features, to allow a more efficient negotiation scheme for content, security, and payment options. In addition to features that are emerging in 1.1, NG addresses commercial needs with mechanisms to enforce copyright control and displaying of license agreements.