HTTP
# Hypertext Transfer Protocol
HTTP is the Web’s application layer protocol, above the transport or optional encryption layer. A Web page contains many objects and is addressable using a Uniform Resource Locator (URL): HTTP uses TCP as its underlying transport protocol.
# A brief history rundown
- HTTP 0.9 began in 1991 with the goal of transferring HTML between client and server.
- HTTP 1.0 evolved to add more capabilities such as header fields and supporting more than HTML file types, becoming a misnomer for hypermedia transport. A typical plaintext HTTP request
- HTTP 1.1 introduced critical performance optimisations such as keepalive connections, chunked encoding transfers and additional caching mechanisms.
- HTTP 2.0 aimed to improve transport performance for lower latency and higher throughput.
# HTTP message format
# Request
GET, POST, PUT, UPDATE, DELETE, HEAD methods are available
# Response
# User-Server State
HTTP is a stateless protocol, and does not maintain information about the clients. This simplifies server design and allow for high-performance web servers.
# Cookies
It is often desirable for a Web site to identify users, either to restrict user access or serve specific content. Cookie technology consists of 4 components:
- Cookie header line in the HTTP response message
- Cookie header line in the HTTP request message
- Cookie file kept on the user’s end system and managed by the user’s browser
- Back-end database at the Web site
# Web Caching
A proxy server acts as a Web cache with its own disk storage and keeps recently requested objects. The proxy server sits on the LAN and reduces response time for a request.
# Optimisations in HTTP 1.1
# HTTP Keepalive
Reuse existing TCP connections paired with TCP Keep-Alive to save 1 roundtrip of network latency
# HTTP Pipelining
Persistent HTTP implies a strict FIFO order of client requests: dispatch request, wait for full response, dispatch next request. Pipelining moves the queue to the server side, allows the client to send all requests at once, and reduces server idle time by processing requests immediately without delay.
# Why not do server processing in parallel?
The HTTP 1.x protocol enforces a requirement similar to that encountered in TCP Head-of-Line Blocking due to its requirement for strict in-order packet delivery, where there must be strict serialization of returned responses. Hence, although the CSS response finishes first, the server must wait for the full HTML response before it can deliver the CSS asset.
# Parallel TCP Connections
Rather than opening one TCP connection, and sending each request one after another on the client, we can open multiple TCP connections in parallel. In practice, most browsers use a value of 6 connections per host.
These connections are considered independent, and hence do not face the same head-of-line blocking issues in parallel server processing.
# Domain Sharding
Although browsers can maintain a connection pool of up to 6 TCP streams per host, this might not be enough considering how an average page needs 90+ individual resources. If delivered all by the same host, there will be queueing delays: Sharding can artificially split up a single host e.g. www.example.com into {shard1,shard2}.example.com, helping to achieve higher levels of parallelism at a cost of additional network resources.
# Enhancements in HTTP 2.0
HTTP 2.0 extends the standards from previous versions, and is designed to allow all applications using previous versions to carry on without modification.
# Binary Framing Layer
At the core of the performance enhancements, is this layer which dictates how the HTTP messages are encapsulated and transferred. Rather than delimiting parts of the protocol in newlines like in HTTP 1.x, all communication is split into smaller frames and encoded in binary:
# Request and Response Multiplexing
In Why not do server processing in parallel?, we find that only one response can be delivered at a time per connection. HTTP 2.0 removes these limitations. With this, workaround optimisations in HTTP 1.x such as domain sharding is no longer necessary.
# Request Prioritisation
The exact order in which the frames are interleaved and delivered can be optimised further by assigning a 31 bit priority value (0 represents highest, $2^{31}-1$ being the lowest). HTTP 2.0 merely provides the mechanism for which priority data can be exchanged, and does not implement any specific prioritisation algorithm. It is up to the server to implement this.
# Server Push
A document contains dozens of resources which the client will discover. To eliminate extra latency, let the server figure out what resources the client will require and push it ahead of time. Essentially, the server can send multiple replies for a single request, without the client having to explicitly request each resource.
# Header Compression
Each HTTP transfer carries a set of headers that describe the transferred resource and its properties. In HTTP 1.x, this metadata is always sent as plain text and adds anywhere from 500–800 bytes of overhead per request, and kilobytes more if HTTP cookies are required.
# Header table
A header table is used on both the client and server to track and store previously sent key value pairs. They are persisted for the entire connection and incrementally updated both by the client and server. Each new pair is either appended or replaces a previous value in the table. This allows a new set of headers to be coded as a simple difference from the previous set: