Analyzing Http/2 with Wireshark
Http/2 is a major revision of http, which is used broadly for web surfing. With new improvements, http/2 is much faster compared to http/1.x. We will review how http/2 works and what features make it stronger than its predecessor.
For better understanding, we will use multiple tools to show concrete examples. Since this article’s content will be mostly comparisons, we need to install our web server and activate some features on it to have some flexibility.
Have you ever wondered how browsers check whether a web server supports http/2 or not? Thanks to TLS extension ALPN (Application-Layer Protocol Negotiation), it is pretty easy to negotiate which protocol to use. Before capturing some pcap via Wireshark, we need to configure some settings on Apache web server, which hosts our test webpage. We will activate http/2 module on Apache server, adding the directive to httpd.conf file as below.
Next step is to enable which protocols are supported by the server, adding a directive to the httpd-ssl.conf file as below.
h2 is http/2 over TLS (protocol negotiation via ALPN).
h2c is http/2 over TCP.(clear text http/2 over TCP)
http/1.1 is self-explanatory.
Most modern web browsers require TLS for http/2 so there is no way to test plain text http/2 with those browsers. Curl is a good tool to overcome this problem, providing h2c without TLS.
With that being said, it is time to get our hands dirty with pcaps. After completing TCP three way hand shaking, TLS negotiations starts as below. TLS uses ALPN extension to negotiate which version of http to use. The pcap capture shows that our client supports h2(http/2) and http/1.1 in the Client Hello packet.
Apache web server responses the client with selecting h2(http/2) in the Server Hello packet like below.
After completing TLS negotiations, http/2 is ready to be used. Before analyzing http/2 packets, we need to talk about our webpage a little bit. The page contains two files , “script.js” and “style.css” , as below.
Let’s assume we are using http/1.1 and make a request like “GET /test/index.php HTTP/1.1”. Our browser will first download index.php page then render it. Next the browser will establish two other TCP connection for sytle.css and script.js files, resulting 3 different connection for that page. The processes are briefly like below.
As seen above, three different connections require to download complete resources. Each connection utilizes some resources. The less number of connection we establish, the less resources will be consumed. Most modern web browsers support a limited number of concurrent TCP connection. Domain sharding is another method to load all of those resources efficiently.
For http/1.1, there is a way to download all resources (index.php, style.css and script.js) over one TCP connection, which can be acquired with setting “connection” header field to some specific value. Let’s dig a little deeper with configuring Apache web server for connection header field. When we set connection header field to close, we will create separate TCP connection for each resource like above. “Keep-alive” is another value for connection header field which let the browser establishes one permanent TCP connection over which all resources are downloaded.
Keep-alive header field can be set in the file called httpd.conf for Apache web server as above. After that the new requests/responses will look like below.
It seems to we have managed to decrease the number of connection with setting connection header field to keep-alive, but there is still some caveat. The requests are in sequence. In other words, one request waits for the other request to complete. HTTP/2 multiplexing (parallelizing) comes to into play for that limitation.
Multiplexing
HTTP/2 includes streams, which is like opening multiple parallel request/response over one TCP connection while consuming less resources. Each stream consists of one or more messages (request or response). Each message has it is own frames (HEADERS, DATA so on). The figure below represents how HTTP/2 multiplexing works.
Wireshark output for stream 1 is below.
With http/2 multiplexing, we can download multiple files (resources) in parallel, consuming less hardware resources like memory and processing power.
Server Push
Imagine that we make a request for “index.php” page which contains a css and js file. When web server get that request, it will realize that the page includes two other files and send these files to us before we ask for them. In other saying server will send multiple responses for a request to the client. This feature is called server push which decreases latency, loading web pages faster. The figure below basically represents server push.
Let’s configure server push feature for style.css and script.js on Apache web server with adding below directives.
After TCP 3 way hand shaking, the browser sends a “GET” request for index.php page. Server responses the web browser with a push promise packet, saying it will also send style.css and script.js files, preventing browser from making duplicate requests for these resources (style.css and script.js). The output from Wireshark is below.
Binary Framing
Unlike http/1.x, http/2 is using binary framing. In http/1.x, requests are made with simple text commands. For example if we have a telnet client, we can make an http get request without using a browser like program. The only thing we need is preparing http header with delimiting characters which is called new line (\r\n). The figure below shows a raw “Get” request with some other header fields. Each text command ends with delimiting characters. After getting this request, web server parses this text using delimiting characters.
We can create the same output with a telnet client, delimiting the header fields as below.
Http/2 is using binary framing format which is completely different from parsing text commands with delimiters. There is a binary frame called HEADERS which contains request or response header like below. As opposed to http/1.x, a binary frame also gives information about frame’s length. As a result, server will know how much data to expect and process received data much faster. Since http/2 is a binary protocol, we can not use telnet like clients to craft a request.
Http/2 consists of binary frames as below.
DATA
HEADERS
PRIORITY
RST_STREAM
SETTINGS
PUSH_PROMISE
PING
GOAWAY
Header Compression
This is another brand new feature which can be used for bandwidth optimization. It requires 3 requests to load the web page completely like in the figure below. The requests are for index.php, style.css and script.js respectively. Every request has some common header fields. All repetitive header fields are marked in blue which takes up big proportion of all header fields. Imagine that there was a way not to send the duplicate header fields from second request, would not it be great? For this purpose, Http/2 has HPACK algorithm to compress header, utilizing bandwidth resource efficiently as well as decreasing latency to some extent.
Stream Prioritization
With multiplexing capability, http/2 can send multiple requests through multiple streams. Our basic web page contains multiple resources like index.php, style.css and script.js. After loading the page, the browser needs to create two different streams for css and script files which have different priorities for the browser to show the page properly. http/2 stream prioritization can help client to let the server know which request is more important. After being informed, the server gives the priority in the favor of specified stream, using bandwidth efficiently.
References
Breaking down HTTP response at Packet Level