Tuesday, June 26, 2007

HTTP Client Handshake Characterization

Continuing in the "what's the latency on my DSL connection" theme (see outbound DNS and incoming SMTP posts), we finally get to looking at outbound HTTP connection latency.

I expected this sample to be the best of the lot for two reasons. First, these servers are self selected by members of my household and therefore have some kind of inherent locality to me. Second, web hosting implies a certain amount of infrastructure and expenditure that the other samples would not necessarily exhibit.

Let's face it - there is more than one Internet delivery system and if you will pay more you get a higher class of service (whether that be uncongested links, content distribution, etc.. etc..) and webhosting correlates with folks paying that tariff in a way that SMTP clients do not.

My expectations held up. The numbers actually perform even better than expected.

The sample:

  • 13,282 handshakes
  • 708 unique servers
  • 750 MB of HTTP data
  • 12.5 days
Here is the data, ranging from 25ms at the best, a median impressively at 48ms, and the worst case is 1.5 minutes. TCP's exponential backoff kicks in based on hardcoded multi second timers really obviously around the 99th percentile, resulting in some extreme outliers.

best - 25
10 - 35
20 - 38
30 - 40
40 - 43
50 - 48
60 - 52
70 - 61
80 - 101
90 - 116
worst - 93112 (1.5 mins)
The mean is 143 (thanks to some really big outliers due to exponential backoff of hardcoded 3 second timers), here the median is much more representative. A full 79 percent of handshakes are completed in a RTT of 100ms or less. More impressively, 70 percent were 61 ms or faster - which is certainly fast enough for most applications.

These positive results, combined with the slower DNS and SMTP client numbers show us that the client/server model of the web is provisioned much more effectively than any given link in a real peer to peer setup. This certainly shouldn't be a surprise, but it does give the lie to any an diagram of the 'net that uses unweighted edges.

This is my last post on latency from my little spot on the grid. I promise.

On a related thought, Mark Nottingham has a great post dealing with support for various aspects of HTTP in network intermediaries. I like characterization studies so much because they provide real data about what to optimize for, Mark's post provides real data about what worry about in implementations.

Wednesday, June 20, 2007

More Characterization - DNS Latency

A little while back I posted about the incoming TCP handshake latencies on my boutique broadband mail server. In short, they were awful - 184ms median and 77% of all handshakes took more than 100ms. A few hypotheses were drawn:
  • A lot of my mail is spam. Much of the spam comes from botnet owned hosts on consumer internet connections. Because of this I am not so much measuring latency from my edge to core Internet services, but instead to other edges. If this is true, it actually has some interesting peer to peer insights.
  • Because we are likely dealing with "owned" botnets sending the spam, those hosts are distributed more uniformly across the world than the services I actually choose to use day to day which exhibit greater locality to my part of the world. Therefore I am getting real data, but maybe not data that is especially insightful to my day to day network usage.
  • My results may not be reflective of general edge connectivity, I might just have lousy service.
  • The handshake latency should be dominated by the network, but the application at the other end plays a role too. Owned spam generators may not be running up to commerical grade mail client standards generally expected of Internet infrastructure.
I posed the open question if the results would be the same for other protocols. DNS and HTTP were of particular interest. This post is about DNS client performance.

I have now built a packet trace that covers 6 days and about 130,000 DNS request/response pairs. I was astonished there were so many in less than a week from a home LAN. The trace was taken upstream from my home LAN caching recursive resolver - so the redundancy was removed from the data where TTL based caching could do so. The 130,000 transactions were done across 8854 different servers - this was also an astonishing amount of diversity. It comes down to just 14 per server on average, I would have expected a lot more server reuse.

The results are indeed better than the SMTP handshakes. We are now dealing with infrastructure class servers and it shows. But frankly, the latency numbers are still surprisingly high - 41% of all lookups still take a very noticeable 100ms. Remember, also, that starting many webpages requires at least two uncached lookups (one from the root name servers and one from the zone's name server itself) - that can be a really long lag.

Here are the numbers, ranging from a best of 24ms, to a worst of 17 minutes. That latter number is certainly an outlier. 99+% of all transactions were complete in 401ms or less.

percentile latency (ms)
best 24

10 41
20 43
30 55
40 71
50 77
60 101
70 114
80 128
90 180
99 401
worst 1,039,356 (17 minutes!)

Any lookup that did not complete was not included in the dataset. That 17 minute one is fascinating - it is a genuine reply to the original lookup request of kvack.org (probably generated by the spam filtering software who received a legitimate message from someone @kvack.org but was seeing if the name resolved as part of its spam scoring system) - it was not a reply to a client generated retransmission or anything like that. That request must have been buried in quite a queue somewhere! It is hard to imagine that the DNS client hadn't timed out the transaction by the time the response arrived, but the packet trace does not give any insight into that. These very long transactions are exceptionally rare - only 46 of the 130,000 transactions took more than 3 seconds.

  • As you see, the median is 77ms
  • The mean is 112ms (104 removing the 17 minute outlier from the data)
  • 41 percent of all transactions took over 100ms
For the curious, I generated the latency numbers using this little ad-hoc piece of C, linked off this page of wonder. The stats were just done with command line awk scripts.

What does it Mean?

I can draw some weak conclusions:
  • DNS latency is much better than TCP handshake latency on my mail server - indeed almost twice as good. It seems likely that is because DNS is dealing with infrastructure class servers (both in terms of location and function), whereas much of the email traffic was probably botnet generated spam out at the edges of the network - just like my host. So much for net neutrality eh, there are already multiple tiers of service in full effect!
  • Latency still sucks. 100ms round trips are deadly and common.
It will be interesting to see how the HTTP client handshake latency numbers compare. The DNS numbers suffer some skew away from the common usage patterns of the edge users because they are looking up email domains from spam and mailing list contributors, etc.. the HTTP numbers ought to be more pure in that respsect.

Thursday, June 14, 2007

Disk Drive Failure Characterization

I've admitted previously that I have a passion for characterization. When you really understand something you can be sure you are targetting the right problem, and the only way to do that with any certainty is data. Sometimes you've got to guess and make educated inferences, but way too many people guess when they should be measuring instead.

Val Henson highlights on lwn.net a couple great hard drive failure rate characterization studies presented at the USENIX File Systems and Storage Technology Conference. They cast doubt on a couple pieces of conventional wisdom: hard drive infant mortality rates, and the effect on ambient temperature on drive lifetime. This isn't gospel: every characterization study is about a particular frame of reference, but it is still very very interesting. Val Henson, as usual, does a fabulous job interpreting and showing us the most interesting stuff going on in the storage and file systems world.

Tuesday, June 5, 2007

Fairness is good - TCP fairness misses the point

For the longest time, any proposed enhancements to TCP congestion control were measured against 2 criteria
  • Is it more efficient in some sense than what we've got
  • Is it TCP friendly (loosely defined as not harming the share of bandwidth legacy TCP flows would get if the new algorithm were deployed in a mixed legacy environment - like the Internet for example)
The first criteria makes sense.. the second just ties our hands.

The problem with TCP friendliness as a requirement is that it supposes TCP flows represent a 1:1 proxy for entitlement. That of course isn't true.. a single person could be using multiple TCP flows at any moment, along with some UDP traffic and some other non TCP/IP traffic. The multiple flows do not coordinate, and these latter types of flows are often not congestion controlled at all - much less TCP friendly! It is the aggregate of all of this traffic that really drives the per user cost.

More and more applications are sensibly opening parallel TCP flows to feed the application. They are sometimes accused of being greedy - doing this to grab an unfair share of bandwidth. But that is only one reason an application might do so. A TCP flow overloads a number of different properties into a single connection. The connection provides reliable delivery, congestion control, and in-order delivery. In-order delivery is handy for many many things, but it can also cause head of line blocking problems - some application architectures create multiple flows so as to separate classes of messages and data by priority. This is a sensible thing to do, but the multiple flows don't coordinate their congestion control properties.

Applications get accused of greed in situations like this, regardless of their motivations. But in truth, often times it is a net loss in performance for the application. A single hot TCP stream with a fully open congestion window is much preferable from a bandwidth point of view to opening a new one. The new one needs to go through a high latency 3 way handshake and then through a slow start period to ramp up its congestion window - much less efficient than using a fully established flow in the beginning.

That's just for latency reasons. If the application is harmed by in-order or even reliable delivery (e.g. streaming multimedia) guarantees the app will likely use something like UDP which is not congestion controlled at all - or DCCP which is "TCP friendly" congestion controlled, but is congestion independent of any other existing traffic that is going on. Again - the realm of congestion control is just for the single flow.

Don't get me wrong. I know lots of applications (p2p especially) open multiple flows in order to hog bandwidth.. The math is easy - if there are two users would you rather have 1 of 2 shares or (by virtue of opening 3 extra flows) have 4 of 5? Still two users, but now that bandwidth is distributed 80/20 instead of 50/50.

So the flow is clearly not the right spot to be thinking about fairness - which makes TCP friendliness kind of goofy. What is the right granularity - the application? the user? the computer? the lan? the organization?

Is the right definition of fair "one person one packet" or is it somehow "pay per class of service"? Clearly this kind of thing cannot be policed by the end hosts, but can they pay more attention to the distributed policing instead of relying on implicit feedback such as inflated RTTs or forced drops. (ECN plays a role here).

A sensible first step is to unravel some of the TCP overlaps. A number of years ago thoughts on shared congestion managers were popular - keeping multiple flows in one window. This way applications could take advantage of multiple flows for creating independent data streams without the rate implications such strategies currently exhibit. The idea should be expanded to cover multiple hosts and protocols (yes, I understand that isn't an easy proposition) so that in the end the fairness granularity can be defined by local policy.

I used to think about this kind of thing regularly.. but now as Veronica Mars would say I haven't thought of you lately (c'mon now sugar!). But Bob Briscoe gets it absolutely right with a paper in the April 2007 ACM SIGCOMM CCR - Flow Rate Fairness: Dismantling a Religion.