<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-8147202175434463396</id><updated>2012-01-26T20:12:36.740-05:00</updated><category term='linux'/><category term='weather'/><category term='pipelines'/><category term='virtualization'/><category term='block'/><category term='smtp'/><category term='spdy'/><category term='javascript'/><category term='htp'/><category term='appliances'/><category term='lwn'/><category term='voip'/><category term='os x'/><category term='disk'/><category term='algorithms'/><category term='http'/><category term='latency'/><category term='vonage'/><category term='voipreocrder'/><category term='ip'/><category term='characterization'/><category term='firefox'/><category term='tcp'/><category term='caller-id'/><category term='dns'/><category term='amazon'/><category term='rss'/><category term='internet'/><category term='performance'/><category term='congestion control'/><category term='hardware'/><category term='google'/><category term='recommendations'/><category term='startups'/><title type='text'>Bits Up!</title><subtitle type='html'>Real data and musings on the performance of networks, servers, protocols, and their related folks.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://bitsup.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>65</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-4406255455217948015</id><published>2012-01-23T23:29:00.000-05:00</published><updated>2012-01-23T23:29:26.264-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='spdy'/><title type='text'>HTTP-WG Proposal to tackle HTTP/2.0</title><content type='html'>Huzzah to &lt;a href="https://twitter.com/#%21/mnot"&gt;Mark Nottingham&lt;/a&gt;, chair of the IETF HTTP Working Group. He &lt;a href="http://lists.w3.org/Archives/Public/ietf-http-wg/2012JanMar/0098.html"&gt;proposes&lt;/a&gt; rechartering the group to "specify (sic) HTTP/2.0 an improved binding of HTTP's semantics to the underlying transport."&lt;P&gt;That's welcome news - the scalability, efficiency, and robustness issues with HTTP/1 are severe problems that deserve attention in an open standards body forum. The HTTP-WG is the right place.&lt;P&gt;SPDY will certainly be offered as an input to that process and in my opinion it touches on the right answers. But whatever the outcome it is great to see momentum around open standardization of solutions to the transport problems HTTP/1 suffers from.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-4406255455217948015?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4406255455217948015'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4406255455217948015'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2012/01/http-wg-proposal-to-tackle-http20.html' title='HTTP-WG Proposal to tackle HTTP/2.0'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-759917441236466452</id><published>2012-01-07T21:01:00.000-05:00</published><updated>2012-01-07T21:01:10.005-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><category scheme='http://www.blogger.com/atom/ns#' term='spdy'/><title type='text'>Using SPDY for more responsive interfaces</title><content type='html'>RST_STREAM turns out to be a feature of spdy that I under appreciated for a long time. The concept is simple enough - either end of the connection can cancel an individual stream without impacting the other streams that are multiplexed on the same connection.&lt;br /&gt;&lt;br /&gt;This fills a conceptual gap left by HTTP/1.x. - In HTTP when you want to cancel a transaction about all you can do is close the connection. &lt;br /&gt;&lt;br /&gt;Consider the case of quickly clicking through a webmail mailbox - just scanning the contents and rapidly clicking 'next'. Typically the pages will be only partly loaded before you move on to the next one. Assuming you have used up all your parallelism in HTTP/1, the new click will either have to wait for the old transactions to complete (wasting time and bandwidth) or cancel the old ones by closing those connections and then open fresh connections for the new requests. New connections add 2 or 3 round trip times to reopen the SSL connection (you are reading email over SSL, right?) before they can be used to send the new requests. Either way - that is not a good experience.&lt;br /&gt;&lt;br /&gt;An interactive map application has similar attributes - as you scan along the map, zooming in and out, lots of tiles are loaded and are often irrelevant before they are drawn. I'm sure you can think of other scenarios that have cancellations.&lt;br /&gt;&lt;br /&gt;Spdy solves this simply - with its inherently much greater parallelism the new requests can be sent immediately and at the same time cancel notifications can go out for the old ones. That saves the bandwidth and gets the new requests going as fast as possible without interfering with either the established connection or any other transactions also in progress.&lt;br /&gt; &lt;br /&gt;A page load time metric won't really show this to you but the increased responsiveness is very obvious when working with these kinds of use cases - especially under high latency conditions that make connection establishment slower.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-759917441236466452?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/759917441236466452'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/759917441236466452'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2012/01/using-spdy-for-more-responsive.html' title='Using SPDY for more responsive interfaces'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-3142873309625882029</id><published>2012-01-01T17:11:00.001-05:00</published><updated>2012-01-01T17:14:07.012-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='spdy'/><category scheme='http://www.blogger.com/atom/ns#' term='htp'/><title type='text'>A use case for SPDY header compression</title><content type='html'>A use case for SPDY header compression: http://pix04.revsci.net/F09828/a4/0/0/0.js&lt;br /&gt;&lt;br /&gt;380 bytes of gzipped javascript (550 uncompressed), sent with 8.8KB of request cookies and 5.5KB of response cookies. That overhead is bad enough to mess with TCP CWND defaults - which means you are taking multiple round trips on the network just to move half a KB of js. For HTTP, that's a performance killer! Those cookies are repeated almost identically on every transaction with that host.&lt;br /&gt;&lt;br /&gt;SPDY's dedicated header contexts and the repetitive nature of cookies means those cookies can be reduced ~98% for all but the first transaction of the session. Essentially the cookies remain stateless for app developers, along with the nice properties of that, but the transport leverages the connection state to move them much more efficiently.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-3142873309625882029?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3142873309625882029'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3142873309625882029'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2012/01/use-case-for-spdy-header-compression.html' title='A use case for SPDY header compression'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2913235221308481656</id><published>2011-12-08T11:02:00.001-05:00</published><updated>2011-12-08T11:18:18.074-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='voip'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><category scheme='http://www.blogger.com/atom/ns#' term='spdy'/><title type='text'>SPDY, Bufferbloat, HTTP, and Real-Time Networking</title><content type='html'>Long router queue sizes on the web continue to be a hot networking topic - &lt;a href="https://plus.google.com/110299325941327120246/posts"&gt;Jim Gettys&lt;/a&gt; has a long interview in &lt;a href="http://queue.acm.org/detail.cfm?id=2076798"&gt;ACM queue&lt;/a&gt;. Large unmanaged queues destroy low latency applications - just ask &lt;a href="https://plus.google.com/106186763111547737548/posts"&gt;Randell Jessup&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A paper like&amp;nbsp;&lt;a href="http://conferences.sigcomm.org/sigcomm/2011/papers/sigcomm/p134.pdf"&gt;this&lt;/a&gt; does a good job of showing just how bad the situation can be - experimentally driving router buffering delay from 10ms to ~1000ms on many common broadband cable and DSL modems. I wish the paper had been able to show me the range and frequency of that queue delay under normal conditions.&lt;br /&gt;&lt;br /&gt;I'm concerned&amp;nbsp; that decreasing the router buffer size, thereby increasing the drop rate, is detrimental to the current HTTP/1.x web. A classic HTTP/1.x flow is pretty short - giving it a signal to backoff doesn't save you much - it has sent much of what it needs to send already anyhow. Unless you drop almost all of that flow from your buffers you haven't achieved much. Further, a loss event has a high chance of damaging the flow more seriously than you intended - dropping a SYN or the last packet of the data train is a packet that will have very slow retry timers, and short flows are comprised of high percentages of these kinds of packets. non-drop based loss notification like connex/ecn do less damage but are ineffective again because the short flow is more or less complete when the notification arrives so it cannot adapt the sending rate.&lt;br /&gt;&lt;br /&gt;The problem is all of those other parallel HTTP sessions going on that didn't get the message. Its the aggregate that causes the buffer build up. Many sites commonly use 60-90 separate uncoordinated TCP flows just to load one page.&lt;br /&gt;&lt;br /&gt;Making web transport more adaptable on this front is a big goal of my spdy work. When spdy consolidates resources onto the same tcp flow it means the remaining larger flows will be much more tcp friendly. Loss indicators will have a fighting chance of hitting the flow that can still backoff, and we won't have windows growing independently of each other. (Do you like the sound of IW=10 times 90? That's what 90 uncorrelated flows mean. IW=10 of a small number of flows otoh is excellent.). That ought to keep router queue sizes down and give things like rtcweb a fighting chance.&lt;br /&gt;&lt;br /&gt;It also opens up the possibility of the browser identifying queue growth through delay based analysis and possibly helping the situation out by managing inside the browser our bulk tcp download rate (and definitely the upload rate) by munging the rwin or something like that. If it goes right it really shouldn't hurt throughput while giving better latency to other applications. It's all very pie in the sky and down the road, but its kind of hard to imagine in the current HTTP/1.x world.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2913235221308481656?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2913235221308481656'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2913235221308481656'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2011/12/spdy-bufferbloat-http-and-real-time.html' title='SPDY, Bufferbloat, HTTP, and Real-Time Networking'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-7990023160056017033</id><published>2011-11-11T18:38:00.001-05:00</published><updated>2011-11-11T19:06:15.124-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='spdy'/><title type='text'>Video of SPDY Talk at Codebits.eu</title><content type='html'>Yesterday, I was fortunate enough to be able to address the codebits.eu conference and share my thoughts on why SPDY is an important change for the web. They have made the video of my talk available on-line. (I guess that saves me the air-mozilla brownbag - just skip the 3 minute community-involvement video near the beginning assuming you've seen it already)&lt;br /&gt;&lt;br /&gt;codebits is full of vitality. Portugal is lucky to have it.&lt;br /&gt;&lt;br /&gt;&lt;embed allowfullscreen="true" height="350" src="http://rd3.videos.sapo.pt/play?file=http://rd3.videos.sapo.pt/NqnCRzygYueKjM3Ih7Ob/mov/1" type="application/x-shockwave-flash" width="410"&gt;&lt;/embed&gt; &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-7990023160056017033?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7990023160056017033'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7990023160056017033'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2011/11/video-of-spdy-talk-at-codebitseu.html' title='Video of SPDY Talk at Codebits.eu'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-8360204645057258529</id><published>2011-09-23T00:23:00.000-04:00</published><updated>2011-09-23T00:23:39.409-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><category scheme='http://www.blogger.com/atom/ns#' term='google'/><category scheme='http://www.blogger.com/atom/ns#' term='spdy'/><title type='text'>SPDY: What I Like About You.</title><content type='html'>I've been working on &lt;a href="http://www.ducksong.com/mozilla/spdy/"&gt;implementing SPDY &lt;/a&gt;as an experiment in Firefox lately. We'll have to see how it plays out, but so far I really like it.&lt;br /&gt;&lt;br /&gt;Development and benchmarking is still a work in progress, though interop seems to be complete. There are several significant to-do items left that have the potential to improve things even further. The couple of anecdotal benchmarks I have collected are broadly similar to the page load time based reports Google has shared at the IETF and velocity conf over the last few months.&lt;br /&gt;&lt;br /&gt;tl;dr; Faster is all well and good (and I mean that!) but I'm going to make a long argument that &lt;b&gt;SPDY is good for the Internet beyond faster page load times&lt;/b&gt;. Compared to HTTP, it is more scalable, plays nicer with other Internet traffic and brings web security forward.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;&lt;a href="http://www.chromium.org/spdy"&gt;SPDY&lt;/a&gt;: What I Like About You&lt;b&gt; &lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;#1: Infinite Parallelism with &lt;u&gt;Shared Congestion Control&lt;/u&gt;.&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;You probably know that SPDY allows multiplexing of multiple HTTP resources inside one TCP stream. Unlike the related HTTP mechanisms of pipelining concurrent requests on one TCP stream, the SPDY resources can be returned in any order and even mixed together in small chunks so that head of line blocking is never an issue and you never need more than one connection to each real server. This is great for high latency environments because a resource never needs to be queued on either the client or the server for any reason other than network congestion limits.&lt;br /&gt;&lt;br /&gt;Normal HTTP achieves transaction parallelism through parallel TCP connections. Browsers limit you to 6 parallel connections per host. Servers achieve greater parallelism by sharding their resources across a variety of host names. Often these host names are aliases for the same host, implemented explicitly to bypass the 6 connection limitation. For example, lh3.googleusercontent.com and lh4.googleusercontent.com are actually DNS CNAMEs for the same server. It is not uncommon to see performance oriented sites, like the Google properties, shard things over as many as 6 host names in order to allow 36 parallel HTTP sessions.&lt;br /&gt;&lt;br /&gt;Parallelism is a must-have for performance. I'm looking at a trace right now that uses the afore mentioned 36 parallel HTTP sessions and its page load completes in 16.5 seconds. If I restrict it to just 1 connection per host (i.e. 6 overall), the same page takes 27.7 seconds to load. If I restrict that even further to just 1 connection in total in takes a mind numbing 94 seconds to load. And this is on 40ms RTT broadband - high latency environments such as mobile would suffer much much worse! Keep this in mind when I start saying bad things about parallel connections below, they really do great things and the web we have with them enables much more impressive applications than a single connection HTTP web ever could.&lt;br /&gt;&lt;br /&gt;Of course using multiple parallel HTTP connections is not perfect - if they were perfect we wouldn't try to limit them to 6 at a time. There are two main problems. The first is that each connection requires a TCP handshake which incurs an extra RTT (or maybe 3 if you are using SSL) before the connection can be used. The TCP handshake is also relatively computationally hard compared to moving data (servers easily move millions of packets per second, while connection termination is generally measured in the tens of thousands), the SSL handshake even harder. Reducing the number of connections reduces this burden. But in all honesty this is becoming less of a problem over time - the cost of maintaining persistent connections is going down (which amortizes the handshake cost) and servers are getting pretty good at executing the handshakes (both SSL and vanilla) sometimes by employing the help of multi-tiered architectures for busy deployments.&lt;br /&gt;&lt;br /&gt;The architectural problem lies in HTTP's interaction with TCP congestion control. HTTP flows are generally pretty short (a few packets per transaction), tend to stop and start a lot, and more or less play poorly with the congestion control model. The model works really well for long flows like a FTP download - that TCP stream will automatically adapt to the available bandwidth of the network and transfer at a fairly steady rate for its duration after a little bit of acclimation time. HTTP flows are generally too short to ever acclimate properly.&lt;br /&gt;&lt;br /&gt;A SPDY flow, being the aggregation of all the parallel HTTP connections, looks to be a lot longer, busier, and more consistent than any of the individual parallel HTTP flows would be. Simply put - that makes it work better because all of that TCP congestion logic is applied to one flow instead of being repeated independently across all the parallel HTTP mini flows.&lt;br /&gt;&lt;br /&gt;Less simply, when an idle HTTP session begins to send a response it has to guess at how much data should be put onto the wire. It does this without awareness of all the other flows. Let's say it guesses "4 packets" but there are no other active flows. In this case 4 packets is way too few and the network is under utilized and the page loads poorly. But what if 35 other flows are activated at the same time - this means 140 packets get injected into the network at the same time which is way too many. Under that scenario one of two things happen - both of them are bad:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Packet Loss. TCP reacts poorly to packet loss, especially on short flows. While 140 packets in aggregate is a pretty decent flow, remember that total transmission is made up of 35 different congestion control blocks - each one covering a packet flow of only 4 packets. A loss is devastating to performance because most of the TCP recovery strategies don't work well in that environment.&lt;/li&gt;&lt;li&gt; Over Buffering. This is what &lt;a href="http://lwn.net/Articles/458625/"&gt;Jim Gettys calls bufferbloat&lt;/a&gt;. The giant fast moving 140 packet burst arrives at your cable modem head where the bandwidth is stepped down and most of those packets get in a long buffer to wait for their turn on your LAN. That works OK, certainly better than packet loss recovery does in practice, but the deep queue creates a giant problem for any interactive traffic that is sharing that link. Packets for those other applications (such as VOIP, gaming, video chat, etc..) now have to sit in this long queue resulting in interactive lag. &lt;b&gt;Lag sucks for the Internet.&lt;/b&gt; The HTTP streams themselves also become non-responsive to cancel events because the only way to clear those queues is to wait them out - so clicking on a link to a new page is significantly delayed while the old page that you have already abandoned continues to consume your bandwidth.&lt;/li&gt;&lt;/ol&gt;This describes a real dilemma - if you guess more aggressive send windows then you will have a better chance of filling the network but you will also have a better chance of packet loss or over buffering. If you guess more conservative windows then loss and buffering happens less often but nothing ever runs very quickly. In the face of all those flows with independent congestion control blocks, there just isn't enough information available. (This of course is related to the famous Initial Window 10 proposal, which I support, but that's another non SPDY story.)&lt;br /&gt;&lt;br /&gt;I'm sure you can see where this is going now. SPDY's parallelism, by virtue of being on a single TCP stream, leverages one busy shared congestion control block instead of dealing with 36 independent tiny ones. Because the stream is much busier it rarely has to guess at how much to send (you only need to guess when you're idle, SPDY is more likely to be getting active feedback), if it should drop a packet it reacts to that loss much better via the various fast recovery mechanisms of TCP, and when it is competing for bandwidth at a choke point it is much more responsive to the signals of other streams - reducing the over buffering problem.&lt;br /&gt;&lt;br /&gt;It is for these reasons that SPDY is really exciting to me. Parallel connections work great - they work so great that it is hard to have SPDY significantly improve on the page load time of highly sharded site unless there is a very high amount of latency present.&amp;nbsp; But the structural advantages of SPDY enable important efforts like &lt;a href="http://www.webrtc.org/"&gt;RTCWeb&lt;/a&gt; as well as provide better network utilization and help servers scale when compared to HTTP. Even if page load times only stay at par, those other &lt;b&gt;good for the Internet attributes&lt;/b&gt; make it worth deploying.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;#2: SPDY is over SSL every time.&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I greatly lament that I am late to the school of SSL-all-the-time. I spent many years trying to eek the greatest amount of server responses per watt that was possible. I looked at SSL and saw impediments. That stayed with me.&lt;br /&gt;&lt;br /&gt;I was right about the impediments, and I've learned a lot about dealing with them, but what I didn't get was that it is simply worth the cost. As we have all seen lately, SSL isn't perfect - but having a layer of protection against an entire class of eavesdropping attacks is a property that should be able to be relied upon in every protocol as generic as HTTP. HTTP does not provide that guarantee - but SPDY does.&lt;br /&gt;&lt;br /&gt;huzzah.&lt;br /&gt;&lt;br /&gt;As a incentive to make the transition to SSL all the time, this makes it worth deploying by itself.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;#3:Header compression.&lt;/b&gt;&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;SPDY compresses all the HTTP-equivalent headers using a specialized dictionary and a compression context that is reserved only for the headers so it does not get diluted with non-header references. This specialized scheme performs very well.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;At first I really didn't think that would matter very much - but it is really a significant savings. HTTP's statelessness had its upsides, but the resulting on the wire redundancy was really over the top.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;As a sample, I am looking right now at a trace of 1900 resources (that's about 40 pages). 760KB of total downstream plain text header bytes were received as 88KB compressed bytes, and upstream 949KB of plain text headers were compressed as just 65KB. I'll take 1.56MB (82%) in total overhead savings!&amp;nbsp; I even have a todo item that will make this slightly better.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-8360204645057258529?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8360204645057258529'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8360204645057258529'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2011/09/spdy-what-i-like-about-you.html' title='SPDY: What I Like About You.'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-8002035494352432938</id><published>2011-02-15T12:05:00.000-05:00</published><updated>2011-02-15T12:05:31.018-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Note to Self: The Web is Slow</title><content type='html'>I've made a living dealing with fast networks and servers that run at really impressive transaction rates using all manner of nifty interconnects and parallelism. Sometimes I forget that the day to day web isn't all that fast in comparison.&lt;br /&gt;&lt;br /&gt;My local copy of Firefox is annotated to dump a bunch of network stats when shutting down. One of them is a CDF of HTTP handshake times. This is from my desktop, which is connected to a premium cable broadband consumer Internet service. It's not as awesome as FIOS, but its still at the top portion of what a home consumer will have in the US, which in turn has certain geographic advantages when connection to many hosting companies. Its fair to say my performance is going to be at least a bit better than the average Internet user. And it is still slow. (We of course need to &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=585196"&gt;work to be able to better characterize&lt;/a&gt; what the real spectrum of experience is.)&lt;br /&gt;&lt;br /&gt;This isn't scientific. It is where I happen to browse, and its just one datapoint although I can tell you my gut says it is pretty typical output - gathered over 15,000 connections.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-0Y5mVqlqaTk/TVqwuv6LexI/AAAAAAAACXs/p2jCBlF93MA/s1600/graph.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="307" src="http://1.bp.blogspot.com/-0Y5mVqlqaTk/TVqwuv6LexI/AAAAAAAACXs/p2jCBlF93MA/s400/graph.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Only about half of my handshakes are where I want them to be: &amp;lt; 100ms. Most of the rest fall in the next 300ms. To be fair there is a little skew in here because the code doesn't separate https from http, and SSL has an extra RTT in there. But SSL is a small fraction of the overall sample.&lt;br /&gt;&lt;br /&gt;And this is the desktop. Think mobile and wireless.&lt;br /&gt;&lt;br /&gt;Latency matters.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-8002035494352432938?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8002035494352432938'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8002035494352432938'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2011/02/note-to-self-web-is-slow.html' title='Note to Self: The Web is Slow'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-0Y5mVqlqaTk/TVqwuv6LexI/AAAAAAAACXs/p2jCBlF93MA/s72-c/graph.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2304168387342569467</id><published>2011-02-14T13:44:00.000-05:00</published><updated>2011-02-14T13:44:09.655-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>The Apex of Pipelines</title><content type='html'>Every once in a while I'm still surprised at the potential&lt;a href="http://bitsup.blogspot.com/2010/11/performance-of-pipelining-in-http.html"&gt; upside of pipelines&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I stumbled across a great example recently: &lt;a href="http://www.witi.com/"&gt;Women In Technology International&lt;/a&gt;. That home page is setup in a pretty typical newsletter format. It has 159 resources, 145 of which are images along with about a half dozen pieces of js and css. Most of the images are small, with over 2/3 of them loading in less than 20ms of transfer time (time to first byte removed).&lt;br /&gt;&lt;br /&gt;What is striking about this page is how large of an advantage pipelining can give even on a well connected broadband desktop with a 100ms RTT to the witi hosting facility. &lt;b&gt;The average latency to receive the first byte of a resource dropped from 1697ms to 626ms, and the average elapsed time per transaction overall dropped from 1719ms to 652ms.&lt;/b&gt; Aggregate that over 159 different resources and you have some serious gains!&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-UdePDAtdFjQ/TVl1V44phNI/AAAAAAAACXk/tu44-g7ADRU/s1600/witi-latency.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="307" src="http://2.bp.blogspot.com/-UdePDAtdFjQ/TVl1V44phNI/AAAAAAAACXk/tu44-g7ADRU/s400/witi-latency.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;But why stop there? The pipeline sweet spot is in high latency situations such as mobile, or trans continental data transfer. This is what happens when we add 200 ms of latency to the connection:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-0NBtY5aCjfc/TVl1vgcFK1I/AAAAAAAACXo/taDZXK7yZDY/s1600/witi-latency-plus200.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="307" src="http://2.bp.blogspot.com/-0NBtY5aCjfc/TVl1vgcFK1I/AAAAAAAACXo/taDZXK7yZDY/s400/witi-latency-plus200.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;That's right - 3300ms of improvement on each transaction! That seems absurdly good if we only added 200ms of latency, but what you're seeing is the aggregate queueing effect - Firefox wants 150 resources more or less simultaneously and can only parallelize it on 6 connections. If you are 25 positions deep on that queue you will have to wait &lt;b&gt;at least&lt;/b&gt; 7500ms just for the back and forth of each transaction in front of you to complete.. obviously not everyone is queued that deeply so the average effect is somewhat less, but still overwhelming.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2304168387342569467?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2304168387342569467'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2304168387342569467'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2011/02/apex-of-pipelines.html' title='The Apex of Pipelines'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-UdePDAtdFjQ/TVl1V44phNI/AAAAAAAACXk/tu44-g7ADRU/s72-c/witi-latency.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-7977684545114564643</id><published>2011-02-02T15:54:00.000-05:00</published><updated>2011-02-02T15:54:32.038-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><title type='text'>HTTP Parallel Connections (Firefox edition!)</title><content type='html'>&lt;b&gt;Parallelism helps when&lt;/b&gt;&lt;br /&gt;&lt;ul&gt;&lt;ul&gt;&lt;li&gt;It hides network idleness during TCP Handshakes though persistent connections help with this too.&lt;/li&gt;&lt;li&gt;It hides network idleness during the first byte phase transactions, though pipelining can address this too.&lt;/li&gt;&lt;li&gt;It hides network idleness during TCP slow start wait-for-ack periods. This is a big one.&lt;/li&gt;&lt;li&gt;It provides a mechanism to prioritize and avoid head of line blocking problems.&amp;nbsp;&lt;/li&gt;&lt;li&gt;It steals bandwidth from competing "tcp friendly" flows by simply increasing the number of flows in one application. That's an arms race that most people think should be avoided. &lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;br /&gt;&lt;b&gt;&amp;nbsp;Parallelism hurts when&lt;/b&gt;&lt;br /&gt;&lt;ul&gt;&lt;ul&gt;&lt;li&gt;It increases the number of TCP Handshakes which are both slow and CPU intensive (at least compared to regular data packets) to execute - this assumes persistent connections are an alternative.&lt;/li&gt;&lt;li&gt;It increases the overhead of normal data processing because more flows have to be considered typically via longer hash chains&lt;/li&gt;&lt;li&gt;It increases the impact of memory overhead and processor cache pollution by increasing the number of simultaneous TCP control blocks that have to managed on both the client and the server.&lt;/li&gt;&lt;li&gt;The resulting reduced amount of data per flow makes it harder to fully open sender congestion windows.&lt;/li&gt;&lt;li&gt;Packet loss is increased due to the non correlated fluctuations of data to be sent between the parallel connections. Two competing flows that are both sending from infinite data sources will quickly adapt to share the bandwidth, but two flows that have a fluctuating demand (e.g. parallel persistent HTTP connections that periodically go idle and alive) will inherently have patterns of underutilizing and overutilizing the path. Overutilization results in either packet loss or excess buffering in the network, which leads to poor interactive response times.&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;b&gt;When should we open a new parallel connection?&lt;/b&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;when I don't have an idle connection and I need the answer with minimum latency&lt;/li&gt;&lt;li&gt;when I expect existing connections are experiencing idleness and therefore not using all of the available bandwidth&lt;/li&gt;&lt;/ol&gt;&amp;nbsp;The approach HTTP implementations, including firefox, take to solving this quandary? They crudely enforce a constant number of connections per host and open them until they hit that limit. Variously across time that limit has commonly been 2, 4, and 6.&amp;nbsp; As server technology has evolved to the point many years ago where the impact of idleness was a bigger deal than the CPU overhead on the server we saw servers actually publish their resources under several virtual host names, even though it was all the same server, for the exclusive purpose of circumenting that per-host limit in the client.&lt;br /&gt;&lt;br /&gt;I wonder if we can't do better in Firefox.. First, lets deal with the case of a low latency request. Right now all we do with them is to put them at the top of the waiting queue if the request cannot be dispatched immediately (because the limit of 6 has already been reached). But there are really two cases to consider:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;What to do when the network is not already saturated&lt;/li&gt;&lt;li&gt;What to do when the network is saturated&lt;/li&gt;&lt;/ol&gt;&amp;nbsp;In both cases the first step for a truly low latency request is the same - open a new connection assuming there isn't an idle one available. However, note that establishing that connection is going to take at least 1 RTT for normal HTTP and 2 RTT for HTTPs - so we should actually watch for any existing HTTP transactions to complete on a different reusable persistent connection in between the time we start opening the new connection and the time the handshake is complete. If that happens the persistent connection should be used instead - that will&lt;i&gt; require a change in the current logic where nsHttpConnection opens the sockets after it has been assigned a transaction. Instead nsHttpConnectionMgr should be opening the sockets as well as receiving the returned persistent connections and then should dispatch to them as they become available. &lt;/i&gt;&lt;br /&gt;&lt;br /&gt;In the case of a saturated network &lt;i&gt;some of the existing parallel connections should be stalled while the low latency request is satisfied in order to provide the most bandwidth for that important transaction&lt;/i&gt;. We can do this by temporarily slamming their recv windows to something close to 1 packet of data which will slow them down to a trickle. This can be done commensurately with the transmission of the prioritized request as it should take 1/2 RTT for the window change to reach the sender.&lt;br /&gt;&lt;br /&gt;But what about the more common case where all transactions are of equal priority - how do we make the decision then about opening a new connection vs queueing a new transaction? Assuming we aren't concerned about head of line blocking issues (which we should be able to wrap up in a definition of priorty somehow), then we want to do this &lt;i&gt;only when there is network idleness that can be covered up by parallelism. &lt;/i&gt;This approach is radically different than "open up to N" connections.&lt;br /&gt;&lt;br /&gt;It isn't obvious exactly how to determine that in Necko. But then again, you are looking for data bursts followed by idleness - and its pretty obvious when you see it graphed out. This is the transfer pattern of a single http response I looked at a couple of weeks ago - it could happily overlap with another flow in order to more effectively utilize the whole pipe. (of course, if the server used a larger initial CWND, the problem would be massively reduced.)&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_wPMwI2SStyw/TSx6Lfsz9CI/AAAAAAAACTI/7Bgo6jMrKiU/s1600/vanilla-xfer.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="300" src="http://2.bp.blogspot.com/_wPMwI2SStyw/TSx6Lfsz9CI/AAAAAAAACTI/7Bgo6jMrKiU/s400/vanilla-xfer.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-7977684545114564643?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7977684545114564643'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7977684545114564643'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2011/02/http-parallel-connections-firefox.html' title='HTTP Parallel Connections (Firefox edition!)'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_wPMwI2SStyw/TSx6Lfsz9CI/AAAAAAAACTI/7Bgo6jMrKiU/s72-c/vanilla-xfer.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-8522541479322899108</id><published>2011-02-02T13:46:00.000-05:00</published><updated>2011-02-02T13:46:12.822-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><title type='text'>Separating HTTP Connections from TCP Connections</title><content type='html'>The firefox http connection implementation, and most others I have seen, binds the http connection and the tcp connection together 1 to 1 something like this:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Check pconn pool for idle http connection&lt;/li&gt;&lt;li&gt;if that succeeds, dispatch&lt;/li&gt;&lt;li&gt;if limits allow, open a tcp connection and when that completes dispatch on it&lt;/li&gt;&lt;li&gt;otherwise queue it and goto 1 when an existing transaction completes &lt;/li&gt;&lt;/ol&gt;As part of the refactoring of &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=623948"&gt;bugzilla 623948&lt;/a&gt; I have separated this logic out a little bit down to its tcp roots:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt; Check pconn pool for idle http connection&lt;/li&gt;&lt;li&gt;if that succeeds, dispatch&lt;/li&gt;&lt;li&gt;queue transaction&lt;/li&gt;&lt;li&gt;if limits allow, start a tcp connection which will be added to the pconn pool when it completes&lt;/li&gt;&lt;li&gt;whenever a pconn is available or a transaction completes check queue for dispatch&lt;/li&gt;&lt;/ol&gt;The major difference is that when a transaction in the first scenario decides to open an http connection it &lt;b&gt;always&lt;/b&gt; waits for that connection to complete and then dispatches on it. In the second scenario the two actions are taken independently, if another connection frees up before the newly demanded TCP connection is ready we can use that instead (and then cache the connection when it does complete in the pconn pool).&lt;br /&gt;&lt;br /&gt;It turns out this happens, anecdotally, &lt;b&gt;a whole freaking lot.&lt;/b&gt; My test network has a delay of about 250ms between Firefox and each server.&lt;br /&gt;&lt;br /&gt;Loading the NY Times&lt;b&gt; &lt;/b&gt;required 219 TCP connections of which 141 (64%) were able to be served on a different persistent connection that became available between the time we started to open the connection and the time the handshake actually completed.&lt;br /&gt;&lt;br /&gt;Loading the wall of one of my facebook pals saw this behavior on 9 of 27 connections (33%), and the cnn home page performed similarly to NYT - 76/117 (65%).&lt;br /&gt;&lt;br /&gt;This makes some intuitive sense - if you have a piece of HTML that includes an img it is entirely likely that we will start the request for the img before we have finished transferring the HTML, but the HTML will finish transferring before the new handshake (which will take a full RTT) completes. This algorithm just moves the request for the img over to the HTML persistent connection which can proceed as soon as the HTML is done.&lt;br /&gt;&lt;br /&gt;The amount of latency saved is variable - indeed it is probably at least somewhat uniformly random with 0 and RTT as its bounds. I was seeing a mean of around 80ms on each one. Of course this doesn't apply to all transactions - just ones that are opening TCP connections to servers that also do persistent connections.&lt;br /&gt;&lt;br /&gt;but its still cool.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-8522541479322899108?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8522541479322899108'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8522541479322899108'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2011/02/separating-http-connections-from-tcp.html' title='Separating HTTP Connections from TCP Connections'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-4803918604454379424</id><published>2011-01-18T11:17:00.000-05:00</published><updated>2011-01-18T11:17:55.762-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><title type='text'>HTTP PSA: beware unpadded content-md5</title><content type='html'>You don't see a lot of HTTP Content-MD5 response headers, but I just discovered some piece of code that generates unpadded base 64 versions.. i.e. a 22 byte:&lt;br /&gt;&lt;br /&gt;Content-MD5: 6Cxy6QbruJs0hrT/P8exaA&lt;br /&gt;&lt;br /&gt;I figured HTTP followed MIME rules and required a multiple of 4 characters.. i.e:&lt;br /&gt;&lt;br /&gt;Content-MD5: 6Cxy6QbruJs0hrT/P8exaA==&lt;br /&gt;&lt;br /&gt;Weirdly, after checking the relevant specs it isn't actually clear to me if the = pad is required. I'm probably missing something obvious. But as this topic generates absolutely 0 google juice, this post is a public service announcement - expect both versions in your clients.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-4803918604454379424?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4803918604454379424'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4803918604454379424'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2011/01/http-psa-beware-unpadded-content-md5.html' title='HTTP PSA: beware unpadded content-md5'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-5618547557751490259</id><published>2011-01-17T16:32:00.000-05:00</published><updated>2011-01-17T16:32:47.889-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='pipelines'/><title type='text'>Firefox Pipeline Patches - Try them out</title><content type='html'>As you may know, I've been on a little odyssey to make pipelining safe,&lt;a href="http://bitsup.blogspot.com/2010/11/performance-of-pipelining-in-http.html"&gt; fast&lt;/a&gt;, and aggressive. I've got a &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=603503"&gt;dozen patches for Firefox 4 &lt;/a&gt;that do that.&lt;br /&gt;&lt;br /&gt;This post is your chance to try them out. Download a build for your favorite OS and try them:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.ducksong.com/misc/pipeline-builds/based-4.0-beta9-1/"&gt;http://www.ducksong.com/misc/pipeline-builds/based-4.0-beta9-1/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;You will still need to set network.http.pipelining to true in about:config in order to enable the code. You can also set network.http.pipelining.aggressive to true if you want to disable any of the "gently test the waters" code. I do that when I want to measure the pipelines, but leaving aggressive mode off is probably a good idea if you want to ensure the smoothest experience possible, but I gotta say that the various recovery mechanisms are working well enough (and are used rarely enough) that I am considering making normal mode considerably more like aggressive mode.&lt;br /&gt;&lt;br /&gt;Consider this a beta level pre-test. I want your feedback and any bug reports.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-5618547557751490259?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5618547557751490259'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5618547557751490259'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2011/01/firefox-pipeline-patches-try-them-out.html' title='Firefox Pipeline Patches - Try them out'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-7747401861971575111</id><published>2011-01-11T11:06:00.001-05:00</published><updated>2011-01-11T11:40:47.477-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Firefox Idle Connection Selection via CWND</title><content type='html'>When choosing between more than 1 idle persistent connection FF &amp;lt;= 4.0 goes with a FIFO approach. I was thinking about ways to tune this.&lt;br /&gt;&lt;br /&gt;I was going to experiment with making that a LIFO. A LIFO afterall should have better cache behavior and would allow the cache size to shrink to a more natural size as the older members remain untouched and timeout. The FIFO will basically keep the size pinned at its maximum while it cycles through all the connections which wastes system RAM with both the client and server maintaining extra idle TCP sessions. It is also the worst possible processor cache behavior. The possible argument in favor of FIFO is actually that connections are expensive to open and we've already opened these so maybe we want to keep it pinned to its maximum size just in case we need it again - it isn't obvious what to do or if it matters much.&lt;br /&gt;&lt;br /&gt;Thinking a little further I realized that the major differentiator between these sockets is not a timestamp at all - it is the CWND associated with them on the server. Many web servers at least partially ignore the TCP suggestion to return to slow start after an RTO of idle activity so it is reasonable to assume that some of the connections have larger CWNDs than others as long as we aren't in an environment where the previous transfers have been actually bottlenecked by bandwidth - and on the web that almost never happens, RTT and document size bottleneck most transfers. Even if available bandwidth is the real bottleneck that just moots our metric, it doesn't provide any information that steers us the wrong way.&lt;br /&gt;&lt;br /&gt;When choosing which connection to use we want to choose the one that has the largest CWND on the server. Unfortunately we cannot see directly into the congestion information on the server, but assuming that previous transfers have been bottlenecked by RTT (approximately a constant for all connections to the same server) and transfer size then we can just use the one with the history of moving the largest amount of data in one burst as a proxy for it because that burst is what opens the congestion window.&lt;br /&gt;&lt;br /&gt;I say one burst instead of one document because we want to include pipelines of multiple HTTP transactions as long as there wasn't an RTT between them. This is another reason to want pipelines - they will more effectively open up the CWND for us. We also want to use the maximum burst seen on the connection, not the necessarily the last burst - the servers we are interested in don't shrink the CWND just because it isn't all being used for each burst.&lt;br /&gt;&lt;br /&gt;The implementation is easy - the idleconnection list is changed from being sorted as a FIFO to being sorted according to this "maxburst" metric. The&lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=624739"&gt; code and bugzilla are here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Using an experiment designed to show the best case, the results are better than I expected for such a minor tweak. This was my process:&lt;br /&gt;&lt;ul&gt;&lt;li&gt; construct a base page mixed with several small and several large images plus a link to a 25KB image. There are 6 objects on the &lt;a href="http://www.ducksong.com/sortbycwnd/index.html"&gt;base page&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;load the base page - FF4 will use six parallel connections to do it&lt;/li&gt;&lt;li&gt;click on the link to the 25KB image - this will use an idle persistent connection. Measure the performance of this load.&lt;/li&gt;&lt;/ul&gt;There is 250ms of RTT and several megabits of bandwidth between the client and server in my test.&lt;br /&gt;&lt;br /&gt;As expected, vanilla FF 4.0 (beta 9)&amp;nbsp; loads the target image on an idle persistent connection that was used to transfer one of the smallest images on the base page. It is FIFO afterall, and the small image load was going to finish first and be put into the persistent connection pool first.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_wPMwI2SStyw/TSx6Lfsz9CI/AAAAAAAACTI/7Bgo6jMrKiU/s1600/vanilla-xfer.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="300" src="http://2.bp.blogspot.com/_wPMwI2SStyw/TSx6Lfsz9CI/AAAAAAAACTI/7Bgo6jMrKiU/s400/vanilla-xfer.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;My patched browser, using the sort by CWND algorithm, loads the  target image on the idle persistent connection that was used to transfer  the largest image (2MB+) from the base page.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Their history is the only difference between the two connections -  at the time of requesting the 25KB image they are both connected and  idle. There is a 250ms RTT between my host and the server.&lt;br /&gt;&amp;nbsp; &lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_wPMwI2SStyw/TSx6XeouiFI/AAAAAAAACTM/XLUeEENLvbo/s1600/sortbycwnd.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="300" src="http://2.bp.blogspot.com/_wPMwI2SStyw/TSx6XeouiFI/AAAAAAAACTM/XLUeEENLvbo/s400/sortbycwnd.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The difference is huge. The default algorithm takes 3 round trips to transfer the document as it works its way through growing the congestion window. That adds up to 793ms. The sort-by-cwnd algorithm inherits a congestion window large enough for the task at hand and moves it all in one stream - just 363ms.&lt;br /&gt;&lt;br /&gt;This is a nice tweak, but it has its limitations - by definition you cannot meaningfully multiply the gain across a large number of transactions. If you have a large number of transactions then you probably are using all your idle connections in a burst and there is no point in discriminating between them if you are just going to use them all.&lt;br /&gt;&lt;br /&gt;I would argue if you have that much pressure on the connection pool then you are probably queueing requests and should be using pipelining. If you don't have that much pressure, then this probably helps you.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-7747401861971575111?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7747401861971575111'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7747401861971575111'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2011/01/firefox-idle-connection-selection-via.html' title='Firefox Idle Connection Selection via CWND'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_wPMwI2SStyw/TSx6Lfsz9CI/AAAAAAAACTI/7Bgo6jMrKiU/s72-c/vanilla-xfer.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2897591853525643592</id><published>2010-12-16T16:29:00.000-05:00</published><updated>2010-12-16T16:29:03.783-05:00</updated><title type='text'>Accelerated Connection Retry for HTTP and Firefox</title><content type='html'>Not all packet loss is created equal. In particular, losing a SYN can really ruin your day - or at least the next 3 seconds which can feel like all day. Most Operating Systems take 3 seconds of waiting before retrying the SYN. Most other timeouts are dynamically scaled to the network conditions, but not the SYN. It is generally hardcoded. And on most of today's networks 3 seconds is an eternity.&lt;br /&gt;&lt;br /&gt;So, in FF we took a page from Chrome's book and said if Firefox has been waiting for 250ms (configurable via network.http.connection-retry-timeout) then start a second connection in parallel with that first one. Assuming you've got .25% packet loss and general independence between packets spaced that far apart the approach turns the 3 second pause from a 1 in 400 event into a 1 in 16,000 event. That is pretty much the difference between "kinda annoying" and "didn't notice". It's a good idea - if you hate it for whatever reason just set the pref to 0 to disable it.&lt;br /&gt;&lt;br /&gt;Taking the idea one step further, if we create two connections because of this timer and they actually both end up completing obviously only one can be used immediately. But we cache the other one as if it were a persistent connection - and then when you need it (which you probably will) you don't have to wait for the handshake at all. It is essentially a prefetched TCP connection. On my desktop, I run with an especially low timer so that any site with a &amp;gt; 100ms RTT benefits from this and its great!&lt;br /&gt;&lt;br /&gt;You can see this effect below, using mnot's cool &lt;a href="https://github.com/mnot/htracr"&gt;htracr&lt;/a&gt;, on the second connection. Note how there is no request placed on it as soon as it is live (the request is the red dot at the top of the grey rectangle - the rectangle represents the connection), but one follows shortly thereafter without having to do a handshake. That's an RTT saved!&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_wPMwI2SStyw/TQK1-txiU5I/AAAAAAAACSY/m_iHZZ2VuhM/s1600/apache.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="190" src="http://1.bp.blogspot.com/_wPMwI2SStyw/TQK1-txiU5I/AAAAAAAACSY/m_iHZZ2VuhM/s640/apache.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You will be able to enjoy this feature in FF 4.0 Beta 9. A buggy version of it is actually included in Beta 8 but disabled behind the pref mentioned above. Feel free to enable and play with it before Beta 9 if you don't mind a connection freeze once in a while.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2897591853525643592?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2897591853525643592'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2897591853525643592'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2010/12/accelerated-connection-retry-for-http.html' title='Accelerated Connection Retry for HTTP and Firefox'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_wPMwI2SStyw/TQK1-txiU5I/AAAAAAAACSY/m_iHZZ2VuhM/s72-c/apache.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-1983661560121173662</id><published>2010-12-01T14:55:00.000-05:00</published><updated>2010-12-01T14:55:40.651-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Performance of Pipelining in HTTP Firefox</title><content type='html'>This post provides some performance measurements of my &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=603503"&gt;HTTP pipeline patches&lt;/a&gt; for Firefox.&lt;br /&gt;&lt;br /&gt;The key benefits of pipelining are reduced transaction latency and potentially the use of fewer connections, so those are generally the metrics I will focus on here. For each of my 5 test cases we will look at the following statistics:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The percentage of requests that are delayed in the request queue inside Firefox waiting for a connection to be available.&lt;/li&gt;&lt;li&gt;The average amount of queue latency for each transaction. This is measured as the time the first byte of the request is given to the kernel minus the time at which the request was presented to Necko/Gecko. It is generally very low but greater than 0 even if the transaction can be placed directly on a pipeline because it takes a moment to construct the request and perhaps schedule the socket thread if other things are on the CPU. But a high value is an opportunity lost - that is time the request could be in flight and the server could be processing it. It includes the time necessary to create a new connection if that is necessary - that includes the three way TCP handshake but not a DNS lookup (which is cached in the test).&lt;/li&gt;&lt;li&gt;The type of connection used for each transaction (new connection, an idle persistent connection, or a pipelined connection)&lt;/li&gt;&lt;li&gt;The average amount of transaction latency for each transaction. This is measured as the time of the first byte of the response being received by firefox minus the time at which the request was presented to Necko/Gecko. It is possible for the average improvement to be greater than 1 RTT because of cumulative queueing delays in the non pipelined case.&lt;/li&gt;&lt;li&gt;The cumulative fraction of transactions completed before 3 different elapsed times in order to show improved execution time for the test case. The benchmark times are sized appropriately for each test case.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;There are 4 data points for each criteria in each test case. Because pipelining is aimed at environments with significant latency and my broadband test connectivity has below average latency for much of the world and every mobile environment the first two data points have 200ms of latency added through a traffic shaper. The two datapoints compare pipelining on vs pipelining off. The other data points measure the same things but without the induced latency. &lt;br /&gt;&lt;br /&gt;All tests are run with both a disk and memory cache enabled but empty at the beginning of the run. In order to measure the effectiveness of the pipelining, each of these sites has been put in the "green - pipelining ok" state which is normally auto discovered.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Facebook&lt;/h3&gt;&lt;br /&gt;The first test is on Facebook. It starts by logging in and then selecting a particular user profile and navigating from that page via lists of friends, occasionally pulling up individual profiles, generating lists of recent updates, and pressing the More link on busy Facebook walls. There are approximately 1400 HTTP transactions in each test run.&lt;br /&gt;&lt;br /&gt;The first thing to consider is the percent of requests that are delayed (i.e. queued) within firefox. I think queuing is particularly bad because if the request hasn't been passed to the network there is no way for any advances in server technology to ever operate on it. For instance - servers are prevented from returning responses out of order but nothing would prevent them from processing requests out of order in order to overlap latencies in DB queries and disk I/O &lt;span style="font-weight: bold;"&gt;if&lt;/span&gt; the requests were not queued on the browser side.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Percent of Requests Queued&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;79.1&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;78.6&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;That is a stark contrast. It is possible by the way to see a request being queued with pipelining enabled - a default configuration limit of 32 governs the maximum depth of the pipeline and not all request types are pipeline eligible.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Average Queue Latency (ms)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;29&lt;/th&gt;&lt;th&gt;1630&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;6&lt;/th&gt;&lt;th&gt;285&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;You might expect to see 0ms for the pipelining case as we just illustrated above that no requests were delayed. But the queue latency covers the time from request submission to the time of putting the first byte of the request on the wire, so that includes any connection setup time when establishing a new connection. That is the primary source of the latency seen here for the pipeline enabled case.&lt;br /&gt;&lt;br /&gt;That begs the question, when pipelining is enabled how many of the requests are pipelined?&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="5" align="center"&gt;Connection Type Pct&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;Moderate Latency w/Pipeline&lt;/th&gt;&lt;th&gt;Moderate Latency wo/Pipeline&lt;/th&gt; &lt;th&gt;Low Latency w/Pipeline&lt;/th&gt;&lt;th&gt;Low Latency wo/Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;New&lt;/th&gt;&lt;th&gt;2&lt;/th&gt;&lt;th&gt;4&lt;/th&gt;&lt;th&gt;2&lt;/th&gt;&lt;th&gt;3&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Reused Idle&lt;/th&gt;&lt;th&gt;13&lt;/th&gt;&lt;th&gt;96&lt;/th&gt;&lt;th&gt;13&lt;/th&gt;&lt;th&gt;97&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Pipeline&lt;/th&gt;&lt;th&gt;85&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;84&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;We see here a moderate reduction in the number of connections used when pipelining, but most of the effect is a transfer from idle persistent connections over to pipelines. While the percentage of new connections as a portion of the overall request stream has gone down just a tick with pipelining, the impact on the actual number of raw connections is significant - going from roughly 60 without pipelining to 30 with it enabled. That boils down to a 50% reduction in the number of connections created which is a significant provides a very busy site like Facebook a significant scalability boost.&lt;br /&gt;&lt;br /&gt;The final criteria deal with transaction latency.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Average Transaction Latency (ms)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;702&lt;/th&gt;&lt;th&gt;1906&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;341&lt;/th&gt;&lt;th&gt;346&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Yowza!&lt;/span&gt; Now there is a result. Under conditions with moderate latency the average transaction waits 1200ms less from the time the request is submitted to Necko to the time the first byte of the response header is received. The net effect is so much more than the approx ~250ms RTT because of aggregating queueing delays - without pipelining enabled you are placed in a deep queue which has to be totally cleared with a 1RTT overhead on each one before you are executed. The impact under low latency conditions is probably close to being noise.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="4" align="center"&gt;Pct of Responses Rcvd in &amp;lt; Xms&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;x=1500&lt;/th&gt;&lt;th&gt;x=1200&lt;/th&gt;&lt;th&gt;x=900&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency w/Pipeline&lt;/th&gt;&lt;th&gt;97&lt;/th&gt;&lt;th&gt;91&lt;/th&gt;&lt;th&gt;75&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency wo/Pipeline&lt;/th&gt;&lt;th&gt;45&lt;/th&gt;&lt;th&gt;38&lt;/th&gt;&lt;th&gt;32&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="4" align="center"&gt;Pct of Responses Rcvd in &amp;lt; Xms&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;x=1000&lt;/th&gt;&lt;th&gt;x=700&lt;/th&gt;&lt;th&gt;x=400&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency w/Pipeline&lt;/th&gt;&lt;th&gt;99&lt;/th&gt;&lt;th&gt;92&lt;/th&gt;&lt;th&gt;61&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency wo/Pipeline&lt;/th&gt;&lt;th&gt;99&lt;/th&gt;&lt;th&gt;91&lt;/th&gt;&lt;th&gt;61&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;Facebook is a big success - probably the biggest success of any of the tests. 200+ms latency situations have performance significantly increased, and low latency scenarios perform similarly while using a few less TCP connections.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Amazon.com&lt;/h3&gt;&lt;br /&gt;The Amazon.com test walks through a basic window shopping experience at Amazon.com. The home page is loaded, the kindle link is clicked, a few more categories are clicked and the lists of products are generally browsed and sorted by "hot and new" and other similar things. This boils down to about 800 HTTP transactions.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Percent of Requests Queued&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;54.4&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;39.8&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;Right away you can see that amazon queues fewer requests than Facebook, so the potential improvement is less.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Average Queue Latency (ms)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;116&lt;/th&gt;&lt;th&gt;791&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;12&lt;/th&gt;&lt;th&gt;136&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="5" align="center"&gt;Connection Type Pct&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;Moderate Latency w/Pipeline&lt;/th&gt;&lt;th&gt;Moderate Latency wo/Pipeline&lt;/th&gt; &lt;th&gt;Low Latency w/Pipeline&lt;/th&gt;&lt;th&gt;Low Latency wo/Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;New&lt;/th&gt;&lt;th&gt;12&lt;/th&gt;&lt;th&gt;13&lt;/th&gt;&lt;th&gt;6&lt;/th&gt;&lt;th&gt;13&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Reused Idle&lt;/th&gt;&lt;th&gt;20&lt;/th&gt;&lt;th&gt;96&lt;/th&gt;&lt;th&gt;28&lt;/th&gt;&lt;th&gt;87&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Pipeline&lt;/th&gt;&lt;th&gt;68&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;66&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;The first thing to note is that less pipelining is going on that with facebook, so again there is less potential for improvement. How the pages are constructed has a lot to do with this (perhaps fewer images, etc..). But almost as interesting is the fact that the number of TCP connections (i.e. new connections) is halved in the low latency case. If the page can be transferred in the same amount of time using fewer connections that is still a win for the web overall.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Average Transaction Latency (ms)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;635&lt;/th&gt;&lt;th&gt;1083&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;266&lt;/th&gt;&lt;th&gt;204&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;An interesting result - 400ms off the average transaction in the ~250ms RTT environment, but a notable loss in the low latency scenario. All of the numbers here are averages across two test runs, but just inspecting the amazon test case in particular on some other ad-hoc runs showed quite a bit of variability. My suspicion is server load occasionally results in a single resource taking a long time to return. I have seen this disable pipelining for HTML pages, but leave it enabled for images, in the past.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="4" align="center"&gt;Pct of Responses Rcvd in &amp;lt; Xms&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;x=1200&lt;/th&gt;&lt;th&gt;x=900&lt;/th&gt;&lt;th&gt;x=600&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency w/Pipeline&lt;/th&gt;&lt;th&gt;91&lt;/th&gt;&lt;th&gt;79&lt;/th&gt;&lt;th&gt;62&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency wo/Pipeline&lt;/th&gt;&lt;th&gt;77&lt;/th&gt;&lt;th&gt;69&lt;/th&gt;&lt;th&gt;61&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="4" align="center"&gt;Pct of Responses Rcvd in &amp;lt; Xms&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;x=1000&lt;/th&gt;&lt;th&gt;x=700&lt;/th&gt;&lt;th&gt;x=400&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency w/Pipeline&lt;/th&gt;&lt;th&gt;93&lt;/th&gt;&lt;th&gt;90&lt;/th&gt;&lt;th&gt;84&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency wo/Pipeline&lt;/th&gt;&lt;th&gt;99&lt;/th&gt;&lt;th&gt;94&lt;/th&gt;&lt;th&gt;85&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;h3&gt;Flickr&lt;/h3&gt;&lt;br /&gt;The flickr test is probably the simplest of the cases. It simply loads several galleries based on set names and tags. There are roughly 350 HTTP transactions in the test. Under normal conditions Flickr has a high variability in server response time.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Percent of Requests Queued&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;57&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;57&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Average Queue Latency (ms)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;43&lt;/th&gt;&lt;th&gt;814&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;7&lt;/th&gt;&lt;th&gt;211&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="5" align="center"&gt;Connection Type Pct&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;Moderate Latency w/Pipeline&lt;/th&gt;&lt;th&gt;Moderate Latency wo/Pipeline&lt;/th&gt; &lt;th&gt;Low Latency w/Pipeline&lt;/th&gt;&lt;th&gt;Low Latency wo/Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;New&lt;/th&gt;&lt;th&gt;10&lt;/th&gt;&lt;th&gt;27&lt;/th&gt;&lt;th&gt;12&lt;/th&gt;&lt;th&gt;27&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Reused Idle&lt;/th&gt;&lt;th&gt;19&lt;/th&gt;&lt;th&gt;73&lt;/th&gt;&lt;th&gt;19&lt;/th&gt;&lt;th&gt;73&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Pipeline&lt;/th&gt;&lt;th&gt;71&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;69&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;As with the other tests, more than half of the new connections have been replaced when pipelining is enabled.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Average Transaction Latency (ms)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;859&lt;/th&gt;&lt;th&gt;1091&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;291&lt;/th&gt;&lt;th&gt;366&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;This result is more modest, but still positive, when compared to our other tests. &lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="4" align="center"&gt;Pct of Responses Rcvd in &amp;lt; Xms&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;x=2000&lt;/th&gt;&lt;th&gt;x=1500&lt;/th&gt;&lt;th&gt;x=1000&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency w/Pipeline&lt;/th&gt;&lt;th&gt;95&lt;/th&gt;&lt;th&gt;87&lt;/th&gt;&lt;th&gt;67&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency wo/Pipeline&lt;/th&gt;&lt;th&gt;88&lt;/th&gt;&lt;th&gt;74&lt;/th&gt;&lt;th&gt;60&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="4" align="center"&gt;Pct of Responses Rcvd in &amp;lt; Xms&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;x=1000&lt;/th&gt;&lt;th&gt;x=700&lt;/th&gt;&lt;th&gt;x=400&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency w/Pipeline&lt;/th&gt;&lt;th&gt;99&lt;/th&gt;&lt;th&gt;97&lt;/th&gt;&lt;th&gt;72&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency wo/Pipeline&lt;/th&gt;&lt;th&gt;98&lt;/th&gt;&lt;th&gt;93&lt;/th&gt;&lt;th&gt;71&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;h3&gt;www.AsiaNewsPhoto.com&lt;/h3&gt;&lt;br /&gt;The test is photo journalism clearing house site located overseas and therefore the broadband low latency case has a starting RTT of closer to 100ms, while the moderate delay case adds 200ms to that. This is the smallest test case - just 175 transactions in each run.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Percent of Requests Queued&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;44&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;38&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Average Queue Latency (ms)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;21&lt;/th&gt;&lt;th&gt;726&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;31&lt;/th&gt;&lt;th&gt;400&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;I am not yet certain how to explain the very modest rise in queue time for the pipeline case when the added 200ms delay is removed. It must involve an aberrant TCP connection as that is really the only component of queue time when the requests them selves are not delayed due to connection limits.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="5" align="center"&gt;Connection Type Pct&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;Moderate Latency w/Pipeline&lt;/th&gt;&lt;th&gt;Moderate Latency wo/Pipeline&lt;/th&gt; &lt;th&gt;Low Latency w/Pipeline&lt;/th&gt;&lt;th&gt;Low Latency wo/Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;New&lt;/th&gt;&lt;th&gt;16&lt;/th&gt;&lt;th&gt;7&lt;/th&gt;&lt;th&gt;11&lt;/th&gt;&lt;th&gt;8&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Reused Idle&lt;/th&gt;&lt;th&gt;33&lt;/th&gt;&lt;th&gt;93&lt;/th&gt;&lt;th&gt;33&lt;/th&gt;&lt;th&gt;92&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Pipeline&lt;/th&gt;&lt;th&gt;51&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;56&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;This is the first time we actually see the new connection numbers moving in the wrong direction. In this case I believe the type scheduling restrictions placed on the connection manager are generating new connections that may have been un-necessary in the non pipelining scenario. I'm curious if the effect would fade in a test with more transactions.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Average Transaction Latency (ms)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;1310&lt;/th&gt;&lt;th&gt;1248&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;752&lt;/th&gt;&lt;th&gt;689&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;This seems to track the changes in connection types and maybe the test bears further examination to see if an adjustment can be made. The scheduling algorithm seems to be getting in the way of itself and has made performance just a tick worse than before, though not by very much. And certainly not by enough to discount the gains made in some other scenarios.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="4" align="center"&gt;Pct of Responses Rcvd in &amp;lt; Xms&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;x=2000&lt;/th&gt;&lt;th&gt;x=1500&lt;/th&gt;&lt;th&gt;x=1000&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency w/Pipeline&lt;/th&gt;&lt;th&gt;84&lt;/th&gt;&lt;th&gt;82&lt;/th&gt;&lt;th&gt;52&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency wo/Pipeline&lt;/th&gt;&lt;th&gt;82&lt;/th&gt;&lt;th&gt;76&lt;/th&gt;&lt;th&gt;54&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="4" align="center"&gt;Pct of Responses Rcvd in &amp;lt; Xms&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;x=1500&lt;/th&gt;&lt;th&gt;x=1000&lt;/th&gt;&lt;th&gt;x=500&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency w/Pipeline&lt;/th&gt;&lt;th&gt;88&lt;/th&gt;&lt;th&gt;82&lt;/th&gt;&lt;th&gt;46&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency wo/Pipeline&lt;/th&gt;&lt;th&gt;84&lt;/th&gt;&lt;th&gt;82&lt;/th&gt;&lt;th&gt;57&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;h3&gt;MapQuest&lt;/h3&gt;&lt;br /&gt;This test is different in that it is driven almost exclusively through JS and XMLHttpRequest. Those elements are present in the Facebook and Amazon tests as well, but they dominate the MapQuest test. In this scenario a map is brought up on the screen and it is manipulated in the usual ways - panning in 4 directions, zooming in and out, and toggling between satelline and map mode. By the time it is done, 711 HTTP transactions have been made.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Percent of Requests Queued&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;42&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;44&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Average Queue Latency (ms)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;52&lt;/th&gt;&lt;th&gt;381&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;10&lt;/th&gt;&lt;th&gt;198&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;For all cases, the queue latency is pretty low for this test. That means the number of documents requested in one burst is relatively modest.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="5" align="center"&gt;Connection Type Pct&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;Moderate Latency w/Pipeline&lt;/th&gt;&lt;th&gt;Moderate Latency wo/Pipeline&lt;/th&gt; &lt;th&gt;Low Latency w/Pipeline&lt;/th&gt;&lt;th&gt;Low Latency wo/Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;New&lt;/th&gt;&lt;th&gt;12&lt;/th&gt;&lt;th&gt;16&lt;/th&gt;&lt;th&gt;11&lt;/th&gt;&lt;th&gt;14&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Reused Idle&lt;/th&gt;&lt;th&gt;30&lt;/th&gt;&lt;th&gt;84&lt;/th&gt;&lt;th&gt;32&lt;/th&gt;&lt;th&gt;86&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Pipeline&lt;/th&gt;&lt;th&gt;58&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;th&gt;57&lt;/th&gt;&lt;th&gt;0&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;Marginally less new connections are used with pipelining. Hurrah.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="3" align="center"&gt;Average Transaction Latency (ms)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;With Pipeline&lt;/th&gt;&lt;th&gt;Without Pipeline&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency&lt;/th&gt;&lt;th&gt;677&lt;/th&gt;&lt;th&gt;732&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency&lt;/th&gt;&lt;th&gt;260&lt;/th&gt;&lt;th&gt;500&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="4" align="center"&gt;Pct of Responses Rcvd in &amp;lt; Xms&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;x=1500&lt;/th&gt;&lt;th&gt;x=1200&lt;/th&gt;&lt;th&gt;x=900&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency w/Pipeline&lt;/th&gt;&lt;th&gt;96&lt;/th&gt;&lt;th&gt;87&lt;/th&gt;&lt;th&gt;76&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Moderate Latency wo/Pipeline&lt;/th&gt;&lt;th&gt;96&lt;/th&gt;&lt;th&gt;83&lt;/th&gt;&lt;th&gt;70&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;table style="border-width: 3px; border-spacing: 1px; border-style: outset; border-collapse: separate;" width="100%"&gt;&lt;tr&gt;&lt;th colspan="4" align="center"&gt;Pct of Responses Rcvd in &amp;lt; Xms&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;x=900&lt;/th&gt;&lt;th&gt;x=600&lt;/th&gt;&lt;th&gt;x=300&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency w/Pipeline&lt;/th&gt;&lt;th&gt;99&lt;/th&gt;&lt;th&gt;95&lt;/th&gt;&lt;th&gt;65&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Low Latency wo/Pipeline&lt;/th&gt;&lt;th&gt;87&lt;/th&gt;&lt;th&gt;75&lt;/th&gt;&lt;th&gt;40&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-1983661560121173662?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/1983661560121173662'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/1983661560121173662'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2010/11/performance-of-pipelining-in-http.html' title='Performance of Pipelining in HTTP Firefox'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2716921432735521544</id><published>2010-11-30T13:55:00.007-05:00</published><updated>2010-12-01T14:54:35.586-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Implementing a Pipelined Client for Firefox</title><content type='html'>I have been working on a &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=603503"&gt;set of patches&lt;/a&gt; to revamp the Firefox HTTP pipeline implementation. The objective is to provide a fast and robust implementation of pipelines. &lt;a href="http://bitsup.blogspot.com/2010/11/value-of-http-pipelines.html"&gt;There are lots of reasons to think this is a good thing.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The lowest level of the current implementation, which is disabled by default, receives a few minor tweaks. The primary change is to &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=447866"&gt;implement a continually fillable pipeline&lt;/a&gt; instead of the existing one where a pipeline is loaded up to whatever depth the request queue supports and then is not refilled until it has been emptied.&lt;br /&gt;&lt;br /&gt;There are much more serious changes on top of that. The most significant is probably captured in the &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=599164"&gt;"Select Connection Based on Type and State"&lt;/a&gt; bug and patch. This code does four things: 1) it classifies each transaction into a particular type, 2) it keeps track of the history of each server with respect to pipelining, 3) it determines whether or not pipelining is appropriate based on the type of the transaction and the history of the server with that type in the past, and 4) if it decides to pipeline it places it on a pipeline filled with only transactions of the same classification.&lt;br /&gt;&lt;br /&gt;That's a lot to think about. But the basic idea is to sort requests into control traffic (i.e. js and css), images, revalidations, htmlish things, and things known to be pipeline inappropriate such as video, most XMLHttpRequest, non-idempotent transactions, etc.. I call the latter classification "solo".&lt;br /&gt;&lt;br /&gt;Classes of data such as CSS and revalidations that generally have very short responses, and thus benefit the most from pipelining, favor pipelines over parallel connections even when we have less than the maximum number of connections already open.&lt;br /&gt;&lt;br /&gt;A variety of things can plague a successful pipelining session - for instance a site may have some documents with a large processing time on the server and that "think time" can really gum up the pipeline. By segregating the types of documents we can turn off pipelining for whatever is DB driven (e.g. XHTML) while still chasing down deep pipelines for images. Facebook is a terrific example of this.. the contents of your wall or your friend list probably have to be scooped out of a database computation and composed in real time, but the dozens of icons they reference are just whipped quickly right out when you have their direct key.&lt;br /&gt;&lt;br /&gt;This concept of negative feedback (in the example above it would be a read of the HTML page with a latency well in excess of the RTT to the server) is what drives the server state. &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=603512"&gt;Very large responses&lt;/a&gt;, sub HTTP/1.1 responses, cancelled pipelines, closed connections, and &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=593386"&gt;server headers known to be associated with broken pipeline implementations&lt;/a&gt; all trigger negative feedback too.&lt;br /&gt;&lt;br /&gt;Most feedback is tied to the classification of the particular request, but some is applied to all classes on the server. Corrupted data is one such case - if the response contains an invalid &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=232030"&gt;MD5 sum&lt;/a&gt;, or uses the proposed &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=597684"&gt;Assoc-Req&lt;/a&gt; header but fails to provide the correct information, or appears to have &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=597706"&gt;non-sensical framing&lt;/a&gt; information, all requests to that server are prevented from using pipelines for a long time. In the past such corruptions have happened due to buggy or compromised servers and intermediaries.&lt;br /&gt;&lt;br /&gt;Much like congestion control, the determination of whether or not to proceed with a pipeline is based on both negative and positive feedback. Initially pipelines are not sent. We must first see a HTTP/1.1 response header from the server. After that has been accomplished we try and send a pipeline of only depth 2 and only on a single connection. If both of those responses are received OK then we transition from that tentative state (known internally as yellow) to a position where each connection can send pipelines of up to depth 4 instead of opening all the way. Even after the depth-of-2 test succeeds we can still not be certain the topology supports pipelines - what appears as a short pipeline at the client may not appear that way at the server due to race conditions inherent in network transfer, but the extension to a depth of 4 for every connection still represents significant additional capacity over a single connection being allowed a depth of 2. During this phase concurrent connections are of course also used, so nothing is slowed down over the historical pipelining disabled setting.&lt;br /&gt;&lt;br /&gt;Once a transaction that was sent at at least a depth of 3 is successfully received the depth limits are removed from all connections to that host. The new maximum depth is only governed by a configuration preference with the default of 32. Should one of the negative events mentioned earlier occur, that class of transaction is placed in the red state (i.e. no pipelining is allowed for a period of time to that server) generally without interfering with the other transactions.&lt;br /&gt;&lt;br /&gt;As mentioned, an unexpected delay of a few hundred milliseconds is considered to be think time and feedback is applied to keep things running smoothly with only a barely perceptible bump in the road. But a longer delay not only applies feedback for future use but it &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=603514"&gt;also cancels any transactions pipelined after the currently delayed one and moves them to new connections&lt;/a&gt;. This helps mitigate the head of line blocking problem if it is really severe and as a side effect no more pipelines will happen in the near future with that server. In that sense firefox is self correcting in a hostile environment.&lt;br /&gt;&lt;br /&gt;All of this negative information expires over time but while it is still valid firefox will keep it persistently between restarts - its value was hard earned.&lt;br /&gt;&lt;br /&gt;XMLHttpRequests are a special source of pain in traditional pipeline implementations. They often implement long polling patterns where a request is sent to the server and the server intentionally hangs until some external event occurs - it then shares that information with the client by forming a response. It is essentially server push from a data point of view implemented with a long polling request. For that reason, XHR is by default classified as solo, &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=602518"&gt;but meta data can be supplied&lt;/a&gt; in the form of an HTTP request header to allow the application to indicate the request is expected to be fulfilled quickly.&lt;br /&gt;&lt;br /&gt;As they say on late night TV, &lt;span style="font-style:italic;"&gt;"But Wait! There's more!"&lt;/span&gt; Sometimes the pipelining infrastructure problem isn't on the server side, sometimes it is a proxy that is part of the client's topology. To test for that there is a &lt;a href="htthttps://bugzilla.mozilla.org/show_bug.cgi?id=603505p://"&gt;new startup test&lt;/a&gt; which initiates a deep pipeline to a known pipeline friendly resource on the Internet. This server intentionally defers its responses until a deep pipeline has been received there and only then leaks a small response. Both sides continue a pipeline on the same connection until it has been confirmed that an entire window of data can be supported by the HTTP devices on that path. The results of this test are cached for a very long time. This transfer also takes the opportunity to share with the client a &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=603506"&gt;list of host names that should be blacklisted&lt;/a&gt; with respect to pipelining because of known operability problems. (You can still use them - just not pipeline with them). The location (and enablement) of the test and hostname blacklist server are configurable.&lt;br /&gt;&lt;br /&gt;All of this feedback and history tracking sounds intimidating, but the truth is that most of the web works just fine. There are enough problems that a feedback driven self correcting implementation helps smooth out the bumps, but most of the web opens right up without any problems.&lt;br /&gt;&lt;br /&gt;&lt;a href="mailto:mcmanus@ducksong.com"&gt;Mail your comments to me&lt;/a&gt; or provide a talk back link.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2716921432735521544?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2716921432735521544'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2716921432735521544'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2010/11/implementing-pipelined-client-for.html' title='Implementing a Pipelined Client for Firefox'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-8038244064936614546</id><published>2010-11-30T11:43:00.008-05:00</published><updated>2010-12-01T14:53:03.381-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><category scheme='http://www.blogger.com/atom/ns#' term='google'/><title type='text'>The Value of HTTP Pipelines</title><content type='html'>For the past few months I've been on a personal quest to implement a safe and effective HTTP pipelining strategy for Mozilla Firefox. The pipeline concept is part of HTTP/1.1 but for various reasons has not been widely deployed on the web.&lt;br /&gt;&lt;br /&gt;I'm going to make a series of three posts. This one will be basic background information - what pipelines are in the abstract and why they are more relevant to today's web than they have been in years.&lt;a href="http://bitsup.blogspot.com/2010/11/implementing-pipelined-client-for.html"&gt; The second post will be about the details of my Firefox implementation&lt;/a&gt; and its mechanisms for dealing with the realities of the occasional pipeline-hostile piece of infrastructure. &lt;a href="http://bitsup.blogspot.com/2010/11/performance-of-pipelining-in-http.html"&gt;The last post will share some performance metrics&lt;/a&gt; of those patches in common use cases.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;HTTP Pipelines - 101&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A client forms a pipeline simply by sending a second HTTP request down an HTTP connection before it has received the first response. A pipeline may be more than two transactions deep. HTTP responses are still required to be returned in the order their requests arrived.&lt;br /&gt;&lt;br /&gt;This is a simple concept, but the benefits are profound.&lt;br /&gt;&lt;br /&gt;Chiefly, the pipelined transactions benefit by eliminating the latency involved in sending their request. The greater the latency, the greater the benefit.&lt;br /&gt;&lt;br /&gt;This crude diagram shows 2 normal HTTP transactions without pipelining. In it each request is shown with a rightward arrow and the response is shown with a leftward arrow. The empty space represents the network latency. When the first response is received, the second is sent.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_wPMwI2SStyw/TPUwDeE-NFI/AAAAAAAACQE/01YYmiCFR3Q/s1600/2xactno.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 125px;" src="http://3.bp.blogspot.com/_wPMwI2SStyw/TPUwDeE-NFI/AAAAAAAACQE/01YYmiCFR3Q/s400/2xactno.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5545391352348292178" /&gt;&lt;/a&gt;&lt;br /&gt;When pipelining is used, the picture becomes different. Even though the request and response sizes are the same, and the round trip time between the hosts is the same, the overall transaction completes sooner. It is obvious that the round trip latency between the client and the server has been mitigated as a problem.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_wPMwI2SStyw/TPUwkHpoTKI/AAAAAAAACQM/TU73qjtP538/s1600/2xactyes.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 125px;" src="http://4.bp.blogspot.com/_wPMwI2SStyw/TPUwkHpoTKI/AAAAAAAACQM/TU73qjtP538/s400/2xactyes.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5545391913263713442" /&gt;&lt;/a&gt;&lt;br /&gt;The greater the latency and the deeper the pipeline the more benefit is captured. &lt;br /&gt;&lt;br /&gt;While bandwidth continues to improve, latency is not keeping up. The trend is toward mobile networks and those have latencies 5x to 20x worse than broadband WAN connections. This increased latency is an opportunity for pipelining.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;HTTP Pipelines - 201&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Conceptually, parallel HTTP connections can garner similar benefits. Afterall, independent HTTP connections do not need to wait for each other to finish before beginning their work.&lt;br /&gt;&lt;br /&gt;Up to a point, that is completely true. Parallelism has been the mainstay for many years and served us well. But it has its limits. &lt;br /&gt;&lt;br /&gt;The chief limit is actually specified. No more than 6 (or 4, or 2 depending on the point in history) simultaneous connections are supposed to be allowed from the user agent to the server. This is a serious handicap - Firefox may easily have a request queue of over 100 images, stylesheets, and javascript items to fetch upon parsing a Facebook HTML page full of photos, emoticons, ads, and widgets. A single digit number of connections is far too small to effectively parallelize that download.&lt;br /&gt;&lt;br /&gt;This brings us to the reason there is a limit on the number of connections. Making a new TCP connection sucks. Before it can be used at all it requires a high latency three way handshake and if it is over TLS an expensive RSA operation on the server (and yet another round trip). Making the new connection requires access to shared data structures on the server in a way that using an existing connection does not and harms scalability. Servers that can pump multiple gigabits a second of TCP data in their sleep through a few connections, can still only initiate on the order of tens of thousands of connections a second.&lt;br /&gt;&lt;br /&gt;Maintaining all those connections is expensive for the server too. Each one takes state (i.e. RAM in the kernel, perhaps a thread stack in the application), and the server now has to deal with sorting through more control blocks and application states everytime a packet comes in in order to match it with the right one. Because of the increased diversity of TCP control blocks in use, the L2/L3 cache is severely polluted which is the number one factor in high performance TCP.&lt;br /&gt;&lt;br /&gt;Even if pipelines only mean browser performance equal to parallel connections but accomplished using fewer connections then that is a win for Web scalability.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;HTTP Pipelines - 301&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The details of TCP start to weigh in heavily in favor of pipelining when parallelism and pipelining are compared. &lt;br /&gt;&lt;br /&gt;Most significantly, TCP slow-start performs poorly in the parallelized environment. To recap what you already know: Slow start requires the sender to send a conservative amount of data at first (typically 3 or 4 packets) and then wait a full round trip time to receive positive feedback in the form of an ACK from the recipient. At that time it can grow the window a little bit and send another burst and then wait again. After enough iterations of this the sender is generating enough traffic in each burst to "fill the window" and it no longer has to wait. &lt;br /&gt;&lt;br /&gt;Pipelining, of course, does not eliminate slow start. But it does use fewer connections to move the same amount of data and the slow-start penalty scales with the number of connections not the amount of data. Fewer connections mean less data is transferred under the artificially slow conditions of slow start.&lt;br /&gt;&lt;br /&gt;There is another wrinkle that applies to even parallel connections that have paid their start-up dues and progressed past slow start. TCP connections that have not sent data within an RTO (think of an RTO as a round trip time plus a grace period) are supposed to go back to slow-start! From a TCP point of view this makes some sense - the inactivity means the TCP stack has lost the ack-clock that it needs to use the connection aggressively. But for even a persistent use of a parallel HTTP connection this is a disaster. Effectively each transaction response must go through slow start because it takes more than a round trip for the last packet of the previous response to travel to the client, the client to form the next request, the request to travel to the server and the server to form the response. Pipelined connections do not have that problem - one response follows immediately on the heels of the previous response which ensures optimal TCP use of the available bandwidth. Huzzah.&lt;br /&gt;&lt;br /&gt;Lastly, it is worth talking about &lt;a href="http://bitsup.blogspot.com/2010/02/while-back-i-mentioned-that-google.html"&gt;Google's push for larger initial windows&lt;/a&gt; during slow start. They support a value of 10 instead of the current 3 or 4. Basically I think this is a good thing - the Internet can support larger bursts than we are currently sending. Unfortunately, this has a terrible potential interaction with parallel connections. 6 connections would essentially mean 60 congestion control packet credits available to be sent in a burst at any time. Just as 3 is too few for a single connection environment, 60 is too many for a multiple connection environment. Whereas pipelines reduce the reliance on parallel connections, they bring down the total number of packet credits outstanding at any time while still allowing for more effective slow start capacity probing. That's a win win in my book.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-8038244064936614546?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8038244064936614546'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8038244064936614546'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2010/11/value-of-http-pipelines.html' title='The Value of HTTP Pipelines'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_wPMwI2SStyw/TPUwDeE-NFI/AAAAAAAACQE/01YYmiCFR3Q/s72-c/2xactno.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-3200033613193797016</id><published>2010-09-01T15:58:00.008-04:00</published><updated>2010-09-01T16:40:12.166-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Long Out Of Order Queueing Delays</title><content type='html'>Last week I &lt;a href="http://bitsup.blogspot.com/2010/08/characterizing-delays-caused-by-tcp-in.html"&gt;posted&lt;/a&gt; about a &lt;a href="http://www.ducksong.com/misc/out-of-order.patch"&gt;kernel patch&lt;/a&gt; that records how long out of order TCP packets are kept hidden from userspace while the kernel tries to fill in the necessary holes to create an in-order stream.&lt;br /&gt;&lt;br /&gt;These packets are especially frustrating - they have arrived at the host but the application does not have access to them until the kernel can create an in-order stream. Some applications that are really doing messaging over TCP (which might be sensible if you're looking for congestion control, maybe layered security, maybe multiplexing different semantic streams onto one TCP stream and the loss is localized to one of them, etc..) might be capable of moving on with their lives (and their data) more quickly if they had access to the missing sequence numbers. So the question is, how long are these kinds of applications waiting for in-order data when out-of-order data is already at their host?&lt;br /&gt;&lt;br /&gt;I ran that code for about 10 days on my desktop which runs typical American broadband with normal RTTs anywhere from 40 to 150ms. The host is even SACK enabled. Here is what I found:&lt;br /&gt;&lt;br /&gt;* 38,881 web flows (port 80 or 443)&lt;br /&gt;&lt;br /&gt;* 164 flows that contained reordered packets. That is 4.1 per thousand.&lt;br /&gt;&lt;br /&gt;* 915K total packets. 18,169 of them reordered. That is 19.8 per thousand.&lt;br /&gt;&lt;br /&gt;* The flows with reordering account for just .4% of the flows, but 40% of the total packets. Obviously, the bigger you are the more likely you are to experience a reordering event.. but furthermore small flows sometimes don't reorder at all because any loss that impacts them is more likely to be repaired with an empty window and a timeout.&lt;br /&gt;&lt;br /&gt;* If you select for just the .41% of flows that experience reordering a whopping 4.6% of the packets in that flow are reordered on average. Indeed this average flow is pretty large - 2418 packets and 110 of them are delayed due to being out of order. The average size of a flow that did not experience any reordering was just 24 data packets long. The fact that reordering events are such large clusters is probably good news - it likely means that we were seeing big windows of data on the wire and just a small amount of the early packets in that window were lost.. the rest of the window is counted as out-of-order. We want to see big windows in flight - so I'm good with that.&lt;br /&gt;&lt;br /&gt;* The average length of a reordered packet is 1424 bytes. Over 97% of these reordered packets are at least 1300 bytes long. This isn't that interesting, but I wrote it down - so here it is.&lt;br /&gt;&lt;br /&gt;When I talk about "N packets long" I mean my host received N packets with data in them.. bare acks and control packets are not counted.&lt;br /&gt;&lt;br /&gt;So far, that's not too bad. Big flows have this happen all the time. Reordering is basically a pre-requisite for doing any kind of TCP fast recovery in the face of packet loss so it looks good. If we assume that the reordering is due to small packet losses which can be repaired with fast-retransmit algorithms, which seems to make sense, then the out of order problem should be repaired in a little over 1 RTT, right?&lt;br /&gt;&lt;br /&gt;Unfortunately, when I plotted the delays incurred they look a lot bigger than the distribution of RTTs I see from my dekstop. A lot bigger.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_wPMwI2SStyw/TH61jvRoNxI/AAAAAAAACJk/_YcKCWrvVfg/s1600/reorder-delay-97.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 369px;" src="http://3.bp.blogspot.com/_wPMwI2SStyw/TH61jvRoNxI/AAAAAAAACJk/_YcKCWrvVfg/s400/reorder-delay-97.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5512042619538519826" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;There is a really long tail on that graph - and it only captures the best 97% of the data points. The longest I saw any packet wait in the reorder queue (and make it out again) was a full 2.5 minutes. &lt;br /&gt;&lt;br /&gt;Even though 2.5 minutes is an aberration, the normal cases are still pretty awful. &lt;span style="font-weight:bold;"&gt;The median time out of order packets spend queued in the kernel while waiting for an in-order stream is 293 milliseconds!&lt;/span&gt; Ouch.&lt;br /&gt;&lt;br /&gt;Let's zoom in on that graph - this shows the 90% of the packets that waited the shortest amount of time:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_wPMwI2SStyw/TH620JUu1OI/AAAAAAAACJs/mcFXuGSQDvY/s1600/reorder-delay-90.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 369px;" src="http://3.bp.blogspot.com/_wPMwI2SStyw/TH620JUu1OI/AAAAAAAACJs/mcFXuGSQDvY/s400/reorder-delay-90.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5512044000920392930" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;That's pretty ugly, you need to budget a full second wait to get 80% of the reordered packets out of their kernel limbo.&lt;br /&gt;&lt;br /&gt;It is much much uglier than I expected.&lt;br /&gt;&lt;br /&gt;It's not the reordering that bothers me - big reordering runs are to be perfectly expected in the face of a little packet loss and it is good to use that bandwidth while the loss is repaired. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;But why is it taking so long to repair?&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-3200033613193797016?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3200033613193797016'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3200033613193797016'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2010/09/last-week-i-posted-about-kernel-patch.html' title='Long Out Of Order Queueing Delays'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_wPMwI2SStyw/TH61jvRoNxI/AAAAAAAACJk/_YcKCWrvVfg/s72-c/reorder-delay-97.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-1634837018061745396</id><published>2010-08-20T20:29:00.003-04:00</published><updated>2010-08-20T20:49:56.614-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Characterizing Delays Caused by TCP in-order</title><content type='html'>If packet N+1 of a TCP flow arrives before packet N, the receiving application does not see any data until packet N gets there. That's what we mean when we say TCP guarantees in-order delivery. That is also true if N+1 through N+100 get there before N - nobody gets through until they can all be delivered in-order.&lt;br /&gt;&lt;br /&gt;At least using the BSD socket API.&lt;br /&gt;&lt;br /&gt;I got to thinking about the impact of this when discussing a multiplexing implementation of various logical streams on top of TCP. For instance, SPDY and BEEP do things along those lines in order to create efficiencies in terms of more accurate congestion control data. But as someone objected, that creates a certain amount of fate sharing between the different streams that wouldn't exist if they were on separate TCP channels. A packet loss in one of them creates a delay for them both even though throughput might very well be maintained using some variation of fast-retransmit and large windows. &lt;br /&gt;&lt;br /&gt;So the question: how often are packets received but the data in them is delayed by he kernel because the stream isn't yet in order? And how long are those delays?&lt;br /&gt;&lt;br /&gt;I don't know yet, but I wrote some crude &lt;a href="http://www.ducksong.com/misc/out-of-order.patch"&gt;linux kernel patches&lt;/a&gt; to find out. When a skb is moved out of the out of order queue a structure with 2 timestamps (in queue, out of queue) is passed to userspace through the netlink connector mechanism. It also reports the total number of received data packets on each TCP stream. That way we can find out, how often and how long.&lt;br /&gt;&lt;br /&gt;I'm running the hack on my desktop now.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-1634837018061745396?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/1634837018061745396'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/1634837018061745396'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2010/08/characterizing-delays-caused-by-tcp-in.html' title='Characterizing Delays Caused by TCP in-order'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-5071220527016201221</id><published>2010-02-24T10:38:00.003-05:00</published><updated>2010-02-24T11:17:04.703-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='ip'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='hardware'/><title type='text'>Forwarding Decisions with Bloom Filters</title><content type='html'>Really neat paper: "Hash, Don't Cache: fast packet forwarding for enterprise edge routers". The &lt;a href="http://www.cs.princeton.edu/~minlanyu/writeup/wren09.pdf"&gt;paper&lt;/a&gt; and &lt;a href="http://portal.acm.org/citation.cfm?id=1592688"&gt;abstract&lt;/a&gt; are both available on line. The authors are Minlan Yu, and Jennifer Rexford - both of Princeton.&lt;br /&gt;&lt;br /&gt;The authors share a lament of mine - caching forwarding decisions is attractive but no longer realistic as an implementation. The diversity of addresses (including those generated randomly by attackers) pollutes the cache and sends way too much traffic to slow fallback processing paths. So we end up with routers with one class of memory and no caches. Generally their memory is made totally from the really expensive fast stuff.&lt;br /&gt;&lt;br /&gt;But the suggestion in this paper of using hashed bloom filters instead of caches is a really cool one. Essentially maintain one filter per interface in fast memory (sram probably) and evaluate those at forwarding time in parallel if you can. You probably just get a single hit and can forward the packet onwards.. you do this with only needing enough fast ram for the hash filters (i.e. not much).. whereas all the traditional trie data and routing update code can live in slow and cheap dram. &lt;br /&gt;&lt;br /&gt;of course, due to the false positive nature of bloomies it is possible that you'll get more than one match that is going to require some kind of (probably slower) fallback plan for that packet. But the problem is nowhere near as dire as it is with caches.. with caches the misses force all the valid entries out of the cache and then all traffic is slowed down as the cache rate drops - with bloomies only the packets with the false positives are impacted, almost all of the traffic goes through the fast path unimpacted. The paper puts the false positive rate somewhere on the order of 1 in tens of thousands (depending on a bunch of factors - but that gives a feel for it.) Furthermore a cache can be intentionally attacked with a diversity of addresses in order to flood the cache and impact service for everyone - where under bloom filters such an attack is no more or less likely to impact service than any other kind of packet. &lt;br /&gt;&lt;br /&gt;What can't you apply a bloom filter to? It's all pretty cool. The paper has a number of other details on how to minimize false positives and efficiently process route updates.&lt;br /&gt;&lt;br /&gt;There is a related work from the same authors on a "buffalo" architecture which, after just a quick glance, appears to apply similar principles to LAN switching.&lt;br /&gt;&lt;br /&gt;J Rexford is also an author of &lt;a href="http://www.amazon.com/gp/product/0201710889?ie=UTF8&amp;tag=wwwducksongco-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0201710889"&gt;Web Protocols and Practice: HTTP/1.1, Networking Protocols, Caching, and Traffic Measurement&lt;/a&gt;&lt;img src="http://www.assoc-amazon.com/e/ir?t=wwwducksongco-20&amp;l=as2&amp;o=1&amp;a=0201710889" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /&gt; which is probably the most practical description of the major web protocol I've ever seen.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-5071220527016201221?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5071220527016201221'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5071220527016201221'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2010/02/forwarding-decisions-with-bloom-filters.html' title='Forwarding Decisions with Bloom Filters'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-7810800999005929070</id><published>2010-02-16T13:32:00.003-05:00</published><updated>2010-02-16T14:19:40.343-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='ip'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><category scheme='http://www.blogger.com/atom/ns#' term='google'/><title type='text'>Googling Harder</title><content type='html'>A while back I mentioned that Google thinks &lt;a href="http://bitsup.blogspot.com/2009/07/google-thinks-tcp-should-be-more.html"&gt;TCP ought to be more aggressive&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I must admit, this matches my own bias. I can barely count the number of applications I have watched wait for network I/O when there was plenty of CPU and idle bandwidth available. It's maddening. Sometimes it's slow start or another aspect of congestion control, sometimes it is outdated things like the nagle algorithm.&lt;br /&gt;&lt;br /&gt;Well, Google is back at it with this &lt;a href="http://sites.google.com/a/chromium.org/dev/spdy/An_Argument_For_Changing_TCP_Slow_Start.pdf"&gt;slide set&lt;/a&gt;. (PDF)&lt;br /&gt;&lt;br /&gt;They make the argument for increasing the initial cwnd. More provocatively, they argue that the Web has already done so in a defacto way by going to aggressive numbers of independent parallel HTTP connections (where you essentially get new cwnd credits just for opening a new TCP stream). Clever argument. Maybe you want to pace the data after 3 or 4 packets based on the RTT of the handshake - so you don't overrun any buffers un-necessarily.&lt;br /&gt;&lt;br /&gt;Frankly, this kind of thing can be implemented on the server side without ever telling the peer. It would make some sense for Google to just do this for a few different values of cwnd on a tiny fraction of their traffic and see if the packet loss rates change and then publish that.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-7810800999005929070?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7810800999005929070'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7810800999005929070'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2010/02/while-back-i-mentioned-that-google.html' title='Googling Harder'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2343319991120421619</id><published>2010-02-13T20:48:00.004-05:00</published><updated>2010-02-13T20:55:28.102-05:00</updated><title type='text'>Channel 2000</title><content type='html'>Posts about finding embedded linux in strange places are pretty common these days - but I couldn't resist this one.&lt;br /&gt;&lt;br /&gt;Flipping channels tonight put me on channel 2000 of my Time-Warner Digital Cable, and this was the signal:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_wPMwI2SStyw/S3dW6tfbquI/AAAAAAAAB1o/CRfH9JAGGJg/s1600-h/DSCN0980.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="http://3.bp.blogspot.com/_wPMwI2SStyw/S3dW6tfbquI/AAAAAAAAB1o/CRfH9JAGGJg/s400/DSCN0980.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5437910641716996834" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Forgive the snapshot.&lt;br /&gt;&lt;br /&gt;But is that Red Hat 9 (not FC 9), running 2.4.24 ? &lt;br /&gt;&lt;br /&gt;Is that my cable box? Its a standard cable company motorola DVR, which is a lot newer than 2.4.24, but it wouldn't be surprising to see MOT basing an embedded project off an old kernel. Of course, the TV is inherently networked - if not necessarily IP networked, and this kind of abandoned legacy code could be entertaining.&lt;br /&gt;&lt;br /&gt;Of course, it might not be the cable box at all - it might just be some kind of generating equipment gone bad - and the video signal ended up on channel 2000. Not sure what is supposed to be there - I wasn't aware there were any 4 digit channels on my system.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2343319991120421619?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2343319991120421619'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2343319991120421619'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2010/02/channel-2000.html' title='Channel 2000'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_wPMwI2SStyw/S3dW6tfbquI/AAAAAAAAB1o/CRfH9JAGGJg/s72-c/DSCN0980.JPG' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-4144266924352827612</id><published>2009-12-05T10:23:00.004-05:00</published><updated>2009-12-05T10:28:06.216-05:00</updated><title type='text'>Transforming Parenting</title><content type='html'>Rocker ran out of batteries way too often.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_wPMwI2SStyw/Sxp7nrjMI1I/AAAAAAAABr4/hdRNRw19u5I/s1600-h/DSCN0732.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 300px; height: 400px;" src="http://3.bp.blogspot.com/_wPMwI2SStyw/Sxp7nrjMI1I/AAAAAAAABr4/hdRNRw19u5I/s400/DSCN0732.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5411773823874507602" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt; I knew I kept those old DC transformers for a reason.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-4144266924352827612?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4144266924352827612'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4144266924352827612'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/12/transforming-parenting.html' title='Transforming Parenting'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_wPMwI2SStyw/Sxp7nrjMI1I/AAAAAAAABr4/hdRNRw19u5I/s72-c/DSCN0732.JPG' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-869827290966229959</id><published>2009-08-10T15:00:00.002-04:00</published><updated>2009-08-10T15:04:46.937-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='ip'/><category scheme='http://www.blogger.com/atom/ns#' term='virtualization'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><title type='text'>Making the switch</title><content type='html'>Well that's funny. &lt;br /&gt;&lt;br /&gt;Just a few hours after my last post, which &lt;a href="http://bitsup.blogspot.com/2009/08/not-switching-contexts.html"&gt;suggested that virtio based networking might be getting bested by the not-in-userspace v-bus,&lt;/a&gt; Michael Tsirkin posts an &lt;a href="https://lists.linux-foundation.org/pipermail/virtualization/2009-August/013525.html"&gt;in-kernel backend to virtio&lt;/a&gt;. Which puts the two on more or less the same procedural footing.&lt;br /&gt;&lt;br /&gt;Fire up the benchmarks?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-869827290966229959?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/869827290966229959'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/869827290966229959'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/08/making-switch.html' title='Making the switch'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-3821849435214390287</id><published>2009-08-10T12:01:00.004-04:00</published><updated>2009-08-10T12:10:06.642-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='ip'/><category scheme='http://www.blogger.com/atom/ns#' term='virtualization'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><title type='text'>(not) switching contexts.</title><content type='html'>a lot of bits have been spilled over virtual network performance in v-bus vs virtio-net/virtio-pci.. (aka alacrity vm vs traditional kvm/qemu).. this includes some pretty sensational(ist?) performance graphs: &lt;a href="http://developer.novell.com/wiki/index.php/AlacrityVM#Results"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;There are lots of details (and details do matter) but the first-order issue can probably be summed up thusly, from Avi Kivity on lklm:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;The current conjecture is that ioq outperforms virtio because the host side of ioq is implemented in the host kernel, while the host side of virtio is implemented in userspace.&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Perhaps context switching isn't such a minor detail afterall.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-3821849435214390287?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3821849435214390287'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3821849435214390287'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/08/not-switching-contexts.html' title='(not) switching contexts.'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-7746497923896218942</id><published>2009-07-21T17:12:00.004-04:00</published><updated>2009-07-21T17:35:44.277-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='ip'/><category scheme='http://www.blogger.com/atom/ns#' term='virtualization'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Caution - (S)Low Bridge Ahead</title><content type='html'>This post will not be satisfying. Someone has posted some great datapoints about virtualized packet forwarding, which is great. But they don't make a lot of sense. Which is not great. Nor is it satisfying.&lt;br /&gt;&lt;br /&gt;Oh well, I'm sure there will be a followup sometime in the future.&lt;br /&gt;&lt;br /&gt;In &lt;a href="http://marc.info/?l=qemu-devel&amp;m=124809923311179&amp;w=2"&gt;this thread&lt;/a&gt;, Or Gerlitz posts a new networking type for qemu (and by extension) kvm which are of course popular linux host virtualization packages. The networking type is "raw" and the driver couldn't be more simple - a (v)lan interface on the host is opened with a AF_PACKET socket and all of the packets that appear there are shoved through to the guest interface, and vice versa. &lt;br /&gt;&lt;br /&gt;This is a pretty direct way of doing things, but it has the unfortunate side effect that all of the guests and the host itself are aggregated onto one upstream switch port without any kind of bridge, switch, or router in between. This means that unless the upstream switch can do a u-turn when forwarding (and most of them will not), all of the guests and the host are isolated from each other. The normal way of doing things is to attach the guests and host together with a tun/tap socket and run a bridge on host. This bridge will do all the necessary forwarding so that everybody has full connectivity, and it lets you run iptables and ebtables on the host to boot.&lt;br /&gt;&lt;br /&gt;That's all well and good, but the really interesting part was the motivation for running around tun/tap/bridge anyhow: the poster runs a test with short udp transmissions over gige.. running it between two real (non-vm) hosts he sees 450K packets per second. The post doesn't mention what hardware is involved, so we'll just take it as a black box baseline. Switching the sender to be a qemu guest with traditional tap/bridge networking it plummets to just 195K. The "raw" interface gets that back up to 240K - which is still a far cry from 450, eh? &lt;br /&gt;&lt;br /&gt;Tap mode has 3 times the context switches than the raw version. I don't think I saw a number for the non-vm test. Other than that nothing, including the profiles, really jumps out.&lt;br /&gt;&lt;br /&gt;The whole thread is worth reading - but the main data points are &lt;a href="http://marc.info/?l=qemu-devel&amp;m=124809923311179&amp;w=2"&gt;here&lt;/a&gt; and &lt;a href="http://marc.info/?l=qemu-devel&amp;m=124815983232649&amp;w=2"&gt;here&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-7746497923896218942?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7746497923896218942'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7746497923896218942'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/07/caution-slow-bridge-ahead.html' title='Caution - (S)Low Bridge Ahead'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2775323495184954884</id><published>2009-07-20T11:51:00.002-04:00</published><updated>2009-07-20T11:59:34.182-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='ip'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='google'/><title type='text'>What is eating those Google SYN-ACKs?</title><content type='html'>In &lt;a href="http://bitsup.blogspot.com/2009/07/google-thinks-tcp-should-be-more.html"&gt;this post&lt;/a&gt;, I mentioned google was seeing huge packet loss on syn-acks from their servers. At the time it looked like 2%. That sounded nuts.&lt;br /&gt;&lt;br /&gt;It still sounds nuts.&lt;br /&gt;&lt;br /&gt;Someone else on the mailing list posted about that, and Jerry Chu of Google confirmed it:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Our overall pkt retransmission rate often goes over 1%. I was&lt;br /&gt;wondering if SYN/SYN-ACK pkts are less likely to be dropped&lt;br /&gt;by some routers due to their smaller size so we collected traces&lt;br /&gt;and computed SYN-ACK retransmissions rate on some servers.&lt;br /&gt;We confirmed it to be consistent with the overall pkt drop rate,&lt;br /&gt;i.e., &gt; 1% often.&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;You could imagine why the overall retransmission rate might be higher than the real drop rate due to jitter and various fast retransmit algorithms that might retransmit things that just hadn't been acknowledged quite yet. Even SYNs might be dropped at the host (instead of the network) due to queue overflows and such.. but we're talking about SYN-ACKs from busy servers towards what one would expect would be pretty idle google-searching clients. And these SYN-ACKs have giant timeouts (3 seconds - which is why Jerry was writing in the first place) so it certainly isn't a matter of over-aggressive retransmit. The only explanation seems to be packet loss. At greater than 1%&lt;br /&gt;&lt;br /&gt;wow.&lt;br /&gt;&lt;br /&gt;This probably has more to do with the global nature of google's audience than anything else. But still, TCP can really suck at loss rates that high. It must be very different than the desktop Internet I know (which is a fair-to-middlin cable service, not a fancy Fiber-To-The-Home setup which is becoming more common.)&lt;br /&gt;&lt;br /&gt;I wonder exactly where those losses happen.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2775323495184954884?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2775323495184954884'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2775323495184954884'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/07/what-is-eating-those-google-syn-acks.html' title='What is eating those Google SYN-ACKs?'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-4081949920668991580</id><published>2009-07-14T16:45:00.002-04:00</published><updated>2009-07-14T17:00:18.680-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='ip'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Google Thinks TCP should be more Aggressive by Default</title><content type='html'>&lt;a href="http://www.ietf.org/mail-archive/web/tcpm/current/msg04707.html"&gt;Really interesting post from Jerry Chu of Google&lt;/a&gt;. He says Google has data which shows that we ought to lower the initial RTO, increase the initial CWND, drop the min RTO, and reduced the delayed ack time out in TCP.&lt;br /&gt;&lt;br /&gt;Based on my own anecdotal data, I've done stuff like that in products I've worked on. Let's face it - 3 seconds is a freaking eternity. Processors, networks, and busses have all scaled but these constants remain the same. Jerry says Google has data that shows this is important. As the google data set is no doubt much more extensive than any I worked with, that's a really welcome post.&lt;br /&gt;&lt;br /&gt;Probably the most important data point Jerry shares is that "up to a few percentage points" in his data set exhibit a SYN-ACK retransmission from the google servers. Wow. (at least) 1 in 50 syn-acks needs to be retransmitted? That's not my experience at all, and if true on google scale it is absolutely fascinating. Are they generally seeing 2% packet loss on google tx? There's no way that they are seeing that.. google would appear to suck! So what's going on... ? Why is syn-ack rexmitted more than anything else? (and I'm assuming they are indeed lost, because otherwise lowering the timeout wouldn't be the right remedy..)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-4081949920668991580?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4081949920668991580'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4081949920668991580'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/07/google-thinks-tcp-should-be-more.html' title='Google Thinks TCP should be more Aggressive by Default'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-4570401618857352787</id><published>2009-06-28T11:37:00.003-04:00</published><updated>2009-06-28T11:43:50.270-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='ip'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><title type='text'>'Violation' is so prejorative</title><content type='html'>&lt;span style="font-style:italic;"&gt;Ben Hutchings says:&lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;&gt;[...] we also have architectural issues in violating layered&lt;br /&gt;&gt; software design&lt;br /&gt;&lt;br /&gt;Meanwhile, in the real world, we want to avoid copying data, so an skb doesn't belong to any specific protocol layer.&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Thank You Ben!&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Abstractions aren't inherently good. They are great if they help you build or maintain things that are otherwise too complex to understand or too tediuos to work on - but we have to vigilantly remember that losing those details also sometimes restricts the quality of what we can build too as somethings are just inherently complex.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-4570401618857352787?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4570401618857352787'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4570401618857352787'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/06/ben-hutchings-says-software-design.html' title='&apos;Violation&apos; is so prejorative'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-7047423786525923776</id><published>2009-06-01T11:04:00.005-04:00</published><updated>2009-06-01T11:44:49.838-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='startups'/><title type='text'>Software - Creative Economy, Blue Collar, or Just Rearranging Bits?</title><content type='html'>Via Ezra Klein, here is a really interesting bit from this Sunday's New York Times magazine:&lt;br /&gt;&lt;a href="http://www.nytimes.com/2009/05/24/magazine/24labor-t.html"&gt;The Case for Working with Your Hands&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I have often felt that as a software engineer the work I do is not substantively different than the work done by carpenters, architects, doctors, or in the case of the article above mechanics. For all of us, the most challenging work we do on a regular basis is that of trouble shooting. Sure, on a hand full of occasions I have had an opportunity to contribute a very high value insight to the construction of something new. On a couple of occasions I have even come up with an idea that enabled something that hadn't been done before. But generally, being a software "architect" involves making design choices from a bunch of well understood techniques, making measurements so you understand the problem space, and weaving the two together. Much of the job involves broadening your understanding of those choices and keeping up with the state of the art. The better you are at that, the better an "architect" you are. Writing code is a similar deal. It requires a whole lot of background, and a lot of diligence, but it is no more or less insightful than any of the other trades I mentioned. The quality of the code tends to correlate directly with the background and the diligence of its author. It is a very skillful occupation, but I question just how creative it is. &lt;br /&gt;&lt;br /&gt;But solving a good bug - well that's really the litmus test of engineering skill for me. The article I cite above is really about the mental satisfaction of working on motorcycle engine bugs. The details are different, but the process is not. Each is proof of creativity, knowledge, and critical thinking. Of my favorite 10 software engineering experiences, at least 7 have to involve resolution of an inscrutable bug. It can bring together the most unexpected sets of facts and insights and leave you at the end of the day (or week, or month) with a sense of satisfaction that little us can professionally do.&lt;br /&gt;&lt;br /&gt;A grand design is indeed grand. But most powerpoint architectures are not worth much more than the bits that hold it together. The value is in a robust working implementation. I think we vastly undervalue that as a society - and that's true of software, carpentry, architecture, and medicine all. Every once in a while a truly unique thought is illustrated in a powerpoint or an academic setting and I don't mean to undervalue the importance of a breakthrough idea. I am just saying that far too often we give the benefit of the doubt to design expressions and at the same time we don't value nearly enough the insight it takes to align all the details and make something run (be it software, an engine, your body, or a building).&lt;br /&gt;&lt;br /&gt;To borrow from the article, where the author is discussing his job creating magazine article abstracts:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;You might wonder: Wasn’t there any quality control? My supervisor would periodically read a few of my abstracts, and I was sometimes corrected and told not to begin an abstract with a dependent clause. But I was never confronted with an abstract I had written and told that it did not adequately reflect the article. The quality standards were the generic ones of grammar, which could be applied without my supervisor having to read the article at hand. Rather, my supervisor and I both were held to a metric that was conjured by someone remote from the work process — an absentee decision maker armed with a (putatively) profit-maximizing calculus, one that took no account of the intrinsic nature of the job. I wonder whether the resulting perversity really made for maximum profits in the long term. Corporate managers are not, after all, the owners of the businesses they run.&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;You see it time and time again when a business tries to scale up by "throwing resources" at a problem. As the real work becomes more abstract to the operators of the business, the quality of the work invariably declines. I would suggest the value of powerpoint architecture in such organizations rises at the same time.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-7047423786525923776?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7047423786525923776'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7047423786525923776'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/06/software-creative-economy-blue-collar.html' title='Software - Creative Economy, Blue Collar, or Just Rearranging Bits?'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-5006052961167052785</id><published>2009-05-26T11:59:00.000-04:00</published><updated>2009-05-27T13:21:50.998-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='vonage'/><category scheme='http://www.blogger.com/atom/ns#' term='voip'/><category scheme='http://www.blogger.com/atom/ns#' term='voipreocrder'/><category scheme='http://www.blogger.com/atom/ns#' term='caller-id'/><title type='text'>VOIP Recorder: Phonebook.. aka the "Mom is calling" feature</title><content type='html'>I am continuing to add little features to VOIP Recorder that help round out the overall functionality.&lt;br /&gt;&lt;br /&gt;The newest feature to join the party is a phonebook database. The entries in this database are automatically populated from Caller-ID information. They are designed to be easily edited in order to personalize the names associated with particular numbers.&lt;br /&gt;&lt;br /&gt;After personalizing a number that new name is used for the pop-ups and logs anytime that number calls (or is called). The obvious use for this is to rename "Jane Smith" to be "Mom" so that when Mom does call, it is noted immediately!&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_wPMwI2SStyw/ShQp2SogKCI/AAAAAAAAAzY/eKdWsBmYMas/s1600-h/phonebook.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 107px;" src="http://4.bp.blogspot.com/_wPMwI2SStyw/ShQp2SogKCI/AAAAAAAAAzY/eKdWsBmYMas/s320/phonebook.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5337937471032272930" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The phonebook feature is in revision "o" of the VOIP Recorder Preview. It is accessed through the Caller-ID tab of VR's web console.&lt;br /&gt;&lt;br /&gt;VOIP Recorder lets you record, block, and manage calls made with the Vonage &amp;trade; service. Check it out at &lt;a href="http://www.penbaynetworks.com/"&gt;www.penbaynetworks.com&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-5006052961167052785?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5006052961167052785'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5006052961167052785'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/05/voip-recorder-phonebook-aka-mom-is.html' title='VOIP Recorder: Phonebook.. aka the &quot;Mom is calling&quot; feature'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_wPMwI2SStyw/ShQp2SogKCI/AAAAAAAAAzY/eKdWsBmYMas/s72-c/phonebook.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2793048163945169315</id><published>2009-05-11T09:45:00.006-04:00</published><updated>2009-05-12T14:47:43.145-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='vonage'/><category scheme='http://www.blogger.com/atom/ns#' term='voip'/><category scheme='http://www.blogger.com/atom/ns#' term='block'/><category scheme='http://www.blogger.com/atom/ns#' term='voipreocrder'/><category scheme='http://www.blogger.com/atom/ns#' term='caller-id'/><title type='text'>VOIP Recorder: Filter Anonymous Calls</title><content type='html'>I released a fun new feature for VOIP Recorder today: filters based on anonymous calls. Just set the calling number to be "anonymous" and you can block anonymous calls without ever ringing the phone. They will go to voice mail instead. You can of course use the filter to toggle the default record/do-not-record status as well.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_wPMwI2SStyw/SggtXZn2L1I/AAAAAAAAAyw/UsbW8pOZRBE/s1600-h/anon-block-filter.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 54px;" src="http://2.bp.blogspot.com/_wPMwI2SStyw/SggtXZn2L1I/AAAAAAAAAyw/UsbW8pOZRBE/s400/anon-block-filter.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5334563638658608978" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Filters have always worked on any Caller-ID based name or number, and now they essentially work on the absence of a number as well.&lt;br /&gt;&lt;br /&gt;Anonymous call blocking is in revision N of the VOIP Recorder preview. VR makes more out of your Vonage&amp;trade service. Check it out at http://www.penbaynetworks.com/&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2793048163945169315?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2793048163945169315'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2793048163945169315'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/05/voip-recorder-filter-anonymous-calls.html' title='VOIP Recorder: Filter Anonymous Calls'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_wPMwI2SStyw/SggtXZn2L1I/AAAAAAAAAyw/UsbW8pOZRBE/s72-c/anon-block-filter.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-4697386706651730267</id><published>2009-05-01T23:46:00.007-04:00</published><updated>2009-05-04T09:54:42.461-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='vonage'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='voip'/><category scheme='http://www.blogger.com/atom/ns#' term='voipreocrder'/><title type='text'>VOIP Recorder: Listen Live</title><content type='html'>I've had the opportunity to add a few new features to my Vonage call recording application, &lt;a href="http://www.penbaynetworks.com/"&gt;VOIP Recorder&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The most entertaining feature is "Listen Live". That will stream the audio from any active phone call to your desktop in more or less real time. That's neat.&lt;br /&gt;&lt;br /&gt;I have also added easy buttons on the "at-a-glance" screen to toggle the recording of an individual call on or off. These buttons compliment the touch tone sequences or Caller-ID based programmable filters that provide similar functionality.&lt;br /&gt;&lt;br /&gt;Feedback on the first preview release has started to come in. Generally, it has been quite positive. A few people had trouble with the auto-discover portion of the program. I have made some updates to those algorithms to deal with more topologies and it seems more robust now. If you tried out VOIP Recorder earlier, and had problems auto-discovering your ATA, try and download the new release (revision 1-M or greater) and see if that helps. All accounts have been updated with the new release. If you have a problem please be sure to write me so we can make VOIP Recorder even better.&lt;br /&gt;&lt;br /&gt;Also, thanks to an idea from Steve, I have added optional courtesy beeps. These are short beeps played periodically to remind everyone about the call recording. You can configure if they are played and, if so, how often they are played. They are off by default. I like the way they sound - they make a nice alternative to the full "recording" announcement insertion.&lt;br /&gt;&lt;br /&gt;Last in the new feature department is the addition of a simple "*" filter which matches everything. This lets you write filters that, for instance, whitelist some specific phone numbers but block everyone else. Thanks to Chad for pointing out that omission.&lt;br /&gt;&lt;br /&gt;So there is lots going on in the world of VOIP Recorder. You should check out the new release at http://www.penbaynetworks.com/ - Linux, Macs, and Windows are all supported for recording calls made with Vonage(tm), as well as orchestrating pop-up notifications and call blocking based on Caller-ID information.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-4697386706651730267?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4697386706651730267'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4697386706651730267'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/05/voip-recorder-listen-live.html' title='VOIP Recorder: Listen Live'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-7230334334457643833</id><published>2009-04-19T09:58:00.003-04:00</published><updated>2009-04-19T11:46:24.291-04:00</updated><title type='text'>device_create() and the linux shifting API</title><content type='html'>The kernel API for device_create() in 2.6.26 and previous versions was:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;extern struct device *device_create(struct class *cls, struct device *parent,&lt;br /&gt;                                   dev_t devt, const char *fmt, ...)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;and starting in 2.6.27 it changed to:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;struct device *device_create(struct class *cls, struct device *parent,&lt;br /&gt;                             dev_t devt, void *drvdata,&lt;br /&gt;                             const char *fmt, ...)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Note the insertion of a fifth argument. In this case it is a void * at the fourth position in a function that takes a ... argument list.&lt;br /&gt;&lt;br /&gt;This is more dangerous that the usual unstable evolutions in kernel APIs in that legacy code may continue to compile &lt;b&gt;without warning&lt;/b&gt; on newer kernels, but it will of course crash as the first argument that was intended for the formatting string is now treated as the formatting string itself.&lt;br /&gt;&lt;br /&gt;Some code is going to live out of the tree and trip over this. And some code is always going to live out of the tree - if for no other reason than the folks who control the commits have to (and should!) make judgments on what is appropriate, but of course other folks will disagree and carry on with their work. TCP Offload Engines are a good example of that kind of diversity. &lt;br /&gt;&lt;br /&gt;Given that, I wonder what the reason for reusing the device_create() name was in between two incompatible versions of that function. There actually was an interim version of the new function called device_create_drvdata() that was used to migrate all of the in-tree uses over to the new style. At the end all the drvdata() versions were renamed back to device_create() where a safer path would seem to have been to simply remove device_create() all together to avoid confusion.&lt;br /&gt;&lt;br /&gt;oh well, its not a big deal - but maybe this post will serve as google bait to help someone else resolve the issue more quickly than I could.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-7230334334457643833?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7230334334457643833'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7230334334457643833'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/04/devicecreate-and-linux-shifting-api.html' title='device_create() and the linux shifting API'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2350119042532340134</id><published>2009-04-16T15:52:00.005-04:00</published><updated>2009-04-16T16:22:47.103-04:00</updated><title type='text'>Recording calls made with Vonage</title><content type='html'>I am looking for early adopters (isn't that a nice euphemism for tester?) for a new project I have been working on: VOIP Recorder.&lt;br /&gt;&lt;br /&gt;VOIP Recorder is desktop software (available for Windows, Mac, and Linux) that records normal Vonage calls without any special configuration. Just run it on the same LAN as your Vonage ATA and VR will redirect the VOIP calls through your dekstop where it can make a copy. Playback and archive management is through an embedded web interface.&lt;br /&gt;&lt;br /&gt;Read all about it and register for a free download at &lt;a href="http://www.penbaynetworks.com/"&gt;http://www.penbaynetworks.com/&lt;br /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;VR has other features too: pop-ups with Caller-ID info, optional insertion of announcements, touch-tone based triggers, Caller-ID based call blocking, voicemail tracking, and more.&lt;br /&gt;&lt;br /&gt;&lt;div align=center&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_wPMwI2SStyw/SeeQA_PdDpI/AAAAAAAAAqc/N7uWV_gCxgc/s1600-h/popup.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 106px;" src="http://2.bp.blogspot.com/_wPMwI2SStyw/SeeQA_PdDpI/AAAAAAAAAqc/N7uWV_gCxgc/s400/popup.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5325383431039553170" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;small&gt;&lt;b&gt;An Example Caller-ID Popup&lt;/b&gt;&lt;/small&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2350119042532340134?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2350119042532340134'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2350119042532340134'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/04/recording-calls-made-with-vonage.html' title='Recording calls made with Vonage'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_wPMwI2SStyw/SeeQA_PdDpI/AAAAAAAAAqc/N7uWV_gCxgc/s72-c/popup.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-4128585251156088716</id><published>2009-02-06T19:20:00.005-05:00</published><updated>2009-02-10T10:12:43.384-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Increasing Upload Speed from Firefox on Windows</title><content type='html'>Sometimes bugs are more interesting to work on than features - they have that mysterious quality about them and give a satisfying feeling when you figure it out.&lt;br /&gt;&lt;br /&gt;&lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=454990"&gt;This one&lt;/a&gt; was brought to my attention by &lt;a href="http://starkravingfinkle.org/blog/"&gt;Mark Finkle&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;It basically boiled down to HTTP POSTs from Firefox on Windows being slower than they are in Internet Explorer, and also slower than they are in Firefox on OS X or Linux. (IE on windows and the non-windows platforms all perform about the same, with FF on windows lagging behind).&lt;br /&gt;&lt;br /&gt;The culprit turned out to be the TCP congestion window. Firefox never had more than 8KB of un-acked data outstanding. If you have a network path with a high bandwidth-delay product, that isn't going to cut it.&lt;br /&gt;&lt;br /&gt;Windows (&lt;a href="http://technet.microsoft.com/en-us/library/cc781532.aspx"&gt;up to and including Vista&lt;/a&gt;) has an 8KB default sending window. Or so I found out thanks to Google. &lt;br /&gt;&lt;br /&gt;Autotuning that buffer size is standard practice on OS X and Linux and has been for a long while. Vista autotunes the receive buffer (but not XP according to what I read), but the send buffer is a small fixed value. IE, realizing that its a web 2.0 kinda world out there full of User Generated Content, must increase that value from its default - because I can look at the IE tcpdump traces and see &gt;80KB of un-acknowledged data (there would be more, but the max window size is not the limit at whatever value they have it set to) in the same way I do with a trace of Firefox on Linux. &lt;br /&gt;&lt;br /&gt;The Linux default is 128KB for any reasonably modern machine.&lt;br /&gt;&lt;br /&gt;Fortunately that can be controlled on a system wide basis through a registry preference, or on a per socket basis by setting SO_SNDBUF. I &lt;a href="https://bug454990.bugzilla.mozilla.org/attachment.cgi?id=360988"&gt;submitted a patch&lt;/a&gt; that does the latter if the network.tcp.sendbuffer preference is set - the patch also sets the pref for windows.&lt;br /&gt;&lt;br /&gt;If you would suffer from this, I see three options:&lt;br /&gt;1] Wait until my patch (or a later rev of it) ends up in an official build&lt;br /&gt;2] Set the registry property to change it for your whole Windows install - &lt;a href="http://support.microsoft.com/kb/950326"&gt;KB 950326&lt;/a&gt;. &lt;br /&gt;3] you might be able to build a k3wl binary add-on that does the same thing as my patch in a crazy way. Fame, fortune, and faster flickr and picasa uploads await you.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-4128585251156088716?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4128585251156088716'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4128585251156088716'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/02/increasing-upload-speed-from-firefox-on.html' title='Increasing Upload Speed from Firefox on Windows'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-594459878520008906</id><published>2009-01-31T17:16:00.003-05:00</published><updated>2009-01-31T17:31:40.151-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='weather'/><title type='text'>Winter Antagonists: A Short Story in Pictures</title><content type='html'>&lt;table style="width:auto;"&gt;&lt;tr&gt;&lt;td&gt;&lt;a href="http://picasaweb.google.com/lh/photo/sLSBOltekw6agEsrwohIcg?feat=embedwebsite"&gt;&lt;img src="http://lh6.ggpht.com/_wPMwI2SStyw/SYTNhTwQ73I/AAAAAAAAAl0/pPYLqya18zQ/s400/DSCN0090.JPG" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align=center&gt;&lt;h3&gt;Snow&lt;/h3&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;&lt;table style="width:auto;"&gt;&lt;tr&gt;&lt;td&gt;&lt;a href="http://picasaweb.google.com/lh/photo/qlVwrHZMh0purA1B7fnxsQ?feat=embedwebsite"&gt;&lt;img src="http://lh3.ggpht.com/_wPMwI2SStyw/SYTNgRRFqdI/AAAAAAAAAls/v8oLjEuB0Eo/s400/DSCN0087.JPG" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align=center&gt;&lt;h3&gt;Cold&lt;/h3&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-594459878520008906?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/594459878520008906'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/594459878520008906'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/01/winter-antagonists-short-story-in.html' title='Winter Antagonists: A Short Story in Pictures'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh6.ggpht.com/_wPMwI2SStyw/SYTNhTwQ73I/AAAAAAAAAl0/pPYLqya18zQ/s72-c/DSCN0090.JPG' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-5536737303904805360</id><published>2009-01-30T14:11:00.011-05:00</published><updated>2009-04-19T09:58:30.349-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='os x'/><category scheme='http://www.blogger.com/atom/ns#' term='voip'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><title type='text'>Getting Vonage Caller-ID display notifications on Linux &amp; Mac without a soft phone</title><content type='html'>(Update - April 2009: See also http://bitsup.blogspot.com/2009/04/recording-calls-made-with-vonage.html and http://www.penbaynetworks.com/ for a one-stop answer to this problem on windows, mac, and linux) &lt;br /&gt;&lt;br /&gt;I use vonage. What they really sell you is a POTS&lt;-VOIP-&gt;POTS tunnel where they provide you one of the POTS/VOIP bridges that you install in your house in order to bring your old traditional phones on line. They also sell a soft-phone that does not include this bridge, but that isn't what I use.&lt;br /&gt;&lt;br /&gt;It's a good service - unmetered calling for the places I call, and it comes with a bunch of phone features for a flat $28/month. The VOIP bits are done with SIP the usual way.&lt;br /&gt;&lt;br /&gt;So that's lovely, but by default it doesn't provide any access to the SIP data beyond the POTS bridge and that presents a challenge to unlocking your data.&lt;br /&gt;&lt;br /&gt;What I would appreciate would be desktop display notifications of the caller id data when the phone rings. This is pretty standard stuff when dealing with soft phones, but it seems to be a bit trickier in the vonage case.&lt;br /&gt;&lt;br /&gt;So I rolled my own for KDE4 and OS X, which are the screens I spend my time staring at.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_wPMwI2SStyw/SYNjK60WhOI/AAAAAAAAAk8/4UhbaQdNdoU/s1600-h/cid-kde-ss1.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 326px; height: 105px;" src="http://3.bp.blogspot.com/_wPMwI2SStyw/SYNjK60WhOI/AAAAAAAAAk8/4UhbaQdNdoU/s400/cid-kde-ss1.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5297186625956512994" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Step 1: Find the SIP invitations.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The SIP protocol is UDP unicast to the vonage "router". If you install the router (in my case a motorolla vt2142) doing double duty as your broadband gateway router, then it will consume those packets without ever sending them onto your LAN. If they're not on the LAN, then you can't really capture them and display the precious info inside, so a different arrangement is required.&lt;br /&gt;&lt;br /&gt;I put the vonage box behind a Linux bridge. The bridge is just a linux box (in this case my file, email, and print server) with 2 interfaces. Those interfaces don't have IP addresses, instead they are brought together into logical interface commonly called br0. do this as: "brctl addbr br0; brctl addif br0 eth0; brctl addif br0 eth1" .. once you have done that the machine will act like an ethernet switch, forwarding packets between interfaces as necessary. You could set it up as an IP router instead, but then you would need different subnets and all manner of other duplicated architecture. The bridge is fine. The server doesn't need an IP address to be a bridge, but it does in order to keep doing those file/print things.. I just ran dhcp as normal on the new br0 interface. Now if you run tcpdump on the eth1 (or more specificlly the interface "behind" the bridge with the vonage device) you will see the vonage traffic crossing the bridge. Reading that data it is easy to see my SIP control runs on UDP port 10000. I hear other routers typically use port 5061.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Step 2: Capture those invitations&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Now that you've got access to the SIP data, let's do something with it. I used the NFQUEUE iptables interface. NFQUEUE lets you shunt packets to userspace for filtering while they are still in the network stack. I wrote a simple iptables rule that matches data coming into port 10000 and places those packets into queue number 5061 for consumption by a userspace program: "/sbin/iptables -A FORWARD --protocol udp --dport 10000 -j NFQUEUE --queue-num 5061 -d 192.168.16.0/24"&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Step 3: Process the invitations and generate network notifications&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I wrote a little C program that runs on the bridge which consumes the packets in the NFQUEUE. For each packet it tries to figure out if this is a SIP invitation and if it is, what is the caller id info. All packets are acknowledged back to netfilter/iptables so they are passed onto the vonage router (which is what makes the phone ring!). If you wanted to do some automatic call blocking, this would be a good place to just drop the invite on the floor and then the phone would never ring.&lt;br /&gt;&lt;br /&gt;The producer-nfqueue program is &lt;a href="http://www.ducksong.com/sip-caller-id-display-for-linux-and-mac.php"&gt;available here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;If a piece of caller-id info is found it is broadcast to the local LAN in two different formats. The first format contains just a magic number to identify the format and the caller id strings. It is sent on UDP port 7651. The second broadcast is in Growl network format. &lt;a href="http://growl.info/"&gt;Growl&lt;/a&gt; is a daemon commonly used on mac OS X to display system notifications. Anybody running growl with "listen for incoming notifications" and "allow remote application registration" enabled will see a popup as soon as this broadcast takes place.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_wPMwI2SStyw/SYNkSzftFPI/AAAAAAAAAlE/t3-gy6vzGAw/s1600-h/cid-osx-ss1.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 114px;" src="http://2.bp.blogspot.com/_wPMwI2SStyw/SYNkSzftFPI/AAAAAAAAAlE/t3-gy6vzGAw/s400/cid-osx-ss1.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5297187860941444338" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Step 4: KDE applet.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;On my linux KDE4 environment, I wrote a kapplet that used a QSystemTrayIcon overload to listen for the port 7651 broadcasts. The effect is nice, but I would have rather had something gnome/kde cross platform. From doing some reading it appears I can inject something into dbus and knotify4 will pop it up as will gnotify, but I couldn't get that to work easily. It would also be a potential signal to things like pulseaudio to turn down the volume. oh well, maybe next version. The applet is &lt;a href="http://www.ducksong.com/sip-caller-id-display-for-linux-and-mac.php"&gt;available here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;and now I can be lazy and find out that the ringing phone isn't one I want to answer without having to break my train of thought. mission accomplished?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-5536737303904805360?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5536737303904805360'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5536737303904805360'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2009/01/getting-vonage-caller-id-display.html' title='Getting Vonage Caller-ID display notifications on Linux &amp;amp; Mac without a soft phone'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_wPMwI2SStyw/SYNjK60WhOI/AAAAAAAAAk8/4UhbaQdNdoU/s72-c/cid-kde-ss1.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-8032062093738410734</id><published>2008-11-24T09:35:00.004-05:00</published><updated>2008-11-24T09:48:21.032-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='hardware'/><title type='text'>Dallas One-Wire Temperature Network - Followup</title><content type='html'>I managed to get the little 5-volt 1 wire network I &lt;a href="http://bitsup.blogspot.com/2008/11/one-wire-home-temperature-network-with.html"&gt;mentioned earlier&lt;/a&gt;, built up and running. Even fishing the wires wasn't too hard, "thanks" to the distinct lack of insulation in some of my walls. I did have to patch a section of the main run (I used a 100 ft run with short stubs to hold the sensors) when I put a staple through the cable while attaching it to the rafters. Doh!&lt;br /&gt;&lt;br /&gt;The graphs make it look colder inside than it really is as I purposely put the sensors in all the cold spots. The kitchen has a zoned radiator that we can use if we are hanging out there, and the wild swings in my office are just the result of me closing the door when I'm not in there. The dining room is the warmest and I will eventually add a sensor there to put an upper bound on the data.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.ducksong.com/misc/tempmon/weekly.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://www.ducksong.com/misc/tempmon/weekly.png" border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I didn't really like any of the pre-canned software options for it, so I rolled my own using xmgrace, digitemp, rsync, and cron. This is pretty crude, but it is a decent placeholder.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-8032062093738410734?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8032062093738410734'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8032062093738410734'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/11/dallas-one-wire-temperature-network.html' title='Dallas One-Wire Temperature Network - Followup'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-3788488029318090188</id><published>2008-11-10T19:45:00.012-05:00</published><updated>2008-11-10T23:39:21.230-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hardware'/><title type='text'>One-Wire Home Temperature Network with Linux - a prototype</title><content type='html'>So, I live in Maine. The locals say, quite correctly, that its the way life oughta be. The way we keep massive hordes away from our paradise is to make it kinda chilly 6 months of the year. The fact that I live in a circa 1830 farmhouse is a constant reminder of that.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_wPMwI2SStyw/SRjXiA_CfuI/AAAAAAAAAdg/zQkib6asCqg/s1600-h/DSCN2357.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_wPMwI2SStyw/SRjXiA_CfuI/AAAAAAAAAdg/zQkib6asCqg/s320/DSCN2357.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5267196743589723874" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Last winter I froze a kitchen water pipe in the basement. It was due to a gap in the foundation. I patched up the hole and insulated the pipe. (Oddly I inherited 90% of the water pipes insulated, but not this stretch.) The night it froze was not the coldest of the year (it was probably 20 above the coldest mark), but there was a strong wind and no snow on the ground so the cold air just swept through the gap onto the pipe. The rest of the winter passed without further incident.&lt;br /&gt;&lt;br /&gt;I put a thermometer in the vicinity, but frankly the basement is kinda dark and creepy and you don't go venturing into the back in the winter without a reason. So the thermometer wasn't all that useful. The sensible thing to do would be to put a $20 wireless thermometer (e.g. &lt;a href="http://www.amazon.com/gp/product/B0013BNJV2?ie=UTF8&amp;tag=wwwducksongco-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=B0013BNJV2"&gt;AcuRite Digital Wireless Weather Thermometer Indoor Outdoor&lt;/a&gt;&lt;img src="http://www.assoc-amazon.com/e/ir?t=wwwducksongco-20&amp;l=as2&amp;o=1&amp;a=B0013BNJV2" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /&gt;) down there with the readout in the kitchen. &lt;br /&gt;&lt;br /&gt;But why do that, when we could go for geek overkill?&lt;br /&gt;&lt;br /&gt;The wireless thermometer has some downsides: the receiver is clutter for something rarely used, I still have to remember to look at it, it has no log and the most interesting data is when I am sleeping, it only measures one place per piece of clutter, it lacks an alarm facility, and it needs batteries that inevitably will die in situ.&lt;br /&gt;&lt;br /&gt;Clearly this is a job for a $500 computer instead, right?&lt;br /&gt;&lt;br /&gt;It appears this is generally done with a "one-wire" network. 1-wire is a dallas semi standard for simple devices that can be powered and read with one wire. You daisy chain them together, typically using one wire from a piece of cat-5. You really need 2 wires, the second is for ground.&lt;br /&gt;&lt;br /&gt;This is really neat stuff. There are sensors for temp, humidity, pressure, and even ground water for your garden. A bunch of places sell this stuff in kits, where the sensors come in a little box with an rj-45 "in" and an "out" if you want to chain another senor off of it. Or you can just buy the sensors for a lot less and solder the leads onto the ground and data wires wherever you want a reading. Each one has a serial number it reports as part of the 1-wire protocol so you can tell them apart, and its pretty easy to auto explore the chain and then power each sensor individually to get a reading (don't read them simultaneously). &lt;br /&gt;&lt;br /&gt;The temperature gadgets, completely assembled wirth rj-45 connectors and little cases can run $25 each. But the sensors themselves &lt;a href="http://www.hobby-boards.com/catalog/product_info.php?cPath=26&amp;products_id=93"&gt;are just $4 or less&lt;/a&gt; in single quantities. (Buy them by the thousand and you can get them for a buck.) I bought just the sensors.&lt;br /&gt;&lt;br /&gt;In addition to the sensors, you also need a driver circuit to drive the power, do the polling, etc. There are several &lt;a href="http://www.maxim-ic.com/appnotes.cfm/an_pk/244"&gt;schematics for building them&lt;/a&gt;, but I gave into my &lt;span style="font-weight:bold;"&gt;software&lt;/span&gt; engineering side and bought a &lt;a href="http://www.hobby-boards.com/catalog/product_info.php?cPath=23&amp;products_id=1503"&gt;prebuilt one wire usb interface&lt;/a&gt;, for $28.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_wPMwI2SStyw/SRkHn9FAgPI/AAAAAAAAAdo/FHjcqZsswSg/s1600-h/DSCN0022.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://3.bp.blogspot.com/_wPMwI2SStyw/SRkHn9FAgPI/AAAAAAAAAdo/FHjcqZsswSg/s320/DSCN0022.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5267249622178365682" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;My primary interest is in the basement, but as long as I'm stringing a cable along the rafters I might as well measure a few different points. So I'll grab my office, the basement, the dining room, and the kitchen. I might stick the final sensor outside and cover it in shrinkwrap so there is a "control" number to compare the others to. We are heating this season with a pellet stove instead of central heat, so this information ought to help determine the effectiveness of the various fan placements I'm considering.&lt;br /&gt;&lt;br /&gt;Once everything is in place the data can be captured a myriad of ways. The most common are the one wire filesystem, or by using digitemp. (google bait: if digitemp returns CRC errors, clear the configuration file it saves - this cost me hours of resoldering connections that were just fine.) After that &lt;a href="http://digitemp.com/"&gt;normal linux graphing software can go to town&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I need to order a couple more sensors to layout the final network - I wanted to build a prototype first. It was easy enough.&lt;br /&gt;&lt;br /&gt;To do it you'll need: an RJ-11 crimping tool (not rj-45 for ethernet), an rj-11 end. Some cat-5 or cat-3 that is long enough to run your network, and a soldering tool.&lt;br /&gt;&lt;br /&gt;The way I wired it, only the middle two wires matter for the crimp. I have blue/white on the left and blue on the right as viewed with the clip down and the contacts away from you. We're going to use white for the signal and blue for the ground. Attach one of the sensors to the end of the cable. The signal pin is the middle one, and the ground pin is on the left. (Left is defined with the flat side of the sensor facing up and the leads pointed down. The right lead is not used, you can trim it off (I've just bent it out of place in this picture).&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_wPMwI2SStyw/SRkIa8fdurI/AAAAAAAAAdw/5JLBsPd_pIg/s1600-h/DSCN0023.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://4.bp.blogspot.com/_wPMwI2SStyw/SRkIa8fdurI/AAAAAAAAAdw/5JLBsPd_pIg/s320/DSCN0023.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5267250498194225842" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;With the end sensor in place, you can add as many more as you like along the line by just stripping the insulation in place and soldering the leads right onto the cable wherever they need to go. This 1-wire stuff is extremely forgiving of my attempts to pretend that I know what I'm doing with electronics hardware.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_wPMwI2SStyw/SRkIog5IpbI/AAAAAAAAAd4/QqMTuKJOgfk/s1600-h/DSCN0024.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://2.bp.blogspot.com/_wPMwI2SStyw/SRkIog5IpbI/AAAAAAAAAd4/QqMTuKJOgfk/s320/DSCN0024.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5267250731303871922" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Obviously you need to tape up or shrink wrap all the joints, I left them open on the prototype for photos. With this in place, &lt;span style="font-style:italic;"&gt;digitemp -a -r 800&lt;/span&gt; happily reports two different sensors within half a degree of each other. Huzzah!&lt;br /&gt;&lt;br /&gt;Now its off to grab the sensors I need for the real thing, installing the cable in the basement along with the other probe points, and getting a graph and alarm server going on the IP network. Such fun!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-3788488029318090188?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3788488029318090188'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3788488029318090188'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/11/one-wire-home-temperature-network-with.html' title='One-Wire Home Temperature Network with Linux - a prototype'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_wPMwI2SStyw/SRjXiA_CfuI/AAAAAAAAAdg/zQkib6asCqg/s72-c/DSCN2357.JPG' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-5776093031489324400</id><published>2008-11-08T16:05:00.007-05:00</published><updated>2008-11-10T14:56:15.371-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='dns'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>DNS Prefetching for Firefox</title><content type='html'>Recently I implemented &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=453403"&gt;patches&lt;/a&gt; to implement DNS prefetching in Firefox. I am primarily interested in their impact on Fennec (aka Mobile Firefox), but it looks like they will land first in Firefox 3.1 beta2. The, hopefully final, glitches are being shaken out of the patch now.&lt;br /&gt;&lt;br /&gt;Google Chrome has a &lt;a href="http://blog.chromium.org/2008/09/dns-prefetching-or-pre-resolving.html"&gt;feature like this&lt;/a&gt; too.&lt;br /&gt;&lt;br /&gt;DNS resolutions are always dominated by latency instead of bandwidth. Particularly on mobile networks the latencies are very high. That makes them perfect candidates for speculative pre-fetching. The advantage is in the latency improvement - instead of waiting for a hostname lookup when you click a link, do that lookup while you're reading the page the link is embedded in. Because the lookups are so small (generally one runt packet in and out) the cost of any wasted over optimistic lookups really doesn't impact the performance of browsing. Good payoff at low cost, the best of both worlds.&lt;br /&gt;&lt;br /&gt;The basic benefit is simple: if you click on a link using a new hostname, you save a round trip time. On some networks this can be a substantial improvement (800ms or more) in responsiveness. &lt;a href="http://chrome-hacks.net/2008/09/09/google-chrome-does-dns-prefetching-for-faster-browsing/"&gt;Some describe this simply &lt;/a&gt;as "figuring out the IP address of every link before you click on it". &lt;br /&gt;&lt;br /&gt;The Firefox implementation takes this approach one step further than just pre-resolving anchor href hostnames. It uses the prefetch logic on URLs that are being included in the current document. By this I mean that it uses the prefetch logic on things like images, css, and jscript that are being loaded right away, in addition to anchor links which might be clicked on at a slightly later time. &lt;br /&gt;&lt;br /&gt;At first that seems non-sensical. How can you pre-fetch the DNS for something you are fetching right now? Where does the "pre" come in? The answer is not so much in the definition of "pre" as it is in the definition of "right now". Most HTTP User Agents, Firefox being no exception, limit their number of simultaneous connections and hosts. Typical pages embed quite a few objects and it is easy to run into these limits. When this happens the browser queues some of the requests. The Firefox pre-fetch DNS implementation allows those queued requests to overlap the high latency host resolution with whatever transfers might be going on without creating an excess level of parallelization.&lt;br /&gt;&lt;br /&gt;While this is just a secondary benefit, it can be meaningful. For example, on the day I grabbed a snapshot of http://planet.mozilla.org/ it required 23 unique DNS resolutions in order to render the base page. Most of these were in img URLs. When loading the page with the prefetch patches, even with a cold cache, 16 of them were either fully completed when needed for the first connection or at least already in progress. The result, measured on an EDGE network, was a 4% overall improvement in page load time. Not bad for something that does not reduce bandwidth consumption in any way. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Configuration&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Basically, it just works. You don't need to do anything. But there are a few configurables out there for both browsers and content providers.&lt;br /&gt;&lt;br /&gt;First, as a browser you might want no part of this. Fair enough - its your browser. If you set the preference network.dns.disablePrefetch to true the prefetch code will never take effect, no matter what any other configuration is set to.&lt;br /&gt;&lt;br /&gt;Furthermore, as a security measure, prefetching of embedded link hostnames is not done from documents loaded over https. If you want to allow it in that context too, just set the preference network.dns.disablePrefetchFromHTTPS to true.&lt;br /&gt;&lt;br /&gt;Content providers have a couple neat tricks available too. These are meant to be &lt;a href="http://dev.chromium.org/developers/design-documents/dns-prefetching"&gt;compatible with Chrome.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;For content to opt out of DNS pre-resolution it can be served with the x-dns-prefetch-control HTTP header set to off. The equivalent meta http-equiv element can be used instead of a response header too:&lt;br /&gt;&amp;lt;meta http-equiv="x-dns-prefetch-control" content="off"&amp;gt;&lt;br /&gt;&lt;br /&gt;Setting content to on will reverse the effect. You can never turn pre-fetching on in a browser that has it disabled by preference, but you can undo the impact of a previous x-dns-prefetch control command. In this way, different content provider policies can apply to different portions of the document.&lt;br /&gt;&lt;br /&gt;The last configuraton possibility allows the content provider to force the lookup of a particular hostname without providing an anchor using that name. This is done with the link tag:&lt;br /&gt;&amp;lt;link rel="dns-prefetch" href="http://www.spreadfirefox.com/"&amp;gt;&lt;br /&gt;&lt;br /&gt;The href attribute can contain a full URL, or just a hostname. Hostname only attributes should preceed the hostname with two slashes:&lt;br /&gt;&amp;lt;link rel="dns-prefetch" href="//www.spreadfirefox.com"&amp;gt;&lt;br /&gt;&lt;br /&gt;Content providers might use the link notation in a site-wide home page in order to preload hostnames that are widely used throughout the site but perhaps not on the home page.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-5776093031489324400?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5776093031489324400'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5776093031489324400'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/11/dns-prefetching-for-firefox.html' title='DNS Prefetching for Firefox'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-8166041434868934540</id><published>2008-09-28T11:00:00.003-04:00</published><updated>2010-09-02T07:47:32.003-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='dns'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Asynchronous DNS lookups on Linux with getaddrinfo_a()</title><content type='html'>A little while back I posted about a &lt;a href="http://bitsup.blogspot.com/2008/09/minutia-of-getaddrinfo-and-64-bits.html"&gt;bug in the glibc getaddrinfo()&lt;/a&gt; implementation which resulted in many CNAME lookups having to be repeated. At that time I teased a future post on the topic of the little known getaddrinfo_a() function - here it is.&lt;br /&gt;&lt;br /&gt;type "man getaddrinfo_a". I expect you will get nothing. That was the case for me. Linux is full of non-portable, under-documented, but very powerful interfaces and tools. The upside of these tools is great - I recently referred to &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=364315"&gt;netem and ifb &lt;/a&gt;which can be used for easy network shaping, and interfaces like tee(), splice() and epoll() are also hugely powerful, but woefully underutilized. I always get a thrill when I stumble across one of these.&lt;br /&gt;&lt;br /&gt;Part of the reason for their low profile is portability. And there are times when that matters - though I think it is cited as a bedrock principle more than is really necessary. I think the larger reason is that some of these techniques lack the documentation, web pages, and references in programmer pop-culture necessary to be ubiquitously useful.&lt;br /&gt;&lt;br /&gt;Maybe this post will help getaddrinfo_a find its mojo.&lt;br /&gt;&lt;br /&gt;This little jewel is a standard part of libc, and has been for many years - you can be assured that it will be present in the runtime of any distribution of the last several years. &lt;br /&gt;&lt;br /&gt;getaddrinfo_a() is an asynchronous interface to the DNS resolution routine - getaddrinfo(). Instead of sitting there blocked while getaddrinfo() does its thing, control is returned to your code immediately and your code is interrupted at a later time with the result when it is complete.&lt;br /&gt;&lt;br /&gt;Most folks will realize that this is a common need when dealing with DNS resolution. It is a high latency operation and when processing log files, etc, you often have a need to do a lot of them at a time. The asynchronous interface lets you do them in parallel - other than the waiting-for-the-network time, there is very little CPU or even bandwidth overhead involved in a DNS lookup. As such, it is a perfect thing to do in parallel. You really do get linear scaling.&lt;br /&gt;&lt;br /&gt;The best documentation seems to be in the &lt;a href="http://people.redhat.com/drepper/asynchnl.pdf"&gt;design document&lt;/a&gt; from Ulrich Drepper. This closely reflects the reality of what was implemented. Adam Langley also has an &lt;a href="http://www.imperialviolet.org/2005/06/01/asynchronous-dns-lookups-with-glibc.html"&gt;excellent blog post&lt;/a&gt; with an illustration on how to use it. Actually, the header files are more or less enough info too, if you know that getaddrinfo_a() even exists in the first place.&lt;br /&gt;&lt;br /&gt;The good news about the API is that you can submit addresses in big batches with one call. &lt;br /&gt;&lt;br /&gt;The bad news about the API is that it offers callback either via POSIX signal handling, or by spawning a new thread and running a caller supplied function on it. My attitude is generally to avoid making signal handling a core part of any application, so that's right out. Having libraries spawn threads is also a little disconcerting, but the fact that that mechanism is used here for the callback is really minor compared to how many threads getaddrinfo_a() spawns internally.&lt;br /&gt;&lt;br /&gt;I had assumed that the invocation thread would send a few dns packets out onto the wire and then spawn a single thread listening for and multiplexing the responses.. or maybe the listening thread would send out the requests as well and then multiplex the responses. But reading the code shows it actually creates a pretty sizable thread pool wherein each thread calls and blocks on getaddrinfo().&lt;br /&gt;&lt;br /&gt;This is more or less the technique most folks roll together by hand, and it works ok - so it is certainly nice to have predone and ubiquitously available in libc rather than rolling it by hand. And it is ridiculous to code it yourself when you are already linking to a library that does it that way. But it seems to have some room for improvement internally in the future.. if that happens, its nice to know that at least the API for it is settled and upgrades should be seamless.&lt;br /&gt;&lt;br /&gt;One last giant caveat - in libc 2.7 on 64 bit builds,&lt;a href="https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/268195"&gt; getaddrinfo_a() appears to overflow the stack and crash immediately&lt;/a&gt; on just about any input. This is because the thread spawned internally is created with a 16KB stack which is not enough to initialize the name resolver when using 64 bit data types. Oy! The fix is easy, but be aware that some users may bump into this until fixed libcs are deployed.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-8166041434868934540?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8166041434868934540'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8166041434868934540'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/09/asynchronous-dns-lookups-on-linux-with.html' title='Asynchronous DNS lookups on Linux with getaddrinfo_a()'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-6308897274116318541</id><published>2008-09-23T11:18:00.004-04:00</published><updated>2008-09-23T11:48:12.176-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Simulating Wireless Networks With Linux</title><content type='html'>I have been working on enhancing network performance for the upcoming &lt;a href="https://wiki.mozilla.org/Mobile"&gt;Firefox on Mobile&lt;/a&gt;. I jokingly refer to this work as "finding useful things to do with latency" as the browser is targeted at cell phone networks. These networks have latencies of hundreds, sometimes over a thousand, milliseconds. &lt;br /&gt;&lt;br /&gt;From time to time I hope to talk on this blog about interesting things I have found or done while looking into this.&lt;br /&gt;&lt;br /&gt;One of the cool things I consed up in the effort is a python script to emulate one of these networks over localhost. Just run the script, along with an XML file that describes the network you're looking to simulate, and then you can run any networking application you want across localhost to measure the impact of any potential changes you want to make.&lt;br /&gt;&lt;br /&gt;The script relies on &lt;a href="http://www.linuxfoundation.org/en/Net:Netem"&gt;netem&lt;/a&gt; and &lt;a href="http://www.linuxfoundation.org/en/Net:IFB"&gt;ifb&lt;/a&gt;. In that sense, it doesn't really add anything fundamental by itself. Those are outstanding, but poorly understood tools. &lt;br /&gt;&lt;br /&gt;By rolling that together in a script, and providing XML profiles for 3G, edge, bluetooth, evdo, hspd, and gprs wireless networks I was able to provide a meaningful testbed for evaluating default preferences for concurrency and pipeline depth, as well as the impact of changes to DNS pre-fetching and the &lt;a href="https://bugzilla.mozilla.org/attachment.cgi?id=334617"&gt;pipelining implementation&lt;/a&gt;. All good stuff. Some of them need their own posts.&lt;br /&gt;&lt;br /&gt;If you're interested in the tool - &lt;a href="http://groups.google.com/group/mozilla.dev.platforms.mobile/msg/157496ce9030859f"&gt;this is the release announcement&lt;/a&gt;. It is bundled as part of my local copy of the firefox development tree, but the tool is easily separable from that for use on something else.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-6308897274116318541?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/6308897274116318541'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/6308897274116318541'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/09/simulating-wireless-networks-with-linux.html' title='Simulating Wireless Networks With Linux'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-3438638631773107705</id><published>2008-09-11T16:02:00.020-04:00</published><updated>2008-09-13T15:04:41.620-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='dns'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>The minutia of getaddrinfo() and 64 bits</title><content type='html'>I have been spending some time recently improving the &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=437953"&gt;network behavior of Firefox in mobile&lt;/a&gt; (i.e. really high latency, sort of low bandwidth) environments.&lt;br /&gt;&lt;br /&gt;The manifestations du jour of that are some improvements to the &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=453403"&gt;mozilla DNS system&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;In that pursuit, I was staring at some packet traces of the DNS traffic generated from my 64 bit linux build (using my LAN instead of the slow wireless net) and I saw this gem:&lt;br /&gt;&lt;br /&gt;&lt;textarea readonly="readonly" id="scrollingtext" name="scrollingtext" rows="5" cols="55" wrap="off"&gt;16:13:56.748926 IP 192.168.16.214.35935 &gt; 192.168.16.218.53: 3172+ A? bitsup.blogspot.com. (37)&lt;br /&gt;16:13:56.749239 IP 192.168.16.218.53 &gt; 192.168.16.214.35935: 3172 2/0/0 CNAME[|domain]&lt;br /&gt;16:13:56.749388 IP 192.168.16.214.47514 &gt; 192.168.16.218.53: 40044+ A? bitsup.blogspot.com. (37)&lt;br /&gt;16:13:56.749542 IP 192.168.16.218.53 &gt; 192.168.16.214.47514: 40044 2/0/0 CNAME[|domain]&lt;br /&gt;&lt;/textarea&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;That is the same request (and response) duplicated. Doing it an extra time appears to cost all of .3ms on my LAN, but on a cell phone that could delay resolution (and therefore page load) time by a full second - very noticeable lag.&lt;br /&gt;&lt;br /&gt;I started by combing through the firefox DNS code looking for the bug I assumed I had accidentially put in the caching layer. But I confirmed there was just one call to libc's getaddrinfo() being made for that name.&lt;br /&gt;&lt;br /&gt;Then I figured it was some kind of large truncated DNS record from blogspot which necessitated a refetch. Looking further into it the record was really quite normal. The response was just 127 bytes and exactly the same each time - it contained 2 response records: one A record and one CNAME record. Both had reasonably sized names.&lt;br /&gt;&lt;br /&gt;I found the same pattern with another CNAME too: 2 out of 6.&lt;br /&gt;&lt;br /&gt;And so the debugging began in earnest. Cliff Stoll was looking for his 75 cents, and I was out to find my extra round trip time.&lt;br /&gt;&lt;br /&gt;I did not find an international conspiracy, but after whipping together a debuggable libc build I determined that the DNS answer parser placed a "struct host" and some scratch space for parsing into a buffer passed into it from lower on the stack. If the answer parser couldn't parse the response in that space an "ERANGE" error was returned and the caller would increase the buffer size and try again. But the try again involved the whole network dance again instead of just the parsing.&lt;br /&gt;&lt;br /&gt;So what I was seeing was that the original buffer of 512 bytes was too small, but a second try with 1024 worked fine. Fair enough, it just seems like an undersized default to fail at such a common case.&lt;br /&gt;&lt;br /&gt;And then it made sense. For most of its life, it hasn't been undersized - while the DNS response hasn't change the "struct host" did when I went to 64 bit libraries. struct host is comprised of 48 pointers and a 16 byte buffer. On a 32 bit arch that's 208 bytes, but with 8 byte pointers it is 400. With a 512 byte ceiling, 400 is a lot to give up.&lt;br /&gt;&lt;br /&gt;64 bit has a variety of advantages and disadvantages, but an extra RTT was a silent-penalty I hadn't seen before.&lt;br /&gt;&lt;br /&gt;This &lt;a href="http://www.ducksong.com/misc/patch-libc-resolver-buffersize.txt"&gt;patch&lt;/a&gt; fixes things up nicely.&lt;br /&gt;&lt;br /&gt;This is good concrete opportunity to praise the pragmatism of developing on open source. It is not about the bug (everyone has them, if this even is one - it is borderline), it is about the transparency. If this was a closed OS, the few avenues available to me would have been incredibly onerous and possibly expensive. Instead I was able to resolve it in an afternoon by myself (and mail off the patch to the outstanding glibc development team).&lt;br /&gt;&lt;br /&gt;At some future time, I'll have a similarly thrilling story about the little known but widely deployed getaddrinfo_a().&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-3438638631773107705?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3438638631773107705'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3438638631773107705'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/09/minutia-of-getaddrinfo-and-64-bits.html' title='The minutia of getaddrinfo() and 64 bits'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-5968857948943988267</id><published>2008-05-30T17:44:00.004-04:00</published><updated>2008-05-30T18:24:08.531-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><title type='text'>Firefox Add-Ons - webfolder webDAV add-on updated for Firefox 3</title><content type='html'>I have had reason to do a little server side &lt;a href="http://www.webdav.org/specs/rfc2518.html"&gt;WebDAV&lt;/a&gt; work these days.&lt;br /&gt;&lt;br /&gt;WebDav clients for Linux aren't all that common. There is a filesystem gateway (davfs), and cadaver. Cadaver is a decent command line app; like ncftp.&lt;br /&gt;&lt;br /&gt;And more recently, I was introduced to &lt;a href="http://webfolder.mozdev.org/"&gt;webfolders&lt;/a&gt;. This is a addon for firefox that does a perfectly good job of creating a file manager interface for DAV sites.&lt;br /&gt;&lt;br /&gt;Point one of this post: webfolders is cool - download it yourself.&lt;br /&gt;&lt;br /&gt;Point two of this post: webfolders as listed on the website does not support firefox 3. And of course you are using firefox 3.&lt;br /&gt;&lt;br /&gt;Point three of this post: you really ought to be using firefox 3 - it is much faster than firefox 2.&lt;br /&gt;&lt;br /&gt;Point four of this post: I want it all - so I have done the trivial work of updating webfolders for firefox 3. Just changed a few constants and paths, and life is good. It is &lt;a href="http://www.ducksong.com/misc/webfolder-1.1.xpi"&gt;here for download &lt;/a&gt;(any OS!)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-5968857948943988267?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5968857948943988267'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5968857948943988267'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/05/firefox-add-ons-webfolder-webdav-add-on.html' title='Firefox Add-Ons - webfolder webDAV add-on updated for Firefox 3'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-4481411013534628288</id><published>2008-04-18T16:23:00.005-04:00</published><updated>2008-09-13T15:22:10.924-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='hardware'/><title type='text'>Measuring performance of  Linux Kernel likely() and unlikely()</title><content type='html'>&lt;span style="font-size:100%;"&gt;A little while back&lt;a href="http://bitsup.blogspot.com/2008/04/linux-kernel-likely-not-measured.html"&gt; I wrote about how prominent likely() and unlikely() are in the Linux kernel, and yet I could not find any performance measurements linked to them&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Today I made some measurements myself.&lt;br /&gt;&lt;br /&gt;But first a quick review - likely and unlikely are just macros for gcc's __builtin_expect(), which in turn allows the compiler to generate code compatible with the target architecture's branch prediction scheme. The GCC documentation  &lt;a href="http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html"&gt;really warns &lt;/a&gt;against using this manually too often:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;blockquote&gt;&lt;span style="font-size:100%;"&gt;You may use &lt;code&gt;__builtin_expect&lt;/code&gt; to provide the compiler with branch prediction information.  In general, you should prefer to use actual profile feedback for this (&lt;samp&gt;&lt;span class="option"&gt;-fprofile-arcs&lt;/span&gt;&lt;/samp&gt;), as programmers are notoriously bad at predicting how their programs actually perform.  However, there are applications in which this data is hard to collect.        &lt;/span&gt;&lt;/blockquote&gt;&lt;span style="font-size:100%;"&gt;The kernel certainly makes liberal use of it. Accroding to LXR 2.6.24 had 1608 uses of likely and 2075 uses of unlikely in the code. LXR didn't have an index of the just released 2.6.25 yet - but I'd bet it is &lt;/span&gt;&lt;span style="font-weight: bold;font-size:100%;" &gt;likely &lt;/span&gt;&lt;span style="font-size:100%;"&gt;to be more now.&lt;br /&gt;&lt;br /&gt;My methodology was simple, I choose several benchmarks commonly used in kernel land and I ran them against vanilla 2.6.25 and also against a copy I called "notlikely" which simply had the macros nullified using this piece of programming genius:&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;textarea readonly="readonly" id="scrollingtext" name="scrollingtext" rows="15" cols="55" wrap="off"&gt;&lt;br /&gt;diff --git a/include/linux/compiler.h b/include/linux/compiler.h&lt;br /&gt;index dcae0c8..f08b535 1006&lt;br /&gt;--- a/include/linux/compiler.h&lt;br /&gt;+++ b/include/linux/compiler.h&lt;br /&gt;@@ -57,8 +57,14 @@ extern void __chk_io_ptr(const volatile void __iomem *);&lt;br /&gt;* specific implementations come from the above header files&lt;br /&gt;*/&lt;br /&gt;&lt;br /&gt;+#if 0&lt;br /&gt;#define likely(x)      __builtin_expect(!!(x), 1)&lt;br /&gt;#define unlikely(x)    __builtin_expect(!!(x), 0)&lt;br /&gt;+#else&lt;br /&gt;+#define likely(x)      (x)&lt;br /&gt;+#define unlikely(x)    (x)&lt;br /&gt;+#endif&lt;br /&gt;+&lt;br /&gt;&lt;br /&gt;* Optimization barrier */&lt;br /&gt;#ifndef barrier&lt;br /&gt;&lt;/textarea&gt;&lt;br /&gt;&lt;br /&gt;The tests I ran were lmbench, netperf, bonnie++, and the famous "how fast can I compile the kernel?" test.&lt;br /&gt;&lt;br /&gt;The test hardware was an all 64 bit setup on a 2.6Ghz core-2 duo with 2GB of ram and a SATA disk. Pretty standard desktop hardware.&lt;br /&gt;&lt;br /&gt;The core 2 architecture has a pretty fine internal branch prediction engine without the help of these external hints. But with such extensive use of the macros (3500+ times!), I expected to see some difference shown by the numbers.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;But I didn't see any measurable difference. Not at all.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Not a single one of those tests showed anything that I wouldn't consider overlapping noise. I had 3 data points for each test on each kernel (6 points per test) and each test had several different facets. Out of dozens of different facets, there wasn't a single criteria where the measurement was always better or worse on one kernel.&lt;br /&gt;&lt;br /&gt;And this disappoints me. Because I like micro optimizations damn it!  And &lt;span style="font-style: italic;"&gt;in general &lt;/span&gt;this one seems to be a waste of time other than the nice self documenting code it produces. Perhaps the gcc advice is correct. Perhaps the Core-2 is so good that this doesn't matter. Perhaps there is a really compelling benchmark that I'm just not running.&lt;br /&gt;&lt;br /&gt;I say &lt;span style="font-style: italic;"&gt;it is a waste in general&lt;/span&gt; because I am sure there are specific circumstances and code paths where this makes a measurable difference. There certainly must be a benchmark that can show it - but none of these broad based benchmarks were able to show anything useful. That doesn't mean the macro is over used, it seems harmless enough too, but it probably isn't worth thinking too hard about it either.&lt;br /&gt;&lt;br /&gt;hmm. &lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-4481411013534628288?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4481411013534628288'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/4481411013534628288'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/04/measuring-performance-of-linux-kernel.html' title='Measuring performance of  Linux Kernel likely() and unlikely()'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2370870918271795162</id><published>2008-04-14T09:37:00.007-04:00</published><updated>2008-09-13T15:15:05.652-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><title type='text'>Monitoring IP changes with NETLINK ROUTE sockets</title><content type='html'>Yesterday I read a &lt;a href="http://article.gmane.org/gmane.linux.kernel.kernelnewbies/25498"&gt;mailing list query&lt;/a&gt; asking how to get event driven Linux IP address changes (i.e. without having to poll for them).&lt;br /&gt;&lt;br /&gt;I agreed with the attitude of the post. The most important thing about scaling is to make sure the work your code does is proportional to the real event stream. Seems obvious enough, but lots of algorithms screw up that basic premise.&lt;br /&gt;&lt;br /&gt;Any time based polling algorithm betrays this scaling philosophy because work is done every tick independent of the events to be processed. You're always either doing unnecessary work or adding latency to real work by waiting for the next tick. The select() and poll() APIs also betray it as these are proportional to the amount of potential work (number of file descriptors) instead of the amount of real work (number of active descriptors) - epoll() is a better answer there.&lt;br /&gt;&lt;br /&gt;Event driven is the way to go.&lt;br /&gt;&lt;br /&gt;Anyhow, back to the original poster. I knew netlink route sockets could do this - and I had used them in the past for similar purposes. I had to get "man 7 netlink"and google going to cobble together an example and only then did I realize how difficult it is to get started with netlink - it just has not been very widely used and documented.&lt;br /&gt;&lt;br /&gt;So the point of this post is to provide a little google juice documentation for event driven monitoring of new IPv4 addresses using netlink. At least this post has a full example - the code is below.&lt;br /&gt;&lt;br /&gt;If you need to use this functionality - I recommend man 3 and 7 of both netlink and rtnetlink.. and then go read the included header files, and use my sample as a guide. In this basic way you can get address adds, removals, link state changes, route changes, interface changes, etc.. lots of good stuff. It is at the heart of the iproute tools (ip, ss, etc..) as well most of the userspace routing software (zebra, xorp, vyatta, etc..).&lt;br /&gt;&lt;br /&gt;&lt;textarea readonly="readonly" id="scrollingtext" name="scrollingtext" rows="15" cols="55" wrap="off"&gt;&lt;br /&gt;/* Copyright mcmanus@ducksong.com 2009, Under terms of GPLv2 */&lt;br /&gt;&lt;br /&gt;#include &lt;stdio.h&gt;&lt;br /&gt;#include &lt;string.h&gt;&lt;br /&gt;#include &lt;netinet/in.h&gt;&lt;br /&gt;#include &lt;linux/netlink.h&gt;&lt;br /&gt;#include &lt;linux/rtnetlik.h&gt;&lt;br /&gt;#include &lt;net/if.h&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;int main()&lt;br /&gt;{&lt;br /&gt; struct sockaddr_nl addr;&lt;br /&gt; int nls,len,rtl;&lt;br /&gt; char buffer[4096];&lt;br /&gt; struct nlmsghdr *nlh;&lt;br /&gt; struct ifaddrmsg *ifa;&lt;br /&gt; struct rtattr *rth;&lt;br /&gt;&lt;br /&gt; if ((nls = socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE)) == -1)   perror ("socket failure\n");&lt;br /&gt;&lt;br /&gt; memset (&amp;amp;addr,0,sizeof(addr));&lt;br /&gt; addr.nl_family = AF_NETLINK;&lt;br /&gt; addr.nl_groups = RTMGRP_IPV4_IFADDR;&lt;br /&gt;&lt;br /&gt; if (bind(nls, (struct sockaddr *)&amp;amp;addr, sizeof(addr)) == -1)    perror ("bind failure\n");&lt;br /&gt;&lt;br /&gt; nlh = (struct nlmsghdr *)buffer;&lt;br /&gt; while ((len = recv (nls,nlh,4096,0)) &gt; 0)&lt;br /&gt; {&lt;br /&gt;     for (;(NLMSG_OK (nlh, len)) &amp;amp;&amp;amp; (nlh-&gt;nlmsg_type != NLMSG_DONE); nlh = NLMSG_NEXT(nlh, len))&lt;br /&gt;     {&lt;br /&gt;         if (nlh-&gt;nlmsg_type != RTM_NEWADDR) continue; /* some other kind of announcement */&lt;br /&gt;&lt;br /&gt;         ifa = (struct ifaddrmsg *) NLMSG_DATA (nlh);&lt;br /&gt;&lt;br /&gt;         rth = IFA_RTA (ifa);&lt;br /&gt;         rtl = IFA_PAYLOAD (nlh);&lt;br /&gt;         for (;rtl &amp;amp;&amp;amp; RTA_OK (rth, rtl); rth = RTA_NEXT (rth,rtl))&lt;br /&gt;         {&lt;br /&gt;             char name[IFNAMSIZ];&lt;br /&gt;             uint32_t ipaddr;&lt;br /&gt;&lt;br /&gt;             if (rth-&gt;rta_type != IFA_LOCAL) continue;&lt;br /&gt;&lt;br /&gt;             ipaddr = * ((uint32_t *)RTA_DATA(rth));&lt;br /&gt;             ipaddr = htonl(ipaddr);&lt;br /&gt;&lt;br /&gt;             fprintf (stdout,"%s is now %X\n",if_indextoname(ifa-&gt;ifa_index,name),ipaddr);&lt;br /&gt;         }&lt;br /&gt;     }&lt;br /&gt; }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;&lt;/textarea&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2370870918271795162?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2370870918271795162'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2370870918271795162'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/04/monitoring-ip-changes-with-netlink.html' title='Monitoring IP changes with NETLINK ROUTE sockets'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-715076859464653182</id><published>2008-04-08T11:08:00.003-04:00</published><updated>2008-04-08T11:20:56.702-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='internet'/><title type='text'>IP Georeferencing</title><content type='html'>IP &lt;a href="http://en.wikipedia.org/wiki/Georeference"&gt;Georeferencing&lt;/a&gt; is a pretty cool toolbox item on today's web. Simply put, its the process of taking an IP address and converting it to some geographical (city, lat, long, whatever) information.&lt;br /&gt;&lt;br /&gt;It gets commonly used for security filters, log analysis, ad targeting, community building, etc..&lt;br /&gt;&lt;br /&gt;&lt;a href="http://http//www.maxmind.com/"&gt;Maxmind&lt;/a&gt; deserves a shout out for making available databases to do this. There is nothing inherent about an IP number that gives you this information, so they just ned to build out of band databases to get the job done. They have two copies of the databases, one for free and one a bit more accurate that is priced very reasonably.&lt;br /&gt;&lt;br /&gt;The databases come with a number of different libraries for using them: C, Java, PHP, etc..  The libraries are released under the LGPL.&lt;br /&gt;&lt;br /&gt;Recently I was doing a project that needed to lookup scads and scads of addresses, so I put a little muscle into improving the lookup routines in the C code.&lt;br /&gt;&lt;br /&gt;I'm happy to say&lt;a href="http://sourceforge.net/mailarchive/forum.php?thread_name=1206389972.13044.148.camel%40tng&amp;amp;forum_name=geoip-c-discuss"&gt; I was able to improve things anywhere from 2x to 8x &lt;/a&gt;in terms of overall lookups, depending on what exactly was being lookedup. There were a bunch of changes, but the primary one was the addition of a variable length radix lookup mechanism that changed the average number of comparisons from 27 to 4 - not rocket science, but the right tool for the job.&lt;br /&gt;&lt;br /&gt;I'm even more happy to say I sent those back as LGPL contributions of my own. The code is on my &lt;a href="http://www.ducksong.com/Misc_Code.php"&gt;open source contribution&lt;/a&gt; web page.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-715076859464653182?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/715076859464653182'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/715076859464653182'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/04/ip-georeferencing.html' title='IP Georeferencing'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-6595755674200449701</id><published>2008-04-05T18:19:00.004-04:00</published><updated>2008-04-05T18:29:03.435-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><title type='text'>Linux Selective Acknowledgment (SACK) CPU Overhead</title><content type='html'>Last year I tossed an e-mail from the Linux kernel networking list in my "projtodo" folder.&lt;br /&gt;&lt;br /&gt;The mail talked about how the Linux TCP stack in particular, and likely all TCP stacks in general, likely had an excessive-CPU attack exposure when confronted with malicious SACK options. I found the mail intriguing but unsatisfying. It was well informed speculation but didn't have any hard data, nor was there any simple way to gather some. Readers of the other posts on this blog will know I really dig measurements. The issue at hand was pretty obviously is a problem - but how much of one?&lt;br /&gt;&lt;br /&gt;A few weeks ago I had the chance to develop some testing code and find out for myself  - and &lt;a href="http://www.ibm.com/developerworks/linux/library/l-tcp-sack/index.html"&gt;IBM DeveloperWorks has published the summary&lt;/a&gt; of my little project. The executive summary is "its kinda bad, but not a disaster, and hope is on the way". There is had data and some pretty pictures in the article itself.&lt;br /&gt;&lt;br /&gt;The coolest part of the whole endeavor, other than scratching the "I wonder" itch, was getting to conjure up a userspace TCP stack from raw sockets.  It was, of course, woefully incomplete  as it was just meant to trigger a certain behavior in its peer instead of being generally useful or reliable - but nonetheless entertaining.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-6595755674200449701?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/6595755674200449701'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/6595755674200449701'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/04/linux-selective-acknowledgment-sack-cpu.html' title='Linux Selective Acknowledgment (SACK) CPU Overhead'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2150657448796727745</id><published>2008-04-03T23:05:00.003-04:00</published><updated>2008-04-03T23:18:52.911-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><title type='text'>Linux Kernel - likely() not measured?</title><content type='html'>The other day on &lt;a href="http://www.kernelnewbies.org/"&gt;kernelnewbies,&lt;/a&gt; the able &lt;a href="http://mail.nl.linux.org/kernelnewbies/2008-03/msg00371.html"&gt;Robert Day wondered whether or not anyone had quantified the effects of the likely() and unlikely() macros scattered all over the Linux kernel&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;He received no less than 5 replies telling him how the macros worked. (If you are curious, &lt;a href="http://kerneltrap.org/node/4705"&gt;this is the best explanation)&lt;/a&gt; - but not a single piece of measurement or other characterization.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lxr.linux.no/linux/+search"&gt;LXR&lt;/a&gt; shows over 3500 uses of those macros in the kernel, and nobody has any data for any scenario? Wowzers.&lt;br /&gt;&lt;br /&gt;Doing before/after benchmarks with those macros changed to nops would be an interesting project. Could use the usual suspects of linux kernel performance enhancements to test (lmbench, compile test, some kind of network load generator, etc..)&lt;br /&gt;&lt;br /&gt;Comments with pointers to data would be cool.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2150657448796727745?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2150657448796727745'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2150657448796727745'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/04/linux-kernel-likely-not-measured.html' title='Linux Kernel - likely() not measured?'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-3190629720263280768</id><published>2008-03-10T09:48:00.002-04:00</published><updated>2008-03-10T10:19:27.067-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtualization'/><category scheme='http://www.blogger.com/atom/ns#' term='startups'/><title type='text'>The Everything Old Is New Again Meme</title><content type='html'>I've been struck lately by the dismissive comments in technology reviews along the lines of "of course this is really just a new skin on a really old idea". They are particularly striking, because the reviews are essentially overall positive about the technology they just dismissed as a rehash! If they really were just recycled implementations of old worn ideas, then why are we so excited now - why waste the bits talking about them?&lt;br /&gt;&lt;br /&gt;I'm left thinking that there just isn't enough respect for a quality implementation that takes into account the real needs of the current market. To some, it is all about the idea. I'm the last to diss a great idea, but let's face it - folks who think the idea is everything tend to overlook various little bits of reality that don't mesh well with their idea (we call those implementation details) and in general just commoditize the implementation. Ideas like this make great conversations but crappy products.&lt;br /&gt;&lt;br /&gt;The truth is these next generation technologies are usually quality implementation with some of the substantial "implementaion details" overcome. To trivialize those details is a mistake - the difficulty of a quality implementation is often overlooked by folks who think the main idea is everything. Often they require some real innovative thinking on their own. Anybody who has taken a tour of duty in one (or two or three) startups will tell you that neither idea nor execution are to be taken for granted - this is hard stuff when you're blazing new trails.&lt;br /&gt;&lt;br /&gt;A couple common examples:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;modern virtualization (vmware, xen, kvm, etc..). Tom Killalea of Amazon.Com writes "&lt;a href="http://www.acmqueue.com/modules.php?name=Content&amp;amp;pa=showpage&amp;amp;pid=522"&gt;Virtualization has been around for more than 30 years - ... - yet in 2007 it tipped.&lt;/a&gt;"&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;PBT (ethernet provider backbone transport) - Light Reading writes "&lt;span&gt;&lt;span&gt;&lt;a href="http://www.lightreading.com/document.asp?doc_id=115612"&gt;Provider Backbone Transport is a new idea in carrier transport networking – or, perhaps more accurately, an old idea in a new guise"&lt;/a&gt; - PBT is presented as an ethernet corollary to IP/MPLS.&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;Thinking about just these two examples, it isn't hard to see why they are breaking through now instead of the earlier incarnations made reference to. While market conditions and external factors have changed, it isn't simply that their train has arrived and they were dusted off for the occasion - real work has made them better.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-3190629720263280768?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3190629720263280768'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3190629720263280768'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/03/everything-old-is-new-again-meme.html' title='The Everything Old Is New Again Meme'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2577197541192701542</id><published>2008-03-04T11:03:00.001-05:00</published><updated>2008-03-04T11:03:58.868-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='recommendations'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='hardware'/><title type='text'>Calgary IOMMU - At What Price?</title><content type='html'>The Calgary IOMMU is a feature of most IBM X-Series (i.e. X86_64) blades and motherboards. If you aren't familiar with an IOMMU, it is strongly analogous to a regular MMU but applied to a DMA context. Their original primary use was for integrating 32 bit hardware into 64 bit systems. But another promising use for them is enforcing safe access in the same way an MMU can.&lt;br /&gt;&lt;br /&gt;In normal userspace, if a program goes off the rails and accesses some memory it does not have permissions for a simple exception can be raised. This keeps the carnage restricted to the application that made the mistake. But if a hardware device does the same thing through DMA, whole system disaster is sure to follow as nothing prevents the accesses from happening. The IOMMU can provide that safety.&lt;br /&gt;&lt;br /&gt;An IOMMU unit lets the kernel setup mappings much like normal memory page tables. Normal RAM mappings are cached with TLB entries, and IOMMU maps are cached TCE entries that play largely the same role.&lt;br /&gt;&lt;br /&gt;By now, I've probably succeeded in rehashing what you already knew. At least it was just three paragraphs (well, now four).&lt;br /&gt;&lt;br /&gt;The pertinent bit from a characterization standpoint is a paper from the 2007 Ottawa Linux Symposium. In &lt;a href="http://ols.108.redhat.com/2007/Reprints/ben-yehuda-Reprint.pdf"&gt;The Price of Safety: Evaluating IOMMU Performance&lt;/a&gt; Muli Ben-Yehuda of IBM and some co-authors from Intel and AMD do some measurements using the Calgary IOMMU, as well as the DART (which generally comes on Power based planers).&lt;br /&gt;&lt;br /&gt;I love measurements! And it takes guts to post measurements like this - in its current incarnation on Linux the cost of safety from the IOMMU is a 30% to 60% increase in CPU! Gah!&lt;br /&gt;&lt;br /&gt;Some drill down is required, and it turns out this is among the worst cases to measure. But still - 30 to 60%! The paper is short and well written, you should read it for yourself - but I will summarize the test more or less as "measure the CPU utilization while doing 1 Gbps of netperf network traffic - measure with and without iommu". The tests are also done with and without Xen, as IOMMU techniques are especially interesting to virtualization, but the basic takeaways are the same in virtualized or bare metal environments.&lt;br /&gt;&lt;br /&gt;The "Why so Bad" conclusion is management of the TCE. The IOMMU, unlike the TLB cache of an MMU, only allows software to remove entries via a "flush it all" instruction. I have certainly measured that when TLBs need to be cleared during process switching that can be a very measurable event on overall system performance - it is one reason while an application broken into N threads runs faster than the same application broken into N processes.&lt;br /&gt;&lt;br /&gt;But overall, this is actually an encouraging conclusion - hardware may certainly evolve to give more granular access to the TCE tables. And there are games that can be played on the management side in software that can reduce the number of flushes in return for giving up some of the safety guarantees.&lt;br /&gt;&lt;br /&gt;Something to be watched.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2577197541192701542?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2577197541192701542'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2577197541192701542'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/03/calgary-iommu-at-what-price.html' title='Calgary IOMMU - At What Price?'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-6952640777883228303</id><published>2008-02-28T15:15:00.003-05:00</published><updated>2008-04-03T23:20:16.141-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='recommendations'/><category scheme='http://www.blogger.com/atom/ns#' term='appliances'/><title type='text'>Appliances and Hybrids</title><content type='html'>In my &lt;a href="http://bitsup.blogspot.com/2008/02/network-algorithmics.html"&gt;last pos&lt;/a&gt;t, which was about the fab book &lt;a href="http://www.amazon.com/gp/product/0120884771?ie=UTF8&amp;amp;tag=wwwducksongco-20&amp;amp;linkCode=as2&amp;amp;camp=1789&amp;amp;creative=9325&amp;amp;creativeASIN=0120884771"&gt;Network Algorithmics,&lt;/a&gt; I mentioned intelligent in-network appliances. I also mentioned that "the world doesn't need to be all hardware or all software".&lt;br /&gt;&lt;br /&gt;I believe this to be an essential point - blending rapidly improving commodity hardware with a just a touch of custom ASIC/FPGA components, glued together by a low level architecture aware systems programming approach makes unbelievably good products that can be made an produced relatively inexpensively.&lt;br /&gt;&lt;br /&gt;These appliances and hardware are rapidly showing up everywhere - Tony Bourke of Load Balancing digest makes a &lt;a href="http://lbdigest.com/2008/02/28/asic-debate-redux/"&gt;similar (why choose between ASIC or Software?) post&lt;/a&gt; today. Read it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-6952640777883228303?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/6952640777883228303'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/6952640777883228303'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/02/appliances-and-hybrids.html' title='Appliances and Hybrids'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-354282784686150071</id><published>2008-02-27T17:51:00.002-05:00</published><updated>2008-02-27T18:12:37.398-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='recommendations'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><title type='text'>Network Algorithmics</title><content type='html'>In late 2004 George Varghese published an amazing book on the design and implementation of networking projects: &lt;a href="http://www.amazon.com/gp/product/0120884771?ie=UTF8&amp;amp;tag=wwwducksongco-20&amp;amp;linkCode=as2&amp;amp;camp=1789&amp;amp;creative=9325&amp;amp;creativeASIN=0120884771"&gt;Network Algorithmics&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;You won't find the boilerplate "this is IP, this is TCP, this is layer blah" crapola here. What makes this special is that it describes the way the industry really builds high end switches, routers, high end server software and an ever growing array of intelligent in-network appliances. The way that is really done, is very different than presented in a classic networking text book.&lt;br /&gt;&lt;br /&gt;The material is not algorithms, rather it is a way of looking at things and breaking down abstraction barriers through techniques like zero copy, tag hinting, lazy evaluation, etc..  It makes the point that layers are how you describe protocols - not how you want to implement them. The world doesn't have to be all hardware or all software, this helps train your mind on how to write systems that harmonize them both by taking the time to really understand the architecture.&lt;br /&gt;&lt;br /&gt;To me, this book is a fundamental bookshelf item. You don't hear it often mentioned with the likes of Stevens, Tanenbaum, and Cormen - but amongst folks in the know, it is always there. More folks ought to know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-354282784686150071?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/354282784686150071'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/354282784686150071'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/02/network-algorithmics.html' title='Network Algorithmics'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-7918154705042944332</id><published>2008-02-15T17:14:00.004-05:00</published><updated>2008-02-15T18:48:27.121-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='disk'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='amazon'/><title type='text'>SLA as medians, percentiles, or averages - as told by Amazon's Dynamo</title><content type='html'>I am going to recommend another published paper because of how it talks about characterizing its own performance. That's not the point of the paper, but I found it really interesting anyhow.&lt;br /&gt;&lt;br /&gt;The paper is &lt;a href="http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf"&gt;Dynamo: Amazon’s Highly Available Key-value Store&lt;/a&gt;, by a number of good folks over at Amazon.com (including the CTO &lt;a href="http://www.allthingsdistributed.com/"&gt;Werner Vogels&lt;/a&gt;). It was published in the peer reviewed SOPS '07.&lt;br /&gt;&lt;br /&gt;The bulk of the paper is about Dynamo, which apparently is a home grown key-value storage system that has a lot of inherent replication, scalability, and reliability. A good read.&lt;br /&gt;&lt;br /&gt;Instead of the big picture I wanted to focus on a detail. The paper rejects the notion of defining SLAs with median expectations. Instead, Dynamo uses 99.9 percentiles. (It also rejects using means, but that is pretty nonsensical anyhow, isn't it?). The central idea is that the SLA defines acceptable usage for all users - not just for half of them (the median).&lt;br /&gt;&lt;br /&gt;This matters in real life in a very real way. There is normally one version  of an operation and then an ala carte menu of choices they might layer on that.. lots of folks use the vanilla version, but if 10, 20, 30, or even 40 percent of users are using some features that require extra processing - they are totally left out of the SLA. The canonical Amazon case is a user with a very long purchasing history - an important case but not the average case. 99.9 installs a threshold for a reasonable definition of "everybody" while still leaving room for the occasional pathological case. The authors point out that for more money you can have more nines, but the law of diminishing returns certainly applies.&lt;br /&gt;&lt;br /&gt;You have to wonder if after &lt;a href="http://www.infoworld.com/article/08/02/15/Amazons-S3-down-for-several-hours_1.html"&gt;today's S3 outage&lt;/a&gt; they wish they had bought another nine or two ;) (I have no reason to think that had anything to do with Dynamo - I'm just poking fun - what Amazon has built for S3/EC2 is very impressive)&lt;br /&gt;&lt;br /&gt;I was happily nodding along as I read the paper when this came up:&lt;br /&gt;&lt;blockquote&gt;Dynamo is built for latency sensitive applications that require at least 99.9% of read and write operations to be performed within a few hundred milliseconds&lt;/blockquote&gt;My initial reaction was not charitable: Whoa - that's not really setting the bar very high, is it guys? Hundreds of millis to read and write a key/value pair? You say yourself in the introductory pages that these services are often layered on top of each other! Sure, it is more latency sensitive than an overnight data warehouse operation, but that's hardly an impressive responsiveness threshold.&lt;br /&gt;&lt;br /&gt;But, I was too harsh. I hadn't internalized what shifting from the 50th percentile to the 99.9th really meant. The value isn't representative of what the typical user will see - it is representative of the worst you can stomach. In effect, it loses its marketing value - which is how a lot of SLAs are used in the real world.&lt;br /&gt;&lt;br /&gt;The Dynamo paper backs this up. Figure 4 shows both the average and 99.9 percentiles for both reads and writes.  average reads ran around 15ms, writes around 25. 99.9 percentiles were respectively in the 150 and 250 neighborhoods. This all makes much more sense, especially when dealing with disk drives.&lt;br /&gt;&lt;br /&gt;In the end, I think you need more than one datapoint in order to make an effective characterization. Amazon clearly wouldn't be happy if everybody was seeing 150ms latencies - though they can stomach it if literally it is a one in a thousand kind of occurrence.&lt;br /&gt;&lt;br /&gt;Maybe SLAs should be expressed as 4 tuples of 25/50/75/99.9 .. I've developed benchmarks that way and felt that helped keep the subsequent optimizations honest.&lt;br /&gt;&lt;br /&gt;Even 90/99.9 is mostly what you need to keep a cap on the outliers while still getting a feel for what somebody is likely to see.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-7918154705042944332?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7918154705042944332'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/7918154705042944332'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/02/sla-as-medians-percentiles-or-averages.html' title='SLA as medians, percentiles, or averages - as told by Amazon&apos;s Dynamo'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-396091945471649455</id><published>2008-01-31T13:19:00.000-05:00</published><updated>2008-01-31T13:33:01.295-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='javascript'/><category scheme='http://www.blogger.com/atom/ns#' term='recommendations'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><title type='text'>AjaxScope Paper Provides Javascript Characterization</title><content type='html'>Emre Kıcıman and Benjamin Livshits from Microsoft Research present some interesting data in their &lt;a href="http://research.microsoft.com/%7Eemrek/pubs/ajaxscope-sosp.pdf"&gt;SOPS paper - AjaxScope: A Platform for Remotely Monitoring the Client-Side Behavior of Web 2.0 Applications&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The paper is mostly about AjaxScope, a neat insturmentation and profiling tool for JavaScript.&lt;br /&gt;&lt;br /&gt;What I want to highlight, though, are some of the measurements they present for both IE (6 and 7) as well as Firefox (2). This is real data in a refereed journal of the ACM, not a Gartner-style whitepaper.&lt;br /&gt;&lt;br /&gt;Among the interesting nuggets are IE's 35x slower performance in String cat operations, and Firefox's 4x slower Array join execution time. The authors also put the intrinsics into context by measuring the performance of common portal pages - IE beats Firefox on msn.com, but Firefox turns the tables on Yahoo!.&lt;br /&gt;&lt;br /&gt;Lots more interesting data, and a useful tool, in the paper. Read it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-396091945471649455?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/396091945471649455'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/396091945471649455'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/01/ajaxscope-paper-provides-javascript.html' title='AjaxScope Paper Provides Javascript Characterization'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-3665560665018742918</id><published>2008-01-19T23:12:00.000-05:00</published><updated>2008-01-20T00:18:19.348-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='google'/><category scheme='http://www.blogger.com/atom/ns#' term='rss'/><title type='text'>Web Syndication Format Market Share</title><content type='html'>For quite a while my todo list has had an instruction to find and characterize the popularity of RSS vs ATOM . Which syndication format is more popular?&lt;br /&gt;&lt;br /&gt;Atom seems on the face of it to be a better format than RSS, but some of what it addresses are not really wide spread problems for operations. Market share will tell if it was a solution looking for a real problem or not. Atom is about 2 years old - and it is pretty common to see atom feeds available around the net now.&lt;br /&gt;&lt;br /&gt;Measuring the breakdown among my own set of feeds that I read isn't terribly useful. I have a bias in my selections - it isn't like measuring my connectivity or transport properties where I am representative as a sample.&lt;br /&gt;&lt;br /&gt;For the record: I have 112 feeds, 54 of them in atom and 58 in some kind of rss.&lt;todo&gt;&lt;br /&gt;&lt;br /&gt;The best information I could find was from &lt;a href="http://syndic8.com/"&gt;syndic8.com&lt;/a&gt;.  But frankly, it wasn't very satisfying. The site didn't feel very complete, and in the end only showed essentially the ratio between RSS and Atom offerings. They listed about 1/2 a million feeds - 82% of which were some flavor of RSS.&lt;br /&gt;&lt;br /&gt;What I want to know is the ratio between active usages (i.e. fetches) of the two formats. Lots of sites offer both formats - but which do users actually consume?&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.feedburner.com/"&gt;Feedburner&lt;/a&gt; clearly has this info - but I couldn't find it published anywhere.&lt;br /&gt;&lt;br /&gt;Does anybody have more information?&lt;br /&gt;&lt;/todo&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-3665560665018742918?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3665560665018742918'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3665560665018742918'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/01/web-syndication-format-market-share.html' title='Web Syndication Format Market Share'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-123639321677961591</id><published>2008-01-17T14:23:00.000-05:00</published><updated>2008-01-20T00:19:11.874-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><title type='text'>Characterization Zealots Unite!</title><content type='html'>The eponymous Kode Vicious over at ACM's Queue magazine has an &lt;a href="http://acmqueue.com/modules.php?name=Content&amp;amp;pa=showpage&amp;amp;pid=516"&gt;excellent rant&lt;/a&gt; on the value of measuring instead of assuming. I read it in print a ways back, now that it is in digital form it deserved a blog shoutout.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-123639321677961591?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/123639321677961591'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/123639321677961591'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2008/01/characterization-zealots-unite.html' title='Characterization Zealots Unite!'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-705288029699236282</id><published>2007-08-04T13:39:00.000-04:00</published><updated>2007-08-04T14:28:59.334-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><title type='text'>Lamenting ECN deployments</title><content type='html'>Explicit Congestion Control (&lt;a href="ftp://ftp.isi.edu/in-notes/rfc3168.txt"&gt;ECN&lt;/a&gt;) has had &lt;a href="http://citeseer.ist.psu.edu/cis?q=ecn&amp;amp;cs=1"&gt;lots of papers&lt;/a&gt; written about it. It has also been in various stages of deployment for almost 10 years now. I generally agree with those that feel it is a good thing. Beyond the scientific data, always important of course, at a gut level separating data loss from congestion notification is an obviously good thing to do - they are simply different things and the implicit overload TCP currently uses results in creating un-necessary overflows and over-conservative backoffs.&lt;br /&gt;&lt;br /&gt;But the sad fact is that ECN just isn't a relevant to the real world Internet. If you can't become relevant in 8 years or so, it is probably time to try something else. For a long time the issue was getting over interop problems with NATs and &lt;a href="http://seclists.org/firewall-wizards/2000/Sep/0203.html"&gt;Firewalls.&lt;/a&gt; Then there was the matter of getting widespread client and router deployments. Linux has had client support for a long time, other Unixes more recently, and Microsoft Vista is the first MS OS to include ECN support at all - but it ships disabled by default.&lt;br /&gt;&lt;br /&gt;However, it seems that even as clients are catching up, the &lt;a href="http://tools.ietf.org/html/draft-briscoe-tsvwg-re-ecn-tcp-04#section-7"&gt;routing infrastructure still isn't playing along&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I took a sample from my home network to see if ECN was relevant at all to my computing life. The summary is that ECN is irrelevant in my data set.&lt;br /&gt;&lt;br /&gt;My network runs a lot of Linux with ECN enabled, so if anything ECN will be over-represented on my network compared to the Internet at large. The sample covered&lt;br /&gt;&lt;ul&gt;&lt;li&gt;8.5 days&lt;/li&gt;&lt;li&gt;1,227,473 IP packets&lt;/li&gt;&lt;li&gt;40,914 TCP flows&lt;/li&gt;&lt;/ul&gt;Of those 41K flows, almost 8 percent (3268) negotiated ECN on between peers. As I already mentioned, I suspect this pretty meager number is higher that the Internet at large.&lt;br /&gt;&lt;br /&gt;8% - that's great and might make a real contribution. But without router support, those eligible flows won't actually use ECN in any meaningful sense. There are two signs of ECN actually taking root with an intermediate router:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The TCP ECN ECHO flag bit is set on an incoming packet.. this tells the receiver that an earlier packet it sent was marked by a router on its way to the destination. The peer is setting this flag in order to tell the original sender to slow down so that doesn't keep happening.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;The IP header ECN codepoint is set to 3 on an incoming packet. This indicates to the receiver that the packet hit some congestion on the way and a router set this bit to mark that fact.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;There were no ECN ECHO flags in the entire trace, except those set on SYN packets where it is used in order to negotiate endpoint knowledge of ECN. When appearing on a SYN it is not a sign of router support.&lt;br /&gt;&lt;br /&gt;There were 87 packets (over 8 different TCP flows) with the ECN codepoint of 3 - normally indicating congestion marks added by a router. However, that is a bogus conclusion in this case and it does not appear that in 1.2 million packets I have a single one that was marked as congestion-influenced.&lt;br /&gt;&lt;br /&gt;I can say conclusively that the 87 codepoint 3 packets are false positives because none of the 8 flows containing those packets were included in the 8% of flows that had successfully negotiated the use of ECN. The peer must have been using the codepoint to indicate something different entirely. Each flow was with a different host, though they were all SMTP MTA based in Europe (7 of them in Germany).&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.icir.org/floyd/ecn.html"&gt;Sally Floyd&lt;/a&gt; says on her webpage:&lt;br /&gt;&lt;blockquote&gt;"David Moore from CAIDA reports that in measurements at one link, 0.1% of the packets had the CE codepoint set. Either this codepoint was being used for some other purpose, or there is some deployment of ECN capability in routers as well as in TCP stacks."&lt;br /&gt;&lt;/blockquote&gt;My data at least hints that the hope that there is some deployment of ECN capability in routers is over-optimistic.&lt;br /&gt;&lt;br /&gt;The final wrapup - my traces are characterized by meager (&lt; 10%) client support no indication of router support at all.&lt;br /&gt;&lt;br /&gt;I was inspired to take a look at this by the work of Bob Briscoe, who is championing a &lt;a href="http://www.cs.ucl.ac.uk/staff/B.Briscoe/projects/refb/#Presentations"&gt;IETF BoF on Re-ECN&lt;/a&gt;, which is an attempt to make lemonade from lemons and re-invigorate the basic underlying technology.&lt;br /&gt;&lt;blockquote&gt;&lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-705288029699236282?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bitsup.blogspot.com/feeds/705288029699236282/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=8147202175434463396&amp;postID=705288029699236282' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/705288029699236282'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/705288029699236282'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2007/08/lamenting-ecn-deployments.html' title='Lamenting ECN deployments'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-5718986961259695682</id><published>2007-07-06T19:31:00.000-04:00</published><updated>2007-07-06T18:31:22.644-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='congestion control'/><title type='text'>In Support of Math Based Computer Science</title><content type='html'>Earlier in my day, I ran across a &lt;a href="http://www.itwire.com.au/content/view/13339/53/"&gt;book review&lt;/a&gt; for &lt;a href="http://www.amazon.com/Computer-Science-Reconsidered-Invocation-Expression/dp/0471798142/ref=sr_1_1/105-9314019-7574818?ie=UTF8&amp;s=books&amp;amp;qid=1183742822&amp;sr=8-1"&gt;Computer Science Reconsidered: The Invocation Model of Process Expression&lt;/a&gt;. The premise of the book, at least garnered second hand from the review:&lt;br /&gt;&lt;blockquote&gt;Mathematicians and computer scientists are pursuing fundamentally different aims, and the mathematician's tools are not as appropriate as was once supposed to the questions of the computer scientist&lt;br /&gt;&lt;/blockquote&gt;I had a number of reactions to that immediately. Most of them were, frankly, emotional. Certainly most of the code that gets churned out these days has very little conscious basis in Mathematics. But I would argue it doesn't have much of a conscious basis in Computer Science, Algorithms, or Finite Automata either which are all definitely critical some of the time. That is largely because so much of it is so well understood, and so abstracted away from first principles, that the underlying rigor isn't required to get the day to day bits churned out. The more important the code is, the more it moves down that spectrum of rigor.&lt;br /&gt;&lt;br /&gt;But when we're really reaching for something new, something interesting, and something that isn't just an incremental change from the conventional wisdom, then my instincts say that Math provides a very valuable framework for describing something out of nothingness.&lt;br /&gt;&lt;br /&gt;My gut was validated just hours later when reading an &lt;a href="http://netlab.caltech.edu/pub/papers/fast-network05.pdf"&gt;IEEE journal article&lt;/a&gt; on the so-called &lt;span style="font-style: italic;"&gt;Fast&lt;span style="font-style: italic;"&gt;TCP&lt;/span&gt;&lt;/span&gt; active queue management TCP congestion control algorithm. The approach looks at congestion control as "a distributed algorithm over the Internet to solve a global optimization problem [.. to ..] determine the equilibrium and performance of the network"&lt;br /&gt;&lt;blockquote style="font-style: italic;"&gt;Moreover, the underlying optimization problem has a simple structure that allows us to efficiently compute these equilibrium properties numerically, even for a large network that is hard to simulate.&lt;br /&gt;&lt;br /&gt;Specifically, we can regard each source as having a utility function that measures its “happiness” as a function of its data rate. Consider the problem of maximizing the sum of all source utility functions over their rates, subject to link capacity constraints. This is a standard constrained optimization problem for which many iterative solutions exist. The challenge in our context is to solve for the optimal source rates in a distributed manner using only local information. A key feature we exploit is the duality theory. It says that associated with our (primal) utility maximization problem is a dual minimization problem. Whereas the primal variables over which utility is to be maximized are source rates, the dual variables for the dual problem are congestion measures at the links.  Moreover, solving the dual problem is equivalent to solving the primal problem. There is a class of optimization algorithms that iteratively solve for both the primal and dual problems at once.&lt;br /&gt;&lt;br /&gt;TCP/AQM can be interpreted as such a primal-dual algorithm that is distributed and decentralized, and solves both the primal and dual problems. TCP iterates on the source rates (a source increases or decreases its window in response to congestion in its path), and AQM iterates on the congestion measures (e.g., loss probability at a link increases or decreases as sources traversing that link increase or decrease their rates). They cooperate to determine iteratively the network operating point that maximizes aggregate utility. When this iterative process converges, the equilibrium source rates are optimal solutions of the primal problem and the equilibrium congestion measures are optimal solutions of the dual problem. The throughput and fairness of the network are thus determined by the TCP algorithm and the associated utility function, whereas utilization, loss, and delay are determined by the AQM algorithm.&lt;br /&gt;&lt;/blockquote&gt;It seems clear here that math provides very strong underpinning for what the article needs to describe and achieve. To be fair to the author of the original book, he was trying to promote another basis for expressing key Computer Science thoughts: "the invocation model of process expression". Which from casual glance looks interesting, I just don't get why you have to tear down something old (e.g. "The Problem: Why the underlying theory of contemporary computer science is not helpful") in order to build up something new.&lt;br /&gt;&lt;br /&gt;Maybe being shocking is good for selling books. Though, I'm not sure the labeling of math as not helpful is all that shocking to the general book buying population.&lt;br /&gt;&lt;br /&gt;Check out &lt;a href="http://www.fastsoft.com/"&gt;www.fastsoft.com&lt;/a&gt; where some of the authors of that paper have created a clever hardware bridge to seamlessly migrate a legacy TCP data center into one that sends with FastTCP congestion control algorithm.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-5718986961259695682?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bitsup.blogspot.com/feeds/5718986961259695682/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=8147202175434463396&amp;postID=5718986961259695682' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5718986961259695682'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/5718986961259695682'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2007/07/in-support-of-math-based-computer.html' title='In Support of Math Based Computer Science'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-3368045129259008406</id><published>2007-06-26T20:11:00.000-04:00</published><updated>2007-06-26T20:39:37.102-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><title type='text'>HTTP Client Handshake Characterization</title><content type='html'>Continuing in the "what's the latency on my DSL connection" theme (see &lt;a href="http://bitsup.blogspot.com/2007/06/more-characterization-dns-latency.html"&gt;outbound DNS&lt;/a&gt; and&lt;a href="http://bitsup.blogspot.com/2007/03/characterizing-latency-of-my-mail.html"&gt; incoming SMTP&lt;/a&gt; posts), we finally get to looking at outbound HTTP connection latency.&lt;br /&gt;&lt;br /&gt;I expected this sample to be the best of the lot for two reasons. First, these servers are self selected by members of my household and therefore have some kind of inherent locality to me. Second, web hosting implies a certain amount of infrastructure and expenditure that the other samples would not necessarily exhibit.&lt;br /&gt;&lt;br /&gt;Let's face it - there is more than one Internet delivery system and if you will pay more you get a higher class of service (whether that be uncongested links, content distribution, etc.. etc..) and webhosting correlates with folks paying that tariff in a way that SMTP clients do not.&lt;br /&gt;&lt;br /&gt;My expectations held up. The numbers actually perform even better than expected.&lt;br /&gt;&lt;br /&gt;The sample:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;                   13,282 handshakes&lt;/li&gt;&lt;li&gt;708 unique servers&lt;/li&gt;&lt;li&gt;750 MB of HTTP data&lt;/li&gt;&lt;li&gt;12.5 days&lt;/li&gt;&lt;/ul&gt; Here is the data, ranging from 25ms at the best, a median impressively at 48ms, and the worst case is 1.5 minutes. TCP's exponential backoff kicks in based on hardcoded multi second timers really obviously around the 99th percentile, resulting in some extreme outliers.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;best - 25&lt;br /&gt;10 - 35&lt;br /&gt;20 - 38&lt;br /&gt;30 - 40&lt;br /&gt;40 - 43&lt;br /&gt;50 - 48&lt;br /&gt;60 - 52&lt;br /&gt;70 - 61&lt;br /&gt;80 - 101&lt;br /&gt;90 - 116&lt;br /&gt;worst - 93112 (1.5 mins)&lt;br /&gt;&lt;/pre&gt;The mean is 143 (thanks to some really big outliers due to exponential backoff of hardcoded 3 second timers), here the median is much more representative. A full 79 percent of handshakes are completed in a RTT of 100ms or less. More impressively, 70 percent were 61 ms or faster - which is certainly fast enough for most applications.&lt;br /&gt;&lt;br /&gt;These positive results, combined with the slower DNS and SMTP client numbers show us that the client/server model of the web is provisioned much more effectively than any given link in a real peer to peer setup. This certainly shouldn't be a surprise, but it does give the lie to any an diagram of the 'net that uses unweighted edges.&lt;br /&gt;&lt;br /&gt;This is my last post on latency from my little spot on the grid. I promise.&lt;br /&gt;&lt;br /&gt;On a related thought, &lt;a href="http://www.mnot.net/blog/2007/06/20/proxy_caching"&gt;Mark Nottingham has a great post&lt;/a&gt; dealing with support for various aspects of HTTP in network intermediaries. I like characterization studies so much because they provide real data about what to optimize for, Mark's post provides real data about what worry about in implementations.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-3368045129259008406?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3368045129259008406'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/3368045129259008406'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2007/06/http-client-handshake-characterization.html' title='HTTP Client Handshake Characterization'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-8014084939346978306</id><published>2007-06-20T19:39:00.000-04:00</published><updated>2007-06-20T19:00:57.767-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='dns'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>More Characterization - DNS Latency</title><content type='html'>A little while back &lt;a href="http://bitsup.blogspot.com/2007/03/characterizing-latency-of-my-mail.html"&gt;I posted about the incoming TCP handshake latencies on my boutique broadband mail server.&lt;/a&gt; In short, they were awful - 184ms median and 77% of all handshakes took more than 100ms. A few hypotheses were drawn:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;A lot of my mail is spam. Much of the spam comes from botnet owned hosts on consumer internet connections. Because of this I am not so much measuring latency from my edge to core Internet services, but instead to other edges. If this is true, it actually has some interesting peer to peer insights.&lt;/li&gt;&lt;li&gt;Because we are likely dealing with "owned" botnets sending the spam, those hosts are distributed more uniformly across the world than the services I actually choose to use day to day which exhibit greater locality to my part of the world. Therefore I am getting real data, but maybe not data that is especially insightful to my day to day network usage.&lt;/li&gt;&lt;li&gt;My results may not be reflective of general edge connectivity, I might just have lousy service.&lt;/li&gt;&lt;li&gt;The handshake latency should be dominated by the network, but the application at the other end plays a role too. Owned spam generators may not be running up to commerical grade mail client standards generally expected of Internet infrastructure.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;I posed the open question if the results would be the same for other protocols. DNS and HTTP were of particular interest. This post is about DNS client performance.&lt;br /&gt;&lt;br /&gt;I have now built a packet trace that covers 6 days and about 130,000 DNS request/response pairs. I was astonished there were so many in less than a week from a home LAN. The trace was taken upstream from my home LAN caching recursive resolver - so the redundancy was removed from the data where TTL based caching could do so. The 130,000 transactions were done across 8854 different servers - this was also an astonishing amount of diversity. It comes down to just 14 per server on average, I would have expected a lot more server reuse.&lt;br /&gt;&lt;br /&gt;The results are indeed better than the SMTP handshakes. We are now dealing with infrastructure class servers and it shows. But frankly, the latency numbers are still surprisingly high - 41% of all lookups still take a very noticeable 100ms. Remember, also, that starting many webpages requires at least two uncached lookups (one from the root name servers and one from the zone's name server itself) - that can be a really long lag.&lt;br /&gt;&lt;br /&gt;Here are the numbers, ranging from a best of 24ms, to a worst of 17 minutes. That latter number is certainly an outlier. 99+% of all transactions were complete in 401ms or less.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;percentile    latency (ms)&lt;br /&gt;best                  24&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;10                    41&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;20                    43&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;30                    55&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;40                    71&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;50                    77&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;60                   101&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;70                   114&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;80                   128&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;90                   180&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;99                   401&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;worst          1,039,356 (17 minutes!)&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Any lookup that did not complete was not included in the dataset. That 17 minute one is fascinating - it is a genuine reply to the original lookup request of kvack.org (probably generated by the spam filtering software who received a legitimate message from someone @kvack.org but was seeing if the name resolved as part of its spam scoring system) - it was not a reply to a client generated retransmission or anything like that. That request must have been buried in quite a queue somewhere! It is hard to imagine that the DNS client hadn't timed out the transaction by the time the response arrived, but the packet trace does not give any insight into that. These &lt;span style="font-weight: bold;"&gt;very long&lt;/span&gt; transactions are exceptionally rare - only 46 of the 130,000 transactions took more than 3 seconds.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;As you see, the median is 77ms&lt;br /&gt;&lt;/li&gt;&lt;li&gt;The mean is 112ms (104 removing the 17 minute outlier from the data)&lt;/li&gt;&lt;li&gt;41 percent of all transactions took over 100ms&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;For the curious, I generated the latency numbers using this little&lt;a href="https://www.ducksong.com/misc/dns-rtt.cpp"&gt; ad-hoc piece of C&lt;/a&gt;, linked off this &lt;a href="https://www.ducksong.com/"&gt;page of wonder&lt;/a&gt;. The stats were just done with command line awk scripts.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;span style="font-size:130%;"&gt;What does it Mean?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;I can draw some weak conclusions:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;DNS latency is much better than TCP handshake latency on my mail server - indeed almost twice as good. It seems likely that is because DNS is dealing with infrastructure class servers (both in terms of location and function), whereas much of the email traffic was probably botnet generated spam out at the edges of the network - just like my host. So much for net neutrality eh, there are already multiple tiers of service in full effect!&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Latency still sucks. 100ms round trips are deadly and common.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;It will be interesting to see how the HTTP client handshake latency numbers compare. The DNS numbers suffer some skew away from the common usage patterns of the edge users because they are looking up email domains from spam and mailing list contributors, etc.. the HTTP numbers ought to be more pure in that respsect.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-8014084939346978306?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bitsup.blogspot.com/feeds/8014084939346978306/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=8147202175434463396&amp;postID=8014084939346978306' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8014084939346978306'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8014084939346978306'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2007/06/more-characterization-dns-latency.html' title='More Characterization - DNS Latency'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-8164446663507095346</id><published>2007-06-14T09:39:00.000-04:00</published><updated>2007-06-14T09:53:46.499-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='disk'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='lwn'/><category scheme='http://www.blogger.com/atom/ns#' term='google'/><title type='text'>Disk Drive Failure Characterization</title><content type='html'>I've &lt;a href="http://bitsup.blogspot.com/2007/03/characterizing-latency-of-my-mail.html"&gt;admitted previously&lt;/a&gt; that I have a passion for characterization. When you really understand something you can be sure you are targetting the right problem, and the only way to do that with any certainty is data. Sometimes you've got to guess and make educated inferences, but way too many people guess when they should be measuring instead.&lt;br /&gt;&lt;br /&gt;Val Henson &lt;a href="http://lwn.net/Articles/237924/"&gt;highlights&lt;/a&gt; on &lt;a href="http://www.lwn.net/"&gt;lwn.net&lt;/a&gt; a couple great hard drive failure rate characterization studies presented at the &lt;a href="http://usenix.org/events/fast07/"&gt;USENIX File Systems and Storage Technology Conference&lt;/a&gt;. They cast doubt on a couple pieces of conventional wisdom: hard drive infant mortality rates, and the effect on ambient temperature on drive lifetime. This isn't gospel: every characterization study is about a particular frame of reference, but it is still very very interesting. Val Henson, as usual, does a fabulous job interpreting and showing us the most interesting stuff going on in the storage and file systems world.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-8164446663507095346?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8164446663507095346'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/8164446663507095346'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2007/06/disk-drive-failure-characterization.html' title='Disk Drive Failure Characterization'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-2043249188314495385</id><published>2007-06-05T11:20:00.000-04:00</published><updated>2007-07-06T14:27:17.408-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><title type='text'>Fairness is good - TCP fairness misses the point</title><content type='html'>For the longest time, any proposed enhancements to TCP congestion control were measured against 2 criteria&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Is it more efficient in some sense than what we've got&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Is it TCP friendly (loosely defined as not harming the share of bandwidth legacy TCP flows would get if the new algorithm were deployed in a mixed legacy environment - like the Internet for example)&lt;/li&gt;&lt;/ul&gt;The first criteria makes sense.. the second just ties our hands.&lt;br /&gt;&lt;br /&gt;The problem with TCP friendliness as a requirement is that it supposes TCP flows represent a 1:1 proxy for entitlement. That of course isn't true.. a single person could be using multiple TCP flows at any moment, along with some UDP traffic and some other non TCP/IP traffic. The multiple flows do not coordinate, and these latter types of flows are often not congestion controlled at all - much less TCP friendly! It is the aggregate of all of this traffic that really drives the per user cost.&lt;br /&gt;&lt;br /&gt;More and more applications are sensibly opening parallel TCP flows to feed the application. They are sometimes accused of being greedy - doing this to grab an unfair share of bandwidth. But that is only one reason an application might do so. A TCP flow overloads a number of different properties into a single connection. The connection provides reliable delivery, congestion control, and in-order delivery. In-order delivery is handy for many many things, but it can also cause head of line blocking problems - some application architectures create multiple flows so as to separate classes of messages and data by priority. This is a sensible thing to do, but the multiple flows don't coordinate their congestion control properties.&lt;br /&gt;&lt;br /&gt;Applications get accused of greed in situations like this, regardless of their motivations. But in truth, often times it is a net loss in performance for the application. A single &lt;font style="font-weight: bold;"&gt;hot&lt;/font&gt; TCP stream with a fully open congestion window is much preferable from a bandwidth point of view to opening a new one. The new one needs to go through a high latency 3 way handshake and then through a slow start period to ramp up its congestion window - much less efficient than using a fully established flow in the beginning.&lt;br /&gt;&lt;br /&gt;That's just for latency reasons. If the application is harmed by in-order or even reliable delivery (e.g. streaming multimedia) guarantees the app will likely use something like UDP which is not congestion controlled at all - or DCCP which is "TCP friendly" congestion controlled, but is congestion independent of any other existing traffic that is going on. Again - the realm of congestion control is just for the single flow.&lt;br /&gt;&lt;br /&gt;Don't get me wrong. I know lots of applications (p2p especially) open multiple flows in order to hog bandwidth.. The math is easy - if there are two users would you rather have 1 of 2 shares or (by virtue of opening 3 extra flows) have 4 of 5? Still two users, but now that bandwidth is distributed 80/20 instead of 50/50.&lt;br /&gt;&lt;br /&gt;So the flow is clearly not the right spot to be thinking about fairness - which makes TCP friendliness kind of goofy. What is the right granularity - the application? the user? the computer? the lan? the organization?&lt;br /&gt;&lt;br /&gt;Is the right definition of fair "one person one packet" or is it somehow "pay per class of service"?  Clearly this kind of thing cannot be policed by the end hosts, but can they pay more attention to the distributed policing instead of relying on implicit feedback such as inflated RTTs or forced drops. (ECN plays a role here).&lt;br /&gt;&lt;br /&gt;A sensible first step is to unravel some of the TCP overlaps. A number of years ago thoughts on shared congestion managers were popular - keeping multiple flows in one window. This way applications could take advantage of multiple flows for creating independent data streams without the rate implications such strategies currently exhibit. The idea should be expanded to cover multiple hosts and protocols (yes, I understand that isn't an easy proposition) so that in the end the fairness granularity can be defined by local policy.&lt;br /&gt;&lt;br /&gt;I used to think about this kind of thing regularly.. but now as Veronica Mars would say I haven't thought of you lately (c'mon now sugar!). But Bob Briscoe gets it absolutely right with a paper in the April 2007 ACM SIGCOMM CCR - &lt;a href="http://www.sigcomm.org/ccr/drupal/?q=node/172"&gt;Flow Rate Fairness: Dismantling a Religion&lt;/a&gt;.&lt;br /&gt;&lt;h1 class="title"&gt;&lt;br /&gt;&lt;/h1&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-2043249188314495385?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2043249188314495385'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/2043249188314495385'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2007/06/fairness-is-good-tcp-fairness-misses.html' title='Fairness is good - TCP fairness misses the point'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-8147202175434463396.post-6664669695481343655</id><published>2007-03-03T18:12:00.001-05:00</published><updated>2007-07-06T14:27:59.219-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='smtp'/><category scheme='http://www.blogger.com/atom/ns#' term='characterization'/><category scheme='http://www.blogger.com/atom/ns#' term='latency'/><title type='text'>Characterizing the Latency of my Mail Server</title><content type='html'>Characterization fascinates me. Knowing what you are trying to solve in detail lets you focus on what is important. Sometimes what is important is everything that is possible - sometimes it is everything that is likely - both what is possible and what is likely for any given problem tend to be a lot smaller than an unbounded "anything at all". This is true for computers, business, planning your vacation, whatever. Characterization is key.&lt;br /&gt;&lt;br /&gt;One of my favorite books of all time, &lt;a href="http://www.amazon.com/Web-Protocols-Practice-Networking-Measurement/dp/0201710889/ref=pd_bbs_sr_1/102-8054558-2724915?ie=UTF8&amp;s=books&amp;amp;amp;amp;qid=1172964971&amp;amp;sr=8-1"&gt;Web Protocols and Practice&lt;/a&gt;, does a really great job of this for the web circa 2002. The web has changed in some ways since then, an update would be welcome, but many of the fundamentals still apply.&lt;br /&gt;&lt;br /&gt;In 2002 I started working on XML aware networking. That space was changing so fast it was very hard to characterize the workloads we were seeing. That meant it was harder to build really great products when an average workload was 50 bytes one day and 50 megabytes the next - you want to focus on different things. The XML space still shows lots of variation, but it is maturing now in a way that makes it ripe for a treatment like WPaP.&lt;br /&gt;&lt;br /&gt;Anyhow, I was thinking about this the other day when I was reading a paper about &lt;a href="http://wil.cs.caltech.edu/pfldnet2007/paper/YeAH_TCP.pdf"&gt;Yeah-TCP&lt;/a&gt;. It is popular nowadays to attack the high-bandwidth delay problems TCP is well known for. This used to be a research problem, but now it is thought to impact common desktop stacks too. That got me wondering a bit. I spend a lot of time in the datacenter working to fill highspeed low latency links.. a few years back when I was in the ISP world at AppliedTheory I saw an awful lot of low bandwidth and low latency links (it was 1999 - the Internet core was great, but the last miles were still comparatively slow - 30Mbps was big bucks to your door), now home users are seeing big bandwiths (Fios, u-verse, etc..) but I have no idea what has happened to a typical desktop rtt in the past few years.&lt;br /&gt;&lt;br /&gt;Being a do it yourself kind of guy, I run a mail server for a vanity domain over standard copper DSL. In order of frequency it receives: spam, linux-kernel mail, other mailing list mail, and an occasional note someone actually wrote with me in mind. I figured it would be easy enough do a tcpdump capture of incoming smtp connections and post-process that to figure out what rtt's looked like these days.&lt;br /&gt;&lt;br /&gt;It turns out that figuring out the elapsed time of a TCP handshake from a packet trace is not particularly easy. I can usually cobble something together with tcpdump, or tcpflow, or maybe wireshark.. but I couldn't figure out how to say "show me only the syn-ack and the ack to that" for every stream. I ended up writing some very hackish &lt;a href="https://www.ducksong.com/misc/calc-rtt.cpp"&gt;C code&lt;/a&gt;.. anybody how knows me, also knows I enjoyed doing that, but it was a chunk of very unportable work that should have been more scriptable.&lt;br /&gt;&lt;br /&gt;Anyhow - onto the results. The capture covered about 24 hours. The server is not very busy. It received 1536 incoming connections over that time, and managed to complete the handshake on 99.1% (1523) of them. I have divided the characteristics into "all connections", "all non-lkml connections", and "just lkml connections". Everything is measured in milliseconds.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;                    ALL       NO-LKML        ALL-LKML&lt;br /&gt;samples                1523      1257           266&lt;br /&gt;--&lt;br /&gt;100th pct              61021     61021          161&lt;br /&gt;90th  pct              795       978            113&lt;br /&gt;50th  pct              184       233            100&lt;br /&gt;10th  pct              90        81             99&lt;br /&gt;0th   pct              27        27             98&lt;br /&gt;--&lt;br /&gt;mean                   585       687            102&lt;br /&gt;--&lt;br /&gt;pct &gt; 100ms            77        86             50&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;So there you go - if you believe this is representative then more than 3/4 of connections out there are floating around with &gt;= 100ms  TTs. Even communications with a significant high-volume server in my own timezone (vger.kernel.org) are likely to be in that neighborhood. The days of high bandwidth-delay do anecdotally seem to&lt;br /&gt;have arrived on the desktop.&lt;br /&gt;&lt;br /&gt;Interesting questions:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Spam comes from bots - is that a different than 'legit' traffic. If so - Does that matter?&lt;/li&gt;&lt;li&gt;this is server side. Would it look different if I was measuring handshakes I initiated?&lt;/li&gt;&lt;li&gt;is smtp just like http? just like video?&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8147202175434463396-6664669695481343655?l=bitsup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/6664669695481343655'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8147202175434463396/posts/default/6664669695481343655'/><link rel='alternate' type='text/html' href='http://bitsup.blogspot.com/2007/03/characterizing-latency-of-my-mail.html' title='Characterizing the Latency of my Mail Server'/><author><name>Patrick McManus</name><uri>https://profiles.google.com/100166083286297802191</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-dlwo2_tBwmU/AAAAAAAAAAI/AAAAAAAACnM/sfHDzgT9wGg/s512-c/photo.jpg'/></author></entry></feed>
