Digg's New DUI.Stream and MXHR

aston · on April 22, 2009

Interesting idea, attempting to crunch all of the requests for a page snippet into a single request. Why do it in Javascript, though?

It seems like you'd be better off with a pre-processor server-side to inline include CSS, Javascript and images, then send the resulting page back for rendering. Javascript isn't nearly as fast as my C++-based templating engine...

edit to add: Also, don't discount the big savings you can get from "302 Not Modified" return statuses, especially on big chunks of content like images. The DUI.Stream image demo they had would get smoked by the 'dumb' version if they allowed the browser to cache the digg dude.

thamer · on April 22, 2009

I suspect there's not much to gain by caching images in the browser: If you're looking at a page full of comments with hundreds of avatars, chances are the next page is going to make you load hundreds more that you haven't seen already: There is a very large number of commenters, and you probably won't see the same people on every page.

The demo is right: they ask for 300 different images (as in, different URLs), and their packager makes it faster.

Also, how would you inline include images? Using base 64 in the image tag?

aston · on April 22, 2009

The demo is pretty unfair. If you want to package lots of the same image, you just use it and let cache control do its job. If you want to package lots of different images, you should use a big sprite (again, server-side packaging without any js). If you want to make image loading look slow, do what they did in the demo.

cakeface · on April 22, 2009

There are a bunch of images that can get repeated across pages for a web site that are completely vital to cache. Think rounded corner images or background images or icons.

ncrovatti · on April 23, 2009

Lazy loading seems more efficient to me for this particular use.

ojbyrne · on April 22, 2009

They do plenty to cache images (including the digg dude). This is about the first load. Yahoo has some report showing that data is uncached quite a bit of the time, even when sending the appropriate headers.

tjpick · on April 22, 2009

what advantage does this have over using something like keep-alive? http://httpd.apache.org/docs/1.3/keepalive.html

Keep alive, to me, seems better as you don't have to modify the frontend architecture. Turn it on at the server and let the server and browser do the hard work.

zemaj · on April 22, 2009

Keep alive maintains a persistent connection, you are still limited by the number of concurrent connections to the server and the latency that each one experiences.

MHXR bundles all your requests into one. This leads to savings the latency overhead on each request (& probably some server resources not having to deal with multiple connections too).

I tell you what; piping binary data into objects? Genius! I had no idea that was possible.

tjpick · on April 23, 2009

yeah, my understanding was that the overhead of a request is significantly less than the overhead of establishing a connection. So by using keepalive/persistent connections, you get the majority of the savings. All the headers and data for each file still have to get sent over the wire in the digg solution anyway, and the concurrent connections limit is the same in both cases.

But thanks for explaining, I see the latency particularly is an issue.

Glide · on April 23, 2009

I found this interesting tidbit on the HAProxy page:

Keep-alive was invented to reduce CPU usage on servers when CPUs were 100 times slower. But what is not said is that persistent connections consume a lot of memory while not being usable by anybody except the client who openned them. Today in 2009, CPUs are very cheap and memory is still limited to a few gigabytes by the architecture or the price. If a site needs keep-alive, there is a real problem. Highly loaded sites often disable keep-alive to support the maximum number of simultaneous clients. The real downside of not having keep-alive is a slightly increased latency to fetch objects. Browsers double the number of concurrent connections on non-keepalive sites to compensate for this.

Don't know if it's true or not, but it doesn't take much thought to realize if someone wanted to DDOS a server they would use persistent connections.

timdorr · on April 22, 2009

Even better: HTTP pipelining - http://en.wikipedia.org/wiki/HTTP_pipelining

Of course, that's controlled client-side and often doesn't request all that many resources at once. This is a server-side pipelining that lets them determine their own capacity (which makes sense).

Of course, I'm not really sure how they're going to do this in IE, since it doesn't support decoding base64 content to image data on the page. Something tells me this is going to hit a bit of a brick wall.

zemaj · on April 22, 2009

Pipelining does not work in most browsers (& is disabled in FF by default). When it does work it can be flaky. A JS solution works on current technologies.

mr_justin · on April 23, 2009

Unless I'm reading their demo wrong, their technique makes the page load slower (for me anyway): http://demos.digg.com/stream/streamDemo.html

Normal is consistently under 100ms and MXHR is around 400ms (using Safari 4 beta, OS X)

mitchellh · on April 23, 2009

I get the same thing with the text version. I refreshed 20 times and only got 1 that was faster (and only 7ms to boot).

But if you try this in a non-IE browser: http://demos.digg.com/stream/imageDemo.html

The image demo performs amazingly well! I wish there was a timer for that one but its extremely fast.

mitchellh · on April 23, 2009

CORRECTION TO THIS BY AUTHOR: There is a timer, it just waits until all the normal images load (and I was too lazy to wait). Upon refreshing, the MXHR stream was 10.3x faster on my FF.

mr_justin · on April 23, 2009

Hmm, I tried 10-20 times and MXHR was always slower. Sometimes a full half-second slower. I had some times approaching 1-second load time, meanwhile the "normal" load was consistently between 350-400ms.

I know it's just a proof of concept and yes, YMMV, but couldn't they come up with a demo that clearly showed this new technology they invented was worth using?

mikeytown2 · on April 23, 2009

Use this trick instead of CSS sprites.

thorax · on April 23, 2009

In all of the browsers I tried (FF3, IE8, Chrome), the timer gave me 4x slower using the MXHR stream on first try. But when I refreshed it was always wildly faster for the MXHR side (and still substantially faster on cache refresh).

schtono · on April 22, 2009

Did you figure out how they're transforming the serialized image data into real images? I remember there was something like a data-attribute in the img-tag back in the days, but i thought it was deprecated years ago...?

Any hints?

a-priori · on April 22, 2009

It looks like they do it by replacing the image tags in the DOM with an object tag pointing to a "data:" uri containing the Base64-encoded image data. Here's two pieces of relevant code, first where they fire off the listeners for the DOM objects, and second where they replace the DOM objects:

http://github.com/digg/stream/blob/66dcce320c73b109cd5836e44...

http://github.com/digg/stream/blob/23a411eb0cc00048bbb090889...

Glide · on April 22, 2009

I don't think they're replacing any objects.

Every time a Content-Type of image/gif is encountered it appends a new object to #stream with the image data inlined. I was really confused at the object tags at first, so I went ahead and copy pasted one of the object tags with image data into a blank html and loaded it into firefox. The image showed up fine.

This is an extremely clever solution. I know jQuery UI's icons are in one giant image file and background offsets are used to display the correct icon, but this is probably better for a dynamic solution because the amount of CPU it would take to stitch upwards of 100 images together.

smhinsey · on April 22, 2009

that's called a sprite, that giant jquery ui file with multiple images, in case you want to find more info.

Glide · on April 22, 2009

They have an object tag with base64 encoded images.

vdm · on April 22, 2009

Hmm, this goes some way to making up for the diggbar.