Finding things quickly

There are many ways of finding things in the DOM - sometimes, many ways of evaluating apparently the same expression. For example, CSS and JQuery allow you to find elements by id, using an expression of the form

  var myElement=$("#my-element-id");

as well as the apparently equivalent expression

  var myElement=$("[id=my-element-id]");

However, these expressions are different in many ways. One advantage of the latter is that it is more resistant to escaping issues - for example, the colon character : is actually a valid part of an XML id, but is not correctly handled in many escaping situations. JQuery will not accept it as part of a # expression, but is fine in an id specification. This was the basis of our adoption of the latter for the fluid.jById utility.

However, these types of selectors are treated very differently within the nuts-and-bolds of the jQuery selector engine. # selectors have a very special status, which enables numerous shortcuts including the judicious use of document.getElementById where possible. id selectors are just standard attribute value filters, and will perform a full scan of the DOM.

document.getElementById("my-id") 8.5µs
$(document.getElementById("my-id")) 24µs
$("[id=my-id]“) 12ms

These tests were done on Firefox 2 in a rather busy document of about 40K of HTML.

Note that the cost of simply wrapping a DOM element in a jQuery object is small, but still noticeable.

Cost of guarding

This immediately suggests a policy of falling back on document.getElementById for quick lookups - however, jQuery, being a mature framework, is doing some work for us here in insulating us from an awkward IE bug where document.getElementById is sometimes returning an element which has a match on *name* rather than a match on *id*. This is a rather awkward tradeoff - putting in a “guard branch” to check that the required id is actually present as an attribute slows down access for everyone to 22µs, with the option to default back to the jQuery scan. However, do we really want a library with the contract “give me the element with this id, unless there is something screwy about your document, in which case give me the element 1000 times slower”. Probably the user will not thank you in this case, and would prefer to be given the chance to fix their markup.

Quick access to DOM nodes

This brings us back to why any of these issues are particularly interesting in the first place. I am currently working on reforming the Fluid Reorderer component, not only from a structural but also a performance point of view. The behaviour of the jQuery droppable library is not quite what we would want, both from a performance as well as a behavioural point of view (might be the subject of a separate blog posting), but in fixing it we are faced with some crucial design decisions about how to arrange data structures such that they can be used for very quick access - ordinarily this is not so important, but a DnD library is going to be queried on every mouse move event, and so even timings in microseconds can begin to mount up.

There are two main options -

i) store a “reverse lookup” cache of jQuery.data id numbers onto DOM nodes, and work with data structures storing jQuery.data ids, or
ii) synthesise short unique ids onto DOM nodes that do not have them, and work with data structures storing DOM ids.

Route i) runs the risk of keeping these caches up to date, and in particular avoiding leaks due to “expired” DOM nodes continuing to hang around in the cache - but it would probably be faster at runtime. Route ii) will be a little slower (incurs the costs we have been talking about so far, in invoking a document.getElementById equivalent on going to the DOM) but is definitely leak-safe, and might also help others navigate around the DOM.

What to do

These are hard kinds of calls to make, and involve a consideration and weighting of different kinds of issues. Currently a reasonable tradeoff for fluid.jById seems to be to go towards document.getElementById, but to put in a guard branch with a hard failure if a “name” attribute is picked up by mistake. For “live” storage during drag-and-drop, the jQuery data id cache option appears preferable, since we will already be storing various other data on drop targets in addition to what can be discovered in the DOM.

jQuery.data is really pretty fast, I will leave you with a few more timings -

FF2 FF3
$(”[id=my-id]“) 12ms 5.1ms
bare document.getElementById() 8.5µs 3.2µs
document.getElementById() with guard 22µs 6.5µs
jQuery.data cache hit 2.9µs 1.9µs

2 Responses to “Finding things quickly”

  1. Bethany Storager Says:

    At the moment I use JQuery, for the reason as you say that is more resistant to problems, but I shall try a few of your recommendations, thanks for the tips

  2. electBlake Says:

    Great article!
    I never knew about the speed comparisons between the various methods.

    I’m not moving from JQuery to be honest, as in my experience (and possibly skill level), things being resistant to error is a MUST.

    A great read though, thanks!

Leave a Reply