Using XSL to arbitrarily order strings - lessons from Professor Layton

2010-07-02 on Development, Sorting, XSLT

At the Day Job, I usually work on a middleware component that contains a component that monitors the state of the system. A "health check", if you will. The component can be monitored automatically so notifications can be triggered on error conditions, or it can be used by a human. In the latter case, the user sees a list of tests performed on the system, sorted first by test outcome and then by test name. To help the user identify problems, any errors encountered are pushed to the top of the results page. Here's a sample:

Result	Test	Notes
Error	Nacelle Polarization	unpolarized
OK	Dilithium Crystals
OK	Jefferies tube
OK	Warp Coils

The actual report is an HTML page built from XML using an XSL transform - the main health check page queries various subcomponents that provide XML document sections. The sections are gathered and the XSLT sorts the results according to severity.

The XSLT sorts the entries alphabetically by result string, using this XSL:

<xsl:apply-templates select="//Operation">
    <xsl:sort order="ascending" select="Result" />
    <xsl:sort order="ascending" select="Test" />
</xsl:apply-templates>

Up 'til now, that worked great, but recently we had a need to add a third status - "Warning". The new report looked like this:

Result	Test	Notes
Error	Nacelle Polarization	unpolarized
OK	Dilithium Crystals
OK	Warp Coils
Warning	Jefferies tube	partly blocked

It would be better for Warning to be grouped between Error and OK. Unfortunately, it wasn't obvious how to do this. A few Google searches later, I'd found a post by Nick Fitzsimons that described his solution to the problem. After trying his approach, I was struck by a feeling of deja vu: I'd seen this, and recently, but where?

Professor Layton to the Rescue

Then it hit me. It's a classic puzzle. I'm sure it's appeared in many places, but I recently saw it in the game Professor Layton and the Diabolical Box.

The Fake Coins puzzle asks

There are 10 coins in each of the five bags below. One of these bags is filled with fake coins that are lighter than the real ones. A real coin weighs 10 units, but a false coin is one unit lighter. If you're using a scale that can register up to 200 units, what is the fewest number of times you could use the scale to find the one bag filled with fake coins?

I'm going to spoil the puzzle, so if you want to figure it out yourself, stop reading now.

The answer is "one". The interesting part is the approach: take 1 coin from bag 1, 2 coins from bag 2, and so on. Weigh them. There's a total of 15 coins, so if they were all genuine, the weight would be 150 units, but we know that each counterfeit coin is one unit less. So,

if bag 1 contains the fakes, the total weight will be 150 - 1 = 149
if bag 2 contains the fakes, the total weight will be 150 - 2 = 148
if bag 3 contains the fakes, the total weight will be 150 - 3 = 147
if bag 4 contains the fakes, the total weight will be 150 - 4 = 146
if bag 5 contains the fakes, the total weight will be 150 - 5 = 145

It's a nice trick - coins from each bag contribute either 10 or 9 units - the weight difference between a good and a bad coin is 1, so we magnify that constant difference by different amounts to produce a single value that identifies which group the fake(s) come from.

From coins to result severity

The puzzle's fun, but what's the connection with the string ordering? The XSLT sort function operates on a single sort key generated from the input nodes, kind of like the single value (the weight) generated from a set of coins in the puzzle.

It's still not clear how to generate a "weight" for the strings. Like in the coin puzzle, we want to sum up a series of values that are mostly the same, but that differ for a single result severity. We're helped by the fact that the number function converts Boolean true values to 1 and false to 0. If we compare each result severity in the source XML to "Error", "Warning", and "OK" in turn, exactly one of these will give a true (1) response, and the rest will be false (0).

So, like the coin puzzle, where all weights are the same except for the counterfeits, we have a situation where all comparisons give the same value except for the true one. If we treat the sorting groups—Error, Warning, and OK—like the bags of coins, we can see how to rank the results. Multiplying the 0s and 1s by a factor that gives the preferred sort order produces a sum that acts as the perfect sort key:

<xsl:apply-templates select="//Operation">
    <xsl:sort data-type="number" order="ascending"
        select="(number(Result='Error') * 1)
              + (number(Result='Warning') * 2)
              + (number(Result='OK') * 3)" />
    <xsl:sort order="ascending" select="Result" />
</xsl:apply-templates>

a result severity of Error maps to 1 × 1 + 0 × 2 + 0 × 3 = 1
a result severity of Warning maps to 0 × 1 + 1 × 2 + 0 × 3 = 2
a result severity of OK maps to 0 × 1 + 0 × 2 + 1 × 3 = 3

The select code is a little long, and not obvious when starting from an empty slate, but it has some nice features:

extending the sort for new result severities is straightforward - just add a term with the appropriate multiplier
if we introduce a new severity without adding it to the sort, it sorts to the top - probably the best possible default action
most importantly, it works. We now get a good health check result:

Result	Test	Notes
Error	Nacelle Polarization	unpolarized
Warning	Jefferies tube	partly blocked
OK	Dilithium Crystals
OK	Warp Coils