Use and Reuse

VERSION 2 Published

Created on:Sep 27, 2007 10:33 AM by curl - Last Modified:  Sep 27, 2007 4:34 PM by curl

An hour in the life of a curl design reviewer

Consider the task of filling a tree control from a data set of entries with categories (like orders by status, date and customer). Of course, you’d like to be a one-liner, with the underlying components responsible for dealing with whatever adapters and bindings are needed.

But let’s assume your framework doesn’t do that (yet), and you need to do it yourself, now. The needs of any substantial application will extend beyond what’s built in to its substrate.

This came up in a recent training project design review. Its a canonical case, and worthy of some attention. The tree population processing is straightforward, but there are some design decisions to be made, which affect efficiency and reusability. The choices embody tradeoffs between present and future.

An agile approach (or any deadline driven one, for that matter) might start with “the simplest thing that could possibly work”, with nested selection loops over the dataset.


    let n:TreeNode = {node-for "Index"}
    {for c in customers do
        let nc:TreeNode = {node-for c}
        {n.append nc}
        {for s in status-codes do
            let ns:TreeNode = {node-for s}
            {nc.append ns}
            let records:{Array-of Record} =
                {orders.select filter =
                    {RecordData
                        CustomerName = c,
                        StatusCode = s}}
            {for r in records do
                let no:TreeNode = {node-for r["OrderID"]}
                {ns.append no}
            }
        }
    }


What about reuse?

This idiom is applicable for any three-deep hierarchy, and could generalize to any depth. Its relatively concise, and readable. So the most popular form of reuse (copy/paste) will get the next few category trees built, and the project milestone passed.

But that does not feel beautiful, and it incurs some TechnicalDebt for addressing the inevitable changing requirements (different categories, different data, different hierarchy display affordances, more data requirements). For example, a likely request is to suppress empty nodes. Any such change necessitates code changes everywhere the idiom is used.

Martin Fowler nicely illuminates this tradeoffs in his DesignStaminaHypothesis article. He’s talking about large scale projects, but the debt accumulates one line of code at a time.

What about efficiency?

The efficiency depends on that of the select operation. With a well indexed underlying database it will be fine. With linear search through a sequence, it will be ON^2 and not scale well for large data volumes. That probably does not apply now, and may never apply. After all, it is widely quoted that premature optimization is the root of all evil.

The “DontRepeatYourself” principle should alert us to the opportunity for abstraction here. The repetition can be avoided by factoring specific assumptions (like category values and field names) out into parameters.


    let n:TreeNode =
        {order-tree-by-status
            orders, customers, status-codes,
            order-key = "OrderID",
            customer-key = "CustomerName",
            status-key = "StatusCode",
            root-label = "Index"}


This simple change has great benefit. The usage is declarative: the parameters specify what is to be done, not how to do it. This decouples the implementation from its contract.


{define-proc public {order-tree-by-status
                        rs:RecordSet,
                        customers:StringArray,
                        status-codes:StringArray,
                        customer-key:String = "CustomerName",
                        order-key:String = "OrderID",
                        status-key:String = "StatusCode",
                        root-label:String = "Index"
                    }:TreeNode


This naturally focuses attention on that contract, which suggests possibilities for further generalization.

First, to arbitrary categories …


{define-proc public {simple-category-tree
                        rs:RecordSet,
                        primary-vals:StringArray,
                        secondary-vals:StringArray,
                        key:String = "OrderID",
                        primary-key:String = "CustomerName",
                        secondary-key:String = "StatusCode",
                        root-label:String = "Index"
                    }:TreeNode

… and then, to arbitrary depth …

{define-proc {category-tree
                 rs:RecordSet,
                 category-keys:StringArray,
                 root-label:String = "Index"
             }:TreeNode

… yielding an even more concise, declarative form.

        {category-tree orders,
            root-label = "Orders",
            {StringArray
                "OrderStatus",
                "CustomerName",
                "OrderDate",
                "OrderID"}}


Abstraction can be a slippery slope. Dick Gabriel, an early proponent of the pattern languages movement, notes in Patterns of Software that

When taken to extremes, abstraction can decrease habitability, and can result in premature compression. Beware of over abstracting, or abstracting where a common pattern will do.

Of course, the depth generalization would necessitate a more substantial change to the implementation. Fortunately, the implementation has already been decoupled.

Handling variable depth stretches the nested loops idiom, and suggests rethinking the algorithm, which may already have raised efficiency concerns.

The classic wisdom for search and select operations is to rely on hashing or sorting, as described in Tim Bray’s contribution to Beautiful Code.

So here’s a solution using hashing, to derive values while processing source records. Note that The fact that the references to each category must be present in the data records allows the enumeration of category values to be inferred.


{define-proc {category-tree
                 rs:RecordSet,
                 categories:StringArray,
                 root-label:String = "Index"
             }:TreeNode
    let root:TreeNode = {TreeNode node-data = root-label}
    || index of created nodes
    let nodes:{HashTable-of String, TreeNode} =
        {{HashTable-of String, TreeNode}}
    || process records
    {for r in rs do
        || current node, and key
        let p:TreeNode = root
        let k:String = ""
        || process categories
        {for category in categories do
            || data for category
            let v:any = r[category]
            || concatenate data for unique key
            set k = {format "%s`%s", k, v}
            || check if subnode already exists
            let (n:TreeNode, n?:bool) = {nodes.get-if-exists k}
            || descend to subnode
            set p =
                {if n? then n
                 else
                    || new subnode: create, attach, index
                    let n:TreeNode = {TreeNode node-data = v}
                    {p.append n}
                    set nodes[k] = n
                    n}
        }}
    {return n}
}


You’ll find some further refactoring opportunities, of course, in this example. (See if you can spot them!)

But that’s the way living software evolves. Get used to it!

Not to mention the importance of using objects, and the MVC pattern, and …

But that will be another design review.
Average User Rating
(0 ratings)




There are no comments on this document