How many times have you had to wait forever for a system to
display a long list of things that you then had to scroll through
looking for the particular item you want? For example, the current
list of unpaid invoices.While displaying the a long list a page at
a time internet-search style may solve the performance problem from
a system's perspective, it does not make life easier for a user to
locate the item they want. Below we look at some strategies for
better handling of large lists.
User or system extendible lists or large fixed-sized lists can
seriously affect the performance of a software system. Loading the
entries of a large list from disk or over a network can be a time
consuming operation. Keeping the contents of large lists in memory
can increase the amount of memory required to run an application or
result in the operating system page swapping memory to disk (again
slowing performance). If the contents of a list are more complex
objects than just simple character strings the problem is
worse.
You might think that such performance problems would be
identified early in a system's development and appropriate
solutions found. However, most developers use small sets of data
for unit and integration testing. This means that a performance
problem with a large list might not be noticed until a formal
system performance testing stage. By then project deadlines may
mean there is not enough time to correct the problem. When a list's
contents can grow slowly over time or be extended by users of
the system, a performance problem may not be noticed until the
system has been installed and running for months.
Therefore, when it comes to working with lists, it is definitely
a case of think first before coding the simplest solution that
comes to mind. In this situation the simplest solution may actually
be too simple and affect the ability of the software to scale to
truly large sets of users and large projects. For software product
vendors, problems like this can seriously affect the ability to
sell a software product into large companies.
The following strategies can be used to provide better designs
for manipulating larger and growing lists of things. None of the
strategies are new or earth shattering; the point is to consider
them when first designing the manipulation of a large list and pick
one or a combination.
List Strategy #1: Categorize list entries and present as a tree
structure
Place the entries of a large list into a set of mutually
exclusive categories. Instead of presenting the user with a single
long list of items, ask the user to select a category first and
then display only those entries within that category.
Notes:
- Most graphical user interface (GUI) toolkits provide a tree
control that can be used to do this concisely; categories are
parent nodes in the tree and the items are leaf nodes.
- Most GUI tree controls provide a means of showing and hiding
the children of a node. Only those entries under nodes that a user
'expands' need be loaded often reducing the amount of memory used
significantly.
- Enabling the user to redefine the categories and move items
from one category to another allows a user to organize the items to
better meet their own specific needs.
Examples:
- The filesystem on many computers (e.g. Unix, Mac and Windows)
use directories or folders to present large numbers of individual
files within a tree structure.
- Many email clients use folders to organize a large number of
messages as a tree.
- TogetherSoft's Together ControlCenter presents its list of
automated design patterns as a tree control categorized by the
patterns applicability or origin (e.g.. J2EE patterns, Gang of Four
Design patterns, user interface patterns, Coad class archetypes,
etc.).
Advantages:
- This strategy usually offers a reasonably straightforward
refactoring by replacing a list control with a tree control.
However, it is definitely more efficient to decide to use a tree
first than code a list and then replace it later.
- Provides an easy way for a user to locate a particular item if
the category of the item is known.
Disadvantages:
- Makes it much harder to find a particular item if the category
is not known.
- As more and more items are added, the categories need to be
reorganized and extra levels added.
- A tree structure only provides a single categorization scheme;
an item belongs to one and only one category. Extensions such as
links and shortcuts to other categories can be used to relieve this
constraint but at the cost of significant additional
complexity.
This strategy does not really solve the underlying problem of a
growing list; it only helps ease the pain for a while. Imagine a
system that listed all employees. In a rapidly growing company a
simple list might be sufficient for the first couple of years. When
the company has a few hundred employees a tree control might
provide a good enough mechanism for locating a particular employee.
However for a multinational company neither a simple list or tree
is going to suffice.
List Strategy #2: Replace a simple list with a search
operation
Instead of presenting all the items in one long, alphabetically
sorted list, provide the user with the ability to search for a
small subset of items using a set of criteria.
Notes:
- In business systems it is important to select a useful set of
search criteria; talk to the user representatives and, if possible,
watch how they currently locate these items in their daily
work.
- If search criteria can be saved, a user can build up a set of
useful searches so that criteria does not have to be remembered and
reentered each time.
Examples:
- Filesystems on many computers (e.g. Unix, Mac and Windows)
provide a file search capability in addition to the hierarchical
categorization of directories or folders.
- Many business systems provide a search facility for locating
customers details.
- Many e-commerce sites provide a search facility for quickly
locating a particular product.
Advantages:
- Users can use multiple criteria to try to locate a particular
item.
- When backed by indexing techniques a search can be a much
faster way to locate a particular item.
- Searching is one of the most established branches of computer
science so an efficient algorithm is usually readily
available.
Disadvantages:
- Poor selection of search criteria can still result in a large
list of items being presented to the user or the particular item
not being found.
- Poor implementation can make searching very expensive and time
consuming operation.
List Strategy #3: Enable a user to define, name, save and load
subsets
Enable a user to specify a subset of the whole list, give that
subset a name, save it and then load it as and when desired.
Notes:
- In a distributed system, designers need to choose between
storing the subsets on the client or on the server. Storing on the
server usually requires more work but subsets can be shared between
users.
- Storing on the client often means faster retrieval but useful
subsets cannot be shared between users as easily.
Examples:
- The list of possible stereotypes in UML is growing daily. For a
tool vendor to present in a list all the possible stereotypes an
element can take is rapidly becoming impractical. UML Profiles
provide a mechanism where a subset of stereotypes can be named,
saved and loaded so that a user only loads the subset of
stereotypes relevant to the task on which they are working.
- A bank manager might save a subset of his most important
account holders.
- A planner might save a subset of possible tasks representing a
particular process.
Advantages:
- Users only load the items they need to work with reducing the
amount of items that need to be loaded from disk or across a
network into memory.
- Unlike saving a set of search criteria, there is no potentially
expensive operation to be performed before the set of items is
presented to the user.
- If there are many lists to which this strategy can be applied,
then these sets can themselves be collected together into named
themes or profiles or project templates.
Disadvantages:
- A newly added item may not be noticed by someone working with
statically defined subsets.
Combining Strategies
We have already mentioned that computer filesystems tend to
combine a tree structure and a search facility to help users locate
files. Other combinations can be very useful too. A search could
return its results in a tree form helping the user learn the
categories used. The results of a search could be used to from a
named subset of items. Larger named subsets could be presented as a
small tree structure. And so on.
Summary
Working with a list? Think about how large it is or could
become. If it could grow to hundreds or thousands of entries
consider each of the above strategies (preferably with a user
representative). Pick one or a combination and provide your system
with the ability to scale well to use by large organizations.
This article is an update of an article first
published as CoadLetter
#88.