Complete Traversals and their Implementation Using the Standard Template Library

traversa/ of a container S as a set) is defined fo:r all x E S :F(x, S) the iteration scheme where :F is a function that might possibly S inserting new elements. We assume that the order in which the elements are treated is not relevant, as long as the iteration continues untíl :F has been applied to all elements currently in S, including those :F has inserted. Standard iteration mechanisms, such as the iterators provided in the C++ Standard Template Library (STL), do not directly support complete traversals. In this paper we present two approaches to complete traversals, both extending the STL framework, one by means of generic algorithms and the other by means of a container adaptor.


Introduction
Consider the following problem: A xnanager wants to arrange a meeting of a certain set of people in her company.For each person in the original set she also wants to invite that person's boss, that boss's boss, and so on.(She has a database from which she can ten who a person's boss is.) The manager can salve this p:roblem fairly simply by writing down the initial set of people¡s names in a list and iterating through the list from beginning to end, inserting new persons at the end where they become part of the iteration.To avoid duplicating names on the list, she should append a 22í person's name to the list if and only if the name is not already present.In the computerized version of this problem, with a large list, an inefficient linear search is required, rather than the binary search that would be possible if the set of names could be kept in, say, alphabetic order.In that case, however, new names should be inserted in their proper place to maintain the order.But this in turn makes it difficult to tell when the iteration should stop, since names might have been inserted befare the current iteration point.The kind of iteration required to gracefuHy salve this problem is caBed a complete tmversal; we give a formal definition in Section 2. Problems requiring complete traversals are fairly common (we give another example in Section 2), and while there are various ad hoc ways of solving them, programmers should ideally have at their command an efficient packaged solution.In this paper we describe two such approaches to complete traversa!, both of which fit into the framework defined by the Standard Template Library, STL (part of the ANSI/ISO draft standard for C++ [2]).STL [8,5,1] provides a set of easily configurable software components of six major kinds: generic algoritluns, containers, iterators, function objects, adaptors, and allocators.In each of these component categories, STL provides a relatively smaH set of fundamental components; it is through uniformity of interfaces and orthogonality of component structure that STL provides functionality far beyond the actual number of components induded.But STL is not intended as a closed system; its structure is designed with extension in mind.The complete traversal components described in this paper may be of interest not only for the functionality they provide, but also as examples of, and measures of, how weH the existing STL components support extensions.
In Section 3, we give two distinct ways of solving the complete traversa} problem: a generic algorithms approach and a container adaptar approach.In both approaches, the complete traversa!components are designed to work with the category of STL components called associative containers, which support fast retrieval of objects based on keys.The generic algorithms are restricted to sorted associative containers, in which keys are maintained according toa given ordering function, but the container adaptor we provide can also be used with hashed associative containers, which give up order properties in favor of faster retrieval.Hashed associative containers are not part of the draft C++ standard but are now provided as an STL extension by at least one compiler vendor [1 J.All of the associative containers have essentially the same interface; e.g., each provides insert and erase member functions for inserting and deleting objects, several kinds of search member functions, and several kinds of iterators for traversing through the current contents.None, however, provides for complete traversals in the sense discussed here.The specifics of these interfaces, and how our components are used for complete traversals, are illustrated at the end of Section 3, in terms of solving the manager's problem stated at the beginning of the paper.
We give more than one approach to the complete traversal problem because no single solution seems best in aH cases.The presentation in Section 3 includes complexity analyses and discussion of other factors such as naturalness of interfaces.We are exploring stiU other approaches, which are discussed briefty in the concluding section.

Complete Traversals
We begin thís section by giving a precise definition of complete traversals.We then describe another example application of the concept.
The definition is dependent on whether the given container S is unique elements, as in an STL set or map) or multiple (i.e., repeated elements are as in an STL multiset o:r multimap).In the unique case, for a given function :F, let J:(x, S) be the set of inserted in S by F(x, S).

S.
In case that S is unique, it follmvs from the definition that the traversal is finite if and only if there is some k such Sk+l ~ U7=o .In the multiple case, we have the stronger that Sk is empty for aH sufficiendy large k.In what we assume traversals are finite.Although a finite complete traversa!could computed actuaHy constrncting the sets or multisets Si, we seek solutions that are more space time efficient.As described in Section 1, the manager's invitation Hst is one problem that might instantiating a complete traversal scheme.We show in detail such a solution using our complete traversal components in Section 3.4.As another example, the one which this stucly of complete traversals, suppose we are given a specification of types written in a certain formalism CTS. 1 CTS types are dassified into atomic types (which have no structure), concrete types (which are used to define data structures), and abstract (which are used to define classes).The representations associated with dasses are, in turn, concrete CTS types.With each CTS expression we can associate a syntax which captures the rel.ationships among involved types.V/e also a.ssume that CTS expression defining a type is associated with a name.Given a map S of elements (n, g), where n is the name of an abstract CTS type and gis the syntax graph associated with its representation, we want to iterate over S in a way that, at each iteration, we take an g) traverse g in a certain order, making sur e that we insert into S elements ( r/, for aH type names n 1 we come across, where g 1 is the syntax graph assocíated wíth n'.Iteration aH elements in S ha ve been processed.This instan ce the is summarized in Fig. 1, where S is shown with its initial val u e. 1 CTS stands for common Type system.The application problem and the clefinition of CTS were taken from [71. S = { ( n, g) 1 n is the na me an abstract CTS type and g -i/; the syntax graph of its representation} :F(x, S) = traverse x.g and insert in S any type name found with its corresponding syntax graph In this section we present two different approaches to complete traversals, one using generic algorithms and the other a container adaptor.We also compare the complexity of these components, and show how they can be used in a simple application.

Generic Algorithms
A generic algorithm is an algorithm designed to work with a variety of data structures, the specialization to a particular structure being realized by the programming language processor ( compiler, interpreter, or run-time system) rather than by manual editing of the source text.
This function can be applied to any of the STL containers, because they aH provide iterator types (Container: : i terator) and member functions begin and end that return iterators defining the range of positions of elements currently within the container.To use for _each with a list ofintegers, for example, one could write list<int> list1; 11 ... code to insert some elements in list1 for_each(list1, f); where f is sorne function object that does not modify the list.
The main algorithmic idea behind our first approach is to set up an iteration through the container with the ordinary iterators provided, applying a function f that may genera te new elements for insertion into the container.But instead of allowing f to do the insertions, we require it to enter the new elements in a queue.After each caH off, we take elements from the queue and insert them into the container, checking whether they are inserted before or after the current iteratíon position.If an element is inserted after the current position, it wiH be taken care the rernaining iterations, but if it is inserted before the current position, we f to it uucu>"u"'""""' in turn may generate new elernents and add thern to the queue).STL defines severa!different categories of iterators according to the set the most being access iterators.V/hile random access /1 v has been inserted in container (it wasn't already there) /1 and it occurs before the current traversal position, i, so process it now vith f: For the multiple container case, the generic algorithm complete_mul tiple_traversal is defined similarly,2 but the implementation is complicated queue is equivalent to the one at the current position, that position is no longer simply a matte:r of draft C++ standard specification of the insert operatkm on where within a range of equivalent elements a new conduct a linear search within the :range.Using the original Hewlett-Packard implementation of STL, we could omit the linear search since an element is always inserted at the end the range of elements equivalent to ít, but we cannot assume this to be the case implementations since the standard does not require it.This situation is a simple illustl•ation of tension by the library (or language) spedfier, between the goal of aHowing implementors as much freedom as possible by leaving sorne details unspecified, and the goal of enabling programmers to optimize their code while retaining portability.

A Container Adaptor
The implementation based on generic algorithms, shown in Section 3.1, requires function f (implemented by the programmer) to put the elements generated in each activation in a queue, which is less natural than having it insert the elements directly into the container.In order to relax this requirement, we propase another approach based on a container adaptar, whose usage for implementing complete traversals is depicted in Fig. 2. From a given container, the programmer bnilds a ® A constructor that takes a Container as a parameter and stores a reference to its argument, and also creates an iteration list (see below).
® Types i terator const_i terator, which implement complete traversal iterators on nonconstant or constant containers, respectively."' Member function size, which returns the size complete_container.
the Container ® Member function insert which takes a valmU::ype value and Container.
The representation of a complet!iLcontainer consists of a container reference and an implemented asan STL list<value_type>.The implementation maíntains the following invariant: the ea eh itemtion with an i terator, the element with the iteratm~ on the iteration list is the one to be the elements to its are were already those that will be and elements to its left are those which This invariant implies that, be a copy of the Container, insert member should insert the element into the container and onto the end the iteration list.
To compare the time n be the nmnber of our two 1et e be a sorted associatíve container, let and let m be total number of done f.There might be than N= n +m elements inserted might have been the e But N is a bound on the final size of e, so O(log N) bounds the time for any one ínsertion, and log N) bounds time for al! insertions.Let T(f, j, be a bound on the total amount for j evaluations off on a container of maximum size k, where we exclude (because we have counted it) any ti!ne f spends doing insertions.So the total time for evaluating f is T(f, N, N).
l.In complete_unique_traversal, the time for all of the queue processing is O(m), so the total time is queue processing time O(m) + function evaluation time +

T(j,N,N)
The extra linear searches required by complete...mul tiple_traversal add to these times an extra O(mN) term in the worst case, but in practice the extra time is likely to be negligihle.
2. For the complete_container adaptor, the time for all of the list processing is O(N), so the total time is list processing time + insertion time + function evaluation time Since T(J, N, N) is D(N), the bound in 1 cannot be asymptotically better than the bound in 2.
It is dear, however, that the list processing time associated with complete containers is more than the queue processing associated with the complete traversa!algorithms.It is also clear that the complete traversa!algorithms require less extra space than the complete containers.On the other containers offer a more natural interface and can be used with hashed containers, while the same cannot be said of our complete traversa!algorithms.In summary, these two approaches offer a good spectrum of possibilities to tackle the complete traversa} of containers.get_names(ifsi, invitees); cout << "Original set of invitees:" << endl; for (name_set: :iterator i ~ invitees.begin();i != invitees.end();++i) cout << *i << endl; cout << "Output during complete traversal:" << endl; name_function get_boss(bosses); complete_unique_traversal(invitees, get_bos cout << "Final set of invitees:" endl; for (name_set: :iterator i"' invitees.begin();i != invitees.end();++i) cout *i << endl; A solution using the complete_container adaptar of Section 3.2 needs a different definition of the to be applied to each person, one that does the insertions.this ins•ert_boss, the adaptar could be used as typedef complete_container<name_set> cc_type; cc_type cc(invitees); for (cc_type: :iterator k= cc.begin(); k !~ cc.end(); ++k) insert_boss(*k, ce); 4 Related Work [3] is one the earliest contributions which offers language support for defining iterators as operations on programmer-defined container types.Since the programmer has total control over how iteration is defined, supporting complete traversa!would be possible, perhaps by adapting one of the approaches discussed here.In [3], the authors mention the potential usefulness of such iterators but develop neither a formal definition nor any examples.
More recently, the work reported in [4] on list iterators in C++ covers issues associated with iterator integrity; i.e., problems which may arise when the object to which an iterator is pointing is deleted.Even though this work does not deal with complete traversals, the iterator integrity problem would come into play if we tried to do complete traversals on STL sequence containers (e.g., vectors or deques), because insertion in vectors and deques might require memory reallocation which invalidates all iterators pointing to the container in question.Except for the case of such iterator invalidation, complete traversals of STL sequence containers can be trivially programmed.
Another recent related work is the Java Generic Library (JGL) [6], which is strongly based on the STL design.For instance, JGL supports the concept of containers and iterators.However, it does not appear that complete traversals are directly supported.by inserting new elements into it.The iteration stop when aH e1ements currently in S have been processed.
In order to ofler packaged solutions to programmers who need to use this class of iteration schemes, we have presented two app.roaches to perform complete traversals implemented on the platform provided by STL.Our first approach is in terms generic a]gorithms that that the iterated function create a queue to hold the generated elements.One aigorithm handles containers with unique keys, while the other handles containers with rrmltiple keys.By taking precautions in the case of multiple keys, we remain independent of the detaiis of particular implementations of sorted associative containers.
Our second approach is based on a complete container adaptor.The main features of this adaptor are the special iterators and insert operation it provides, by which the programmer can obtain complete traversals nsing a function that directly i.nserts new elements in the container.
The time complexities of both our approaches are asymptotically equivalent.However, the approach based on generic algorithms stores just the elements which are generated at each iteration in the function's queue, while the container adaptar stores all container elements in its list.the other our complete container can be used with any STL associative container (including existing extensions such as hashed containers and any future extensions meeting the requirements associatíve containers), while the generic algorithms can be used with sorted associative contaíncrs only.
There are still other approaches we are exploring, such as the result of melcllng the complete container idea with the approach of keeping just the generated elements.
are also interestcd in trying iteration schemes which do deletions as weli as insertions.In this direction we have conjectured the nonexistence of functions :F which do insertions and deletions ancl are such that the orcler of traversa! is irrelevant.The connection between complete traversaJs and iterator integrity is also on our list of future work.We are also exploring the relationship hetween complete traversals and what we cal! itemtor tmjector•y functions; Le., functions object which describe a specific way of traversing a set.Lastly, we planto use the components presented in this paper to solve real-Ji fe applications, líke the CTS application mentioned in Section 2, and to measure the performance both approaches with randomly-generated containers.

Figure 1 :
Figure 1: An instance of the complete traversal problem

•Figure 2 :
Figure 2: Implementing complete traversals by using a container adaptar approach Another classification of associative containers is unique, in which objects in a container cannot have equivalent keys, versus multiple, in which they can.Still another classification is simple containers, in which only the keys are stored, versus pair containers, in which pairs of keys and associated values are kept.The sorted associative containers provided in STL are shown in the following table: