Calibration, verification and stepwise analysis for numerical phene

Abstract Calibration and verification techniques are discussed in the context of numerical phenetic analysis. Calibration is introduced to evaluate the character set, decide on the type of phenetic algorithm to be used, and determine the level at which to recognize taxonomic entities. Clusters are verified by analyzing sub-samples of specimens. This determines whether the groups obtained are dependent on the variation represented by particular specimens or on variation between taxa to which the specimens belong. A stepwise procedure was used to improve resolution on the ordination axes and thus to visualize differences between phenetically similar taxa. The application of these techniques in Olinia Thunb. supports the recognition of six clearly defined clusters which correspond to O. emarginata Burtt Davy, O. micrantha Decne, O. ventosa (L.) Cufod., O. capensis (Jacq.) Klotzsch, O. radiata Hofmeyr & Phill. and O. vanguerioides Bak. The analyses further revealed one highly variable group, referred to as the O. rochetiana complex, which includes O. aequipetala (Del.) Cufod, O. usambarensis Gilg, O. volkensii Engl., O. macrophylla Gilg, O. ruandensis Gilg, O. discolor Mildbraed and O. huillensis Welw. ex A.R. Fernandes.


Introduction
Numerical phenetic methods of analysis continue to be used in studies on patterns of population variation and species delimitation (Balfour & Linder 1990;Vincent & Wilson 1997;Chandler & Crisp 1998;Hodalova & Marhold 1998;Leht & Paal 1998;Naczi et al. 1998; Van den Berg et al. 1998;Verboom & Linder 1998;Barker 1999;Casas et al. 1999;Ortiz et al. 1999; Van de Wouw et al. 2003;Cron et al. 2007;Spooner et al. 2007). A range of tests such as ANOVA, Manhattan distance, correlation coefficients, Mahalanobi's distance and the Mantel t-test are available for use as a basis to delimit taxa or groups of taxa. However, there are very few morphometric studies, particularly those employing Cluster Analysis (Barker 1990;Hodalova & Marhold 1998;Ortiz et al. 1999;Wilkin 1999), where the groups obtained are subjected to any form of verification with respect to the number and composition of groups of specimens (OTU's).
Methods which can determine cluster homogeneity and verify the consistency of clusters are necessary for the interpretation of variation in groups that have not been extensively studied. Leht and Paal (1998) used what they call a coefficient of indistinctness (CI) to test for the distinctness of clusters in their analyses of variation in Potentilla Sect. Aureae. In Cluster Analysis, the groups obtained are often defined using the levels of dissimilarity (Clifford & Williams 1973), optimal splitting levels (McNeill 1984), chosen levels (Sneath 1988) and the concept of a phenon line (Williams 1971;McNeill 1984;Gower 1988;Sneath 1988;Barker 1990), in which arbitrary levels of similarity are used to delimit groups of particular taxonomic rank. Although ordination gives useful representation of OTU's in multi-dimensions, the groups and/or phenons are often circumscribed by eye, a step that is regarded as unacceptably subjective (Sneath 1976).
The lack of predictability of where to place the line without prior knowledge of the taxonomy of the OTU's, coupled with the observation that cluster size tends to affect the placement of the line (Clifford & Williams 1973), have led to criticism of the usefulness of the phenon line concept in biological studies (Duncan & Baum 1981). It has been shown that clusters can be easily delimited without the use of phenon lines (Hill 1980). An alternative approach on the utility and placement of the phenon line in cluster analysis has been proposed by Sebola and Balkwill (2006), and it involves the sampling and analysis of variation at the population level and the use of the information on intra-and inter-population variation to determine the levels of similarity at which to delimit taxa in samples where individual herbarium specimens are used as terminal units. The current study uses a standard taxon to aid in the placement of the phenon line.
Calibration of the data set. A common approach in phenetic analysis has been the inclusion of a well known taxon in the analyses to establish phenetic relationships (Hodalova & Marhold 1998;Ortiz et al. 1999;Wilkin 1999). However, Barker (1990) used what he called the 'Unanimous Inclusion Principle' in his studies on the taxonomy of Pentameris Beauv. and Pseudopentameris Conert. to test the heuristic values of phenograms, and test the species concepts. He also used this principle as an aid to delimit new taxa, and the selection of ranks for the various clusters elucidated. This approach, however, does not incorporate any means of calibrating the character set, and assumes the appropriateness of the character set for the study group.
No established procedure is yet available for selecting an appropriate character set for systematic evaluation in little-known study groups, except to select as many characters as is practicable according to one of the principles of numerical taxonomy: of considering adequate coverage of the phenotype (Sneath & Sokal 1973). Often the analysis of variance (ANOVA) or the correlations between characters are used to screen for reliable characters (Thorpe 1976). The evaluation of the character set in numerical analyses ensures that only meaningful characters are used, and that redundant characters are avoided. In this study character evaluation is pursued with the search for a reliable set of characters to delimit the standard taxon, which is then used to guide the placement of a phenon line and avoid the subjectiveness associated with delimitation of taxa in numerical analyses.
A calibration method is proposed to overcome the lack of predictivity about where to place the phenon line in biological studies. The method involves analysing a data matrix (OTU's x Characters) that includes a known or standard taxon among the sample of OTU's and a data set with characters obtained from all possible sources. The initial data matrix should include a hundred or fewer representatives of the taxa under study to allow easy visualization of the OTU's on the ordination axes. Clustering procedures are followed and the dendrograms checked for clustering of OTU's, in particular those belonging to the standard taxon. Further clustering analyses can be conducted until all the OTU's of a known standard taxon form a unit distinct from other groups. If the standard taxon does not form a distinct cluster, then the definitions of characters and the scoring of character states are evaluated, the appropriateness of the algorithms used are questioned and/or the standard taxon chosen may not be a good one, in which case the currently accepted concept of the standard taxon must be reassessed. The process continues until a data matrix is obtained in which the standard taxon is recognized. Then the sample of OTU's can be increased as a robust character set will have been established. The method, therefore, requires the identification and recognition of a good standard taxon and applies to situations where OTU's in a study represent individual specimens. The logic of the technique is that the OTU's of the accepted taxon should group first with each other before joining other OTU's or clusters at higher levels of phenetic dissimilarity. The phenon line should be placed at a level of dissimilarity between that at which the last member of the standard taxon groups with the others and at which the standard taxon joins to other groups. This approach ensures that the calibration method is repeatable and verifiable, and avoids the perception that calibration of the character set is merely a case of juggling the data set to obtain an a priori desired result for one cluster (in this case, a standard taxon), which will have similar effect on the other clusters.
Verification. It is important to verify the robustness of the groups obtained in numerical analysis. Verification approaches include using data sets with different types of characters, for example using data from Light Microscopy versus data from Scanning Electron Microscopy (Vincent & Wilson 1997), using quantitative characters versus all other kinds of characters (Ortiz et al. 1999) and using data from male flowers versus data from 3-or 5-foliolate leaves (Wilkin 1999).
Comparisons should not be made between the method of verification proposed here for phenetics and the methods used in phylogenetic analyses of evaluating support for clades. One of the major concerns in cladistic analyses is to determine the robustness of clades (i.e. how well supported are the clades by the character set), and there are several indices or measures on offer in this regard. A few examples of these include the clade stability index (Davis 1993), the character jackknife (Penny & Hendy 1986;Farris et al. 1996), the data set removal index (Gatesy et al. 1999) and the character bootstrap (Felsenstein 1985). The focus here is on the bootstrap and jackknife techniques as they provide analogous approaches. These techniques are, however, not regarded as the same method of verification as employed in numerical phenetic analysis because the assumptions and context (phylogenetic versus phenetic) are different as outlined below (Wiley & Liebermann 2011). The jackknife index measures the stability of nodes with the removal of characters, while the bootstrap assesses the stability of nodes or clades with re-sampling of characters from the original matrix. In jackknifing, a fixed percentage of characters are removed from the original data matrix without replacement and the derivative data sets constructed. The replicate data sets are analyzed phylogenetically, and the percentage of times that a particular clade is supported in the different analyses is noted (Gatesy 2000). With bootstrapping, characters from the original data matrix are re-sampled with replacement, and many data sets of equal size to the original data matrix are assembled. Each of the replicate data sets is analyzed phylogenetically, and the percentage of times that a particular clade is supported in the various analyses is noted. In both bootstrap and jackknife analyses, a high percentage of replicate analyses in which a particular clade is supported indicates high clade stability (i.e. often set at ≥ 50% occurrence for acceptable clade support); and it is customary to do at least 100 replicate analyses (Simpson & Miao 1997;Bayer & Starr 1998;Chatrou et al. 2000), but up to 1000 and more replicate analyses are also common (Bradford 1998(Bradford & 2002Compton et al. 1998;McDowell & Bremer 1998;Buckley et al. 2001;Soltis et al. 2001;Vargas 2001;Meerow et al. 2002;Wen et al. 2002;Breitwieser & Ward 2003;Wen et al. 2003). Phylogenetic computer programs such as PAUP are designed and automated to perform multiple re-sampling of large replicate analyses (Emerson et al. 1999), with the only limit being the computer memory.
Compared to verification method in phenetics wherein up to 50% of the groups or taxa can be re-sampled, in parsimony jackknifing and bootstrapping usually a few data points (≤ 5% of the original data) are omitted at a time, thus making at least 100 re-samplings necessary in order to obtain a statistically meaningful basis for supporting various clades on cladograms. During verification 50% or more of the OTU's are re-sampled at a time and the replicate matrices analysed separately to assess whether similar groups are recovered in the separate analyses.
Traditionally, phylogenetic analyses employ only characters which are polarized (characters in which ancestral character state or direction of character state evolution is pre-specified), sometimes weighted but all deemed to be phylogenetically informative compared to phenetic analyses in which the main objective is to make groups based on overall similarity of as many characters as is possible (Sneath & Sokal 1973). In phenetic analysis the requirement for large numbers of characters justifies limited sub-sampling of OTU's and groups of OTU's during the verification process. In cladistic analyses the autapomorphies (derived character states that are found in only one evolutionary line) are excluded from the analyses because they are regarded as cladistically uninformative (Stuessy 1990;Bryant 1995), and yet these are character states most useful in phenetic analyses as they aid in the recognition of taxa. It has been established that jackknifing frequencies or values for clades are lower in data matrices that contain irrelevant characters or autapomorphies (Carpenter 1996). Thus, the characters regarded as autapomorphies (in cladistic terms) would be included and treated as informative in phenetic analyses since cladistics and phenetics can be applied at the same hierarchical levels but for different outcomes.
Stepwise analysis. The application of stepwise approach in phenetic analyses rests on the premise that some OTU's or, a group of OTU's can be excluded at once from the data matrix, and the remaining sub-matrix re-analyzed to evaluate resolution of the remaining OTU's or groups of OTU's. Therefore, the number of re-sampling and re-analysis done following a stepwise procedure in a phenetic context is not as important as is the case for parsimony jackknifing in a cladistic sense. The aim of stepwise analysis using phenetic analysis (in particular, ordination methods) is solely to allow the axes to become longer, and thus the spreading, and possibly resolution of the unresolved groups. If this can be achieved in a few limited re-sampling stepwise procedures, then there is no need to re-sample up to 100 times because the analyses are based on overall similarity (phenetics) rather than on the influence of individual polarized data points (as in cladistics). In stepwise analysis, the idea is to sample the units (OTU's) rather than sample characters as is the case in bootstrapping and jackknifing procedures. The assumptions underlying the use of parsimony bootstrapping and jackknifing are specific to the cladistic methodology and philosophy, and differ completely from the numerical phenetic approach upon which the stepwise approach is based.
Thus, application of bootstrapping and jackknifing in cladistic analyses is to achieve a totally different purpose (to evaluate or determine comparative support within the data set for the clades or nodes retrieved by the parsimony analysis) from that achieved through the use of verification and stepwise analysis in phenetic analyses (verifying the consistency of groups formed, and assessing whether such groups are dependent on the inclusion in the analysis of specific individual OTU's, or on the interpretation of variation among the studied taxa represented by the OTU's).
In this paper, verification method is proposed as a means of establishing whether groups obtained in numerical analysis are dependent on the inclusion of particular specimens (OTU's), or on the pattern of variation among the taxa as represented by the OTU's studied. Thus, sub-sampling OTU's to test the effect of changing the individual OTU's and the numbers of OTU's in recovering the same groups will provide a means of assessing the reliability of clusters.
Rationale for choosing Olinia (Oliniaceae) in this study. During a monographic study (using numerical phenetic methods) of the Oliniaceae it became possible to explore the applicability of these techniques (calibration, verification, and stepwise analysis) in order to understand the morphological variation in Olinia Thunb. The Oliniaceae form a monogeneric, relatively small family that presents a number of taxonomic problems. The family is endemic to the forests of the African continent and comprises mainly shrubs and trees and is characterized by the following features: Branchlets are reddish when young, turning pale with age and 4angled; leaves are simple, opposite and decussate; stipules are minute and appear as ridges at the base of petioles; the inflorescence axes are pink to red; the flowers are regular, bisexual and epigynous with a narrow hypanthium tube; there are four or five petal lobes at the throat of the hypanthium alternating with an equal number of incurved scales; the ovary is inferior with four or five locules; ovules are up to three per locule, campylotropous, bitegmic and crassinucellate (Tobe & Raven 1984); fruits are pink to red with a scar remaining after the hypanthium has fallen.
Within the family there are some species groups with clearly defined limits and others with uncertain limits needing clarification. Olinia is an ideal genus in which to address the methodological issues of calibration and verification in numerical analysis for the following reasons: Firstly, there are relatively few (thirteen) described taxa in the genus, and it is thus practically feasible to include many representatives covering the known geographic range of all the taxa in the analyses. Secondly, the availability of a large number of herbarium specimens of Olinia covering the entire range of distribution makes it possible to study and analyze the morphological variation, review calibration and sub-sampling techniques in numerical analysis and provide an empirical basis for recognition of taxonomic entities in Olinia. Thirdly, the clearly circumscribed taxa on the basis of morphological criteria (Sebola & Balkwill 1999) can be used to assess the effectiveness of the methods in retrieving clearly defined taxa.
Lastly, the resolution of any of the taxonomic groups with unclear limits (the O. rochetiana complex) will add new knowledge to the taxonomy of Oliniaceae.

Current species limits in Olinia.
Species limits in Olinia have never been satisfactorily resolved, and other than the monograph by Cufodontis (1960), all other studies are regional (Sonder 1862;Hofmeyr & Phillips 1922;Burtt Davy 1926;Fernandes & Fernandes 1962;Verdcourt 1975Verdcourt & 1978Verdcourt & Fernandes 1986), with the consequence that species limits and synonymy become doubtful, especially for a highly variable and geographically widespread species such as O. rochetiana A. Juss. The confusion about the taxonomy within Oliniaceae was mentioned by Mujica and Cutler (1974)  Species in the latter group were later found to exhibit a considerable overlap in morphological variation (Verdcourt 1975(Verdcourt & 1978Verdcourt & Fernandes 1986). Examples of the morphological features which are unreliable as diagnostic features among the taxa from tropical and tropical East Africa, Mpumalanga and Limpopo provinces (South Africa) include the dimensions of leaves and floral parts, and the degree of pubescence on vegetative and floral parts. The geographic areas of greatest morphological diversity within Olinia appears to be southern Africa and tropical East Africa, judging by the similar numbers of species names proposed for the regions, fourteen and twelve, respectively. Tobe and Raven (1984) recognized only five species in Oliniaceae, all occurring in southern Africa and St. Helena, and none in tropical East Africa. They examined the embryology of two species, O. emarginata and O. ventosa, but did not mention the other three species they recognised. Sebola and Balkwill (1999) distinguished all taxa occurring north of the Limpopo River (referred to as the O. rochetiana complex) from the South African species on the basis of leaf venation patterns (basically the same character used by Mujica and Cutler (1974)) and recognized five species Against this background, the aims of this study were therefore, firstly to investigate the applicability of calibration techniques (using a standard taxon) in evaluating the character set. Secondly, to use the standard taxon as a guide for specific and infra-specific delimitation in Olinia. Thirdly, to investigate the utility of the verification technique as a test of the robustness of clusters in Cluster Analysis. Fourthly, to apply a stepwise approach in the circumscription of taxa with unclear limits and, lastly, to determine the number of taxa in Olinia.

Materials and methods
Material and measurements. A comprehensive collection of herbarium specimens (on loan from B, BM, BOL, J, K, NBG, PRE and SAM. Acronyms as per Holmgren et al. 1990) covering the entire known range of distribution of Olinia was studied and sorted a priori into hypothetical groups based on the similarity of a few macro-and micro-morphological characters. These groups were merely intuitive and served as hypotheses of taxonomic groups within Olinia that were to be tested using phenetic methodology. In total, 200 fertile (either flowering or fruiting) specimens were measured and means for each of the characters investigated were obtained for each specimen or OTU. A total of 60 characters, 11 of which are quantitative continuous (obtained by measurements), 2 quantitative discontinuous (obtained by counting) and 47 qualitative discontinuous (obtained by scoring each specimen into states), were measured per specimen (Table 1). A minimum of five measurements was made for all the quantitative continuous characters per specimen and averaged.
Measurements of larger parts such as lengths and widths of leaves and lengths of inflorescence units were made to the nearest 0.5 mm. An ocular micrometer was used to measure smaller structures such as the lengths of hypanthia, and lengths and widths of sepal lobes at 6x to 31x magnifications to the nearest 0.1 mm. The full data matrix contained the mean values per individual specimen for each of the quantitative characters and the character states for the qualitative characters.  10. PLL length of petal lobe measured from the hypanthium rim to the tip of the petal lobe.
11. FRTL length of fruit measured from the point of attachment to the pedicel to the tip of the fruit.

Features of the indumentum
The indumentums was coded for absence (1) or presence (2); the degree of pubescence was coded as either slightly pubescent (1) if there were less that ten hairs in an area of 2 mm 2 or, markedly pubescent (2) if there were ten or more hairs in an area of 2 mm 2 .

Methods of analysis.
The numerical methods of analysis were carried out using NTSYS-PC version 2.0 (Rohlf 1998). The full data matrix was standardized using the STAND option to render the characters dimensionless and to reduce all characters to a scale of comparable range with a mean of zero and a standard deviation of unity. Both ordination and cluster analyses were performed on the standardized data matrix.
Two methods of ordination analysis were performed, namely principal components analysis (PCA) and the principal coordinate analysis (PCoA) since the data set contained a mixture of quantitative and qualitative characters. Ordination techniques are concerned with approximating the entries of a dissimilarity matrix by the distances (usually Euclidean) generated by a set of points plotted in a few dimensions (Gower 1988). An ordination analysis aims to represent phenetic relationships of objects (e.g. populations or individuals) by the scattering of points in reduced dimensional space (Chandler & Crisp 1998). The advantages of ordination over clustering are its few assumptions regarding the nature of the relationships in the data set, and by not imposing a hierarchical structure on the data. Ordination can also identify multiple overlapping patterns (Faith & Norris 1989). In practice the OTU's are represented in the first 2 or 3 dimensions, which often explain most of the variation present in the data (Baum 1986). The results are considered more reliable when there is a higher percentage of variance explained in the first two or three axes (Sneath & Sokal 1973). However, as Baum (1977) has demonstrated, the proportion of variance explained in the first three axes can be altered by simply subjecting the data matrix to some form of transformation. Sneath and Sokal (1973) warn that the use of ordination techniques may not always yield simple, low dimensional results that are easy to interpret, and suggest that ordination methods be used in conjunction with clustering techniques. This approach was followed in this study.
Principal Components Analysis makes no assumptions of group membership of OTU's, but attempts to portray multidimensional variation in the data set in the fewest possible dimensions, while maximizing the variation (Van den Berg et al. 1998). According to Austin (1985) the advantage of PCA is that it makes use of all the information contained in the similarity matrix to determine the component axes, and that it is accurate for between-group distances. It is a general trend in taxonomic studies employing PCA (Vincent & Wilson 1997;Naczi et al. 1998;Hodalova & Marhold 1998;Van den Berg et al. 1998;Casas et al. 1999;Ortiz et al. 1999) to consider only two or three component axes because practically the first two or three principal components usually explain most of the useful taxonomic variation in the data. The fourth and subsequent principal components are often ignored, as these do not provide any meaningful information not yet explained by the first three components (Hodalova & Marhold 1998;Semple et al. 1990). Marcus (1990) warns that PCA should be recognized for what it is: a data projection and rotation technique summarizing most of the variability in the data, where one may search for patterns and clusters in displays and get some idea of influential and associated variables giving rise to the displays. In this study, PCA was applied strictly on the quantitative continuous characters as it is not suitable for discrete qualitative characters (Sneath & Sokal 1973;Schilling & Haiser 1976;Kent & Coker 1992). PCA was therefore performed from the correlation matrix on the standardised data (Rohlf 1998) using the procedures STAND to standardise the data matrix by variables, EIGEN to compute a matrix of correlations among the OTU's, extract eigenvectors from the correlation matrix, PROJ to project the standardised data onto these eigenvectors, and MOD3DG to generate a 3-dimentional plot of the OTU's.
Principal Coordinate Analysis can be applied to data sets containing both quantitative continuous and qualitative discontinuous characters (Small & Brookes 1990;Small et al. 1999). It is also the preferred ordination method for association data, DNA (RAPD) or immunological data (Marcus 1990). The method uses inter-OTU distances (OTU by OTU matrix) rather than the raw character state data. While PCA is based on the characterby-character sums of squares and cross products matrix (Rohlf 1998), principal coordinate analysis is regarded as 'dual' to PCA because it is based on the individual-by-individual distance squared matrix, which can also be transformed to a sums of squares and cross-products as in PCA (Marcus 1990). This method, together with multi-dimensional scaling, is not constrained by the nature of the data set compared to the PCA i.e. can be used to analyze a data set made up of both continuous and discontinuous data (Sanfilippo & Riedel 1990;Tardif & Hardy 1995).
Otherwise, PCoA and PCA give identical results. The distances among the objects are maximally summarized by the first, and then the second, down to the last principal coordinate as in PCA. In this study PCoA was applied on the full data set containing both quantitative continuous and qualitative discrete characters as the method does not have the same constraints on the data set nor the same assumptions as the principal components analysis (Austin 1985). Principal coordinate analysis was performed from the correlation matrix on standardised data using the procedure SIMINT to compute a matrix of distances between OTU's, DCENTER to double-center the distance matrix, EIGEN to factor the double-centered matrix, and MOD3D to use eigenvectors to project the OTU's in 2D or 3D space. All these options are available in the NTSYS-PC package (Rohlf 1998).
For cluster analysis, only those characters that were effective in discriminating between a priori groups (i.e. judged by high eigen vector scores) in the first three axes of ordination analyses were used. This approach was followed since cluster analysis is known to impose a hierarchical structure on any data (Thorpe 1983), and often shows clusters that may not be recoverable in ordination analyses (Chandler & Crisp 1998). Cluster analysis was performed by calculating the distance matrix between OTU's using the average taxonomic distance coefficient from the standardised matrix, clustering the OTU's by using the Unweighted Pair-Group Method of Arithmetic Averages (UPGMA), computing the co-phenetic values and the co-phenetic correlation using COPH and MXCOMP, respectively, in order to measure the distortion between the original distance matrix and the resultant phenogram (Crisci et al. 1979;McDade 1997). A cophenetic correlation value of one indicates a perfect match and lower values indicate that placing the taxa in a phenogram distorts the original distance matrix to a greater or lesser extent (Rohlf 1998;McDade 1997). The UPGMA was used because it produces better phenograms compared to when either the single linkage or complete linkage methods is used (Crisci et al. 1979), it has become the most widely used clustering method in taxonomic investigations (Crisci et al. 1979;Hill 1980 Bartish et al. 1999;Small et al. 1999;Marcussen & Borgen 2000), and it is deemed to be more spaceconservative and shows the highest co-phenetic correlation coefficient (Chandler & Crisp 1998;Duncan & Baum 1981). Cluster analysis was therefore used to test whether similar groups to those obtained in ordination analyses could be recovered, and also to visualise the level of morphological similarity/dissimilarity using appropriate coefficients between and within the a priori groups. narrower hypanthia and five, rather than four, white petal lobes. Analyses were performed first on an initial data matrix containing 68 OTU's and 60 characters (Table 1), and then further analyses were done following the evaluation of characters regarded as aspects of the same feature. Thus, leaf dimensions (leaf lengths, leaf widths and leaf length: width ratios)

Calibration
were not included simultaneously in any analysis to avoid over-weighting of characters. Univariate analyses of variance (ANOVA) for each of the quantitative characters were made to allow an objective assessment of any significant differences between the means of characters among the a priori groups. The characters for which there were missing data for most OTU's were excluded from this analysis, thus leaving only eighteen characters for univariate analysis. This process allowed for the selection of characters most likely to discriminate the standard taxon from other a priori groups on the one hand and possibly to discriminate among other a priori groups on the other given that the primary emphasis was on the standard taxon.
The OTU's of the standard taxon included for analysis should represent the entire range of its geographic distribution. During the analyses, if OTU's of the standard taxon did not cluster together, it would be necessary to examine the data set for any errors in coding of characters and character states before questioning the validity of the standard taxon (Barker 1990). The level on the phenogram at which the last member of the standard taxon joins other OTU's of the standard taxon was used to position a phenon line. The total number of OTU's was then increased to 200 with more material belonging to other a priori groups.
Verification. The full data matrix with 200 OTU's was subdivided to create two derivative matrices, each with a total of 100 OTU's. The two data matrices were created such that each had the same number of OTU's of the standard taxon (seven), but varying numbers of OTU's in other a priori groups. This was done to ensure there were sufficient OTU's of the standard taxon in the derivative matrices because there was a limited number of well documented OTU's of these compared to the number of OTU's belonging to other a priori groups. Each of the data matrices was analysed separately and the results compared with those from the full data matrix to check for the formation of similar groups.
Stepwise approach. Stepwise analysis as advocated in this study (i.e. numerical phenetic analysis involving both ordination and cluster analysis) refers to the systematic assessment of phenetic relationships and clustering among dissimilar OTU's when the OTU's representing clearly recognisable taxa are removed from the data matrix preceding further analyses. In phenetic analysis, it is known that the presence of phenetically dissimilar OTU's representing clearly recognisable taxa can cause the remaining OTU's to cluster together even when these are not phenetically similar to each other (Sneath & Sokal 1973;Kent & Coker 1992). In this and similar situations, stepwise procedure can be followed.
An alternative approach has been used to identify and eliminate redundant characters (i.e. those not contributing significant information) to the discrimination of natural populations of the Eucalyptus risdonii -E.
tenuiramis complex (Wiltshire et al. 1991). However, it should be noted that removal of characters from the data matrix may not necessarily lead to the same outcome in the analyses compared to when OTU's are removed. As for the material of Olinia from tropical East Africa, little is known of the causes of the high level of morphological variation within and between OTU's, and hence the stepwise approach was adopted in the ordination analysis of the pattern of morphological variation. A stepwise approach was applied in an ecological study (Stalmans et al. 1999

Results
Calibration. In the analysis (UPGMA clustering) of the initial data set, containing a total of 68 OTU's, the OTU's belonging to the standard taxon (O. emarginata) did not form a separate cluster (results not shown) but were scattered among other a priori groups. The ranges of quantitative characters used (Table 2) indicate that it is not possible to distinguish individuals from the given a priori groups using single characters, but that it is a combination of characters that can be used to distinguish between individuals of the a priori groups. Univariate analysis of variance (Table 2) showed that for each of the quantitative characters the means of at least three a priori groups (including the standard taxon) differed significantly from the means of at least two other a priori groups. It is also obvious from  (Sebola & Balkwill 1999).
The lack of separation of the standard taxon and any other a priori groups led to an evaluation of the data set for any errors in coding of characters and character states. The characters regarded as logically coding for the same feature were not included in the analysis simultaneously, but one at a time and cluster analysis re-run. Qualitative discrete characters such as shape of leaves and petal lobes, as well as density of indumentum on floral parts were also excluded from the analyses but were found not to affect the separation of a priori groups (results not shown). It was the exclusion from the data matrix of leaf length that produced a phenogram ( Figure 2) in which all OTU's of the standard taxon and other a priori groups formed distinct clusters at the taxonomic distance of 1.10.
According to Rohlf (1998)  In the analyses of the sub-samples of the full data matrix similar clusters to those obtained by analysing the full data matrix were obtained (Table 3) except for the misplacement of two OTU's belonging to the a priori group e, and the splitting of the a priori group x into four sub-groups (Figures 4a & 4b). Upon examination of the data matrix, the two misplaced OTU's of a priori group e were found to be coded for fruit characters in addition to vegetative and floral characters.  x Table 3. Number of OTU's misplaced from their a priori groups during the sub-sampling procedure. * = OTU's formed four sub-clusters, all occupying same phenetic space in ordination (PCoA) analyses. A priori groups as in Table 2. Stepwise analysis. The apparent splitting into four sub-groups of the OTU's belonging to the a priori group x was investigated using a stepwise ordination analysis, in which the a priori groups forming distinct clusters were excluded from the data matrix, one after the other, and the principal coordinate analysis re-run. Most of the a priori groups (a -d) forming distinct clusters appear to the right side along the first PCoA axis (Figure 3), except for the a priori groups e & f found together with the sub-groups of a priori group x to the right along the first PCoA axis. The a priori groups a, b, c, d, e and f were sequentially excluded from the full data set, and the data matrix reanalysed using principal coordinate analysis. Only the results in which a priori groups a to f were excluded are presented ( Figure 5). In this analysis, involving only OTUs of the a priori group x (i.e. the O. rochetiana complex), the OTU's were found to split into four sub-groups/clusters along the first axis (as in Figures 4b), and seven clusters along the third axis ( Figure 5).
Characters most strongly correlated with the first axis were mainly quantitative (hypanthium length, petal length, fruit length, petiole length and inflorescence unit length) and only three qualitative characters (petal shape, presence/absence of indumentum on petal lobes and on styles). The results showed that as more of the a priori groups of unquestionable phenetic distinctness were removed from the analysis, there was an increasing availability of ordination space to allow the remaining a priori groups to spread beyond their original positions, thus allowing characters that correlated with other axes to become dominant. Different suites of characters changed roles in contributing to the separation of the remaining a priori groups during the stepwise analysis (Table 4). There were statistically significant differences in the numbers of characters that had eigenvector scores or loadings (which were either positive or negative) > 0.5 during and after the stepwise analysis, indicating that different suites of characters had become active in separating the remaining a priori groups during stepwise analyses (Appendix 1). With respect to characters in which eigenvector scores had been > 0.5, but had increased; eigenvector scores had been more than 0.5 but had decreased; eigenvector scores had been < 0.5, but increased to ≥ 0.5; and eigenvector scores had been > 0.5 but decreased to < 0.5 (i.e. categories of characters be in Table 4), a comparison was only possible between the analyses in which clearly defined clusters were excluded and the analysis of the full data matrix in which all clusters representing the a priori groups were included because no comparison could be made on the changing roles of different sets of characters before stepwise analysis was undertaken. Thus, Table 4   characters, excluding leaf lengths following calibration (using UPGMA clustering on a distance matrix); cophenetic correlation coefficient (r) = 0.92.
The a priori groups as in Figure 1.
. PCoA plot of the first two coordinate axes obtained from analysing the first sub-sample of 100 OTU's of the full data set of Olinia used in Figure 3. The a priori groups as in Figure 3.   f  (a) with eigenvector scores of > 0.5, (b) in which eigenvector scores had been more than 0.5 but had increased, (c) in which eigenvector scores had been more than 0.5 but had decreased although still above 0.5, (d) with eigenvector scores that had been < 0.5, but had now increased to ≥ 0.5 and (e) in which eigenvector scores had been > 0.5 but now decreased to < 0.5. One way analysis of variance (F-values) of the mean number of categories of characters (a -e), ns = not significantly different, * = significantly different (p < 0.05), dash indicates categories not applicable.  vanguerioides, (Sebola & Balkwill 1999) and x = the O. rochetiana complex (Verdcourt & Fernandes 1986).

Discussion
It is therefore important that during phenetic investigations several analyses should be conducted on data, firstly to calibrate the data set based on the unity of members of a known taxon, and secondly to run further analyses using sub-samples of the data matrix to check for consistent retrieval of the same clusters, including that of a known taxon. This approach will be particularly useful if applied to studies of taxa on a monographic scale, with the benefit of analysing variation within taxa over their full known range of distribution. The calibration of the data set using the standard taxon can better inform decisions on where to delimit taxa on phenograms in Cluster Analysis by using the level of phenetic dissimilarity at which members of the standard taxon join each other before they join other clusters as the criterion for the delimitation of taxa. More than one standard taxon can be included in the analyses as in Barker (1990) to ensure that calibration of the data set is not influenced by a single concept of a standard taxon. The concern by Clifford & Williams (1973) that cluster size tends to affect the placement of the phenon line can be addressed by ensuring that the total number of OTU's of the standard taxon is kept more or less the same as the total number of members of the study group to avoid the influence of different sizes of clusters on the level at which to place the phenon line. Therefore the use of a standard taxon is more objective than the traditional approach of deciding arbitrarily where to delimit taxa in Cluster Analysis, an approach that was discredited by Clifford and Williams (1973). The similarity or dissimilarity coefficients in phenograms are used for choosing the levels at which to recognise and delimit taxonomic groups (Sneath & Sokal 1973), and the scales are influenced by the types of characters used (continuous quantitative versus discrete qualitative) and the type of coefficient used (i.e. distance or correlation). In addition, the level of variance represented by the OTU's within a cluster can also influence the similarity/dissimilarity level at which to recognise taxonomic groups (Thorpe 1983 rochetiana complex ( Figure 5) did not form a coherent group, and this is consistent with Verdcourt's (1975)  were positively or negatively correlated) in the PCoA have been used in the key to distinguish between species of Olinia in South Africa (Sebola & Balkwill 1999). There was a significant difference in the number of characters in which eigenvector scores had been > 0.5 before exclusion of clearly defined clusters in the analysis, but had decreased to below 0.5 when clearly defined clusters were excluded from the analysis. This was particularly obvious in the second and third axes (Table 4). Thus, as more clearly defined clusters were excluded from the analyses, more of the characters which had eigenvector scores of < 0.5 became active (i.e. eigenvector scores of > 0.5) in separating the remaining groups. The stepwise approach cannot, however, provide an overall spatial picture of relationships among all clusters of the study group (Parnell 1999), and is only helpful in situations where there is difficulty in interpreting phenetic similarities of some clusters in ordination analysis. The use of a similar approach, stepwise discriminant analysis, to distinguish between groups in the study of Eugenia and Syzygium in Thailand (Parnell 1999) established that the exclusion of some OTU's (i.e. those belonging to segregate genera) affected the eigenvector values, but did not alter significantly the relative importance of the characters for each axis. In stepwise analysis (using PCoA) of Olinia specimens in which four and six clearly defined clusters were excluded the mean number of characters in which eigenvector scores had been > 0.5 did not differ significantly (p = 0.05) in the third PCoA axis, but differed significantly in the first and second axes. Similarly, there were also no significant differences in the mean number of characters in which eigenvector scores had been > 0.5 but increased in the second and third PCoA axes when all clearly defined clusters (represented by a priori groups a f) were excluded in the analysis. rochetiana complex (Verdcourt & Fernandes 1986). Our study also supports Verdcourt's (1978) and Verdcourt & Fernandes' (1986) Verdcourt (1975Verdcourt ( & 1978. This complex is geographically widespread and occupies various habitats with varying climatic conditions that possibly contribute to its overall variability. Most of the morphological characters used to delimit species of Olinia occurring in southern Africa overlap considerably among groups within this complex. As a follow up to this study, a comprehensive investigation and analysis of the morphological variation within O. rochetiana complex was undertaken at the population level (Sebola & Balkwill 2006). A stepwise approach to PCoA has been described and applied in ecological studies (Stalmans et al. 1999), but never applied in systematic studies. This approach was applied on the taxonomy of Olinia in this study. However a similar approach, stepwise discriminant analysis, has been applied in a study of Eugenia and Syzigium in Thailand (Parnell 1999). Cooley and Lohnes (1971) cautioned against the use of stepwise regression analysis, which involves adding or subtracting one predictor at a time to the regression equation. The difference between Cooley and Lohne's (1971) approach and the stepwise analysis advocated in this paper is that the latter focuses on sampling groups or clusters, which is different from subtracting or adding predictors (i.e. characters). The calibration technique was applied in the analysis of Pentameris and Pseudopentameris (Barker 1990 Table 1.