QV calculator: Quasi-variances in Xlisp-Stat and on the web

The most common summary of a fitted statistical model, a list of parameter estimates and standard errors, does not give the precision of estimated combinations of the parameters, such as differences or ratios. For this, covariances also are needed; but space constraints typically mean that the full covariance matrix cannot routinely be reported. In the important case of parameters associated with the discrete levels of an experimental factor or with a categorical classifying variable, the identifiable parameter combinations are linear contrasts. The QV Calculator computes ‘quasi-variances’ which may be used as an alternative summary of the precision of the estimated parameters. The summary based on quasi-variances is simple and permits good approximation of the standard error of any desired contrast. The idea of such a summary has been suggested by Ridout (1989) and, under the name ‘floating absolute risk’, by Easton, Peto & Babiker (1991). It applies to a wide variety of statistical models, including linear and nonlinear regressions, generalized-linear and GEE models, Cox proportional-hazard models for survival data, generalized additive models, etc. The QV Calculator is written in Xlisp-Stat (Tierney, 1990) and can be used either directly by users who have access to Xlisp-Stat or through a web interface by those who do not. The user either supplies the covariance matrix for the effect parameters of interest, or, if using Xlisp-Stat directly, can generate that matrix by interaction with a model object.

A standard presentational device, both in computer output and in published work, is a list of model parameter estimates and associated standard errors.Such a summary is not sufficient to provide standard errors for estimated combinations of the parameters, such as differences or ratios, which may often be of interest: for this, estimated covariances are also needed.Space constraints, however, typically mean that the full covariance matrix cannot routinely be reported.
In the important case of parameters β 1 , . . ., β p associated with the p levels of an experimental factor or with a p-category classifying variable, the identifiable linear combinations are contrasts of the form p j=1 c j β j , with p j=1 c j = 0, including the simple between-level or between-group comparisons β i − β j .From the usual list of estimates and standard errors, with for example the constraint β 1 = 0 imposed, only the p − 1 comparisons {β j = β j − β 1 , j = 2, . . ., p} can be made; standard errors for other contrasts are not available in the absence of the relevant estimated covariances { cov( βi , βj ) : i, j = 2, . . ., p}.The solution implemented in the QV Calculator is as follows: from the full covariance matrix, determine p 'quasi-variances', represented by q 1 , . . ., q p , such that for all contrast vectors c = (c 1 , . . ., c p ). Then if q 1 , . . ., q p are reported along with the p estimates β1 , . . ., βp , an approximate standard error is available for any contrast c i βi in the intuitively appealing form ( c 2 i q i ) 1/2 ; the variance of a simple contrast βi − βj is approximated simply by q i + q j .This basic idea appears to have been suggested first by Ridout (1989) and then, independently and under the name 'floating absolute risk', by Easton, Peto & Babiker (1991).

Approximation method
In the QV Calculator the quasi-variances {q i } are chosen to minimize the mean square of errors on the log scale, log(q i + q j ) − log v ij , over the set of variances v ij for simple contrasts βi − βj .Thus relative error is controlled: small contrast variances are deliberately approximated more accurately than large ones, but the relative errors {(q i + q j )/v ij − 1} are of similar magnitude.
Ridout (1989) also suggests controlling relative error, by minimizing the mean of In practice Ridout's method gives very similar results to those obtained from the QV Calculator, and indeed it can be shown (Menezes, 1999) that they are equivalent to first order.Easton, Peto and Babiker (1991), on the other hand, suggest choosing the quasi-variances to minimize the mean square of (q i + q j ) − v ij .This controls absolute rather than relative error, and can lead to poor approximation of small contrast variances (in terms of relative error) in order to achieve good approximations to larger ones, which is undesirable.McCullagh & Nelder (1989, pp204-8) analyse the rates of occurrence of damage incidents to cargo ships, using a Poisson-type loglinear model with allowance for overdispersion.The main-effects model reported by McCullagh & Nelder is summarized as follows by GLIM (Francis et al., 1993): ?$display v$ !prints the variance-covariance matrix of estimates 0.0799 -0.0530 0.0533 -0.0458 0.0428 0.1824 -0.0396 0.0390 0.0384 0.1428 -0.0408 0.0404 0.0412 0.0387 0.0941 -0.0266 0.0038 0.0030 0.0020 -0.0002 0.0379 -0.0343 0.0138 0.0043 0.0024 -0.0025 0.0272 0.0487 -0.0344 0.0160 0.0126 -0.0111 0.0049 0.0281 0.0367 0.0919 -0.0094 0.0009 -0.0002 -0.0003 0.0013 -0.0036 -0.0089 -0.0147 0.0237 Suppose that we are interested in the 5-level factor TYPE (the type of ship).The QV Calculator computes the quasi-variances as follows: The 'quasi-s.e.' column simply presents square roots of the quasi-variances {q i }, which are then quasi standard errors on the same scale as the estimated effects.

An example
The QV Calculator also gives an indication of the accuracy achieved in the approximation of contrast variances.Typically this is rather good; for example, relative errors for the standard errors of the ten simple contrasts in this case are all less than 1%.The quasi-variance approximation of contrast variances is exact in certain special situations, notably when p = 3: in that case the three quasi-variances q 1 , q 2 , q 3 summarize perfectly the two variances var( β2 ), var( β3 ) and single covariance cov( β2 , β3 ).When p > 3 the quasi-variance approximation is exact only in special models, such as normal-linear or Poisson-loglinear models with a balanced design matrix; Menezes (1999) explores such 'exact' situations, and a fairly wide range of others where the approximation is nearly exact, in detail.
For a two-level factor, i.e., p = 2, the notion of quasi-variances is redundant: there is only one contrast, and no covariance to take into account.Thus, for example, it would be pointless to compute quasi-variances for the two-level factor PERIOD in the above example: an infinite number of exact quasi-variance pairs (q 1 , q 2 ) is available, and the choice among them arbitrary.If asked to compute quasi-variances for the effect of such a binary variable, the QV Calculator reports an error.

Suggested use of quasi-variances in presenting results
The quasi-variances may be used either as a supplement to, or in place of, 'conventional' standard errors which relate to contrasts with a reference category.Their interpretation is the same as that of the variances of independent estimated means μ1 , . . ., μp in a one-way layout; hence the terminology, 'quasi-variances'.
The presentation of p quasi-variances (or their square roots, 'quasi standard errors') has two main advantages over a conventional set of p − 1 standard errors of contrasts relative to a reference category.First, by construction the quasi-variances allow fairly accurate approximation of the variance of any desired contrast by any subsequent reader, without needing the full set of covariances.Second, the presentation of quasi-variances facilitates the combination of results from different studies, where conventional standard errors might be problematic if different parametrizations (different reference categories, for example) were used in the reports of the different studies.

Using the QV Calculator with Xlisp-Stat
The QV Calculator is called in one of two ways: the function qvcalc works with a user-supplied variance-covariance matrix, while function get-qvs operates on a Lisp-Stat model object.(For general information on Lisp-Stat and object-oriented statistical modelling, see Tierney, 1990.)In either case the result is a quasi-variance object, an instance of the object prototype qv-proto.Methods are available for displaying such an object, for computing the exact and approximated variance of any contrast, and for calculating the worst error of approximation in the set of all possible contrasts.

Function qvcalc
Suppose that we are interested in the 5-level factor TYPE (the type of ship) in the example of Section 3. The relevant part of the covariance matrix is 0 0 0.0533 0 0.0428 0.1824 0 0.0390 0.0384 0.1428 0 0.0404 0.0412 0.0387 0.0941 Notice the column of zeros here, which reflects the fact that GLIM has set the parameter for TYPE(1) to zero and accordingly has omitted the corresponding column from the displayed variance-covariance matrix.Other model-fitting packages have different conventions, and it does not matter which one is used as long as the covariance matrix corresponding to all levels of the factor (i.e, all 5 levels here rather than just 4) is provided.Since the QV Calculator operates only on contrast variances, its results are independent of any constraints used in fitting the model.
The quasi-variances are computed, and stored with name ship-type-qvs, by > (def ship-type-qvs (qvcalc 5 'VCL (list 0 0 0.0533 0 0.0428 0.1824 0 0.0390 0.0384 0.1428 0 0.0404 0.0412 0.0387 0.0941))) The resultant object can now be displayed, or operated upon in various ways.Information can be obtained by sending ship-type-qvs the :help message: The largest error incurred in this example is thus in the region of just 2%.It is computed from a formula derived by Menezes (1999), based on an eigenvalue calculation.
Information about the model and the effect of interest may be stored, as a reminder to the user, in the slots :model-name and :factor-name.For example, > (send ship-type-qvs :model-name "main effects model") "main effects model" > (send ship-type-qvs :factor-name "ship type") "ship type" A nicely-formatted summary of the quasi-variances and associated information about the quality of approximation is obtained by  ---------------------------- For the standard error of a simple contrast, i.e., of a difference between two levels, the error incurred by the quasi-variance approximation is between -.7% and 0.9%.
In the set of *all* possible contrasts the approximation error is between -2.1% and 1.6%.
The three mandatory arguments to qvcalc are: (i) the number of levels, (ii) a quoted symbol indicating the style in which variance-covariance information will be entered, and (iii) the variance-covariance information itself.An optional fourth argument can be set to t if the result required is the formatted summary rather than the quasi-variance object itself; this is used in communicating with the web interface (see Section 6).
> (help 'qvcalc) QVCALC [function-doc] Args: (levels style data &optional (web nil)) LEVELS -the number of levels of the factor of interest (including any reference level) STYLE -the way in which the variance-covariance information will be supplied.This must be one of 'VC --the whole variance-covariance matrix, either as a list or as a Lisp-Stat matrix representation 'VCL --the lower triangle of the variance-covariance matrix, as a list 'CORR --coefficient standard errors (including a zero for any reference level) and lower triangle of the correlation matrix (excluding the diagonal), as a single list.(Only styles 'VCL and 'CORR are available from the web input form.)DATA -a list of the variance-covariance information in the style chosen WEB -indicates whether this function is being called from the web page.If so, the function returns a string summarizing the quasi-variance approximation, rather than a quasi-variance object.Example of use: (qvcalc 5 'VCL (list 0 0 0.0533 0 0.0428 0.1830 0 0.0390 0.0384 0.1427 0 0.0404 0.0411 0.0387 0.0940)) for the ship-type example given on the QV Calculator web page (http://www.stats.ox.ac.uk/~firth/qvcalc/); see also file 'example.lsp'.The result is a quasi-variance object, unless WEB is non-nil in which case it is a nicely-formatted summary string.The summary string can be obtained by sending the :SUMMARY message to the quasi-variance object.

Application of get-qvs to a model object
If the model of interest has itself been fitted in Lisp-Stat, the relevant variancecovariance information can be extracted directly from the model object by get-qvs, whose result is then the required quasi-variance object.
The file example.lspwhen loaded into Xlisp-Stat makes an object, ship-model, corresponding to the main-effects model for the ship damage data.The contents of the file are reproduced in Appendix A.
The estimated parameters and standard errors as reported by poissonreg-model are To make the quasi-variance object for TYPE as before, we can use either > (def ship-type-qvs (get-qvs ship-model (list 1 2 3 4) :model-name "main effects model" :factor-name "ship type"))

Constant
or > (def ship-type-qvs (get-qvs ship-model "type" :model-name "main effects model" :factor-name "ship type")) The first argument of get-qvs is the model object of interest.The second argument identifies which are the particular model coefficients of interest, either by indicating their positions (list 1 2 3 4) in the list of parameter estimates (starting at 0), or using a string (here "type") which unambiguously matches the beginning of the variable name as printed out with the estimates and standard errors.The full description of get-qvs is > (help 'get-qvs) GET-QVS [function-doc] Args: (model-object subset &key (reference-level 1) (factor-name nil) (model-name nil)) MODEL-OBJECT -a model object (which inherits from REGRESSION-MODEL-PROTO, or is an instance of GEE-PROTO or COX-REGRESSION-PROTO, for example)

SUBSET
-either a predictor-name prefix string or a list of indices corresponding to the coefficients of the factor of interest REFERENCE-LEVEL -the level (if any) whose coefficient has been set to zero to identify the parameters.This is level 1 by default.Set this to NIL for a parametrization where no coefficient has been constrained to zero and the variance-covariance matrix of all p coefficients is available, or to another value if a different coefficient has been set to zero.FACTOR-NAME -an optional string as a reminder of which factor the quasi-variance structure refers to.MODEL-NAME -an optional string as a reminder of which model the quasi-variance structure refers to.Computes quasi-variances from the covariance matrix of a subset of estimated coefficients in a model.The coefficients concerned should correspond to the levels of a categorical variable (factor).Result is a quasi-variance object, from prototype QV-PROTO.
A corresponding quasi-variance object for year-built can be made by > (def ship-built-qvs (get-qvs ship-model "built" :model-name "main effects model" :factor-name "year ship built")) and a printed summary obtained as before: > (send ship-built-qvs :summary) QUASI-VARIANCE SUMMARY Model name: main effects model for ship damage rates Response variable: DAMAGE-INCIDENT-COUNT Factor name: year built Covariance matrix: Quasi-variances (and corresponding quasi-se's) computed from the above matrix by the QV Calculator: For the standard error of a simple contrast, i.e., of a difference between two levels, the error incurred by the quasi-variance approximation is between -3.6% and 6.5%.
In the set of *all* possible contrasts the approximation error is between -9.5% and 8.2%.
Some of the relative errors of approximation for BUILT are of the order of 8 or 9 percent in magnitude.This is quite a bit larger than the maximum error found above for TYPE, but is still accurate enough for most purposes.It is worth bearing in mind that the covariance matrix itself was estimated, and is based on large-sample distribution theory.

To what models can get-qvs be applied?
A model object to be operated upon by get-qvs must have methods defined for the following four messages: :response-variable, :predictor-names, :intercept and :coef-covariance-matrix.Many Lisp-Stat model objects have methods for the first three of these messages, including all instances of regression-model-proto and its descendants (nonlinear regressions, generalized linear models) and instances of gee-proto (Lumley, 1996) and cox-regression-proto (Lumley, 1998).The fourth message, :coef-covariance-matrix, requires a new method to be defined.This is done automatically by the QV Calculator for instances and descendants of regression-model-proto (including nonlinear and generalized linear regressions) using the following definition: (defmeth regression-model-proto :coef-covariance-matrix () "Method args: () Returns the estimated variance-covariance matrix of the regression parameter estimates."(* (send self :xtxinv) (^(send self :sigma-hat) 2))) For other model objects, the method for :coef-covariance-matrix needs to be supplied by the user.For generalized estimating equation models and Cox proportional hazards models, for example, the required definitions are and (defmeth cox-regression-proto :coef-covariance-matrix () "Method args: () Returns the estimated variance-covariance matrix of the regression parameter estimates."(send self :covariance-matrix)) In every case the method for :coef-covariance-matrix should return the matrix of estimated variances and covariances of all estimated regression coefficients, including any intercept term present in the model.

Further notes on quasi-variance objects
We have seen already the use of the methods :summary, :worst-errors and :help associated with a quasi-variance object.Other available methods are listed in response to the :help message.One other method worth mentioning specifically is :contrast-variance, which computes the variance of a specified contrast and its quasi-variance approximation.As an example, consider the contrast between ships built before 1970 with those built later: > (send ship-built-qvs :contrast-variance (list 1 1 -1 -1)) Contrast coefficients: (1 1 -1 -1) Contrast variance: .141QV approximation: .117(0.141399933681981 0.116743532464883) This particular contrast is in fact among those whose precision is least well approximated in this example, with a relative error of −9.1% on the scale of the standard deviation.
A list of all quasi-variance objects currently defined can be obtained by > (qv-objects) (SHIP-BUILT-QVS SHIP-TYPE-QVS) An optional argument to qv-objects is a string with which the names of objects are compared: only those objects whose name begins with the given string are included in the list.Thus > (qv-objects "ship-t") (SHIP-TYPE-QVS)

Files in the QV Calculator distribution
There are just two files: the main file qvcalc.lspcontains all the code, and the other file example.lspmakes the main-effects model object ship-model for the ship damage data.

Using the web-based QV calculator
The web-based QV Calculator calls qvcalc to process the user-supplied variancecovariance information, and returns the formatted summary of the resulting quasivariance object.Only the input styles 'VCL and 'CORR, in which the user supplies respectively the lower triangle of either the covariance matrix of the correlation matrix, are supported by the web input form.The URL is http://www.stats.ox.ac.uk/~firth/qvcalc/ 7 Some implementation details

Computing the quasi-variances
The criterion minimized by the quasi-variances where v ij is the variance of simple contrast βi − βj .Formally, the optimization problem is equivalent to maximum likelihood estimation of the {q i } in the statistical model with exp(µ ij ) = q i + q j , which is a generalized linear model with normal errors and exponential link.The QV Calculator exploits this fact to carry out the optimization using the glim-proto modelfitting routines of Xlisp-Stat; a new model prototype, normal-exp-proto, is defined in order to achieve this.This approach is similar to the one taken by Ridout (1989), whose approximation criterion is formally equivalent to maximum likelihood for a statistical model in which the contrast standard errors √ v ij are gamma-distributed with means √ (q i + q j ).

CGI for the web-based calculator
Communication between the web server and Xlisp-Stat is done on a Macintosh through a CGI script written in AppleScript.The CGI script used is based on a template by Jon Wiederspan, which may be found at http://www.comvista.com/.The part of the script specific to the QV Calculator is the passing of information supplied on the web form (stored as variables NumberOfLevels, DataStyle and UserData by the CGI script) to Xlisp-Stat, and the return by Xlisp-Stat of a string containing either the quasi-variance summary or an error message as appropriate (all error checking is carried out by Xlisp-Stat).The AppleScript code for this part is as follows:

Please cite use
The QV Calculator is provided free, but its author would appreciate acknowledgement in the form of citation in any published work that uses it.The present paper can be cited as:
NAME NEW OWN-METHODS OWN-SLOTS PARENTS PRECEDENCE-LIST PRINT PROTO QUASI-VARIANCES REPARENT RESPONSE-NAME RETYPE SHOW SIMPLE-CONTRAST-LABELS SIMPLE-CONTRAST-VARIANCES SLOT-NAMES SLOT-VALUE SUMMARY WORST-ERRORSHelp can be obtained on a specific topic: ADD-METHOD ADD-SLOT APPROXIMATED-SIMPLE-CONTRAST-VARIANCES CONTRAST-VARIANCE COVARIANCE-MATRIX DELETE-DOCUMENTATION DELETE-METHOD DELETE-SLOT DOC-TOPICS DOCUMENTATION FACTOR-NAME GET-METHOD HAS-METHOD HAS-SLOT HELP INTERNAL-DOC ISNEW LEVEL-NAMES LEVELS METHOD-SELECTORS MODEL-