The R Commander: A Basic-Statistics Graphical User Interface to R

Unlike S-PLUS, R does not incorporate a statistical graphical user interface (GUI), but it does include tools for building GUIs. Based on the tcltk package (which furnishes an interface to the Tcl/Tk GUI builder) the Rcmdr package provides a basic-statistics graphical user interface to R called the (cid:147)R Commander.(cid:148) The design objectives of the R Commander were as follows: to support, through an easy-to-use, extensible, cross-platform GUI, the statistical functionality required for a basic-statistics course (though its current functionality has grown to include support for linear and generalized-linear models, and other more advanced features); to make it relatively di¢ cult to do unrea-sonable things; and to render visible the relationship between choices made in the GUI and the R commands that they generate. The R Commander uses a simple and familiar menu/dialog-box interface. Top-level menus include File , Edit , Data , Statistics , Graphs , Models , Distributions , Tools , and Help , with the complete menu tree given in the paper. Each dialog box includes a Help button, which leads to a relevant help page. Menu and dialog-box selections generate R commands, which are recorded in a script window and are echoed, along with output, to an output window. The script window also provides the ability to edit, enter, and re-execute commands. Error messages, warnings, and some other information appears in a separate messages window. Data sets in the R Commander are simply R data frames, and can be read from attached packages or imported from (cid:133)les. Although several data frames may reside in memory, only one is (cid:147)active(cid:148)at any given time. There may also be an active statistical model (e.g., an R lm or glm object). The purpose of this paper is to introduce and describe the use of the R Commander GUI; to describe the design and development of the R Commander; and to explain how the R Commander GUI can be extended. The second part of the paper (following a brief introduction) can serve as an introductory guide for students who will use the R Commander


Background and Motivation
R (Ihaka and Gentleman, 1996; R Development Core Team, 2004) is a free, open-source implementation of the S statistical computing language and programming environment.R is a command-driven system: One normally speci…es a statistical analysis in R by typing commands -that is, statements in the S language that are executed by the R interpreter.S-PLUS (a commercial implementation of the S language), also incorporates a graphical user interface (a "GUI") to much of the statistical functionality of S.
In my opinion, a GUI for statistical software is a mixed blessing: On the one hand, a GUI does not require that the user remember the names and arguments of commands, and decreases the chances of syntax and typing errors.These characteristics make GUIs particularly attractive for introductory, casual, or infrequent use of software.
On the other hand, having to drill one's way through successive layers of menus and dialog boxes can be tedious and can make it di¢ cult to reproduce a statistical analysis, perhaps with variations.Moreover, providing a GUI for a statistical system that includes hundreds (or even thousands) of commands, many incorporating extensive options, can produce a labyrinth.The R Commander GUI described in this paper is not immune to these problems, but I have tried to keep things relatively simple, and to render visible, in a reusable form, the R commands that the GUI generates.
Unlike S-PLUS, R does not include a statistical GUI, but it does furnish tools for building GUIs. 1 The Rcmdr package provides a basic-statistics GUI for R, which I call the "R Commander."The design objectives of the R Commander were as follows: Most important, to provide, through an easy-to-use, cross-platform, extensible GUI, the statistical functionality required for a basic-statistics course. 2 The original target text was David Moore's The Basic Practice of Statistics, Second Edition (Freeman, 2000).With the help of a research assistant (Tony Christensen), I have since examined several other texts (including the third edition of Moore, 2004), collected suggestions from a number of individuals, and slightly expanded the horizons of the R Commander -for example, to include linear and generalized-linear models.
To make it relatively di¢ cult to do unreasonable things (such as calculating the mean of a categorical variable).
To render visible the relationship between choices made in the GUI and the R commands that they generate.Commands are both pasted into a script window in the R Commander and echoed to an output window (see below).The script window is editable, commands in the window can be executed or re-executed, and new commands can be entered by typing directly in the window.Scripts can also be saved to and loaded from …les.
One purpose of this paper is to introduce and describe the basic use of the R Commander GUI.In particular, Section 2 of the paper can serve as an introductory guide for students who will use the R Commander.Section 3 describes the design and development of the R Commander; informally assesses the extent to which it has met its goals; and suggests future directions for the project.Section 4 explains how the R Commander can be extended.The …nal section provides some information for instructors.The help …les for the Rcmdr package appear as an appendix to the paper.

Starting the R Commander
Once R is running, simply loading the Rcmdr package by typing the command library(Rcmdr) into the R Console starts the R Commander GUI.To function properly under Windows, the R Commander requires the single-document interface (SDI) to R. 3 After loading the package, R Console and R Commander windows should appear more or less as in Figures 1 and 2. These and other screen images in this document were created under Windows XP; if you use another version of Windows (or, of course, another computing platform), then the appearance of the screen may di¤er. 4e R Commander and R Console windows ‡oat freely on the desktop.You will normally use the menus and dialog boxes of the R Commander to read, manipulate, and analyze data.R commands generated by the R Commander GUI appear in the upper text window (labelled Script Window ) within the main R Commander window.You can also type R commands directly into the script window or at the > (greater-than) prompt in the R Console; the main purpose of the R Commander, however, is to avoid having to type commands.The lower, gray window (labelled Messages) displays error messages, warnings, and some other information ("notes"), such as the start-up message in Figure 2.
When you create graphs, these will appear in a separate Graphics Device window.
There are several menus along the top of the R Commander window: File Menu items for loading and saving script …les; for saving output and the R workspace; and for exiting.
Edit Menu items (Cut, Copy, Paste, etc.) for editing the contents of the script and output windows.Right clicking in the script or output window also brings up an edit "context" menu.
Data Submenus containing menu items for reading and manipulating data.
Statistics Submenus containing menu items for a variety of basic statistical analyses.
Graphs Menu items for creating simple statistical graphs.
Models Menu items and submenus for obtaining numerical summaries, con…dence intervals, hypothesis tests, diagnostics, and graphs for a statistical model, and for adding diagnostic quantities, such as residuals, to the data set.
Distributions Probabilities, quantiles, and graphs of standard statistical distributions (to be used, for example, as a substitute for statistical tables).Tools Menu items for loading R packages unrelated to the Rcmdr package (e.g., to access data saved in another package), and for setting some options.
Help Menu items to obtain information about the R Commander (including this paper).As well, each R Commander dialog box has a Help button (see below).
The complete menu "tree"for the R Commander (version 1.0-0) is shown below.Most menu items lead to dialog boxes, as illustrated later in this paper.Menu items are inactive ("grayed out") if they are inapplicable to the current context.The R Commander interface includes a few elements in addition to the menus and dialogs: Below the menus is a "toolbar" with a row of buttons.
-The left-most ( ‡at) button shows the name of the active data set.Initially there is no active data set.If you press this button, you will be able to choose among data sets currently in memory (if there is more than one Immediately below the toolbar is the script window (so labelled), a large scrollable text window.As mentioned, commands generated by the GUI are copied into this window.
You can edit the text in the script window or even type your own R commands into the window.Pressing the Submit button, which is at the right below the script window (or, alternatively, the key combination Ctrl-r, for "run"), causes the line containing the cursor to be submitted (or resubmitted) for execution.If several lines are selected (e.g., by left-clicking and dragging the mouse over them), then pressing Submit will cause all of them to be executed.Commands entered into the script window can extend over more than one line, but if they do, lines after the …rst must be indented with one or more spaces or tabs.
Below the script window is a large scrollable and editable text window for output.Commands echoed to this window appear in red, output in dark blue (as in the R Console).
At the bottom is a small gray text window for messages.Error messages are displayed in red text, warnings in green, and other messages in dark blue.Errors and warnings also provide an audible cue by ringing a bell.Messages are cleared at the next operation, but a 'note'does not clear an error message or a warning.
Once you have loaded the Rcmdr package, you can minimize the R Console.The R Commander window can also be resized or maximized in the normal manner.If you resize the R Commander, the width of subsequent R output is automatically adjusted to …t the output window.
The R Commander is highly con…gurable: I have described the default con…guration here.
Changes to the con…guration can be made via the Tools !Options. . .menu, or -much more extensively -by setting options in R. 7 See the Rcmdr help …les for details.

Data Input
Most of the procedures in the R Commander assume that there is an active data set.The …rst line of the …le contains variable names: TFR (the total fertility rate, expressed as number of children per woman), contraception (the rate of contraceptive use among married women, in percent), infant.mortality(the infant-mortality rate per 1000 live births), GDP (gross domestic product per capita, in U.S. dollars), and region.
Subsequent lines contain the data values themselves, one line per country.The data values are separated by "white space" -one or more blanks or tabs.Although it is helpful to make the data values line up vertically, it is not necessary to do so.Notice that the data lines begin with the country names.Because we want these to be the "row names" for the data set, there is no corresponding variable name: That is, there are …ve variable names but six data values on each line.When this happens, R will interpret the …rst value on each line as the row name.Some of the data values are missing.In R, it is most convenient to use NA (representing "not available") to encode missing data, as I have done here.
The variables TFR, contraception, infant.mortality,and GDP are numeric (quantitative) variables; in contrast, region contains region names.When the data are read, R will treat region as a "factor"-that is, as a categorical variable.In most contexts, the R Commander distinguishes between numerical variables and factors.
To read the data …le into R, select Data !Import data !from text …le... from the R Commander menus.This operation brings up a Read Data From Text File dialog, as shown in Figure 3.The default name of the data set is Dataset.I have changed the name to Nations.
Valid R names begin with an upper-or lower-case letter (or a period, .)and consist entirely of letters, periods, underscores (_), and numerals (i.e., 0-9); in particular, do not include any embedded blanks in a data-set name.You should also know that R is case-sensitive, and I clicked the View data set button to bring up the data viewer window, also shown in Figure 5. Notice that the commands to read and view the Nations data set (the R read.table and showData commands) appear, partially obscured by the display of the data set, in the script and output windows.When the data set is read and becomes the active data set, a note appears in the messages window (and this is erased when the subsequent showData command is executed).
The read.table command creates an R "data frame," which is an object containing a rectangular cases-by-variables data set: The rows of the data set represent cases or observations and the columns represent variables.Data sets in the R Commander are R data frames.

Entering Data Directly
To enter data directly into the R spreadsheet-like data editor you can proceed as follows.As an example, I use a very small data set from Problem 2.44 in Moore (2000): Select Data !New data set... from the R Commander menus.Optionally enter a name for the data set, such as Problem2.44, in the resulting dialog box, and click the OK button.(Remember that R names cannot include intervening blanks.)This will bring up a Data Editor window with an empty data set.
Enter the data from the problem into the …rst two columns of the data editor.You can The R Commander: A Basic-Statistics GUI to R

Creating Numerical Summaries and Graphs
Once there is an active data set, you can use the R Commander menus to produce a variety of numerical summaries and graphs.I will describe just a few basic examples here.A good GUI should be largely self-explanatory: I hope that once you see how the R Commander works, you will have little trouble using it, assisted perhaps by the on-line help …les.
In the examples below, I assume that the active data set is the Nations data set, read from a text …le in the previous section.If you typed in the …ve-observation data set from Moore  (2000), or read in the Prestige data set from the car package, as were also described in the previous section, then one of these is the active data set.Recall that you can change the active data set by clicking on the ‡at button with the active data set's name near the top left of the R Commander window, selecting from among a list of data sets currently resident in memory.
Selecting Statistics !Summaries !Active data set produces the results shown in Figure 10.For each numerical variable in the data set (TFR, contraception, infant.mortality,and GDP), R reports the minimum and maximum values, the …rst and third quartiles, the median, and the mean, along with the number of missing values.For the categorical variable region, we get the number of observations at each "level" of the factor.Had the data set included more than ten variables, the R Commander would have asked us whether we really want to proceed -potentially protecting us from producing unwanted voluminous output.
Similarly, selecting Statistics !Summaries !Numerical summaries... brings up the dialog box shown in Figure 11.Only numerical variables are shown in the variable list in this dialog; the factor region is missing, because it is not sensible to compute numerical summaries for a factor.Clicking on infant.mortality,and then clicking OK, produces the following output (in the output window): By default, the R commands that are executed print out the mean and standard deviation of the variable, along with quantiles (percentiles) corresponding to the minimum, the …rst quartile, the median, the third quartile, and the maximum.
As is typical of R Commander dialogs, the Numerical Summaries dialog box in Figure 11 includes OK, Cancel, and Help buttons.The Help button leads to a help page either for the dialog itself or (as here) for an R function that the dialog invokes.

Statistical Models
Several kinds of statistical models can be …t in the R Commander using menu items under Statistics !Fit models: linear models (by both Linear regression and Linear model ), generalized linear models, multinomial logit models, and proportional-odds models [the latter two from Venables and Ripley's (2002) nnet and MASS packages, respectively].Although the resulting dialog boxes di¤er in certain details (for example, the generalized linear model dialog makes provision for selecting a distributional family and corresponding link function), they share a common general structure, as illustrated in the Linear Model dialog in Figure 16. 14uble-clicking on a variable in the variable-list box copies it to the model formulato the left-hand side of the formula, if it is empty, otherwise to the right-hand side (with a preceding + sign if the context requires it).Note that factors (categorical variables) are parenthetically labelled as such in the variable list.
The row of buttons above the formula can be used to enter operators and parentheses into the right-hand size of the formula.
1 3 At start-up, the R Commander turns on the graph history mechanism; this feature is available only in Windows systems.Dynamic three-dimensional scatterplots created by Graphs !3D scatterplot... appear in a special RGL device window; likewise, e¤ect displays created for statistical models (Fox, 2003) via Models !Graphs !E¤ ect plots appear in individual graphics-device windows.
1 4 An exception is the Linear Regression dialog in which the response variable and explanatory variables are simply selected by name from list boxes containing the numeric variables in the current data set.The explanation below assumes familiarity with R model formulas; see, for example, the Introduction to R manual that comes with R, which may be accessed from the Help menu in the R Console.You can also type directly into the formula …elds, and indeed have to do so, for example, to put a term such as log(income) into the formula.
The name of the model, here LinearModel.1, is automatically generated, but you can substitute any valid R name.
You can type an R expression into the box labelled Subset expression; if supplied, this is passed to the subset argument of the lm function, and is used to …t the model to a subset of the observations in the data set.One form of subset expression is a logical expression that evaluates to TRUE or FALSE for each observation, such as type != "prof" (which would select all non-professional occupations from the Prestige data set).
Clicking the OK button produces the following output (in the output window), and makes LinearModel.1 the active model, with its name displayed in the Model button:

Saving and Printing Output
You can save text output directly from the File menu in the R Commander ; likewise you can save or print a graph from the File menu in an R Graphics Device window.It is generally more convenient, however, to collect the text output and graphs that you want to keep in a word-processor document.In this manner, you can intersperse R output with your typed notes and explanations.
Open a word processor such as Word, or even Windows WordPad.To copy text from the output window, block the text with the mouse, select Copy from the Edit menu (or press the key combination Ctrl-c, or right-click in the window and select Copy from the context menu), and then paste the text into the word-processor window via Edit !Paste (or Ctrl-v ), as you would for any Windows application.One point worth mentioning is that you should use a mono-spaced ("typewriter") font, such as Courier New, for text output from R; otherwise the output will not line up neatly.
Likewise to copy a graph, select File !Copy to the clipboard ! as a Meta…le from the R Graphics Device menus; then paste the graph into the word-processor document via Edit !Paste (or Ctrl-v ).Alternatively, you can use Ctrl-w to copy the graph from the R Graphics Device, or right-click on the graph to bring up a context menu, from which you can select Copy as meta…le. 15At the end of your R session, you can save or print the document that you have created, providing an annotated record of your work.
Alternative routes to saving text and graphical output may be found respectively under the R Commander File and Graphs !Save graph to …le menus.

Terminating the R Session
There are several ways to terminate your session.For example, you can select File !Exit !From Commander and R from the R Commander menus.You will be asked to con…rm, and then asked whether you want to save the contents of the script and output windows.Likewise, you can select File !Exit from the R Console; in this case, you will be asked whether you want to save the R workspace (i.e., the data that R keeps in memory); you would normally answer No.

Entering Commands in the Script Window
The script window provides a simple facility for editing, entering, and executing commands.Commands generated by the R Commander appear in the script window, and you can type and edit commands in the window more or less as in any editor.The R Commander does not provide a true "console" for R, however, and the script window has some limitations: Commands that extend over more than one line should have the second and subsequent lines indented by one or more spaces or tabs; all lines of a multiline command must be submitted simultaneously for execution.
Commands that include an assignment arrow (<-) will not generate printed output, even if such output would normally appear had the command been entered in the R Console [the command print(x <-10), for example].On the other hand, assignments made with the equals sign (=) produce printed output even when they normally would not (e.g., x = 10).
Commands that produce normally invisible output will occasionally cause output to be printed in the output window.This behaviour can be modi…ed by editing the entries of the log-exceptions.txt…le in the R Commander's etc directory.
Blocks of commands enclosed by braces, i.e., {}, are not handled properly unless each command is terminated with a semicolon (;).This is poor R style, and implies that the script window is of limited use as a programming editor.For serious R programming, it would be preferable to use the script editor provided by the Windows version of R itself, or -even better -a programming editor.

Design and Development of the R Commander
Prior to developing the R Commander, I had for several years wanted to use R in teaching basic statistics to social-science undergraduates, but from past experience I felt that the command-line interface to R would present an obstacle to many students.The software that I used in this course over the previous decade or so -…rst Minitab and then SPSS -was not software that I used in my own work.Moreover I did not feel that I could ask my students to purchase software for the class, which already requires them to buy a relatively expensive textbook and some other materials.Consequently statistical computing in the course was relegated to university computer labs.I expect that this not an uncommon scenario, at least at universities that do not o¤er attractive site-licensing of statistical software to students.I expected someone else with more experience in GUI development to produce a suitable GUI for R, but when nothing that I could use in my course materialized by the Spring of 2003, I decided to explore creating one myself.I looked initially at the facilities provided by the Windows version of R -for example, the winMenu* and winDialog functions -but quickly determined that these were inadequate for developing a broadly useful statistical GUI. 16I experimented next with Visual Basic, and although this route to a statistical GUI for R appeared to be feasible, I decided against it for several reasons, the most important of which were the propriety nature of Visual Basic and my desire to produce a cross-platform solution.
I quickly gravitated towards Peter Dalgaard's tcltk package: The package is available for all of the major R platforms; it provides a serviceable, if not rich, set of widgets; and most importantly, the standard Windows version of R installs a basic Tcl/Tk system.The last point was key, in my view, because the principal target audience for the a basic-statistics GUI consists in large majority of Windows users, many of whom have di¢ culty installing and con…guring software.By using Tcl/Tk through the tcltk package, I was also able to provide a GUI as a standard R package, which developed into the Rcmdr.Installing the Rcmdr (and its dependencies) is simple, especially on Windows systems, and loading the package starts up the GUI.
Other, arguably more capable, GUI builders -such as GTK via the RGtk package (see http://www.omegahat.org/RGtk/index.html)-appeared to create obstacles for Windows users.I believe that this situation is essentially unchanged, though I am also aware of several other R GUI projects in addition to the R Commander.I look forward to these producing a better statistical GUI than the R Commander that is usable by relatively naive Windows users.
Using Tcl/Tk entailed several compromises, however: The standard widget set is limited; in particular, I was unable to employ drop-down lists, tabbed dialogs, and table widgets, which I would have preferred to use in certain contexts.For example, the data set viewer in the Rcmdr package -the showData function from the relimp package -would have been more naturally programmed using a table widget, as would the Rcmdr Enter Two-Way Table and Test Linear Hypothesis dialogs.Similarly, providing options on an Options tab would produce The R Commander: A Basic-Statistics GUI to R cleaner and more uniform dialog boxes.There are extended widget sets available for Tcl/Tk, but because these are not part of the standard installation of R for Windows, I reluctantly ruled out their use. 17 Another limitation of Tcl/Tk is that while it is available on all of the major platforms that run R, its look and feel is non-standard on all of these platforms.Nevertheless, I have been able to tune the behaviour of the R Commander GUI to be very similar to that of a standard Windows application.
Some not entirely nontrivial problems remain: On the Macintosh (as mentioned), applications such as the R Commander that use the tcltk package must run under X-Windows and require software that is not installed on out-of-the-box OS/X systems; the appearance of the R Commander GUI is not as attractive on Linux systems as it is on Windows systems, although the cosmetics can be improved by carefully selecting fonts and font sizes (as supported by R Commander options); and there are some (if now greatly reduced) stability problems on Windows systems, stemming from the integration of the Tcl/Tk and R event loops.
The initial version of the Rcmdr package (numbered 0.5-0) 18 , with perhaps half the content of the current version, was completed in about a month, and somewhat later, in the Summer of 2003, was contributed to CRAN.The range of features supported by the R Commander grew gradually over the following two years, but a number of conventions established in this early version of the package persist: The interface uses standard menus, most of which lead to simple dialog boxes.As mentioned, the limited range of R-Commander dialog-box elements is the product of the restricted standard Tk widget set, but the simplicity and familiarity of the interface is deliberate.The object was to produce an interface that students would be able to learn and negotiate with little trouble.Though it is less extensive and less polished, the R Commander GUI is similar in many respects to other GUIs to command-oriented statistical software, such as SPSS (http://www.spss.com/)and Minitab(http://www.minitab.com/):The basic model of work- ‡ow is procedural.This contrasts with statistical packages [such as JMP (http://www.jmp.com/) or Vista (http://www.visualstats.org/)]that are meant to be pedagogically innovative.
The set of top-level menus in Version 0.5-0 was the same as the current one, except that a Tools menu was introduced much later.The R Commander menus were initially "hard-wired" in the package code, but were later made con…gurable via a text …le.In other instances as well, features in the package were made more ‡exible and con…gurable.For example, the Rcmdr originally supported only linear and generalized linear models; now, the range of supported models has expanded and can be augmented by the user. 19pical R Commander dialog boxes have one or more scrollable variable-list boxes at the top; check boxes and radio buttons for selecting options below that; and OK, Cancel, and Help buttons at the bottom.Some dialog boxes have buttons that produce subdialogs displayed over the main dialog.I have tried to use this arrangement sparingly, and could have avoided it altogether were tabbed dialogs available in the Tk widget set supported under Windows by the tcltk package.
Menus and dialog boxes generate R commands (whence the name, "R Commander") that are saved in a script window (originally called a "log").These commands call basic R functions, functions in the "recommended"packages that are part of the standard R distribution, and -as necessary -functions in contributed packages available from CRAN.Although I tried to avoid it, in a few instances, I introduced additional statistical functionality to the Rcmdr package: for example, functions to compute alpha-reliability for composite scales and to compute partial-correlation matrices.These functions, summarized in Table 5 in the next section of the paper, are usable independently of the Rcmdr GUI.Generating commands to be executed was not the only route to go: Statistical computations could have been, at least partly, subsumed in the code for the Rcmdr package, and the details of the computations hidden from the user.To do so, however, would have wasted some of the e¤ort put into developing the statistical capabilities of R, and would also have contradicted one of the goals of the R Commander projectto draw a visible connection between choices made in the GUI and R commands.
Statistical analyses are performed on an active data set, which is a standard R data frame.An alternative would be to allow the user to select a data set in each dialog, with the selection defaulting to the previous one.This seems to me to o¤er no advantage over the current scheme.Another possibility would be to permit multiple data frames to be attached to the search path.This approach provides more ‡exibility in handling data, but I …nd that even more advanced students than those in introductory statistics classes have di¢ culty dealing with issues, such as objects masking each other, that arise from managing the search path. 20For similar reasons, all variable creation (for example, by the Recode and Compute dialogs, and the computation of residuals or other "case statistics"for statistical models) takes place in the active data set; an alternative would have been to allow variables to be created in the global environment, but such an approach risks doing damage, creating con ‡icts, and generating potentially cryptic errors.
Similarly, operations on statistical models via the Models menu are performed primarily on an active statistical model, which is kept synchronized with the active data setwhen the active data set is changed, there is initially no active model, and when an active model is selected from among recognized model objects in memory, the active data set is changed to the data frame on which that model was …t.This procedure is a bit constraining for advanced users (who will, I believe, in any event prefer to specify commands directly), but it helps novices to keep things straight.
Menus and dialog boxes produce R commands as text strings.The R Commander causes these commands to be parsed and evaluated in the global R environment.Having the commands available as text is convenient for entry into the script and output windows, but I am not entirely satis…ed with this approach: In particular, building text commands can be awkward, and the code to do so hard to read.My early e¤orts to proceed with tools such as evaluate, substitute, and expression were not successful, however.
The R Commander: A Basic-Statistics GUI to R Likewise, although it has successively been improved, the script window is much less than a true R console, something that I have been unable to provide in a platformindependent manner.
The original R Commander had a toolbar below the menu bar with information …elds displaying the names of the active data set and active statistical model; buttons for editing and viewing the active data set; and a check box for determining whether commands were echoed to the script window.Somewhat later, the data-set and statistical-model information …elds morphed into buttons that could be used to select the active data set and model, the log window became the current script window, and the check box was removed.A button was provided to submit lines in the script window for re-execution.
Initially, output was directed to the R console.Although this arrangement is retained as an option, an output window was introduced, which receives printed output by default.
Error messages and warnings were initially printed in the R console.Later, such messages were intercepted and presented to the user in pop-up message windows.Currently, error messages and warnings (along with other messages) are directed to a messages window.The main R Commander window therefore has evolved from one, to two, and then to three text sub-windows.The script and output windows are editable.
Along the way, many changes were made "beneath the hood"to improve the performance and maintainability of the Rcmdr package.At one point, for example, the size of the Rcmdr code was reduced by nearly 40 percent by modularizing repetitive elements, primarily in dialogbox generating functions.Some of this required macro-like functions (adapted from Lumley,  2001).At present, functions that create Rcmdr dialog boxes consist mostly of calls to utility functions to initialize and close a dialog, and to construct common elements such as variable lists, sets of radio buttons and check boxes, and the OK, Cancel, and Help buttons at the bottom of the dialog box.This process is illustrated in the next section.
Similarly, the original Rcmdr saved a great deal of state information in global variables, such as the name of the active data set, the names of variables within the active data set, and various options.Currently, all of this state information is saved instead in a special environmenta much neater and less problematic solution.

How well has the R Commander met its goals?
Ease of use Over the years, I have used a variety of statistical software in introductorystatistics courses -more, indeed, than I would care to enumerate.Although I do not have formal evidence about the relative usability of the R Commander in this context, I can report that in the two years that I have been using it, students appear to have virtually no trouble in completing course assignments requiring the software.I have also had positive feedback from other individuals who have used the Rcmdr package for statistical instruction.This experience compares favourably with the other statistical software that I have used in teaching.
Coverage The R Commander now is much more extensive than required for the basic statistics texts that I have examined, and can reasonably support most of a low-level course in applied regression analysis.
Cross-platform functionality My own experience with the Rcmdr package is primarily under Windows, where the software works quite well.As mentioned, I and others also have used it successfully under Linux.Installation and use under Macintosh OS/X is possible but more challenging at present.I have occasionally received reports of particular aspects of the software proving problematic on non-Windows systems, but these have been isolated -for example, to the 3D scatterplots dialog, which depends upon the rgl package.
Extensiblity As described in the next section, extension of the Rcmdr package requires some programming and editing of con…guration …les, though not necessarily rebuilding the package itself.This process is facilitated by utility functions for the construction of dialog boxes that the package exports, and by the ability to add to and modify the Rcmdr menu-de…nition …le, but it does presuppose some familiarity with R, the tcltk package, and Tcl/Tk itself.
Protecting the novice from errors Where possible, I have tried to limit users'choices to those that are reasonable within the current context.For example, the dialog-box for an independent-samples t-test presents only two-level factors in the variable-list box for de…ning groups and only numeric variables in the list-box for the response variable.Likewise, if there are no two-level factors or no numeric variables in the active data set (or, indeed, if there is no active data set), then the menu item for an independentsamples t-test is grayed-out.Errors and warnings are intercepted, and where it has been possible to anticipate certain kinds of errors, an e¤ort has been made to report understandable error messages.
To expose users to R commands The script window displays the R commands that the R Commander GUI generates, but it is my impression that most students ignore these commands.This response probably partly re ‡ects my emphasis on generating and interpreting the output of statistical procedures, and at least the commands are there for examination and experimentation.As well, as explained, the R Commander script window has some de…ciencies as a simulated R console.

What is the future of the R Commander?
If the past is prologue, then I have only limited ability to foresee where the R Commander is headed.Nevertheless, several potential directions for future development seem clear: Additional statistical functionality It is safe to predict modest extension of the statistical capabilities of the R Commander in response to users'requests and contributions.More ambitiously, I would like to add high-interaction statistical graphics, such as scatterplots that support dynamic variable transformations and possibly linkage between di¤erent plots [in the manner of Cook and Weisberg's (1999) Lisp-Stat based Arc software].
Improvements to the code and to usability As I have explained, I have worked over the code for the Rcmdr package more than once, but there is certainly still room for improvement -in particular, further elimination of redundancy in the code.At present, R Commander dialogs are used in Philippe Grosjean's SciViews GUI for R (http://www.sciviews.org/SciViews-R/),and it should not be di¢ cult to make these dialogs more generally available outside of the R Commander GUI itself.Moreover, with the exception of the statistical-modelling dialogs, R Commander dialog boxes do not "remember" user selections from one invocation of a dialog to the next; it would not be di¢ cult -though it might be tedious -to provide this feature.Similarly, if an extended set of Tk widgets becomes conveniently available to R users of Windows, I could rework the basic layout of R Commander dialog boxes by incorporating elements such as tabs and drop-down lists.
Internationalization I am aware of the translation of the Rcmdr package from English into two other languages.At present, translation is accomplished by editing the code for the package, a process that is both tedious and that has to be redone with each new version of the Rcmdr.It should not be di¢ cult to provide the ability to translate messages and other text in the package, probably through the new internationalization facilities in version 2.1.0 of R.

Extending the R Commander
As is the case for any R package, a user can modify the source code for the Rcmdr package and rebuild the package.Two features make it possible, however, to modify or add to the Rcmdr package without rebuilding it: 1.The R Commander menus are de…ned in the plain-text (ASCII) …le Rcmdr-menus.txt,which resides in the package's etc directory.Modifying this …le changes the menus.The format of the …le is described below.
2. Files with extension (…le type) .R in the etc directory are "sourced"(read into memory) when the R Commander starts up.Consequently, functions and variables de…ned in .R …les are available in the global environment.
The following example assumes some familiarity with Tcl/Tk (e.g., Welch, 2000) and the tcltk package (Dalgaard, 2001, 2002): Suppose that we want to provide a menu-item and dialog box for multivariate Box-Cox transformations to normality.The car package, which is one of the packages that Rcmdr loads at startup, contains a function to perform the necessary computations, box.cox.powers(see Fox, 2002, pp.111-112).Because none of the existing R Commander menus seems appropriate, I will add a Transform menu under Statistics, with the single item Multivariate Box-Cox transformations. . . .This item will lead to a dialog box to select the variables to be transformed.Finally, I will write a function, named BoxCox, to construct the dialog box and invoke box.cox.powers.
The modi…ed Rcmdr-menus.txt is as follows, eliding most of the lines in the …le (the elisions are marked by . ..).I have also "wrapped" each line in the …le to …t on the page, and inserted a blank line between each menu de…nition."" . . .Each line in the …le contains six entries (…elds) and de…nes either a menu or a menu item.
Each menu has a "parent" menu; top-level menus, such as File and Statistics, have topMenu as their parent.Menu de…nition requires two lines: One to create the menu and another to place it under its parent.
The "operation/parent"…eld in each line contains the parent menu (for menu creation), cascade (for placing a menu under its parent), or command (for a menu item that invokes a command).
The "label" …eld contains the text that labels a menu or menu item.By convention, menu items leading to dialog boxes have labels ending in ellipses, ... .The "command/menu" …eld contains the name of a function to be invoked by a menu item, or the name of a menu to be installed.

Purpose activeDataSet
Returns or sets the name of the active data set.

ActiveDataSet
Returns the name of the active data set.

activeModel
Returns or sets the name of the active model.

ActiveModel
Returns the name of the active model.

Factors
Names of factors in the active data set.
getRcmdr Retrieve an object from the Rcmdr environment.

GrabFocus
Returns (or sets) the grab-focus status.

Variables
Names of variables in the active data set.
Table 1: Functions exported by the Rcmdr package for setting and retrieving information .
The "activation" …eld contains a quoted R expression that, when evaluated, indicates whether a menu item is to be active, if the expression is TRUE, or inactive ("grayed out"), if it is FALSE.The Rcmdr package exports a number of functions (see the discussion below and Table 2) to test the current state of the R Commander -for example, numericP (a "predicate"to test for the presence, and possibly su¢ cient number, of numeric variables in the active data set), factorsP (to test for the presence and number of factors), and packageLoaded (to test whether a speci…c R package has been loaded).The status of menus is assessed at R Commander start-up; it is reassessed when the active data set or active statistical model changes, and whenever the function activateMenus is invoked.If the activation condition is empty (i.e., if the …eld contains ""), then the corresponding menu item is always active.
The last three …elds are empty ("") for menu (as opposed to item) lines.(These lines are indented two additional spaces in the …le listing.)Note the line in the modi…ed Rcmdr-menus.txt…le creating transformMenu as a child of statisticsMenu; the line creating the Box-Cox item under transformMenu; and the line cascading transformMenu under statisticsMenu.
The remaining task is to write the BoxCox function.The Rcmdr package exports a number of functions to assist in writing dialogs and performing computations; these are shown in Tables 1 through 5   Table 3: Functions exported by the Rcmdr package that build elements of dialog boxes.* Functions marked with an asterisk are "macro-like"in their behaviour, in that they execute in the environment of the calling function.These functions were created with a slightly modi…ed version of Thomas Lumley's defmacro function (described in Lumley, 2001).Rename the …le to BoxCox.R to activate it.Likewise, the Rcmdr-menus.txt…le distributed with the package contains commented-out lines for the example; remove the comment characters (#) from the beginnings of these lines to activate them.

Some Suggestions for Instructors
At the beginning of my introductory-statistics course, I distribute a manual for the R Commander based on the second section of this paper.When the software is required during the course, I begin by demonstrating its use for a particular kind of task, such as constructing a contingency table or performing a regression analysis, that is similar to the work that the students will do.Assignments that entail the use of the software are accompanied by directions that point the students towards the menus and dialogs that they will need.Students are given the opportunity to do these assignments in a supervised computer lab, but after the initial assignment, almost all work independently.With the exception of independence from the lab, this is essentially the same strategy that I previously employed with other statistical software.
Some of the social-science students whom I encounter in introductory statistics classes have di¢ culty installing and con…guring software.I imagine that this situation varies with discipline and locale, but I also expect that it is reasonably common.I assume here that students will be using R and the R Commander under Windows, but it should not be hard to transpose these suggestions to other operating systems.You do have to install some the tools for building R, however, including Perl and the Inno Setup software for building Windows installers.Inno Setup should be installed at c:npackagesninno4 (not in the default location under Program Files); alternatively, you can edit the MkRules …le in the R source distribution to re ‡ect the location of Inno Setup.See http://www.murdoch-sutherland.com/Rtools/ for further information.
The binary installation that you use as the "target"for the installer should be a complete installation of R -e.g., including all manuals, HTML help pages, etc. Note: Depending upon how your version of Windows is configured, you may not see the file types ".bat" and ".exe" referred to here.
R is free software.Most of it is distributed under the GNU General Public License; see the files rw2001pat\COPYING and rw2001pat\COPYRIGHTS for details.Individual R packages have various licenses; license information is given in the DESCRIPTION file of each package.

Figure 1 :
Figure 1: The R Console window after loading the Rcmdr package.

Figure 2 :
Figure 2: The R Commander window at start-up

File
Data in packages -List data sets in packages j j-Read data set from attached package j-Active data set -Select active data set j j-Help on active data set (if available) j j-Variables in active data set j j-Set case names j j-Subset active data set j j-Remove cases with missing data j j-Export active data set j-Manage variables in active data set -Recode variable j- Cluster analysis -k-means cluster analysis j j-Hierarchical cluster analysis j j-Summarize hierarchical clustering j j-Add hierarchical clustering to data set j-Fit models -Linear regression j-Bonferroni outlier test j-Graphs -Basic diagnostic plots j-Residual quantile-comparison plot j-Introduction to the R Commander j-Help on active data set (if available)

Figure 3 :
Figure 3: Reading data from a text …le.

Figure 4 :
Figure 4: Open-…le dialog for reading a text data …le.

Figure 5 :
Figure 5: Displaying the active data set.

Figure 6 :
Figure 6: Data editor after the data are entered.

Figure 7 :
Figure 7: Dialog box for changing the name of a variable in the data editor.

Figure 8 :
Figure 8: The Data Editor window after both variable names have been changed.

Figure 9 :
Figure 9: Reading data from an attached package.

Figure 10 :
Figure 10: Getting variable summaries for the active data set.

Figure 12 :
Figure 12: Selecting a grouping variable in the Groups dialog box.

Figure 13 :
Figure 13: The Numerical Summaries dialog box after a grouping variable has been selected.

Figure 15
Figure 15: A graphics window containing the histogram for infant mortality.

Figure 16 :
Figure 16: The Linear Model dialog box.

listDataSets
Lists names of data frames, by default in the global environment.listFactors Lists names of factors in a data set.listGeneralizedLinearModels Lists names of glm objects, by default in the global environment.listLinearModels Lists names of lm objects, by default in the global environment.listNumeric Lists names of numeric variables in a data set.listTwoLevelFactors Lists names of two-level factors in a data set.listVariables Lists names of variables in a data set.Numeric Returns names of numeric variables in the active data set.putRcmdr Store an object in the Rcmdr environment.twoLevelFactors Names of two-level factors in the active data set.updateModelNumber increment (or otherwise change) the model number.
Reports an error and (optionally) restarts the dialog.getFrameReturnsthe frame of a listbox object.getSelectionReturns the currently selected elements of a listbox object.groupsBox*Constructs a button and sub-dialog box for selecting a grouping factor.groupsLabel*Constructs a text …eld that shows the currently selected groups.initializeDialog* Initial housekeeping for a Tk dialog box.modelFormula * Constructs a dialog component for entering a model formula.OKCancelHelp *Constructs OK, Cancel, and Help buttons.radioButtons*Constructs a set of related radio buttons.subOKCancelHelp*Constructs OK, Cancel, and Help buttons for a sub-dialog.subsetBox*Constructs a text box for entering a subsetting expression.variableListBoxConstructs an object containing a scrollable list box.

Figure 17 :
Figure 17: An illustrative dialog box produced by the BoxCox function. 23

I
distribute to students a CD/ROM with a live, installed version of R, including all necessary packages, and con…gured to open R in SDI mode, to the load the Rcmdr package at startup, and to use compiled HTML help in R. Students can simply double-click on the …le Run-R.bat in the root directory of the CD to start R.This batch …le contains a single line: 24 start rw2001pat\bin\Rgui.exeStarting with R version 2.0.1 "patched,"it is possible to create a custom installer with packages additional to the "recommended" R packages and modi…ed con…guration …les.Details are in the …le srcngnuwin32ninstallernINSTALL of the R source distribution.A few tips: Although you have to download and unpack the R source distribution, you do not have to compile your own R Windows binary.

I
include a ReadMe.txt…le in the root directory of the CD with the following contents: Installing the R Software and Data Files From the CD/ROM This CD/ROM is intended for Windows 9x, ME, NT, 2000, and XP systems.The CD/ROM contains the following files and directories: o The file rw2001pat.exewill install the R software on your computer and configure it for use in the course.Double-click on the file in the Windows Explorer to initiate the installation process.You can take all of the defaults in the R installer.o The file AdbeRdr60_enu_full.exe will install the Adobe Reader version 6.0 on your computer.This is a viewer for PDF files; you do not have to install the Adobe Reader if you already have it or another PDF file viewer installed on your computer.You need a PDF file viewer to read the R Commander manual and the R manuals.Double-click on the file to initiate installation.o The directory rw2001pat\ contains a pre-installed copy of R that can be run directly from the CD/ROM.Double-click on the file Run-R.bat in the Windows Explorer to run R from the CD/ROM.o The directory R-Packages\ contains zip files for all of the packages on CRAN (the Comprehensive R Archive Network).
Two buttons allow you to open the R data editor to modify the active data set or a viewer to examine it.The data-set viewer can remain open while other operations are performed. 5 ).Most of the menus and dialogs in the R Commander reference the active data set.(TheFile, Edit, and Distributions menus are exceptions.)TheR Commander: A Basic-Statistics GUI to R - lm(prestige ~(education + income )*type , data=Prestige)Operations on the active model may be selected from the Models menu.For example, Models !Hypothesis tests !Anova table produces the following output: .22TRUE if there are hclust objects in memory.lmPTRUEif the active model is an lm object.NumericPTRUE if there are (su¢ cient) numeric variables in the active data set.TRUE if there are (su¢ cient) two-levels factors in the active data set.

Table 2 :
"Predicate" functions exported by the Rcmdr package.These functions are used to determine menu-item activation.