Effect of Presentation Mode and Students' Prior Knowledge on Achievement (visual/verbal Testing) of Different Educational Objectives Thermex: an Educational Expert System for Thermodynamics Students Amtec Board of Directors past President Canadian Journal of Educational Communication Volume 19, Numb

tification of address change should be sent to the above. All articles are copyright by AMTEC and may be reproduced for non-profit use without permission provided credit Is given to CJEC. Back issues of CJEC are $15 Canadian and may be obtained by contacting the Editor. CJEC is indexed in the Canadian Education Index and ERIC.


INTRODUCTION
In 1983 Richard E. Clark stunned many people in the media and technology field by declaring that instructional media have no more effect on student learning and achievement than a delivery truck has on the quality of goods it transports to market. Both, he argued, are essentially neutral carriers of their respective contents. His claim extends from televised instruction on through to more recent applications of computer-based learning.
Clark's characterization of television as a neutral medium did not corne as a particular shock to most, because of the results of experiments in the 1950s and 1960s where no significant differences in TV treatments abounded (Saettler, 1968). But to challenge the literature of computers in education (see Clark, 1985aClark, , 1985b was to contradict both intuition and the prevailing research evidence. A flurry of comments and counter-comments in the literature (e.g., Petkovitch &Tennyson, 1984) and at conferences followed Clark's article for several years.
Clark's claim was based in part on an evaluation of several meta-analyses that have appeared in recent years on the effectiveness of computer-based instruction. In particular he argued that these quantitative summaries (see Table 2 at the end of this article for references) were fundamentally flawed, because a variety of experimental artifacts -among them the novelty effect associated with the treatment itself-had not been factored out of the results.
This article is about meta-analysis and its usefulness to practitioners for planning and predicting the outcomes of instructional treatments and to researchers for conceptualizing future research efforts. Meta-analysis, also referred to as quantitative synthesis, is a general set of procedures for combining the results of many individual research studies addressing a single question (Glass, 1976(Glass, , 1978. The technique has grown out of a need in the social sciences to capture the essence of ever expanding research literatures and to provide definitive answers, in terms of the magnitude of effectiveness, to the bigger questions posed by theoreticians and practitioners. In addition, meta-analysis attempts to circumvent the subjectivism commonly associated with narrative forms of literature review and the limitations ascribed to the box score or vote count technique (Kavale, 1984). However, meta-analysis is not without its critics. There is disagreement among researchers on both the underlying premises of the technique as well as procedural issues relating to its implementation.
This article examines meta-analysis as a technique for reviewing literature with a particular focus on the literature of instructional techniques, methods and strategies. Some of the main issues on both sides of the "metaanalytic debate" are examined, for the purpose of judging its usefulness to educational technology and specifically its potential as a tool for designing instruction.
A case where great controversy has arisen over meta-analytic findings will then be reviewed in detail: the debate for and against the use of mastery learning. Finally, some guidance for reading and interpreting meta-analyses will be provided. An appendix to this article includes references to metaanalytic studies of instructional variables and strategies that are likely to be of interest to the educational technologist.

Why Integrate Research Studies?
It has long been recognized that the result of a single research study by itself is far from conclusive, even when the finding supports the hypotheses under consideration. Therefore, it has been common practice for researchers to review the literature of all such studies, whenever enough are available. It is not uncommon, in fact, to see the same question asked and answered in reviews every couple of years, as new studies add to the weight of evidence that can be brought to bear on a particular question.
The value of integrative reviews stems from limitations that are inherent in the research process itself. Since few studies of educational phenomena and even fewer studies of instructional methods actually draw subjects at random from a population, integrative reviews of many similar studies serve to provide greater coverage of the population. The need for wider coverage is increased when one realizes that individual samples suffer from the same problems of error that is involved in testing the null hypothesis within a study Even when a treatment effect is weak, five findings of significant differences out of one hundred studies run will be expected in the population (i.e., when a equals .05), simply as a result of chance. Integrative reviews, therefore, provide a means of overcoming the effects of chance fluctuation within samples, leading to a more generalizable conclusion concerning an effect.

Methods of Integrating Findings
Light and Smith (197 1) provide a typology for categorizing most reviews of research in the social sciences. The first type of review involves listing or describing factors which have produced significant differences in at least one study. The style of this type of review is primarily narrative. In the second type only studies that support a particular point of view are presented. Most of the brief reviews of literature at the beginning of research articles are of either the first or the second type. A third type involves summarizing the findings of many studies using what has come to be called the vote count or box score technique. A simple count of studies reporting positive, negative or no significant results is conducted and a verdict is reached when a plurality of votes exists. The last type, reviews in which effect sizes are aggregated across many studies, is the category in which meta-analysis resides. The third and the fourth types of reviews are both quantitative in nature. The box score or vote count technique, however, has been criticized because it fails to take into account the effects of differential sample size on the sensitivity of the null hypothesis test. Larger samples require smaller mean differences to establish significance than smaller samples, although they are given equal credence in this technique. Box score analysis also does not take into account the magnitude of differences/relationships or'the quality of the study. Metaanalysis, the subject of this article, was developed by Glass (1976Glass ( , 1978 to overcome the difficulties inherent in descriptive reviews and the problems associated with using statistical indices (e.g., r) to reflect differential treatment effects or relationships among variables.

Objections to Quantitative Synthesis
Objections in principle. Complaints about quantitative synthesis range from the purely philosophical to the purely methodological. On the one hand, there are the arguments raised by advocates of qualitative and naturalistic approaches to enquiry (e.g., see Guba, 1979). His objections in principle to "reduction by numbers" applies doubly to quantitative synthesis, since the distillation of many studies removes the researcher one step further from the "texture" of the original setting. This, of course, is an objection that cannot be overcome by improving the theory or practice of quantitative synthesis. Rather, one must accept or reject this argument based on other criteria which have been laid out and vigorously defended by both sides. The second major objection applies not to quantitative analysis in general, but to quantitative synthesis in particular. Eysenck (1984) argues that what is lost when narrative review is transformed to quantitative review is the exercise of scientific judgment over what is, in nearly all areas of research, a complex set of interacting variables. He states that "No simple addition of diverse and incommensurate studies can serve the purpose of drawing meaningful conclusions from heterogeneous and complex data. That requires experience, knowledge and the intangible quality we call good judgment" (p. 47).
Evidence that paints a rather different picture of actual review practice is supplied by Jackson (1980). He examined a random sampling of narrative reviews from the social sciences and found that decision rules were so often unstated in these reviews, that it was difficult to describe them, much less evaluate their quality Meta-analysis is often touted as the antidote to the subjectivism that appears to be endemic to the process of describing research outcomes verbally.

Objections in practice.
Hardly anyone from within the quantitative research community argues about the need to synthesize the results of large bodies of research literature. It is readily acknowledged that when the literature exceeds even a dozen studies, the ability of reviewers to capture its essence in narrative form is diminished. Quantitative synthesis, then, as a principle-for dealing with substantial literature bases is not challenged. In addition, it is generally acknowledged that there is nothingobjectionable to the statistical underpinnings of meta-analysis, given that they derive from the wealth of statistical experience that has developed over many decades. It appears, then, that the objections arising from the research community derive more from the practice of meta-analyzing, than the principle of meta-analyzing. Slavin (1984) and others have argued that one of the chief problems inherent in much of the meta-analytic literature that has appeared since its introduction by Glass, is the uncritical combining of studies that have little more in common than the underlying question -Yz better than Yc. It is not surprising that many practitioners have adopted this strategy and that the literature reflects this tendency, since Glass had originally suggested subjecting all available studies to meta-analysis.
In large measure, this point is at the heart of Clark's objections to the metaanalyses on computer-based learning. In this particular case, according to Clark, the unconsidered effects of treatment artifacts produced an overestimation of effect size for this medium of instruction, although it could have just as easily gone the other way. In any case it is generally agreed that lumping all possible forms of treatment and methodological variations into one analysis probably leads to more confusion than clarity.
It has been argued, moreover, that one of the strengths of meta-analysis, its tendency towards summary conclusions, is also one of its weaknesses (Guskin, 1984). Since the research question being raised in a meta-analysis is often dichotomous or at best one of simple relationship, the variety of more complex findings that may have appeared in the original articles is reduced The fear has been expressed that consumers of meta-analyses may come away with nothing more than unqualified statements such as, "computer-based instruction is better than traditional lecture-based methods" or "the correlation between prior achievement and instructional support is moderately high and positive". It is certainly arguable, however, that for consumers who would not bother to digest more subtle forms of summaries, a simplistic view of the state of a research literature is better than no impression at all.
In the following sections, procedures for conducting a meta-analysis are described. Issues related to each procedure will be discussed to highlight both the potentials and the problems associated with the technique.

Defining the Scope of the Analysis
A first important decision to be made after a general area of research has been identified is how extensively the search will be conducted and what descriptors will be used in reviewing the literature. While this sounds like a relatively straight forward process, it is usually not. Often this step involves making literally dozens of a priori decisions about what will be included (and not included) in the meta-analysis.
Each decision will narrow the field of search, as well as the number of studies identified and the population to which generalizations can be made.
For example, in a synthesis of mastery learning studies one might consider features such as: a) how far back in time the review will go; b) the grade level of subjects; c) the subject matter tested; d) the duration of treatment; e) whether self-paced or group-based treatments is used; e) the type and quality of the dependent variable; and f) a host of experimental design characteristics (e.g., internal and external validity). Carlberg and Walberg (1984) point out trade-offs in: a) narrowly focusing the synthesis to exclude relevant variations in treatments (high fidelity/limited conclusions); and b) making the scope of inclusion so broad that marginally relevant and/or bad research is analyzed (low fidelity/more robust conclusions).
Advice on both sides of this issue has been offered in the literature of metaanalysis. Glass, McGaw and Smith (1981) argue for the widest inclusion criteria possible in order to reduce the effects of reviewer bias in the selection process. Eysenck (1978) has criticized this approach as "garbage in -garbage out".
As a way of accommodating this criticism, Slavin (1986Slavin ( , 1987 has proposed an approach called "best-evidence synthesis". This approach is based on the legal notion that ". . . the same evidence that would be essential in one case might be disregarded in another because in the second case there is better evidence available" (1986, p. 6). In the case of research review this means that only the best quality literature should be used in judging the general state of a research question-those studies which are high in methodological rigor and best manifest the characteristics under study. In the absence of studies of better quality, this could involve having to use less well designed studies, but in any case, comprising the best evidence. Objections to the use of this approach have been raised by Guskey (1987), who counters that the "best" in bestevidence synthesis is itself subjective and does not necessarily eliminate bias from the review. Abrami, Cohen and d'Apollonia (1988) take a middle approach, between that advocated by Glass and that advocated by Slavin: . . . we urge greater care in describing the inclusion criteria and in detailing the reasons for excluding individual studies. But we also consider that reviews sometimes go beyond describing the substance of the literature to consider the methodological problems and generalizability concerns that distinguish the best evidence from other evidence. Reviews may thus contribute to knowledge in an area through the analysis of study weaknesses as well as strengths. Such a contribution cannot be made through only the analysis of best evidence (p. 164).

Reviewing the Literature
Once inclusion criteria have been established, the approach to locating studies for review is not substantially different from that used in other forms of integrative review. Primary studies may be located from a variety of sources, some of which are accessible through computer-generated searches. Most meta-analyses include the literature from relevant journals in the field. Others include theses and dissertations, conference presentations, technical reports and in-house manuscripts, chapters in books and monographs and other documents referred to categorically as "fugitive material".
Even when inclusion andexclusion criteria have been soundly determined, there remains the thorny problem of actually sorting studies by the established criteria. This process is by no means straight forward, as Abrami, Cohen and d'Apollonia (1988) have demonstrated using data from the literature on the validity of student ratings of instructors. They found that even when inclusion criteria were very clearly specified, seven expert raters had an average comprehensiveness index of only .58 (i.e., ratio of correctly included studies to incorrectly included or excluded studies) with individual indices ranging from .13 to .88. They make recommendations for enhancing the agreement among raters, along with suggestions for improving meta-analysis methodology at four other stages in the process.

Identifying Variables for Study
Unless the researcher is very familiar with the primary literature under study, it is advisable to select asample of studies for the purpose of determining variables that will be subsequently coded for analysis. The purpose of this exercise is to determine which variables, in addition to the primary distinction under study, have been most commonly reported in the literature. These additional variables may serve to aid in generalization or may actually form the basis for tests of significance in their own right. In the following sections these variables, under commonly encountered headings, are discussed.
Demographic variables. Among other things, these include variables related to the nature of the experimental sample under study (e.g., sex, grade level, SES).
Treatment variables. Included in this category are characteristics of the treatment condition, for instance, type of treatment, duration of treatment, location of treatment and experimenter characteristics.
Design variables. Variables falling into this category are those associated with the nature and quality of the experimental manipulation. Examples include presence of experimental control, randomization and selection, presence of pretest, nature of dependent measures, specific threats to internal and external validity Once these variables have been identified, they are coded for each study using a scheme that is similar to that shown in Figure 1 (see page 178).

Calculating Effect Size
The estimate of the strength of a treatment, called an effect size, is calculated using a relatively simple procedure. For difference questions, the means of treatment and control groups are ascertained, and the control mean is subtracted from the treatment mean. Naturally, if this difference has a positive sign, it indicates that treatment subjects have outperformed control subjects, while a negative sign indicates the reverse.
This raw difference is not enough, however. It must be standardized so that other studies investigating the same variable may be averaged with it. The meta-analytic researcher accomplishes this by dividing the raw difference by an estimator of e--the standard deviation of the control group ( Other formulae for deriving effect sizes in studies that do not contain some of the elements listed above have been presented by McGaw and Glass (1980). Formulae are also available for obtaining effect sizes when transformed scales are used (e.g., gain scores), when factorial designs are used or when dependent measures have been adjusted by a covariate. The result is a z-score of sorts* -a standardized metric which represents the number of standard deviations the treatment condition has outperformed the control condition (or underperformed if the sign is negative). All of the effect sizes in the study are then averaged (to produce a mean effect size) or the median of the distribution is represented. Figure 2 (see page 179) shows how the difference between the two theoretical distributions of control and treatment may be shown graphically, and then represented as an actual distribution of effect sizes. *Z-scores are calculated within a distribution of raw scores using the following formula: Score in the Distribution Distribution Mean + Distribution Standard Deviation. Since the distribution of z-scores is in unsquared deviation units, its mean is always 0 and its standard deviation is always 1.0. This is not the case with an effect size distribution, since the mean and standard deviation for each study included is different. The mean of the distribution may be either positive or negative and represents the average standardized difference between sample means.  Note: Effect size distribution from Guskey and Pigott (1988).
At this stage, the homogeneity of the effect&e distribution is considered.
The finding of an effect size of .94 with a standard deviation of 1.91 in a mastery learning meta-analysis (Lysakowski & Walberg, 1982) is probably an example of too much variability, considering the magnitude of the mean. Consequently, Guskey and Pigott (1988) reported a homogeneity of variance violation for mastery learning, x2 = 759.5 (df = 77), p < .00l, and avoided calculating a measure of central tendency for the set of mastery learning studies.

Using Inferential Statis tics
If homogeneity of effect size is violated, it is recommended (Hedges & Olkin, 1985) that effect sizes should be separated into subsets by coded characteristics until homogeneity is achieved. While similar to the statistical procedure just described, this test is equivalent to the F-test among groups in a one-way experimental design. Guskey and Pigott followed this procedure for all studies selected for inclusion on subject area, grade level of students and duration of study On those studies which reported them, program characteristics, gender, initial ability level of the students and extent of teacher training were investigated in an attempt to isolate models of study characteristics that would explain the lack of homogeneous findings.
Even when the homogeneity assumption is met, most meta-analyses report tests of significance across coded variables to enhance the findings and explore other dependencies that may exist in the data. Let'ssay, hypothetically, that the average effect size for a study is .60, but when the sample is categorized by sex, women (ES = .80) improve more than men (ES = .60). This suggests that women may be affected by the treatment more than men. In a sense it is an interaction term relating sex, as an independent variable, to the average difference between treatment and control. The test of significance is analogous to ANOVA in that total variation is partitioned into between-class and withinclass components for the purposes of comparison (Hedges & Olkin, 1985). One should always resist a causal interpretation of comparisons like this, however, since no random assignment to treatments is involved.
An actual example of this comes from a study of teacher feedback on homework assignments (Paschal, Weinstein & Walberg, 1984). Homework was found to be more effective in the fourth and fifth grades for improving achievement than in upper elementary or high school. Also, when graded versus ungraded homework was compared, a substantial difference emerged. Graded homework produced an effect size of .80, while ungraded homework influenced achievement by only .36 standard deviations. Both of these characteristics of the sample produced significant differences when tested using the procedures outlined above.
In addition, researchers would be interested in whether there is a difference among a variety of methodological and demographic aspects of the studies under consideration. This amounts to searching for bias in the variables coded under threats to internal and external validity, publication sources, such as articles, bookchapters dissertations and ERIC documents and other variables that may reside concomitantly with treatments. Not surprisingly, higher effect sizes are often found for published over non-published works, since journals usually accept studies that apply more rigorous methods and often reject studies reporting no significant differences.
Where a quantitative scale is involved, regression analysis can be used to test among increasing or decreasing levels of some continuous independent variable and its accompanying dependent variable. A good example of this comes from Glass and Smith (1979). They investigated the effects of differing class sizes (i.e., number of students being taught at a time) and the cognitive achievement associated with it. Eighty studies were gathered and increasing class size (a quantitative scale) was regressed against achievement (also a quantitative scale). Results indicated that achievement was found to increase from by .50 standard deviations as class size changed from 1 (i.e., individualized tutorial instruction) to 40. However, the relationship was not completely linear. The greatest change occurred between class sizes of 1 and 20, beyond which it flattened into almost a straight line. This suggests that with class sizes over 20, individual achievement does not rise incrementally. This study was one of the first large-scale meta-analyses and its results have been widely discussed as both an example of good and bad (e.g., see Slavin, 1984) metaanalytic practice.

Interpretation and Reporting
When they are completed, meta-analyses, unlike the individual samples summarized within them, are thought to approximate the population of subjects from which the original studies were drawn. In fact, meta-analyses often include literally tens of thousands of subjects, assumed to have been originally drawn from the same population before random assignment. When treatment effects are present, two populations are actually involved, one treated and one untreated. The effect size estimates the standardized difference between these populations. Figure 2 shows this comparison in graphic form. An effect size of 1.0 means that the treatment population has outperformed the control population by one population standard deviation. Often the effect size is converted to a percentile rank to enhance interpretability An effect size of 1 is equivalent to the 84th percentile in a normal distribution, meaning that the average treatment condition subject is above 84% of subjects in the control condition.
Since one of the purposes of meta-analytic studies is to allow for comparison among potentially useful instructional treatments, some additional form of interpretation of average effect size is desirable. A non-technical interpretation of low, medium and high effect sizes has been suggested by Cohen (1969). Small effect sizes (e.g., .20 or the 58th percentile) are similar to those associated with comparisons among the heights of 15 and 16 year old girls. Medium effect sizes (e.g., .50 or the 69th percentile) would be similar to differences between 14 and 18 year old girls. Large effect sizes (.80 or the 79th percentile) are of the order of magnitude of differences in IQ between holders of Ph.D. degrees and the average college freshman.

Description of Mastery Learning
Although the concept of mastery learning has existed since the 1920s, it became a mainstream instructional strategy primarily as a result of work by Bloom (1968) and Block and Anderson (1975). In its simplest form mastery learning is ". . . a test about what the student was supposed to learn; a test not for gradingorjudging, but rather to see what the student has learned and what he or she needs to learn. The students are then given some help" (Bloom quoted in Koerner, 1986, p. 60). There are two primary forms that have grown out of this basic notion: a) group-based mastery learning; and b) personalized system of instruction (PSI/Keller Plan). PSI is an individualized form of mastery learning.
Supporters of mastery learning claim that the method will produce significantly higher achievement results, given the same objectives, the same materials, and the same amount of time allocation as standard instructional models. In group-based mastery instruction, teachers determine the pace of instruction, while in PSI the student controls the pace. In addition, it is argued that learning achievement will be dramatic: 90% of the learners will be able to achieve at a learning level of 85% or higher. Effectively, this would change the normal distribution of learning outcomes produced by standard instruction into one that is highly negatively skewed (Bloom, 1984). Guidance is individualized and focused on what has not been achieved.
Over the years many studies have been conducted to test these claims in both the contexts of group-based settings and PSI. The first review of literature (Block & Burns, 1976), conducted on both group-based and PSI studies, concluded that the mastery approaches described result in higher achievement and positive affective outcomes. However, the cognitive results were not as dramatic as the supporters of mastery learning had claimed. In the 1980s three meta-analytic studies of group-based mastery learning were conducted, each reaching dramatically different results as to the state of research that underlies this instructional technique and the magnitude of treatment effects. These studies are summarized in Table 1 (see pages 183 and 184), and their features and issues related to them are discussed.

Three Meta-Analyses on Mastery Learning
The first major review of research (Lysakowski and Walberg, 1982) was a quantitative summary of three of the four fundamental ingredients of quality instruction: cues, participation and feedback corrective. Reinforcement, the fourth element, had been reviewed previously The reviewers concluded that the average effect for all three components was .97, and that the effect for feedback and correction, the element most commonly associated with mastery learning, was .94. Clearly, this was dramatic evidence that mastery learning had achieved the potential that had all along been claimed for it.
Five years later, Slavin (1987) published a meta-analytic study of mastery learning that all but refuted the major claims made by Bloom and others. Using a technique developed by him, called 'best-evidence synthesis," he was able to show that the results of mastery learning are considerably smaller in subsets of studies embodying: a) the "strong claim"-that mastery will outperform the control group when they have the same objectives, the same materials and the same amount of time and when learning is measured with standardized instruments; b) the "curricular focus claim" -that mastery learning focuses teachers on particular curricula and students on the attainment of particular objectives; and c) the "extra time claim"-that mastery learning is an effective use of additional time and instructional resources to bring all students to an acceptable level of achievement. In addition, Slavin only used studies that he considered methodologically rigorous and those where the mastery learning treatment lasted for four weeks or longer. Evidence for the "strong claim" produced a median effect size of .04 (essentially 0). Studies representing the "curricula focus claim" were found to have a median effect size of .26, and the median for those representing the "extra time claim" was .31. The most recent meta-analysis of mastery learning studies was conducted by Guskey & Pigott (1988). A subset of articles related only to elementary and secondary classrooms had been published previously by Guskey and Gates (1986). The Guskey and Pigott meta-analysis included a larger number of studies (n = 46) and arrived at a somewhat surprising conclusion: that the variability of effect sizes for methodologically sound studies of group-based mastery learning was too great to compute an average effect size estimate. Attempts to derive models of measured variables which explained this heterogeneity were generally unsuccessful, although some trends are noted (e.g., higher in some subject matters).

Some Reasons for the Differences
There are several explanations for the discrepancies among these metaanalyses which help to demonstrate some of the characteristics of metaanalyses in general. First, it is obvious that meta-analyses conducted at different points in time, especially when there is high research productivity in the area, are bound to produce different results. More refined methods of study, better research designs, sensitivity to criticisms of previous research studies and a host of other considerationscan affect the results achieved through metaanalysis from one era to another.
Second, the selection of studies for inclusion, even when the same study pool is available, can dramatically affect the results that are achieved by different researchers. Smith and Glass (1977) and others argue for the inclusion of all available studies that include the minimum criterion of a metric of comparison, while others have claimed that mixing apples and oranges clouds the issue under study considerably, rather than elucidating it (Slavin, 1984). In the meta-analyses under consideration, the earliest effort set wide inclusion criteria that admitted many studies that were not included in the subsequent articles. Note in Table 1 that the percentage of overlap between Lysakowski/Walberg and Slavin and Guskey/Pigott is 0% and 10%, respectively Between Slavin and Guskey/Pigott the overlap is considerably higher. In this latter case, however, it is far less than it could be because of the restrictive conditions set by the 'best evidence synthesis".
One interesting finding by Guskey and Pigott (and Lysakowski & Walberg, although it was not discussed) reveals something of interest about the outcomes of meta-analytic studies. You may notice in Table 1 that for an effect size of .94, Lysakowski & Walberg report a standard deviation of 1.91. This describes a very platykurtic distribution (i.e., flat) which cannot be thought of as homogeneously summarized by a single effect size. In sampling an essentially different literature of mastery learning studies, Guskey and Pigott found the same thing. The standard deviation of the distribution of effect sizes should be considered an important piece of information and effect sizes with high standard deviations or standard error of the mean should not be taken at face value, because they may not be statistically significant.
The variability, both within and among mastery learning studies probably says little about the pedagogical soundness of its best applications, but instead bespeaks the implementation and methodological problems that continue to plague it. Variations in practice abound and there exists thorny research design problems which have not yet been fully addressed. These include dealing with time to mastery, equilibrating mastery and non-mastery treatments, dealing with the skewed distributions that invariably results from good mastery applications and establishing a sound rationale for using either standardized or well constructed locally produced instruments. As the technology of mastery learning improves and researchers become sensitive to methodological problems that are peculiar to mastery investigation, the variability in mastery studies will undoubtedly subside.

Comments for Practitioners and Researchers
Conscientious practitioners are always searching for support for the design of quality instructional programs. This might come from previous successes, from the analysis of cost/benefits, or from the literature of research studies. Meta-analysis seems a reasonable tool for achieving the latter goal.
Table 2 (see pages 187 through 191) lists 26 meta-analyses of instructional variables divided for convenience into categories: instructional media, text design features, classroom processes, feedback and correction and social aspects of learning. The references to these studies are provided in the appendix. These studies represent a potentially valuable resource for the practitioner and researcher alike.
In spite of the apparent flaws in the practice of meta-analysis, it remains the single most powerful tool for summarizing studies in an era of rapidly c .-5 co expanding scientific literature. This process is necessary even when flawed because of the impossibility of keeping up with even a fraction of the literature by researchers and practitioners alike.
For the researcher, meta-analysis represents a means for focusing thought on the large questions and a heuristic for designing future studies taking into account the smaller questions, For the practitioner in the media and technology field, meta-analysis is a means for making broad decisions about the implementation of new programs and the design of instructional products. Effect size is the metric for predicting what might happen in a new circumstance if a particular instructional variable were implemented. It tells roughly how many standard deviations of additional achievement would be expected over groups that do not receive the variable. However, it behooves both practitioners and researchers alike to heed the warnings of thecritics of metaanalysis practice. The following suggestions may aid the reader in using the information contained in meta-analysis to support their instructional decisions.
1. Achieving a common definition -While seemingly self-evident, the consumer of meta-analyses should make certain that their definition and that of the author are in agreement and that the studies reported in the metaanalysis are examples of the conceptual definition under consideration. For instance, a designer searching for pre-instructional activities for textbook design should realize that the studies reported under the rubric of advance organizers will not include other design features that might be commonly associated with advance organization, such as outlines, abstracts, introductions and overviews. The technical definition created by Ausube1 and tested in meta-analyses does not include the above.
2. Achieving a common circumstance -Meta-analyses often summarize studies across a wide variety of instructional or educational circumstances (i.e., grade levels, SES levels, geographical boundaries). Consumers of metaanalyses should be aware of these circumstances and if necessary base their conclusions on subsets within the meta-analysis that fit their own needs. There is a danger in this, however. When the studies in a meta-analysis are subdivided, the resulting number of studies per subset is often quite small, often fewer than 5 studies. It is more difficult to base a firm judgment on smaller data set than larger ones, because the smaller number runs a greater risk of a Type II error (accepting a when it should be rejected). Naturally, the variability among studies within a subset should be of concern, as well as the mean.
3. Achieving an overall understanding -Of note in Table 2 is the fact that some areas have been investigated several times. This is partly because the state of evidence is always advancing. More recent meta-analyses, supplant older ones in characterizing the field more fully. However, in some cases, a meta-analysis may be repeated to reconsider an earlier finding or to incorporate a new methodological or conceptual application into the state of the art. The meta-analysis of the mastery learning literature by Slavin (1987) is a good example of the latter case, The 'best-evidence synthesis" represents a new conception of how inclusion and exclusion criteria should be set and as a result a dramatically different impression of the field emerged.
It is surprising that follow-ups of the Kulik et al. media studies of the early 1980s have not been attempted, particularly after Clark's 1983 attack on their validity One would have expected a response to determine if Clark's assertion accurately represented the overall literature, given that his findings were based on only a partial selection of these studies. We can only speculate that calls by Salomon and Clark (1979) and Bernard (1986) and others to stop asking gross media questions, comparing a media treatment to a control group, have been heeded. Unfortunately, literature concerning the nuances of within media comparisons does not abound, reducing the likelihood of additional meta-analyses in the media area.
Since new meta-analyses can appear for either of the reasons mentioned above, to achieve a complete understanding of a field of inquiry, it is important to become familiar with all of the meta-analyses that have been conducted, not just the most recent ones.
Another point of importance here is the limitations of meta-analysis for drawing specific conclusions about when or under what exact circumstances a particular technique or medium should be applied. Meta-analysis is far too global to aid in the fine-grained analysis of instructional problems. In addition, it has seldom been used to address instructional treatments that are continuous or incrementally applied (e.g., varying degrees of feedback) or where variations in type of common strategy (e.g., type of questioning) are examined. Therefore, meta-analysis is most useful as a tool for making the larger instructional decisions. The designer must look to more specific studies of instructional treatments and/or conduct local evaluation studies on prototype materials in order to gain insight into particular aspects of developing instruction.
4. Achieving a statistical understanding -There are several important points here. One, the mean effect size that is reported may not accurately reflect the underlying population parameter. If a test of homogeneity of effect size is not provided, look carefully at the magnitude of the standard deviation if it is given. Interpretation of this statistic may be supplemented by a histogram similar to the one pictured at the bottom of Figure 1 that are often included (stem-and-leaf diagrams are also common). This will provide a visual sense of the distribution of effect sizes and the variability among them. Two, while tests of significance within the distribution of effect sizes and between subsets of demographics are important, they can be misleading. When sample size is small, the power of the test is low reducing the probability that differences will be detected, even when they are present in the population. When sample size is large, even a relatively small effect size may exceed the critical value necessary to reject the null hypothesis. Three, in interpreting the effect size, reference to the percentile rank and to non-technical descriptions of the meaning of effect sizes are invaluable.

Comments for Researchers
Meta-analysis represents both a means for estimating the effects of instructional treatments in practice and a heuristic for designing future research studies. This latter function may be accomplished in two ways. First, meta-analyses can sensitize researchers to issues of design and methodology that will admit or not admit their studies to scrutiny in future attempts to synthesize the literature. Second, the ancillary analyses contained in most meta-analyses can aid researchers in identifying the sources of data and variables that are likely to interact with the major question that is being addressed. If these suggestions sound like prescriptions for conformity, that is exactly how they are intended. Progress in the science of instruction, to some degree, is predicated on the presence of high quality replications in order for the larger questions to be answered.
However, it should be recognized that there exist limitations to metaanalysis as a heuristic for research. Meta-analysis is a retrospective approach which derives its strength from the weight of past efforts. It is therefore unlikely that new developments-those that will qualitatively extend beyond present practice-will emerge from this technique. Meta-analysis will never be a substitute for insight and creativity in the conduct of primary research or the development of new instructional methods. In short, as a technology of quantitative synthesis, meta-analysis should never substitute for the kind of in-depth exploration and complex thinking that characterize productive scientific enquiry.

CONCLUSION
In this article we have sketched a broad picture of the nature of metaanalysis, its potential for informing researchers about the overall effectiveness of variables in a given field and for aiding media and technology practitioners in making decisions concerning larger instructional development issues. We have discussed both the philosophical and practical objections to metaanalysis and have described the process of doing a meta-analysis in some detail. Clearly, all of the many issues that have arisen over the last 15 years cannot be catalogued here. However, the core issues that have been represented and the references, provide ample fodder for further consideration.  Table 2 Cohen, P. ACKNOWLEDGEMENT: The authors wish to thank Dr. Phil C. Abrami and two anonymous reviewers for their careful reading and thoughtful comments. Les analyses ont indiqué que: a) les étudiants possédant différents niveaux de connaissances préalables profitent des divers traitements par une action dlfférentielle; b) les étudiants ayant un niveau de connaissances préalables variable peuvent réduire l es différences de performance, tout en fournissant un enseignement avec une r épétitio n d'une activité complexe: c) des traitements i dentiques ne sont pas également effectifs pour promouvoir une réussite des étudiants dans différentes types d'objectifs pédagogiques: et d) le testi ng vi suel est une stratégie viable pour la récupération de l'informaation acquise par les étudiants ayant reçu un enseignement vis uel.

Rehearsal
Considerable research (Paivio, 1971;Tulving, 1976;Dwyer, 1978; has indicated that merely using visual materials to complement oral or verbal instruction does not always optimize student learning. Gagne (1977) indicated that learning is a highly idiosyncratic event, and depends very much on the Canadian Journal of Educational Communication. VOL. 19, Abstract: The purposes of this study were to determine a) the effectiveness wIth which different mined both the method of instruction and testing mode they received. Students received their respective instructional presentations and criterion tests in one session. Analyses indicated that: different treatments; b) differences in achievement potential among students with varying prior with elaborate rehearsal activity; c) Identical treatments are not equally effective in promoting viable strategy for retrieving information acquired by students receiving visualized instruction.
nature of the learner-particularly on his prior learning. In order for learning to be effective at any level, the learners must be active (Bork, 1979;Bransford, 1979;Travers, 1970). In this regard, Murray and Mosberg (1982) have indicated that the longer an individual can be involved in rehearsal type activities (taking notes, summarizing, responding to question, etc.) where he/ she is actively processing information related to the content material, the greater the possibility that this information will be moved from short term into long term memory and the greater the possibility that increased learning will occur and be retained. Mental activity on the part of learners is essential for learning to occur. This activity includes the selection and perception of stimuli; encoding of new stimuli, and the retrieval of prior knowledge for use in combination with the new stimuli for imaging, comparison, analysis, synthesis, and problem-solving. Varied forms of rehearsal, while focusing attention, allow time for incoming information to remain in short-term memory long enough to be elaborated upon and encoded for long-term memory (Anderson, 1980;Dushkin, 1970;Lindsay & Norman, 1972;Murray & Mosberg, 1982). Lindsay and Norman (1972) have argued that the longer an item is maintained in short-term memory by rehearsal, the greater the probability that it will be transmitted into long-term memory and be retained.
In general, rehearsal can be considered to be any mathemagenic activity that can serve the learner in several ways including motivating and promoting appropriate mental activity. Additionally, different rehearsal strategies differ in intensity of learner involvement and, thus, may have differential effects on specific learning outcomes. If this is the case, then it may also follow that optimum intensity of rehearsal activity (actual overt interaction with the content material) may be directly related to the level of learning to be achieved -the more complex learning requiring the most intense or involved type of rehearsal activity. For example, covert rehearsal generally requires minimal information processing activity on the part of the learner. Examples of covert rehearsal include reading prose passages, reading summary statements, reading questions and answering them mentally before checking with a given answer, and following mentally the completed solution of a problem. Researchers have found that reading correct statements or correct answers does not always provide for a level of mental processing that results in increased understanding (Anderson, Goldberg, & Hiddle, 1971;Bransford, 1979). Overt rehearsal, by providing physical activity in which the learner is required to interact with the content material, ensures that he/she attends to the information and spends more time interacting with and encoding the information. Using visualization to complement oral or verbal instruction is presumed to be a form of overt rehearsal since it provides the learner with the opportunity to observe the structure of the constructs being illustrated and also their relationship to other constructs in the illustration. For example, when using visualization to complement instruction on the structure of the human heart the student can quickly see what the structure of the mitral valve looks like and also by further inspection (interaction) its location between the left auricle and right ventricle.

Visual Testing
The transmission, acquisition and the subsequent retrieval of information are primary concerns in any instructional/training environment. In considering factors that might enhance this process, Tulving and Thomson (1973) have proposed the encoding specificity principle -that recognition memory is better if the cues used in the original instructions/acquisition environment are used in the testing retrieval environment. Similarly, Battig (1979) andNitsch (1977), in adhering to the encoding specificity principle, have indicated that any change in the retrieval environment from that which occurred in the original learning environment produces marked decrements in learner performance.
Support for the use of visual test items that employ visuals of the same type as those employed in the instruction has surfaced regularly in the research literature in the form of hypotheses and theories; for example, the signsimilarity hypotheses (Carpenter, 1953), cue summation theory (Tulving & Thomson, 1973), and transfer-appropriate processing principle (Morris, Bransford, & Franks, 1977). Lindsay and Normal (1977, p. 337) have stated that in the teaching-learning environment, "the problem in learning new information is not getting the information into memory; it is making sure that it will be found later when it is needed. " Bransford (1979), Tulving (1979, and Tulving and Osler (1968) have indicated that the accuracy with which information is retrieved is related to the degree of elaborateness of the encoding which occurred during the rehearsal activity.
Consequently, information retention level is assumed to be a direct function of the encoding occurring at the presentation stage and the degree to which the retrieval environment recapitulates this encoding (Battig, 1979;Tulving, 1979). The implications of this position would imply that in instructional situations where visualization was utilized in the encoding process and was not used in the retrieval (decoding) process, learner performance measures would yield gross underestimates, if not distortions, with respect to what and how much information had been originally required. This conceptualization suggests that information retrieval is a very specific process, easily disrupted. Since the features of the original learning cues have processed during a test, any reduction in the individual distinctiveness of the cues themselves should produce concomitant reductions in recall (Nelson, 1979).

PROBLEM STATEMENT
The purpose of this study was to investigate the instructional effectiveness of integrating rehearsal activity intovisually complemented prose instruction. Within this context the instructional effectiveness of both overt and covert rehearsal activity was examined Additionally, the study examined the effect that students' level of prior knowledge had on learning and the effect of visual/ verbal testing had on information retrieval. Specifically, the purpose of this study was todetermine: a) whether students possessing different prior knowledge levels profit differentially from different instructional treatments; and b) whether visual testing is a viable strategy for retrieving information acquired by students receiving visualized instruction.

VERBAL AND VISUAL TESTS (CRITERION MEASURES)
Students in each instructional group participated in their respective treatments followed immediately by four criterion tests (drawing, identification, terminology, and comprehension). The identification, terminology and comprehension criterion tests were in multiple-choice formats in both the verbal and visual versions. Scores on these three tests were combined into a 60item composite test score (each test will be described below).
The three multiple-choice tests (verbal format) used in this investigation were developed by Dwyer (1972). Additional revisions were made to selected multiple-choice questions to further eliminate the ambiguity of specific distracters and to attempt to prevent a specific question and its distracters from clueing another answer (Dwyer, 1985(Dwyer, -1986. The format of each of the 60 multiple-choice items consisted of a typical verbal stem and verbal response options. The following description of the criterion tests, adapted from Dwyer (1978, pp. 45-47) illustrates the kinds ofeducational objectives assessed in this study.
Drawing Test. The objectives of the drawing test was to evaluate student ability to construct and/or reproduce items in their appropriate context. The drawing test (20 items) provided the students with a numbered list of terms corresponding to the parts of the heart discussed in the instructional presentation. The students were required to draw a representative diagram of the heart and place the numbers of the listed parts in their respective positions. For this test the emphasis was on the correct positioning of the verbal symbols with respect to one another and in respect to their concrete referents.
Identification Test. The objective of the identification test was to evaluate student ability to identify parts or positions of an object. This multiple-choice test (20 items) required students to identify the numbered parts on a detailed drawing of a heart. Each part of the heart, which had been discussed in the presentation was numbered on a drawing. The objective of this test was to measure the ability of the student to use visual cues to discriminate one structure of the heart from another and to associate specific parts of the heart with their proper names.
Terminology Test. This test consisted of 20 multiple-choice items designed to measure knowledge of specific facts, terms, and definitions. The objectives measured by this type of test are appropriate to all content areas that have an understanding of the basic elements as a prerequisite to the learning of concepts, rules, and principles.
Comprehensive Test. The comprehension test consisted of 20 multiple-choice items. Given the location of certain parts of the heart at a particular moment of its functioning, the student was asked to determine the position of other specific parts of the heart at the same time. This test required that the students have a thorough understanding of the heart, its parts, its internal functioning, and the simultaneous processes occurring during the systolic and diastolic phases. The comprehension test was designed to measure a type of understanding in which the individual can use the information being received to explain some other phenomenon. Composite Test Score. The items contained in three of the four individual criterion tests (identification, terminology, and comprehension) were combined into a 60-item composite test score. The purpose was to measure the total achievement of the varied levels of objectives presented in the instructional unit.
Visual Criterion Tests. In designing the visual test formats for the identification, terminology and comprehension tests, the visual tests developed by De Melo (1980) were used as a guide. The revised version of the visual form of the criterion tests utilized only one drawing with four or five letter labels in all items in which it was possible to do so while maintaining clarity and correspondence to the verbal test items (See Figure 1 on page 204). However, two items in the terminology test and all items in the comprehension test required four drawings. The item stems of both the verbal and visual test questions were verbal and asked the same question. In addition, the visual distracters in the visual tests corresponded to the verbal distracters in the verbal tests as closely as was reasonable. The description of the verbal tests given previously also describes the visual tests.

INSTRUCTIONAL TREATMENTS
Each of the four instructional treatments in this study contained the same instructional script, visuals, terminology labels, and arrows. The treatments differed only in the degree of rehearsal employed and the type of testing employed (verbal or visual). Figure 2, Plate 1 (page 205) illustrates a sample frame received by students in Treatments 1 (Reading Summaries -Verbal Test). Students receiving Treatment 2 received the same instruction as did students in Treatment 1 (Figure 2, Plate 1); however, instead of receiving the verbal tests the students in Treatment 2 received the multiple choice criterion tests in thevisual test format. The summary statements at the end of each page required a minimal amount of covert rehearsal on the part of thestudents and did not review all the information in the instructional script. the summary statements were designed to provide the students with the opportunity for mental review of the instructional content. Figure 2 (Plate 2) illustrates a sample frame received by students in Treatment 3. This treatment required that students shade with colored pencils the specified parts and functions of the heart. Students receiving Treatment 4

Sample Questions from the Tests in the Visual Format.
Plate 1: Sample questions from the identification test (visual format) Plate 2: Sample questions from the terminology test (visual format) The chamber of the heart that The part(s) of the heart that pumps oxygenated blood to control(s) its contraction and all parts of the body: relaxation: A D E A

Plate 3: Sample questions from the comprehension test (visual format)
The parts of the heart though which blood is being forced during the second contraction of the systolic phase: received the same instruction as did students in Treatment 3 (Figure 2, Plate 2); however, instead of receiving the verbal tests the students in Treatment 4 received the multiple choice criterion tests in the visual test format. The parts and functions to be shaded corresponded to, as closely as possible, the information summarized in Treatments 1 and 2. The colors used were selected to avoid a sense of color coding: black, green, yellow and red. Red was used in two circumstances in which four colors were required on a page; association of the red color with oxygen-rich blood was avoided.
Instruction was presented to students in booklet format and students were permitted to spend as much time as they needed to interact with the instructional content and to complete the criterion tests DESIGN AND ANALYSES One hundred twenty undergraduate students enrolled at The Pennsylvania State University participated in this study. A pretest consisting of 36 items on general content in physiology (Dwyer, 1972) was utilized in this study. The pretest was used to determine students' prior knowledge level regarding human physiology. Scores on the pretest were arranged in descending order from highest to lowest. The top 40 scores represented high prior knowledge (M = 27.31, the next 40 medium prior knowledge (M = 21.8) and the bottom 40 low prior knowledge (M = 17.1). The Formula 21 reliability of the physiology pretest was .84 and its correlation with the total composite test was .56. Students in each of the three prior knowledge levels were then randomly assigned into one of the prior treatment groups. The independent variables manipulated were levels of prior knowledge, level of rehearsal strategy (covert/ overt) and test mode (verbal/visual). The dependent variables were: a) performance on the visual and verbal versions of the individual criterion tests -terminology, identification and comprehension; b) performance on the composite test; and c) performance on the drawing test.
Alpha was set at the .05 level for each analysis of variance. Where significance was found to exist Tukey's Wholly Significant Difference (WSD) for comparison among the means was utilized. comparisons generally indicate that Treatment 4, the shading on pictures (overt rehearsal): visual testing, was the most effective in facilitating information acquisition. Table 2 (see page 208) shows where significant differences occurred in achievement among students posessing different prior knowledge levels (High, Medium, Low) as the five criterion measures. The blank areas which exist where you would expect to find 2>3 on the different criterion tests indicate that significant differences in achievement did not occur between Treatments 2 and 3. These results would seem to indicate that the low prior knowledge group is the group which is most positively influenced by the different instructional strategies. Additionally, Treatment 4 would seem to be the instructional format which was most instrumental in reducing the effect of differences among students possessing different prior knowledge levels.  The equal sign (=) does not mean that the means are equal, but that they fall in close approximation to one another "under the normal curve" so that achievement differences may be considered insignificant.

RESULTS
modes. Insignificant differences were found to exist between students who received the reading summaries treatment which were evaluated by both visual and verbal tests. However, in evaluating the differences between Treatments 3 and 4, Treatment 4 the Shading on Drawing (Overt Rehearsal) Visual Test was found to be significantly more effective than Treatment 3 on all the criterion measures.

D I S C U S S I O N
Considerable research on the design and use of visuals has shown that visualization can significantly improve students' learning from prose instruction (De Melo, 1980;Dwyer, 1978;Dwyer & Parkhurst, 1982;Levie & Levie, 19'75, 1971). The review of literature for this study indicated that rehearsal strategies and thevisual test mode are important instructional variables. The present study went one step beyond merely establishing the importance of visualization by attempting to determine: a) the effect of different rehearsal strategies used to complete visualized prose instruction; b) the effect of visual testing in retrieving information; and c) the effect that different instructional strategies have on students possessing different prior knowledge levels.
Results of this study indicate that all types of rehearsal strategies are not equally effective in facilitating student achievement of different educational objectives (Figure 3, page 211). The general trend also seems to indicate that overt rehearsal is more effective than covert rehearsal in facilitating student achievement. Additionally, it was found that within the overt rehearsal treatments, when significant differences were found to exist, students who received the visual test mode achieved significantly higher scores than did students who received the verbal tests The higher scores on the visual teats by students in Treatment 4 may be explained by the encoding specificity principle (Battig, 1979;Nitsch, 1977;Tulving, 1979) since the visual test situation in this study closely matched the learning situation; the visuals employed in the test situation provided the critical cues needed by students to retrieve the encoded information (Bransford, 1979). These results are also congruentwith the sign-similarity hypothesis (Carpenter, 1953), the stimulusgeneralization hypothesis (Hartman, 1961), the cue summation theory (Severin, 1967), and the transfer-appropriate principle (Morris, Bransford, & Franks, 1977).
In comparing the results of the verbal-visual mode of testing on the different criterion measures (Table 3) insignificant differences were found to exist between students in Treatments 1 and 2. Two possible explanations may be proposed for this finding: a) the reading summaries treatments (covert rehearsal) apparently did not provide enough maintenance rehearsal to allow for additional elaborative rehearsal that would encode more information from short-term memory into long-term memory; and b) performance on the visual test may have been influenced by the fact that visual tests are rather unfamiliar to most students and their performance on them suffered accordi n g l y .

Comparison of Mean Achievement on the Criterion Tests.
In examining the effect of the verbal-visual testing mode (Treatment 1 vs. Treatment 2 and Treatment 3 vs. Treatment 4: Table 3) insignificant differences were found to exist between Treatments 1 and 2 on all criterion measures; however, Treatment 4, the visual test format, was found to be significantly more effective than its counterpart, the verbal testing format, on all criterion measures. Apparently, the visual rehearsal which required that students overly interact with the instructional content by requiring them to color and draw on the visuals functioned to provide appropriate conditions for encoding; thus, it was possible for students to retrieve more information on the visual criterion tests. This finding seems to support the encoding specificity principle (Tulving & Thomas, 1973) which contends that recognition memory is better if the cues used in the original instructions/acquisition environment are used in the testing retrieval environment. Additional related research tends to support this position (Nitsch, 1977;Battig, 1979;Morris, Bransford & Franks, 1977;Dwyer & De Melo, 1984;Dwyer & Dwyer, 1985).
In examining the effect of the different treatments on students possessing different levels of prior knowledge (Table 2), the results indicated that Treatments 1 and 2 had moderate effects in reducing prior knowledge differences on the criterion tests; when they did occur differences between the medium and low prior knowledge levels were effected. For example, for students in Treatments 1 and 2 significant differences between students in the medium and low prior knowledge levels were reduced on both the drawing and terminology tests, and for treatment on the composite test and for Treatment 2 on the comprehension test. Treatment 3 was effective in reducing all differences between the low and medium prior knowledge levels. On three criterion measures differences between the low and medium prior knowledge levels were reduced. Treatment 4, shading on pictures, visual test, influenced student performance in all the criterion tests dramatically by reducing performance differences attributed to levels of prior knowledge. Thesuccess of the treatment may be an indication of the dual coding which occurred (Paivio, 1971) -verbal encoding form the interaction with the verbal test and visual encoding from the repeated interaction with the visualized content. This repeated interaction resulted in a greater amount of information being encoded relative to the functions of the heart and the relationships between the parts of the heart. It may be that students by their sustained and repeated interaction with the visuals in Treatment 4 maintained more information in short-term memory for longer periods of time which allowed for more elaboration (by students of all levels) and storage in long-term memory (Craik & Watkins, 1973). This is consistent with Travers' (1982) statement that shortterm memory is where information is organized and prepared for long-term memory. The achievement of higher scores did require a higher level of processing, more elaborations, more "links," more reflection, etc. In this regard, the data indicated that more information was encoded in the visual rehearsal situation as tested by the visual criterion tests.

SUMMARY
The results of this study reveal that all types of rehearsal strategies used to complement visualized instruction are not equally effective in facilitating student achievement of different educational objectives and that visual testing is an important instructional variable for facilitating the retrieval of optimum amounts of acquired information. The finding that it is possible to develop an instructional-evaluation strategy (i.e., Treatment 4) which can reduce differences attributed to prior knowledge levels is significant for future instructional design and development activities.
On the practical level the results seem to indicate that providing students with visualized instruction and a covert activity (reading summaries: Treatments 1 and 2) only has a "mild" effect in reducing differences among students possessing different prior knowledge levels regardless of the type of testing format (visual/verbal) they receive. Significant differences between students in the low and medium prior knowledge group were estimates in Treatment 3 where students shaded pictures and wereevaluated by means of theverbal test format. The most significant results were realized in Treatment 4 where virtually all significant differences among students in the low, medium, and high prior knowledge levels were reduced when students shaded pictures and were evaluated by means of the visual test format. The results of this finding has significant implications for the designers of instructional software to be delivered on computers and interactive video technologies where visual/ graphic packages are readily available to generate visually complemented instruction and test formats. However, it is important that the findings of this study be replicated with larger and more varied audiences before they are generalized too broadly. DeMelo, H.T. (1980) Visual self-paced instruction and visual testing in biological science at the secondary level. Unpublished 108-117. Dwyer, F.M., & DeMelo, H. (1984). Effects of mode of instruction, testing, order of testing, and cued recall on student achievement. Journal of Experimental Education, 5.2, 86-94. Dwyer, F.M. (1972)

CALL FOR PRESENTATIONS [-=EziJ=]
The theme of the Ottawa-Hull conference is Challenging the Technology.
By exploring the limits of existing technology and through a more intimate knowledge of the capabilities and future directions of modern communications we get the most out of educational applications and challenge technology and ourselves.
The Program Committee is looking for workshops and presentations that address the following: Interactive Technologies -for example, innovative use of videodisc and tape Instructional Development -innovative approaches in instructional media Computer Applications -probably focusing on Computermanaged instruction, etc Tele-education -with examples of video, and audio augmented and multi-media approaches Media Production -film, video and audio production techniques and applications Ideally, within each of these areas the schedule would allow for sessions on design, development, delivery and current applications and would try to address the interests of members in K-8, 9-12, college, university and business training, etc.
If you are interested in contributing a workshop or presentation or are interested in a demonstration of a specific application or sitting on a panel in any of these areas or in any other area of your own personal interest, please call or write your Program Committee before January 31st, 1991: Sylvie Vachon - (613)

INTRODUCTION
An Educational Expert-System can be described as a software programme including specific domain knowledge and a tutor capable of solving a learner's task. The ultimate goal of such a system can be defïned as rendering the computer "capable of entirely autonomous pedagogical reasoning", that is claiming domain as well as instructional expertise (Wenger, 1987, p. 5). thermex is an attempt towards the realization of this goal, where the domain expertise is Thermodynamics and the instructional expertise follows a Socratic method. In this method the tutor leads a student through a sequence of questions intending to make the student formulate correct general principles by examination of the validity of hypotheses, by discovering contradictions (diagnostic phase) and extracting correct inferences from known facts (correcting phase) (Collins, 1977;Wenger, 1987). It can be seen as a rule-based decision-making procedure in which the individual learner is confronted with and allowed to correct misconceptions. In THEBMEX, the diagnostic phase is guided by the explicit task questions and the correction phase refers to the heuristics following the identification of errors. The justification for applying a Socratic approach lies in the nature of Thermodynamics which requires an explicit conceptualization or at least formulation of general principles, here labelled qualitative reasoning, before tackling the quantitative procedures of a task. Smith (1987), in his meta-analysis of current instructional strategies for engineering education, states that learning effectiveness can be facilitated by providing the student with learning strategies which stress; the use of simple heuristics closely related to the studied subject-matter, visual and verbal mapping, computer programming and reasoning sessions with peers and domain experts. In much engineering education, these aspects have been neglected and it is suggested that engineering departments change their approach from stand-up lectures to active learning environments (Smith, 1987).
THERMEX is a software program written in Turbo-Prolog II, which lends itself well to questioning and answering processes. It runs on IBM-PC compatible computers. The following features can be considered as particular to THEBMEX: -the structuring of the subject matter is based upon principles and axioms including specific heuristics leading to an appropriate choice of hypotheses -student errors are classified as either procedural or conceptual -the expert model and the student model represent knowledge in the same manner -the student model uses a combined approach, applying theories from both the "buggy" and the "overlay" model -the tutor model is built according to teaching strategies used by professors in thermodynamics. It forms the bases for both the diagnosis and any attempt to provide the student with an adequate problemsolving strategy of learning.
This article describes the different steps, problems, and findings involved in the development of the THEBMEX. A short review of supporting literature is presented in order to illustrate the underlying instructional and modeling methods applied in THEBMEX. The main components of THERMEX and their interrelationships are described. Finally, the formative validation procedure and outcomes are discussed. These will constitute our foundation for future research and development.

AN OVERVIEW OF EDUCATIONAL EXPERT SYSTEMS
Recent studies in Artificial Intelligence have advanced knowledge about how people learn and how experts solve problems. It is widely accepted that intelligence is the capability of formulating and solving problems and that solving problems is best attained through a heuristically guided search among alternatives (Lenat, 1988;Haugeland, 1985). Expert systems, considered as a branch of artificial intelligence, are domain specific problem-solving systems containing a knowledge-base from which correct decisions within the specified field can be made. Intelligent Tutoring Systems (ITS) can be seen as an educational domain specific tutor expert and must therefore include an instructional knowledge base as well as a domain specific knowledge base (Dede & Swigger, 1988). Dede and Swigger (1988) argue that an ITS should be able to adapt itself to different student learning styles and that this can best be attained by using a flexible student model built on "on-task" and continuous diagnosis of the student's misconceptions.
The past ten years show an increasing number of articles dealing with the development and implementation of educational expert systems in science teaching:

1) Brown, Burton and de Kleer (1982) developed an interactive learn-
ing environment, SOPHIE, in an advanced electronic trouble-shooting course. They convincingly argue the benefits of first employing a qualitative reasoning about general principles of the domain before trying a quantitative solution of the task to be resolved. 2) Bottino, Forcheri, and Molfino (1986) constructed ESCORT, that teaches group theory which demands not only knowledge of modern algebra, but also the ability to abstract reasoning leading to an acceptable solution. They contend that abstract or qualitative reasoning will help a student to a better conceptual understanding of the subject matter. 3) Slater and Ahuja (1987) produced MACAVITY, an expert tutor for rigid-body mechanics, focussing on the expert's knowledge representation and explanatory facilities for the student. MACAVITY is competent in answering questions through an automatic generation of a code system, which includes the required action. They argue the importance of including the option for the student to get help in the form of, for example, definitions of expressions, concepts, principles, laws, etc.
In summary, expert systems have enclosed a structure where four necessary components can be distinguished (Kearsley, 1987;Becker, 1988): an Expert Model; -a Student Model; -a Tutor Model; and an Interface The following sections will treat the context for the intended use of THERMEX., and descriptions of the Expert Model, the Student model, the Tutoring model and the Interface.

Architecture of THERMEX
At Sherbrooke University Thermodynamics is an obligatory undergraduate course for engineering students. About 350 students enroll per year and the course consists of a conceptual part (39 hrs), an applied part (12 two-hour exercise sessions), held by T.A.'s, and three exams. Classical Thermodynamics deals with the relation of heat and work in different states of a dynamic system and is defined as the "Science of Energy and Entropy" (Van Wylen, 1978). The general objective of the course is "to acquire and to apply thermodynamic concepts relative to systems and substances". It appears to be a subject-matter difficult to grasp and therefore, emphasis has always been put on providing the students with adequate heuristic strategies to facilitate their conceptual and procedural understanding. However, the current oversized classes and the insufficient time allotted put unreasonable pressure on the teachers and the T.A.'s, consequently provision of individualized instruction is inadequate.
THERMEX is designed to assist student in their attempts to learn the basic concepts and to use appropriate procedures to solve thermodynamics problems. It can thus be likened to a teaching assistant. THERMEX is based on exercises from the French version of the course book "Fundamentals of Classical Thermodynamics" by G.J. Van Wylen (1978), widely used in North America.
A second source of thermodynamic problems, used in THERMEX, is selected final exam problems from the past ten years.
THERMEX provides a learning environment in which the locations of the students' errors are diagnosed through heuristic techniques, that is, the learner has to answer sequential questions pertinent to the chosen exercises. THERMEX assumes that the student has previously attempted a solution and failed. The goal of the diagnostic procedure is to lead the student to an appropriate method of solving a thermodynamic task. When the student fails, THERMEX assists by giving hints in form of pertinent questions. Figure 1 (a & b) (see page 221) shows the context of THERMEX and the relationships between the learner and the software.

Expert Module
As a first step in the construction of the expert model, an analysis of the subject-matter in thermodynamics was carried out, using an approach proposed by Clancey (1986), which includes the representation of a formal domain knowledge (e.g. algebraic and/or geometrical expressions) and a natural domain knowledge (diagnostic and/or strategic). Formal domain knowledge can be considered to be of algorithmic nature, whereas the natural domain knowledge is seen as heuristic: "the expert's rule of thumb". Several content specialists were involved to insure a more accurate knowledge representation. Once the expert model was formally planned, the same "experts" verified and commented on the rule-like representation that was suggested.
The undergraduate course in engineering thermodynamics is of the formal type, where the classical axioms are taught and applied. Although the exercises to be solved are already highly formalized, stress is put on learning a problem-solving strategy, for example formulating correct hypotheses, defining which states are to be looked at, what information is to be discarded and what is to be included. When these steps, which constitute the domain knowledge, are mastered, the student is ready to apply the.formal knowledge, that is, the axioms pertinent to thermodynamics. Thus, it was concluded that both formal and natural knowledge representations were needed.

Description of Knowledge Representation
As stated above thermodynamics is dependent upon axioms and theorems that can easily be described by rules. On the other hand, thermodynamics manipulates entities that entail properties which in turn define the entity itself. These entities are represented by objects where the common properties form the attributes. Thus, the hypotheses of the exercise are the attributes of the systems' objects. Therefore, the expert model represents the knowledge in a composite manner regrouping objects and rules into classes and subclasses. In this manner, the hypotheses of the real task will form the attributes to the system object. The domain knowledge is represented by an hybrid of regrouped objects and rules. Figure 2 (see page 223) shows a schematic view of this type of representation.
More precisely, the software will then manipulate these classes, here transformations, states, and procedures. The first class is related to the thermodynamic system in question.
This system has a set of sub-systems which are defined in the problem statement of the exercise. The attributes of the class system include the nature of the thermodynamic system and can, in this case, take on three possible values; closed system, a steady-state, steady-flow process and a uniform-state, uniform flow process. Other classes will define the thermodynamic states and transformations comprising their attributes (see Fig. 2). Each class is linked with specific procedures in accordance to what was defined in the other related classes, building up the necessary conditions for the thermodynamic system under treatment. lb more explicitly explain the class procedure let us consider the following examples: EX. 1: IF the system is closed AND the transformation is reversible AND the transformation is adiabatic AND the system contains ideal gas THEN the relation Pl*Vl**K=P2*V2**K holds EX. 2: IF two independent properties are known for a specified state THEN all other properties can be calculated.

EX. 4:
IF a relation includes n variables AND (n -1) variables are known the knowledge base THEN the n:th variable can be computed.
The knowledge base presented up to this point is general for all the exercises in THERMEX. The correct hypotheses will be provided for each specific exercise. There are qualitative hypotheses such as the nature of the system or the type of transformation andquantitative hypotheses like numerical values of the properties.
Thus, the knowledge base defines a set of axioms, rules, and relationships between objects, and forms the logic program. Computation of a logic program is the deduction of all possible consequences of the program. The inference engine obtains a set of consequences and among them an appropriate solution can be deduced.
However, these conclusions formally obtained are not necessarily useful for optimization of a diagnostic and misconception based tutoring system. In fact, using this type of axiomatic strategy in a "trial and error" way would make the system tedious and difficult to handle. Therefore, an expert-related strategy is imposed by the system, which both implicitly and explicitly urges the student low-achieving to use an adequate method for solving the given problem. The solution comprises the definitions, the inventory of hypotheses, the necessary relationships and the sequential application rules, that is, the natural reasoning procedures.

Student Model
One of the main concerns for current researchers and developers of educational expert systems lies in how to "model the student" . Brown and Burton (19'78) developed the misconception-based system, Sleeman (1981) the rule-based diagnostic system, and Goldstein (1979) the "overlay model with importance weights" and Becker (1988) misconception-based with a decision tree system. Inspired by these models, thermex combines the "buggy" and the "overlay" model in order to refine the diagnostic procedure, which leads to a stepwise guide adapted to the student. It is believed that this method will promote student reflection, which can be seen as a desired higher order learning function.
To ensure a more accurate model of the student, a group of students were individually videotaped during four two-hour sessions. The students were asked to do their weekly thermodynamic exercises and to verbalize every step they took to solve the task. These videotapes were analyzed and resulted in a list of errors, These were classified into strategic, conceptual and computational procedural errors. In this study, emphasis was put on how students went about solving their problem and why they would block. This analysis resulted in valuable information for the creation of both the student and tutor model.
Another source of useful information for the construction of the student model came from the analysis of 50 student exams where errors were categorized in the same manner.
The student model was developed on these observations. Thus, THERMEX views the student according to the following statements: -selection and/or omission of hypotheses concerning the system, the states, and the transformations; -the choice of mathematical expression(s) and their relationships with the chosen hypotheses; -numerical values, such as units used and logic signs; and definitions of the properties of the system as a whole.
Within each of these groups, different categories of errors can be found and were classified with aid of rules and "mal-rules" adding to a more complex design.
To further illustrate how thermex perceives the learner, the model could be depicted by its mathematical expressions, the hypotheses and the conclusions the student proposes.
The student model uses the same formalism for knowledge representation as is applied in the expert model. It is stressed here that this model is a combined approach, that is it uses both the "overlay" and the "buggy" model.
The "overlay" model can be identified as the verification of the student's knowledge compared to the expert's. For example, if the student proposes the application of mass balance, or the given hypotheses pertinent to the exercise, then these are the elements of an "overlay" model.
On the other hand, the utilization of a bad relationship (mal-rule) can be detected by verifying the hypotheses in connection with the conclusion and constitutes therefore the application of the "buggy" model. An example of a mal-rule is: IF the transformation is adiabatic THEN the temperature stays constant It appears that the construction of a pie-determined error bank, including known mal-rules, does accelerate the diagnostic procedure. In THERMEX a certain number of mal-rules are defined and it would be interesting to find a way of progressively increasing this bank, whenever new mal-rules are detected. Becker (1988) proposes to make this error bank individual in order to create a student "on-task" history, thus increasing the individualistic capacities of the ITS and thereby rendering the system more adaptive. Smith (1987), as earlier mentioned, argues that especially low-achievers benefit from learning a qualitative reasoning strategy before attempts are made to do quantitative solutions. The target learners for THERMEX are lowachievers; that is, they received a grade lower than 45% on the first midterm exam. The main goal is to provide the student with efficient learning strategies appropriate to the subject-matter (in the present case thermodynamics). Thus, the instructional strategy adopted in THERMEX includes an individualized diagnosis in form of questions tailored to each problem and depending on the errors committed by the learner. Once this initial diagnosis is carried out, the "tutor" selects the appropriate remedial strategy, The selected strategy then leads the student through the different steps of the solution again by questions. This approach attempts to fulfill the assumptions of Socratic tutoring (Wenger, 1987, p. 39).

Tutor Model
The diagnosis is divided into three parts. First, THERMEX asks the student which parts of exercise have been attempted; secondly, it asks the student to give numerical values of given and computed data, thirdly it asks to define the physical properties (the hypotheses) of the thermodynamic system. This information forms the outer limits within which a stepwise heuristic guide takes over. The purpose of this guide is to point out to the student where he goes wrong. It tells him to verify the problem statement, given data, his computations, the hypotheses (a built-in dictionary of definitions and explanations of expressions are available on command to assist the student in verifying his solution), etc. By doing all this in a carefully structured manner, it is hoped that the student will identify, and correct the errors. If the student fails more than twice, then correct answers are provided stepwise, until the exercise is fully solved. This method was adopted in order to increase the learning efficiency of THERMEX. However, a chance is left for the student to continue whenever the misconception(s) seem(s) to be cleared up as far as the "tutor" can judge.

Description and Examples
Once the exercise is chosen, THERMEX determines by questioning the student on which part of the exercise the learner needs help. Each exercise is divided into 3-6 main questions, which are displayed in a menu (Fig. 3), where the student can easily mark which questions the learner has attempted. For this reason, the task of finding out precisely where aid is desired is also facilitated. These exercises are displayed using the same indications as in the course book, so that the student can immediately recognize which exercise is assigned for whatever week the learner is in. The next step is to compare the main numerical values, both those given in the problem statement and computed by the student, as well as to identify which formulas he has proposed. The formulas are numbered in the same way as in the coursebook, for example, if "3.4" is displayed in the menu the student knows that it means "PV=nRT" or "PV=mT", which are alternative ways of finding "PV". Further, a comparison of the proposed hypotheses and the physical properties pertinent to the exercise takes place. These types of errors are classified as conceptual and stem from the course objectives, experience of the professors and the analyses of the student exams. Hence, if the results are correct, the next question is considered until the blockage point is found. This technique provides the diagnosis, then the tutoring takes over. The information gathered from the diagnosis serves in briefing the student what type of errors the learner has committed.
If none of the important concepts are absent or omitted, the programme lets the student continue, but stores whatever mistakes are committed. These mistakes are, for example superfluous hypotheses, unnecessarily proposed relationships, small computational errors, etc. If a numerical error is detected the expert-system highlights the wrong number, and comments on how the error appears to be classified, for example as a copy/typing error, as a wrong unit, as a miscalculation, and prompts the student to verify these, but lets the learner go on. THERMEX does not consider these types of errors important enough to force a stop. However, a summary of them is given at the end to further make the student aware of diagnosed errors. The program does not furnish the exact numerical value(s), but rather leaves it up to the student to calculate these outside the program.
Qualitative reasoning, that is, knowing the concept of the thermodynamic system in terms of characteristics such as open-closed, adiabatic, etc. (see Figure 3 on page 228 ), is to be understood before attempting a quantitative solution. This strategy is supported by the analyses of the videotapes where students erroneously tried to put numerical values "into any old formula" before defining the thermodynamic system and thus missed out on understanding the problem altogether.
Since THERMEX is directed towards students having difficulties, stress is put on learning an adequate strategy to solve thermodynamic tasks. To help thestudent in this task, the "tutor" analyzes steps taken by thestudent to solve the tasks. Thus, if, for example, the student proposes the correct hypotheses, but does not know how to use them, the expert-system first reminds the learner that the hypotheses are correct and then points out what relationships are compatible with these hypotheses. If the learner omits information, then the THERMEX "tutor" suggests: "read the problem statement again". If this procedure does not clarify the concepts to be used, then THERMEX indicates a correct procedure.
In the case where the student blocks from the start, the "tutor" suggests a convenient content-related strategy for solving thermodynamic problems. This strategy is used by professors and teaching assistants and is also fundamental to the tutoring system, but not explicit until the blockage point is found. This strategy can be outlined in a few statements: -to define the thermodynamic system(s) of the problem; -to identify the principal hypotheses concerning this system; -to name the essential relationships according to the chosen hypotheses which are appropriate to the exercise; and to correctly apply these relations.
Here again, as indicated above, the assumed instructional strategy Ws ta ou tes questlons avec RETU~, puls appule sur F10 pour termlner. stresses that answers are not given directly but instead thermex tries helping the student to find them through heuristically formulated feedback.

In terface
All computer assisted systems include an interface allowing communication between the system and the user. Constructing an interface based on natural language is a very difficult process. The biggest problem appears to lie in foreseeing and dealing with the individual learner's way of thinking and phraseology, In order to reduce these types of problems a menu driven interface is adopted in thermex.
For example, when defining the physical properties of a thermodynamic system, the learner can choose from a menu of keywords, including all possible hypotheses, by moving down or up with the help of the arrow keys. Different function keys are assigned to either get help, a definition of a certain concept, or to get the problem statement on screen. The return key is used to confirm whatever the user proposes. These features are consistently applied throughout the program and shown in a status line at the bottom of the screen.
Numerical values are verified through a process whereby the answer to a specific question is compared to the exact value within a 10% miscalculation limit.
Dialogues and comments are always shown in the bottom area of the screen, in a window with a different color (see Figures 4 and 5 on page 230). Dialogues and comments are continued by a "yes", "no" or <RETURN> statement.

FORMATIVE EVALUATION
The goal of this formative evaluation was to obtain initial reactions to the instructional strategy used in THERMEX directly from the target learners.
One third of the low-achievers (midterm grade < 45%), that is students volunteered for this formative evaluation, that lasted for 8 weeks consecutive, 3 hours at a time. Like the rest of the class, these students were assigned a certain number of thermodynamic exercises each week. They were told to attempt a solution on paper and to use THERMEX as a "teaching assistant" who could help answer questions and verify steps. An average of two exercises per session was solved this way. The students were directly observed using a checklist concerning THERMEX technical, instructional and conceptual qualities.

Findings
Observations brought into light the following points: -The students tended to use concepts at random, without complete understanding. Since THERMEX forced the learner to explicitly state the concepts  to be used, the learners were able to identify and correct their omitted or misunderstood concepts. For example when an open system was assumed, confusion was observed when the student had to distinguish between input and exit states versus initial and final states. THERMEX benefitted from these findings since new explanations could be added to the software.
-The students also tended to use mathematical expressions (thermodynamic formulas) at random without verifying the specific conditions under which these formulas could be applied. THERMEX detected these types o f errors through the use of the "mal-rules" in the student model. In this case, THERMEX forced a justification procedure, whereby the student stepwise had to identify all the necessary operational conditions for the suggested relationships. Most of these "mal-rules" were represented in the error bank, which in turn was used to help the student clarify the misconception(s) that were employed. Through this formative evaluation it was possible to identify more of the common "ml-rules", and the error bank was expanded.
_ It was encouraging to note that students did indeed take care in choosing between options of the different menus. Most of them read carefully and reflected on what would be the mos tappropriate choice. This type of continuous reflection was perceived to implicitly reinforce strategic steps as well as the subject-matter, because it involved them in verifying definitions and meanings of the concepts presented.
-The fact that THERMEX was capable of indicating numerical errors related to signs or magnitude showed that students often do not question the results obtained; e.g., 1.234 instead of 12.34. The students appreciated this feature, sincethey, when it was pointed out to them, usually could immediately distinguish and correct the error. This was seen as a time saving feature and they thought THERMEX was more effective in this sense than a human T.A, they also believed that the use of THERMEX increased their efficiency o f learning, that is, they perceived it as a time-saving aid. These points raise questions that can hopefully be answered by the summative evaluation. The fact that it is computer-mediated does not inspire any fear at all in these students. It should be noted here that the thermodynamics course is preceded by a course in computer programming, thus possibly explaining this fact.
-The students consistently attempted to solve their exercises completely, and appreciated the comments and encouragements displayed by THERMEX. Even when it was a question of a simple calculation error, they returned to the beginning until obtaining the correct answer(s).
In summary, this formative evaluation supported the hope that students appreciated the instructional strategy applied in THERMEX. They perceived it as a time-saving tool which provided them with adequate information about the subject-matter and related methods to solve problems when compared to help given by a teaching assistant or a copy of a solution of the thermodynamic e x e r c i s e .
The student also valued the capacity of THERMEX to give specific feedback to each class of errors. In this sense. it appears that THERMEX could provide an individualized learning environment where a problem-solving strategy might be developed.
The observations provided valuable information on where more dialogues, extensions of the error bank, and the mal-rules are needed to refine the diagnosis and to increase the effectiveness, efficiency and adaptability of the expert-system. With these modifications it is believed that THERMEX can be submitted to a true experimental situation, where changes in student performance can be quantitatively as well as qualitatively compared and measured.
All through the construction of THERMEX, the main concern was to find out whether the proposed student model would be sufficiently precise to display a helpful diagnosis of the learners' misconceptions and to provide an appropriate remedial strategy The validation procedures confirmed that most of the students' erroneous behavior were, in fact, correctly identified by the student model. One of the recognition difficulties encountered is the case where a student suggests a resolution that will actually lead to a correct answer, but goes about it in a slightly different way than the expert, that is, than the way it is represented in the knowledge base. These differences refer especially to unexpected intermediary expressions utilized by the student.
It was observed, several times, that the learner can precisely understand some of the important relationships but did not declare one or two intermediary equations, although he employed them, hence confusing the "tutor" into believing that the learner did not know the intermediary equations. This problem was overcome by delaying the error comments of important steps until the whole question was treated. Therefore, if the end computational results are correct, the "tutor" will assume that the student did understand and correctly used these omitted intermediary expressions. This technique permits alternative strategies in obtaining results and focuses on the important conceptual and procedural steps.
However, it is difficult to foresee and categorize all of these different types of student models; for examples when a student "invents" given numerical information, THERMEX has difficulties understanding the behavior of the student. An example of these types of "inventions" was when two fluids with different temperatures was mixed together and the sum of the temperatures are put as the value of the temperature of the mixture. These error models are not random, since they correspond to a mental representation of the student which is a conceptual error. Another error was the creation of new equations or formulas. Since these expressions are entered only by menus, the system cannot detect what the misconception is because the menu is a correct expression. If the numerical value entered by the student is wrong, it is pointed out to him that a calculation error is committed, but in reality it is a conceptual error, which is not detectable.

CONCLUSION AND FUTURE RESEARCH PARADIGMS
This article has presented the different steps that were taken to develop a particular educational expert-system in which efforts have been put on diagnostic and remedial strategies related to the learning of fundamental concepts and problem-solving methods of classical thermodynamics. It appears that the methodology used in thermex could easily be transfered to other academic subject-matters that display approximately the same characteristics as thermodynamics.
THERMEX is conceived for students who have difficulties in conceptualizing thermodynamics, after using THERMEX and it was exciting to observe that these students tried to employ the "expert's' strategy of solving a thermodynamic task. The utility of THERMEX will be twofold, acquiring a transferable method of solving scientific problems and filling the void concerning the concepts and objectives of the subject-matter.
However encouraging these first trials with THERMEX were, further research and development are needed, especially in the area of tutor decisionmaking and student modelling. It is believed that the knowledge representation problem is adequately solved by using rules and objects. The formative evaluation also appears to confirm the adequacy of the basic instructional strategy, although efforts will be put on finding out what, where, and when the student will benefit more from further interventions of the system. As mentioned earlier, it sometimes appeared to be more adequate to delay comments and, in other instances, it seemed better to display corrective comments immediately. These features need to be further researched.
For the time being, the student models consist of a mixed approach including features from both the "overlay" (Goldstein, 1979) and the "buggy"  model. It is planned to investigate the possibility of incorporating a "decision tree model" (Becker, 1988) in order to increase and refine the diagnostic capabilities of thermex. A "decision tree model" would expand the error bank and restructure related errors in a way that might overcome problems with the student's "invented" information.
Our next step is to carry out a formal summative evaluation where student performance will be quantitatively, as well as qualitatively, measured. It is planned for the winter term of 1990, using the low-achieving students of two groups of about 80 students each, taking the obligatory course in thermodynamics at the University of Sherbrooke. ACKNOWLEDGEMENTS: Special thanks are extended to IBM CANADA LTD., and the Faculty of Applied Sciences at Sherbrooke University for their financial support of this project. It is also extended to all the participating students, who accepted to test and carry out the validation procedure with THERMEX. We are also indebted to Dr. G. Boyd at Concordia University for offering numerous comments and suggestions on the entire manuscript.

Catch the Wave: The Future is Now
In conjunction with AMTEC'90, the conference organizers put together a publication entitled Catch the Wave: The Future is Now. The spiral bound book contains selected papers from the conference, a set of selected quotes from the keynote addresses and the past writing of keynote speakers, and introductory insights for each section. It is a "must read" for all of us who work in the field of educational technology. Some copies are still available at $25.00 per copy, and cheques should be made payable to AMTEC'90.

Be sure to get yours!
Copies may be ordered from: The paradigmatic status of educational communications and technology is reviewed in this paper from a cultural anthropology perspective. Kuhn's metaphorical book on paradigms, The Structure of Scientific Revolutions (1970), has itself become a metaphor since its first publication in 1962. Kuhn used citation analysis, a historical method, for building his paradigm mode1 of how empirical thought develops in physics and astronomy. Empiricists from other disciplines borrow his mode1 but they do not apply his methodology. Using rational tools they would not normally allow in their work due to lack of empirical rigor, they describe their own fïelds as paradigmatic. The Kuhnian metaphorical structure has diffused from describing scientific research in physics and astronomy, It is applied to almost any perceived state or desired change, from improvements in computer operating systems to research and theory in educational communications and technology.

KUHN'S MODEL
Kuhn's The Structure of Scientific Revolu tions (1970) opposed Popper's principle of falsifiability (Popper, 1934(Popper, /1968). Described in The Logic of Scientific Discovery, falsification had been accepted for three decades. Popper asserted the superiority of empirical observation over scientific theories accepted on the basis of agreement between authorities. In the Popperian tradition, competition between research strategies is thought advantageous to science. The credibility of steadystate knowledge rests not on dogma but on refinement and replacement by more powerful theories and closer approximations of truth. Kuhn questioned what had become the textbook explanation of continual, logical progress and created his model to explain the results of bibliographic research on the history of science.
Kuhn's model contains three central metaphors: paradigm, anomaly and revolution (1970). The paradigm is an accepted set of rules for knowing about and conducting normal science. The anomaly is the exception that stimulates new explanations that cannot be ignored. The revolution is the emergence of a new paradigm.
Kuhn identified three normal foci of factual scientific investigation (1970, p. 25-30): Determination of significant fact -Paradigmatic facts are developed for solving paradigmatic problems as a result of applying research strategies. Measurement occurs with increasingly refined apparatus and methods. Some researchers receive more recognition for developing research tools than for what they find.
Matching facts with theory -This comes from addressing research issues. Increasing the match between theory and nature comes from arguments and/or factfinding demonstrations rooted in the real world. Kuhn stated: 'The existence of the paradigm sets the problem to be solved; often the paradigm theory is implicated directly in the design of apparatus able to solve the problem" (p. 27).
Articulating theory -A paradigm is articulated by looking for universal constraints, quantitative laws and experiments, in a more qualitative sense, that elucidate a phenomenon and relieve ambiguous interpretations.

Kuhn's Model In Action
This model with its triple foci is not Popperian falsification (Kuhn 1970 pp. 77-80). The historical study ofscientific development shows that new basic theories appear in the short revolutionary period of extraordinary science and new paradigms are incommensurable with previous paradigms. In the long periods of normal science, the reigning paradigm restricts researchers to puzzle solving science in which assumptions are accepted and not questioned. Kuhn (1970) further challenged modern rationality by doubting the neutrality of investigators and suggested the importance of considering sociological and individual psychological aspects. He situated the longterm search for objective truths within the reality of the everyday world: "Lifelong resistance, particularly from those whose productive careers have committed them to an older tradition of normal science, is not a violation of scientific standards but an index to the nature of scientific research itself" (p. 161).
Kuhn's relativistic approach led to debates with other philosophers and historians of science including Popper, who defended the neutrality of scientists, Lakatos, who described rules for sequential theories in scientific research programs, and Feyerabend, who welcomed subjectivity and accepted mysticism (Lakatos & Musgrave, 1970). The choice of empiricism as a way of knowing drew doubts because it is empirically unprovable. Dependency on gathering data through observation became understood as a bias. Science in action claims objectivity but fails any test of neutrality Debate with Lakatos chased empiricism into a corner as a sociopolitical business with sociopolitical aims when Kuhn demonstrated the importance of consensus in the scientific community for determining facts.
Kuhn's bibliographic analyses increased in particularity over time because each revolution requires a unique explanation. His original metaphors did not fit all situations. He discovered differing paradigms for chemistry and physics that described helium either as an atom or as a molecule (1970, pp. 50-51). Kuhn drew parallels from science to art, from theories to painting styles (1977, pp. 340-351). With the refinement of specificity, Kuhn's model was reduced in scale from paradigms in conflict to theories in conflict. This admitted lack of generality, partly in response to close examinations of the issues, resulted in severe attack. Stegmiiller's book The Structure and Dynamics of Theories finds "Kuhnianism" not only relativistic but irrational (1976).
Kuhn had constructed a usable metaphor. The model became popular as scholars in many areas borrowed his structure regardless of its relativistic base (belief in social perspective as reality). It has been applied uncritically by realists (believers in objective reality) and instrumentalists (believers in measurement as reality). Kuhn's model has become dominant. Casti wrote 'With Kuhn we have come to the end of the line as far as contemporary views on the way science operates both to form and to validate its view of the world" (1989. p. 45).
Paradigm has lost its revolutionary fire. It may mean no more than Weltanschauung as a metaphysical, epistemological and methodological perspective of the times. As the author of a classic work, Kuhn has endured the fate of classic authors because his model has been more often cited than read (Adams & Searle, 1986, p. 381).

Kuhn's Model And Social Science
In examining the relationship of Kuhn to the field of educational communications and technology, it is first necessary to review Kuhn's influence on the social sciences, the source of education theories and research methods. The acknowledgement of a paradigm is socially desirable in any discipline as a sign of intellectual adulthood. Just as early psychological researchers had "physics envy" (Gould, 1981, pp. 262-263), some social sciences have been the subject of debates over whether they are really scientific or not (Kuhn, 1970, p. 160). Psychologists claim that understanding paradigm shifts in their field is central to understanding cognitive psychology (Lachman, Lachman & Butterfield, 1979). There was a Kuhnian diffusion of ideas in linguistics: In accordance with Thomas Kuhn's (1970) description of paradigm changes in the sciences, the Chomsky point of view took over, not by convincing the previous generation it had been in error, but by winning the allegiance of the most gifted students of the succeeding generation. (Gardner, 1986, p. 209) Similarly, parapsychologists are attracted by the orthodoxy of Kuhnian metaphors (Barnes, 1983, pp. 90-93;Radner & Radner, 198, pp. 62-672). However, Kuhn had ventured . ..it remains an open question what parts of social science have yet acquired such paradigms at all. History suggests that the road to a firm research consensus is extraordinarily arduous. (1970, p. 15) Despite Kuhn avoiding extrapolation of his ideas to the social sciences, the central argument that more is happening in science than an academic competition of ideas caused Barnes, a sociologist, to write on T.S. Kuhn and the Social Sciences (1983). Barnes describes the dangers of Whig history, of viewing the past as a reflection of the present, and of writing textbooks that present only facts that support current understandings. He writes that Kuhn's ideas have become progressively more conformist, conservative and supportive of the scientific establishment. Barnes also uses the phrase "intellectual laziness" (p. 120) to show that he does not endorse the dogmatic acceptance of Kuhn's model as an after the fact explanation in sociology, economics or psychology. Kuhn (1970) had cautioned: The members of all scientific communities, including the schools of the "pre-paradigm" period, share the sorts of elements which I have collectively labeled 'a paradigm.' What changes with the transition to maturity is not the presence of a paradigm but rather its nature. Only after the change is normal puzzle-solving research possible. (p. 179) Barnes' (1983) reflections on Kuhn (1970) illustrate the seductiveness of Kuhn's model. No area wants to be regarded as preparadigmatic when having a paradigm appears to be a measure of social standing. The widespread use of Kuhn's model causes a subtle dislocation. There is a self contradiction in employing it to assert professional status. Mitchell, an English professor, identified the essence of this interdisciplinary borrowing: We can always tell which of two crafts outranks the other by looking at its lexicon. Jargon only runs downhill. You will notice that al-though educators have borrowed "input" from the computer people, the computer people have felt no need to borrow 'behavioral objectives" or "preassessment" from the educators. (Mitchell, 1979, p. 106).

CULTURAL EVOLUTION
Kuhn, the historian of science, describes the scientific way of knowing but does not provide a sufficient explanation of the sociopolitical forces driving that way of knowing. Cultural anthropologists, however, specialize in that type of problem. They describe what people do and what they say they do and construct explanations for beliefs and how they change. Harris, in particular, has proposed a theory of cultural evolution that accounts for how beliefs and behaviors are formed in response to environmental pressures (1968,1974,1977,1980,1989). This theory is known as cultural materialism. Harris' theory explains why beliefs and behaviors are shaped by fundamental issues such as food supply and population growth (1985). He has also extrapolated this theory to hyperindustrial life (1987).
Harris gives the basic principle of cultural materialism with these words: The etic behavioral modes of production and reproduction probabilistically determine the etic behavioral domestic and political economy, which in turn probabilistically determine the behavioral and mental emit superstructures. (1980, p. 55-56) Etic operations are independently verifiable: 'The test of the adequacy of etic accounts is simply their ability to generate scientifically productive theories about thecauses of sociocultural differences and similarities" (Harris, 1980, p. 32).
In contrast, emic operations give native informants absolute status in determining the reality, meaningfulness or appropriateness of analyses. These etic and emic distinctions are not mere synonyms for behavioral and mental. They combine into "four objective operationally definable domains in the sociocultural field of inquiry" (Harris, 1980, p. 38).
An example comes from Harris' fieldwork with farmers in Kerala, on the western side of the Indian peninsula (1980, pp. 32-40). From the etic view the feeding of male calves is restricted so the gender ratios of the cattle are adjusted through starvation. This suits the local ecological and economic conditions for farming. From the emic view, no farmer would violate the Hindu prohibition against slaughter. The Kerala farmers say that male cattle are weaker, sicker and inherently eat less than female cattle.
The paradoxical relationship between etic and emic views is testable by a cultural comparison. Hindu farmers in parts of India with different local ecological and economic conditions, such as the inland states of the north, value the traction capabilities of cattle. In Uttar Pradesh, the seat of Hindu religion and culture, the mortality rate of cows is significantly higher than that of oxen.
Besides slowly starving the female calves, unwanted animals are sold to Moslem traders. Again, the death of the animals because of gender appears intentional (Harris, 1980;Harris, 1985).
This knowledge can be applied to the culture of educational communications and technology research and theory. From the cultural materialist viewpoint the behavioral and mental emics of a culture are determined by the etic forces. The emic projections or reconstructions become beliefs. Harris lists the mental and emic components as conscious and unconscious cognitive goals, categories, rules, plans, values, philosophies and beliefs about behavior (1980, p. 54). Scholarly beliefs are also emic representations and there is a tendency to favor low grade emic stories over high grade etic information (Price, 1980). Travers describes similar myth building behavior about research in education (1987). The next section of this paper looks at the emic superstructure of educational communications and technology,

The Emic Functions Of A Cognitive Paradigm
The claim to a cognitive paradigm in educational communications and technology (Clark & Salomon, 1986;Clark & Sugrue, 1988;Heinich, 1970;Winn, 1989) can be read as a social text. Harris' theory suggests that belief in the cognitive paradigm performs a social function. Like the boost in agricultural production from a Kwakiutl chief redistributing wealth at a potlatch (Harris, 19741, it encourages cooperation in the joint productive effort. Having a paradigm is an indication of being established. The transition from preparadigm state to p&paradigm state is widely perceived as the passage of puberty for any discipline and from the viewpoint of cultural evolution, claiming a paradigm has adaptive value. It helps people obtain and maintain employment. When prospective colleagues say they are believers, they increase their chances of survival in the job market. Publishing manuscripts that look outside of cognition in examining what the field does and why it is done, causes schisms and these reduce the centralized power. Conflict is discouraged because it decreases material growth.
To claim a cognitive paradigm impresses other big men* such as granting agencies. It reassures school district superintendents and corporate directors of instructional systems that learning is knowable and predictable. Everything appears under control and the scholars who support a cognitive paradigm promise to bend their research efforts to everyone's benefit.
Belief in the cognitive paradigm in educational communications and technology may exist without paradigmatic consensus. Instead of a paradigm, Kuhn's structure may offer another explanation which fits the field better, In Kuhn's model, preparadigmatic research is characterized by the atheoretical factfinding characteristic of prescientific times (1970). Explanations of phenomena are inadequate. Data are too dense for decoding. Details are missed *This is an anthropological term denoting leaders who work extremely hard at motivating their followers to be productive. See, for example, Harris (1989, p. 359).
which are later considered important. In preparadigmatic research, technology is the name given to solving practical problems systematically Technology parents science.

Understanding The Transmission Of Culture
From the cultural materialist point of view, individuals and groups in the field can still use their reason to choose what they want to believe and what they want to do. Although lacking evidence for this ocurring earlier, Harris suggests that deliberate choices may be the only hope for the planet in the face of the ecological emergency (1987, p. 181-183). Researchers in the field could search to make the questions they ask more meaningful, to make their results more useful and for new methodologies and theories. From this viewpoint, sociocultural research represents a strength of educational communications and technology's position as an applied field. Scholars investigating this dimension would recognize more is at stake in educational communications and technology than achievement. Their work would be closer to practice. Driven by sociocultural research issues, their investigations would draw from the theories and methods of the social sciences and the humanities. Some would write in what Husen (1988) identifies as humanism, educations' other way of knowing, as opposed to neopositivism/logical empiricism. These scholars would be concerned about critically understanding cultural reproduction.
Nichols, for example, believes the field might turn from the mechanical study of achievement and select the direction of Habermasian morality (Habermas, 1984;Habermas, 1987;Wells, 1986): Education should function, via communicative action, to help us competently reach understanding with one another (the cultural function), fulfill appropriate societal norms (the social function), and develop our personalities (the socialization function), and in the process, learners become involved with objective, practical and emancipatory forms of knowledge. (Nichols, 1989, p. 351).

THE SCHISM BETWEEN RESEARCHERS AND PRACTITIONERS
The beliefs and behaviors of educational communications and technology is rooted in the practice of using and producing educational media but researchers focus on one set of activities and practitioners focus on another set of activities. Researchers investigate to write scholarly research reports but educational cinematographers investigate to create films and videos. Classroom teachers, school media coordinators and instructional systems developers select images to convey the world to learners. They know the field is effective because educational media employ rhetoric and the way of saying something changes what is said. Their productions are lyric, dramatic and epic. Even computer screens are alive with metaphors.
Adams' Philosophy of the Literary Symbolic (1983) confronts the tension between the literary and the scientific ways of knowing: 'The war between poetry and philosophy has extended from before Plato's time into our own" (p. 389). These words also apply to educational communications and technology. More than a paradigm, debates between researchers and between researchers and practitioners ensure conscious decisions. These are necessary to defeat the material pressures on beliefs. More than a paradigm, the field needs this conflict between the philosophy of research and the poetry of practice.
As an applied field, the Kuhnian paradigm of revolutionary process does not fit educational communications and technology and neither does the falsificationist myth of orderly progress. The description of the preparadigmatic state fits best. There are material pressures for claiming a cognitive paradigm. That claim is an emic fact, whether it is empirically true or not.

Microware Review
Authorware Academic and Authorware Professional, Part 2 Earl R. Misanchuk Part 1 of this review, published in the last issue of this journal, dealt with Authorware at a conceptual level; this part deals with the "nitty-gritty" details of how it works.
The task of creating CBI has recently become so much simplified and eased that one of the "grand old men" who pioneered much of the work done in Canadian CBI admitted that he felt like he had wasted much of 25 years working with the cumbersome CBI environments and authoring languages that were the state of the art before Authorware.
Authorware Professional (formerly called Best Course of Action) and Authorware Academic (formerly Course of Action) have so much authoring power in such a small package that with either one of them, a Macintosh Plus could run rings around most mainframe CBI systems of only a few years ago. During the eighties, a powerful (but expensive) CDC mainframe CBI system called PLATO more or less set the standard for CBI in terms of ease of authoring and sophistication of presentation to the learner. The people who created Authorware are former PLATOites, now spun off from CDC, and the Authorware course design software runs on the Apple Macintosh line. Once the course is designed, it can be ported down to other environments (e.g., DOS) for delivery to learners. Aversion of Authorware Professional for Windows 3.0 is scheduled to become available, probably by the time you read this. Although the Windows 3.0 version can be expected to be similar, the remainder of this review deals with the Macintosh version.
The general procedure for producing CBI with Authorware is to create and de-bug an instructional sequence, then "package" it (i.e., create a stand-alone run-time version of the sequence which can be used by learners). It is not necessary for each learner to have a copy of Authorware in order to use the packaged modules.
Both Authorware Academic and Authorware Professional are icon-based, object-oriented authoring environments: An author merely drags the appropriate icon (representing a desired action) from a storage spot onto a flow chart depicting the course being developed, and the software does all the necessary code creation completely unobtrusively. The eight icons in Authorware Academic, and their corresponding effects, are: display animation erase wait decision question calculation map puts text and/or object-oriented or bit-mapped graphics onto the screen permits simple motion of screen elements clears screen causes flow of course to pause causes selection from among a set of attached icons presents a question, and provides feedback based on learner's response permits arithmetic or logical control; also per mits jumping to other files or other application programs groups individual icons to organize and modularize course Authorware Professional has three additional icons: ? advanced permits more complicated animations, including animation bit-mapped "movies" . sound plays a digitized sound file (digitizing equipment comes with the package) . video executes commands to display selected segments of a videodisc For example, if you want to display something on the screen, then have a pause, then clear the screen and ask the learner a question, you would drag onto the flowchart, in turn: a display icon, a wait icon, and a question icon. Having done those simple actions, you would proceed to execute (run) the program. Authorware pauses whenever it hits an "empty" icon (one that has no information attached to it), so when the first display icon comes up, -you are shown a blank screen, to which you can add text and/or graphics, using familiar Macintosh tools and techniques. The program then executes the wait icon, allowing you to specify a variety of conditions (e.g., wait a certain length of time, or until either the mouse is clicked or a key is pressed, optionally displaying a prompt for a response). When the wait conditions are met, the execution resumes, thescreen is cleared, and the question display is put on the screen. You type in the question, choose the type of answer you want (text, click or touch area, move object, pulldown menu, keypress, pushbutton, or conditional), and construct as many feedback paths and displays as are appropriate, using the same "let-it-run-until-it-stops, then-fill-in-the-missing-information" approach. Authorware "knows" that some sequences of activities are "normal", and defaults to them unless told otherwise. For example in the scenario above, it would not be necessary for you to use an erase icon before the question icon; Authorware anticipates that you would want the screen cleared first, and defaults to that condition. Indeed, one of the nicest things about using Authorware is the degree to which it anticipates what you want to do next and sets itself accordingly Not long after you begin authoring, you suddenly begin to realize how many things you didn't have to tell it to do-and it still did them right! For displays, the full range of Macintosh fonts, sizes, and styles of text is available, in black and white or color, depending upon which Mac platform is being used. While screen displays can be any size, the run-time machine's characteristics obviously have to match the authoring machine's. In typical Mac manner, copying, cutting, and pasting text and graphics are possible making importation via the Clipboard or the Scrapbook quick and easy A simple, built-in graphics toolbox provides object-oriented lines, rectangles and ovals, with various fill attributes and display modes. (Because the toolbox is missing a few useful features-like object alignment, for example-and because an imported bit-mapped graphic becomes an object in Authorware it is useful to have another, more powerful, graphics program available to use'with Authorware.) Display effects include zooming and fading. In addition to simply placing text and objects where you want them on the screen, the option exists to place them according to specified coordinates, or to have coordinates calculated by the software.
The simple animation icon is of the fixed destination type: it can be used tocausean object to move from point A to point B, but not much more. The speed of the animation, or the time taken to execute it, however, can be controlled. (The advanced animation capability that comes with Authorware Professional is described below.) Screen erasure is quite straightforward. The effects of zooming either to a point or to a line, or of fading out, can be combined with the erase icon.
As noted earlier, the wait icon may have conditions attached to it. If a certain amount of time is allocated during which the learner is expected to answer, it is optionally possible to display a small graphic representation of an alarm clock, indicating how much time is remaining.
The decision icon provides for branching of several types: sequential (go through each of the attached icons in turn); random without replacement (choose one or more of the attached icons, as specified, but never repeating the choices); random with replacement (same as the last choice, but permitting repetitions); or pick "nth" path (path chosen is based on calculations). The decision icon is one of the most important of Authorware's structures permitting the author to have the machine emulate human-to-human interaction, and providing for different machine reactions to different learner actions.
Using one of the seven types of questions (listed earlier) available with the question icon makes it easy to require interactivity with the learner. Text answer evaluation is very flexible, allowing capitalization, punctuation, spaces, extras words, and word order to be selectively evaluated or ignored. Alternative answers are easy to specify. Wild card characters are, of course, supported. A simple form of parsing is available which requires the learner to match a specified number of words. A nice feature is incremental matching, a scenario in which the system remembers partial answers provided by the learner, and combines them to evaluate the whole answer. Clicking on an area is a familiar Mac activity, so "pointing" to a correct answer is easy for learners. In addition, learners can be instructed to drag screen objects to certain locations. Answers can also be provided by selecting from a pulldown menu, or by striking a key or combination of keys. Pushbuttons (Mac-like "buttons" on the screen activated by clicking on them) can also be used for answers. Conditional answers are typically logical or mathematical computations that direct the learner along a certain path, based on previous performance or other criteria. Feedback can be controlled with respect to amount of time taken to provide the answer, or with respect to the number of attempts made to answer the question correctly.
Thecalculation icon is an extremely powerful tool, providingaccess to more than 100 system variables and functions. Want to know how many times the learner got the answer right (or wrong) on the first try? Or what the learner's last answer was? Whether the Caps Lock key is currently depressed? How many days it has been since the learner last worked on the course? The third number in the learner's last answer? The cumulative number of questions the learner has been asked? Authorware remembers all those things, and many more.
The map icon represents a group of one or more icons, and is primarily a way of keeping the desktop from becoming cluttered. As you develop and extend sequences of icons and de-bug them, you can collapse them into mapscollections of fragments of instruction that work as you want them to. Indeed, as noted in Part 1 of this review, Authorware's approach demands this 'bottomup" approach to planning a sequence (as opposed to a "top-down" approach typical of programming): You make the smallest, most central part of the instruction work properly, then you add a layer of instruction around it. When that all works, you add another layer, and so on. It takes very little time to get used to this approach, and once you've used it, it becomes second nature. The fact that you can edit anytime, anywhere, makes it easy. You can change the sequence of displays with impunity; Authorware keeps track. You can even change the name of a variable-making the change in only one place---and Authorware will make all the other necessary changes for you.
The advanced animation option expands the number of choices available to five kinds of animation: fixed destination (movement from point A to point B along a straight path, the same as in Authorware Academic); fixed path (movement along a curved or jagged path); scaled path (movement along a specified fixed path, with pauses permitted at points in response to values of specified variables); linear scale (movement from a starting position to a calculated position along an imaginary straight line); and scaled X-Y (move-ment to a location -on-or off-screen -calculated from values of variables specified). Paths, once defined, are editable. By assigning different animations to different layers (thereby defining what will happen when the animated objects overlap) and by using the concurrency option, you can have several animations happening on the screen simultaneously The movies icon that comes with the advanced animation option permits the playing back of bit-mapped sequences (up to 1575 frames long) that resemble movies. After inserting a movie icon into a course, you can adjust the beginning and/or ending frame of the movie you want displayed, the playback speed (frames/second), the size of the image (1: 1 -3: 1), and how long it should play (repeatedly, a given number of times, or until a stated condition is true). Concurrency with other activity (e.g., sound) is possible with movies, as well. The movies themselves are created in a stand-alone program, Movie Editor, that comes with the Authorware Professional package. You create a movie by drawing each frame pixel by pixel, using MacPaint-like tools, or by importing frames via the Clipboard. The frame size can be 32, 48, or 64 pixels square.
The sound icon works in a manner similar to the movies icon: A sound icon inserted into the course flow diagram calls up and plays a sound file. The sound file can either be created with such commercial packages as Studio Session (not provided) or with SoundWave (included with Authorware Professional). SoundWave is used in conjunction with the hardware provided to digitize sound from either a microphone (supplied) or other electronic source. Sounds can be recorded at 5.5, 7.5, 11 or 22 KHz. A waveform monitor option is available to set optimum gain. The sound can be edited and manipulated (e.g., speed, volume, delay, equalizing filters) before it is saved as a file which can be called up by the sound icon. External "hooks" exist for custom programming to be added.
The video icon permits program control of one of eight models of videodisc players. The functions available are those on a standard remote control unit: play, step, slow, and fast, all either forward or reverse; pause; starting and ending frame numbers; and freeze frame. Playback can be set to one of five speeds. As with movies and sounds, video can be concurrent with other activities, and the number of times it is to be played can be specified. Various degrees of control can be assigned to a student-operated controller.
As noted in Part 1, Authorware permits the use of models, which are fragments of instruction devoid of content-shells, really-that can be called up in their entirety and pasted into place. Thus if you have a CBI sequence which will use many different four-answer multiple-choice questions (to use a mundane example), where each answer has different feedback, you could make up one sequence of the appropriate icons (without putting in the content), and save it as a model. Then, when you need a four-answer multiple-choice segment, you choose that model from a menu, and paste it into place on the flowchart. Run the segment, and place the content into each icon as it executes. Saves hours and hours of repetitive coding! The 'jump out" feature allows the author to permit the learner to use another application program -a word processor or spreadsheet, for example -at specified locations in the course, then resume the instruction when done using the other application. Again at the author's option, learners who quit an instructional sequence in the middle may be allowed to pick up where they left off or may be required to begin the sequence again, the next time they access the course.
Authorware comes with a tutorial manual and a reference manual with separate manuals for each of the sound, video interface, and advanced animation icons. By following the tutorial, you actually create a quite sophisticated instructional sequence designed to illustrate most of Authorware's features. The manuals are logically organized and well-written. A novice to CBI authoring (who knows how to use a Macintosh) can begin to create simple CBI in a matter of a couple of hours. Of course, mastering completely such a powerful program will take much longer.
As noted in Part 1 of this review, Authorware Professional makes it possible for the educator-author to begin creating computer-based instruction with a relatively small learning curve, yet provides sufficient power and flexibility to satisfy even the most experienced professional CBI author.
The inevitable question arises: How does Authorware compare to another much-touted piece of software, HyperCard? In an earlier review of HyperCard in this journal, I waxed enthusiastic about the potential of HyperCard for developing CBI. Now, at the risk of being accused of being overly enthusiastic I find myself wanting to do the same-only more so-for Authorware. Yet it's not just an either/or proposition. It's a little like comparing Apples and oranges (pun intended; forgiveness begged). Authorware is designed for the express purpose of producing CBI, while HyperCard is a somewhat more generalpurpose construction tool which, with some effort, can be made to do many of the things Authorware can. On the other hand, HyperCard makes easy some things that Authorware would have trouble with.
Student tracking is an Authorware strong point; Recording and evaluating answers is quick and easy While it could be done with HyperCard, it would require considerable sophisticated HyperTalk code to accomplish. Authorware's animation is much more powerful than HyperCard's (especially Authorware Professional's). Authorware supports color, while HyperCard at this writing does not, and Authorware products can be ported to other platforms while HyperCard cannot. Both allow for quick and easy creation and importation of text and graphics, and both permit the relatively easy use of sound simple animation, and videodisc (although the sound tools provided with Authorware Professional are significantly more powerful than HyperCard's). Both have powerful internal programming features, and both permit user code to be "hooked" into them. HyperCard's powerful search features are not available on Authorware. Neither is the ready capability of creating hypertext links, although Authorware's "leaping" feature comes close. HyperCard's price (free) has to figure into the comparison. But I repeat: It's not an either/or question. The serious author of CBI will probably want to use both, jumping

Reviewed by Earl R. Misanchuk
Four chapters comprise the book: Communication; Perception, Learning and Memory; Literacy; and Designing Visuals for Information. Each chapter is followed by its own list of references; the last chapter is followed by an extended reference list, which is sub-divided into a number of categories (e.g., content, structure, realism, degree of detail, objects, time, statistics, motion, sound, etc.) In Chapter 1, Communication, the first section, Media and Representations, consists of a very cursory review of a few communication models a post-MacLuhan analysis of the medium and the message, and a sub-section &titled "Production of need-oriented information". The second section, Media Consumption, discusses media market size and media-industry mapping (a classification scheme relating live media, sound media, film media, broadcast media, video media, models and exhibitions, graphical media, and telecommunications media). The third section, New Media, deals with electronic publishing, video, teletext, videotex, cable TV, databases, and mediateques. Both the latter two sections provide a distinctly European flavor (indeed there is considerable reference to Sweden throughout the book), but also show good awareness of North American activity. The fourth section of Chapter 1 The Information Society, discusses how humans evolved over the years Born writers to readers, some of the consequences of electronic publishing changes in media consumption, and the effects of the introduction of new media. The final section of the chapter, Screen Communication, focuses on the increasing prevalence of computers in information-provision, and includes discussions of visual displays, color description systems, the message on the screen, and computer print-outs.
Chapter 2, Perception, Learning and Memory, is more research-based than the first chapter, and has sections on Our Senses (but limits discussion to hearing and vision), Listening and Looking (including perception "laws", choice of information, the brain, picture perception, and a cognitive model), and Learning and Memory (a very cursory introduction to learning models, a more complete description of the current information-processing model of memory, an examination of the effects of human development, and a quite spurious section on illusions).
Chapter 3, which is nearly twice the length of the other three chapters, is entitled Literacy, and consists of sections on Language, Verbal Languages, Visual Languages, Linguistic Combinations, and a review of Current Research. Thesectiononverbal languages, after averyquickreviewofthe history of both spoken and written language and an equally quick cross-cultural comparison of languages, contains a demonstration whose point initially may be lost on the reader (as it was on me), but which begins to make some sense later, in the section on visual languages: all the characters in a paragraph of text, then all the words-first without, then with, punctuation-are sorted in differing orders. Then a sentence is depicted in several different fonts (type styles) and sizes, and another is shown upside down and in mirror image. The demonstrations are easy to accomplish on a microcomputer, but aside from being somewhat dramatic, do not really seem to lead anywhere (at least until later). More foreshadowing of the purpose of these demonstrations might have helped the reader make more sense of this section. The section Visual Languages discusses functions, levels of meaning, structure, properties, picture readability, classification of visuals, picture dimensions, and characteristics of visual languages. This section is much too long, and contains some material which, if sacrificed, would not be missed: measuring picture properties, which pedantically discusses properties of pictures in general and vague terms, but does not provide much useful information to either the practitioner or the researcher; the discussion of the picture circle, which borders on the pedantic; and the several pages of description of picture archives and databases (whose logical relationship to visual language is nebulous in any event).
Chapter 4, Designing visuals for Information, has sections on Content, Execution, Context, and Format. Under Content are discussed such factors as structure (including reference to degree of realism and of detail); factual content (the influence of characteristics of the objects used to depict something, time, place, and statistics); events (motion, sound, humour and satire, and relationships); credibility; and viewer completion. Execution deals with graphical elements, types of visuals, subjects, light, shape, size, color, contrast, emphasis, composition, perspective, technical quality, symbols and explanatory words, mixing and zoom, picture editing, and copyright. Context looks at the interplay of words and visuals, the interplay of visuals, and layout considerations. Format discusses image morphology, analogue and digital coding, perception of pixels, and image format categories.
I approached this book with the expectation and hope that it would provide useful advice to a practising instructional designer with an active interest in research on matters relating to instructional design. I was disappointed.
In the first place, there appears to have been a mismatch between my expectations and the author's intentions; after having read the book, I concluded that the book was probably less likely to have been aimed at a professional readership (e.g., practising instructional designers or message designers) than it was to have been intended as a text-book for a basic course in visual literacy (an idiosyncratic one, at that). In my own defense, although I have long since learned not to judge either the proverbial or the actual book by its cover, both the publisher's notes on the flyleafand the author's preface appeared to promise something other than what was delivered. Indeed, neither the flyleaf nor the preface suggests that the book might be approached as an undergraduate text for a visual literacy course (a judgment of worth for which function I will have to leave to someone teachingsuch a course), but the preface explicitly states 'This book is useful to practitioners as well as to researchers" (p. vi).
Secondly, the quality of the book, both in terms of the language used to express ideas (particularly in the first chapter) and in terms of the production values of the text itself, was-at best-quite uneven. Whether the responsibility for this should be borne by the author or the publisher is moot; it should not be inflicted upon a reader. A handful of examples will illustrate why I found it very difficult to immerse myself in the book: "Information processing is a scientific discipline comprising e.g. mathematical and numerical analysis plus methods and technics [sic] for administrative data processing." (p. viii) 'A representation, e.g. a visual, which is to be used to convey certain information, has a sender, one or more receivers and even a content, of course, a structure, a context and a format." (p. 4) ". . .my view of the interrelationships of various media in a twodimen-sional [sic] representation [sic] of a multidimensional reality" (p. 9) [Although the effect may have been lost in the transition to thispagein thejournal, thesecond hyphenated word was in the middle of a line in the book.] Typographic errors in a published work are, of course, not unheard of, but the frequency with which they occur in the first chapter suggests a very rushed job which is sorely in need of editorial attention. Pettersson also has a penchant for over-using quotation marks, which adds to the difficulty of reading.
Another point relating to the technical aspects of writing is that some sections are written in quite a scholarly manner, with full documentation of sources and citations, while other sections-only pages away-contain provocative statements or ideas which the reader may wish to follow up on, but which have no attribution or amplification whatever. Examples of the latter are: "Very small children view the world as being up-side-down. After a time, however, the brain somehow learns to process retinal images so that they are perceived to be right-side-up." (p. 63) 'When we look at a person who is walking or running, the eye records a series of stills which ultimately blend into one another and form a moving image." (p. 64) "Our Western society is dominated by the written word and extremely quadrangular. It is a society in which bureaucrats occupy quadrangular cells in such a way that creative and intellectually lively people are perceived as disturbing and disruptive features of the prevailing order. New ideas are effectively stifled. This leads to stagnation, industrial crises and a breakdown of the social fabric." (p. 78) ' I f people like the content in a visual, they like it even more when thevisual is presented in color and vice versa." (p. 233) "Substantial research has clearly shown that learning efficiency is much enhanced when words and visuals interact and supply redundant information. The improvement sometimes exceeds sixty percent and averages thirty percent." (p. 268) How much more useful Pettersson's summary statements would have been if they had had supporting references to the research from which the statements were abstracted! A noteable inconsistency appears within little more than the space of a page: "A visual should usually be in color but not in unrealistic colors" (p. 250; emphasis mine); but "[s]ometimes color enhances learning but in many cases black and white would be better" (p. 251-252). My reading of the literature certainly supports the latter statement, but not the former. The former statement, taken out of context and attributed to a widely published scholar, may be decidedly misleading.
Another feature of the book that needs to be improved considerably is the use of graphics. While they are numerous, and well spaced throughout the four chapters, they are also of uneven quality, particularly with respect to complexity and interpretability The only uniform features of the illustrations are that they are neither numbered for ease of reference, nor provided with adequate captions, nor-in the overwhelming majority of cases-referred to in the accompanying text. Pettersson, in summarizing research, notes that "[i]t was also concluded that when illustrations are not relevant to the prose content no prose-learning facilitation is to be expected, on the contrary there can be a negative effect [sic]" (p. 106), and that "[m]any illustration [sic] (often without legends) in contemporary textbooks appear to serve no useful purpose whatever" (p. 145). However, for many illustrations in his book, it is difficult to discern what the relationship is between an illustration and the text surrounding it, and sometimes even why the illustration was included at all. Exceedingly complex diagrams are left to stand on their own, with little or no accompanying explanation. One could imagine some of them being effective as overhead transparencies, bolstered by considerable verbiage from an instructor, but few of them seem capable of delivering a message on their own. Some of the illustrations are less than effective in other ways. In one, Pettersson attempts to illustrate that, in his words, "...image design can be changed a great deal without any major changes in the perception of image content" (p. 159; emphasis his). He illustrates his point with reference to three computer-generated graphics which differ, he states, by virtue of having changed 100 pixels. He neglects to mention that those 100 pixels represent something in the order of 1.2% of the pixels comprising the drawing (at least by my admittedly crude measurements, made from the printed page). Whether 1.2% constitutes "a great deal" might be arguable; elsewhere, he notes that "[t]he use of misleading illustrations in comparisons and statistics reduces the credibility of the message itself' (p. 233).
Pettersson does provide considerable technical detail in a number of places, which may be of use to those interested in making comparisons between different technologies (e.g., between the efficiency of storage of print vs. computerized text) or between different standards within similar technologies (e.g., NTSC and PAL television standards). Because of his relatively international perspective, he provides fodder for comparisons of other kinds, as well (e.g., copyright laws; picture database access). There are other bits and pieces scattered throughout the book that will likely interest those who examine media from a cross-cultural perspective.

REVIEWER
Earl R. Misanchuk is a Professor of Extension, University of Saskatchewan, Saskatoon, Saskatchewan S7N OWO.

Reviewed by Jonathon Marsh
It is made abundantly clear, from the opening quotation to the final summary, Dr. Haynes' primary intention in this book is to sell the idea of interactive videodisc (IV) technology as a means to educational revolution. While the book is a cleverly crafted, informative, and up to date overview of developments in the IV world, the force of the argument presented is not sufficient to support such a grandiose concept. It may well be that the impressive and disturbing set of figures provided by Rockley L. Miller (editor of the videodisk Monitor) in the forward are good indicators of future developments and trends in training and education. It is also possible that much of Haynes' vision of the future of education may be accurate. However, significant change in a social organization as complex as an education system involves a huge number of variables, only some of which are relevant to developments in media. The obvious difficulty involved in assessing the impact of such novel technology on the full spectrum of educational variables should suggest a modicum of caution when it comes to predictions. Strangely enough, this point is tacitly born out by Haynes himself during the course of his very excellent and comprehensive discussion of the historical factors leading up to the current state of the art. He repeatedly emphasizes the educational limitations and marketing difficulties generated by basic issues such as the lack of standardization both in video formats and disk mastering processes. Perhaps his intention is to demonstrate that his predilection for prediction is well tempered by a comprehensive knowledge of developments in the field. Unfortunately the net effect for the reader is confusing and one is left with a nagging sense of contradiction.
While, due to this confusion, it is difficult to fully share his enthusiasm for interactive video, it is not hard to appreciate the importance placed on interactive video in general. Haynes, like many before him, is quick to point out that the key issue is interactivity. For him our educational system is in dire need of change if it is to function; not just a surface change in areas concerned with the "whos" and "whats" of teaching and learning, but more critical change in the "whys" and "hows". Haynes advises us to broaden our vision of what 'getting an education' means if the students of today are to be made ready to cope with the complex demands of modern society. We must move from an elitist, restrictive, and fact oriented concept of education to a more freely accessed sharing of information oriented towards problem solving. Haynes suggests that interactive technology in general and interactive video in particular is the "change agent" required to promote just such an evolution. Not only does it "through innovative classroom use have the potential to augment standard pedagogy and.. .advance individualized and mastery learning" (p. 104), it is in the words of Karen Block a "symbolic technology" which can "qualitatively change the structure and function of mental activities such as problem solving or memory" (p. 96). Like most agents of change it is destined to be viewed with suspicion and mistrust until such time as it has proven its worth.
Such esoteric claims are surprisingly common and often poorly supported in the field of educational media. However in the case of this book they are well documented with references to case studies and such research as is available. The major points are clearly presented and situated squarely within a set of well defined historical constructs (if a better and more entertaining history of the development of interactive media exists it would be an interesting read indeed). The book includes a finely documented chapter on "the Standards Dilemma" which not only clarifies many of the issues surrounding software development and compatibility but examines them with specific reference to lessons learned from development projects within governmental, corporate, and to a lesser degree educational institutions. As for providing the reader with coverage of actual training implementations, Haynes outdoes himself. Instead of broadly outlining the results of various research projects, he provides a short synopsis of ten different studies, each of which reflects a different attempt to assess the worth of the technology within a particular context. Broad insights are derivable from these studies which Haynes comments on in an attempt to provoke the reader into thinking more deeply about the types of educational change he proposes.
Focal to Haynes' concept of educational change is the need for teacher competence in the use of new technology Too often has new wine been forced into old wineskins. He suggests that due to the significant increase in communication capabilities surrounding the new technology, we are faced with the need for a new form of curricular integration. Teacher, student, parent, professional, school, institution, and corporation must all be incorporated into the process of educating the individual if the full potential of the technology is to be realized. It is only reasonable to assume that in the early stages of such a development the teacher will be at the helm. However it is entirely unreasonable to assume that teachers with only limited understanding and competence in the use of new technology can meet the challenge. It is also futile to imagine that we will attract individuals to the teaching profession who can meet this challenge unless society as a whole undergoes some reassessment of their status as professionals.
As astute as Haynes is with respect to the educational implications of interactive technology it is rather disappointing to have him refer on numerous occasions to advances and developments in educational technology as apparently equivalent to advances and developments in instructional media (ie. newer and more powerful machine configurations). It is particularly disturbing as he does so after pointedly quoting Everett Rogers' model of a technology as "a design for instrumental action that reduces the uncertainty in the cause effect relationships involved in achieving a desired outcome" (Preface X). The fact that he clarifies his use of the term technology should indicate an understanding of educational technology as being concerned with the analysis and design of educational systems and processes (which usually includes knowledge of media use) and not with specific hardware configurations. It is unfortunate that a thinker who expresses such an obvious concern for systemic thinking in educational development, and who so adamantly emphasizes the primacy of good design principles in the application of media to the process of instruction, should so blatantly appear to misuse such a critical term.
While it is necessary to criticise this book on the above mentioned issues, it is also appropriate to laud it for its strengths. There is currently available a plethora of books, monographs, and articles concerned with the design and development of interactive media. Very little has been done however to provide us with a comprehensive look at the historical developments and educational implications of this technology. Access to such an overview is a necessity for anyone required to make well informed media-based training decisions. If one is inclined to consider Haynes' more esoteric claims as food for thought, then this book is ideal for meeting the need. Haynes does not sacrifice a very readable style for the sake of academic appearances. Although the book is not overly long (150 pages) the treatment of the subject is substantial and illuminating. These factors combined with the inclusion of a reasonably comprehensive glossary make the work extremely suitable as an introductory text for students interested in instructional media.
Karen Block's paper entitled "The Information Age in Education: Computer Assisted Learning" is included in the text as a subsection of Chapter 4.

REVIEWER Joanathon Marsh is Senior Educational Technologist, Educational Technology
Centre, City Polytechnic of Hong Kong.