Transforming the sensing and numerical prediction of high-impact local weather through dynamic adaptation

Mesoscale weather, such as convective systems, intense local rainfall resulting in flash floods and lake effect snows, frequently is characterized by unpredictable rapid onset and evolution, heterogeneity and spatial and temporal intermittency. Ironically, most of the technologies used to observe the atmosphere, predict its evolution and compute, transmit or store information about it, operate in a static pre-scheduled framework that is fundamentally inconsistent with, and does not accommodate, the dynamic behaviour of mesoscale weather. As a result, today's weather technology is highly constrained and far from optimal when applied to any particular situation. This paper describes a new cyberinfrastructure framework, in which remote and in situ atmospheric sensors, data acquisition and storage systems, assimilation and prediction codes, data mining and visualization engines, and the information technology frameworks within which they operate, can change configuration automatically, in response to evolving weather. Such dynamic adaptation is designed to allow system components to achieve greater overall effectiveness, relative to their static counterparts, for any given situation. The associated service-oriented architecture, known as Linked Environments for Atmospheric Discovery (LEAD), makes advanced meteorological and cyber tools as easy to use as ordering a book on the web. LEAD has been applied in a variety of settings, including experimental forecasting by the US National Weather Service, and allows users to focus much more attention on the problem at hand and less on the nuances of data formats, communication protocols and job execution environments.


Introduction
Suppose that, based upon a numerical model forecast available at 06.00 hours, thunderstorms are expected to impact Chicago's O'Hare International Airport at 16.00 hours local time, or 2 hours prior to the next forecast cycle. Airlines releasing dozens of coastal flights bound for O'Hare must make a go/no-go decision by noon, and loading additional fuel to accommodate potential diversions or in-flight holds will incur significant cost. If computer forecast technology could recognize the potential existence of thunderstorms over O'Hare several hours before their occurrence and automatically create additional forecasts at intervening times, perhaps in a rapid cycle, or if it were able to generate very fine grid nested ensembles in recognition of the expected large forecast error for this particular event, then the potential for economic loss and disruption to passengers might be reduced.
This, and numerous other scenarios of high-impact local weather, occur daily across a breadth of application areas and are characterized by one common factor: the inability to apply weather technologies when, where and how needed in order to meet the specific challenge at hand. For example, current operational models run largely on fixed schedules in fixed geographical domains, independent of anticipated or extant weather. (The exception in the USA is hurricane models, run in domains that follow hurricanes with time; the highresolution window of the National Weather Service Weather Research and Forecast (WRF) model (Michalakes et al. 2000;Skamarock et al. 2007); and the UK Meteorological Office model, windowed domains of which can be configured by forecasters to run on demand.) Additionally, most observing systems are not designed to adjust their data collection strategies to focus on specific events in response to a given user need or weather scenario. In the USA, WSR-88D (NEXRAD) Doppler radars (Crum & Alberty 1993) use a variety of volume coverage patterns based upon the general type of weather present (e.g. clear air, deep convection; Brown et al. 2000Brown et al. , 2005. However, they do not adaptively scan particular sectors of the atmosphere or focus on a specific weather event in lieu of the others at a given time.
Even if forecasters had the ability to launch specialized numerical predictions over O'Hare Airport in the case described above, or if forecasts could be triggered automatically based upon developing conditions, doing so would require immediate access to high-performance computers, networks and storage systems. Presumably, these facilities would be in use for other purposes, but would have to be rapidly reconfigured to deliver the needed quality of service to guarantee that forecast output would be available well in advance of the actual weather.
In an effort to meet the sorts of challenges described above, namely, a cyberinfrastructure in mesoscale meteorology that facilitates dynamically adaptive, on-demand resources and applications, the US National Science Foundation in October 2003 funded a 5-year Large Information Technology Research (ITR) grant known as Linked Environments for Atmospheric Discovery (LEAD; e.g. Droegemeier et al. 2005;Plale et al. 2006a;Ramakrishnan et al. 2007). A multidisciplinary effort, involving nine institutions and more than 100 scientists, students and technical staff, LEAD has created an integrated, scalable, web services framework in which meteorological analysis tools, forecast models and data repositories can operate as dynamically adaptive, on-demand, grid-enabled systems that (i) change configuration rapidly and automatically in response to weather, (ii) respond dynamically to inputs from users, (iii) initiate other processes automatically, and (iv) steer remote observing technologies to optimize the data collection for the problem at hand. Section 2 of this paper on LEAD describes the principal objectives of the programme, while §3 provides an overview of the service-oriented architecture (SOA). Section 4 provides a use case of dynamic adaption, while §5 outlines the steps required to create a computer forecast using LEAD. Section 6 shows real examples in which LEAD has been applied to severe weather prediction and §7 discusses LEAD education activities. A summary and overview of future activities are presented in §8.

Objectives
As a research and education project at the intersection of meteorology and computer science, LEAD has two major objectives (e.g. Droegemeier et al. 2007). The first objective is to lower the entry barrier for using, and increase the sophistication of problems that can be addressed by, complex end-to-end weather analysis and forecasting/simulation tools. Existing weather tools, such as data ingest, quality control and analysis/assimilation systems, as well as simulation/ forecast models and post-processing environments, are enormously complex even if used individually. They consist of highly sophisticated software developed over long periods of time, contain numerous adjustable parameters and inputs, require one to deal with complex formats across a broad array of data types and sources, and often have limited transportability across computing architectures. When linked together and used with real data, the logistical complexity increases dramatically, as does the ability for the system as a whole to provide high-quality results. Indeed, the control infrastructures that orchestrate interoperability among multiple tools, which notably are available only in a few institutions in highly customized settings, can be as complex as the tools themselves, involving thousands of lines of code and requiring months to understand, apply and modify. Indeed, students applying such models in graduate-level research often spend the majority of their time on technical, rather than scientific, issues.
Although many universities now run experimental forecasts on a daily basis using public domain software such as the WRF model (http://www.wrf-model. org/plots/wrfrealtime.php), many do so in very simple configurations using mostly local computing facilities and pre-generated analyses to which no new data have been added. LEAD seeks to greatly broaden the availability of advanced weather technologies for research and education, lowering the barrier to entry, empowering application in a grid context, increasing the realism of how technologies are applied, and facilitating rapid understanding, experiment design and execution.
The second objective for LEAD involves improving our understanding of and ability to detect, analyse and predict mesoscale atmospheric phenomena by dynamically adapting with or to it. The limitations of today's static environments were highlighted in §1, and Droegemeier et al. (2005) showed real examples in which dynamic adaptation can be of considerable practical benefit. A more complete discussion of dynamic adaptation, including key scientific questions, is presented in §4.
Achieving these goals requires more than translating existing capabilities into new information technology frameworks. Rather, it requires fundamental changes in how experiments are conceived and performed, in the structure of user application tools and middleware, and in methodologies used to observe the atmosphere. It has been the vision of LEAD to usher in these changes through a programme of directed basic research, underpinned by an actual web services infrastructure that both facilitates research and is itself a research project.

The LEAD service-oriented architecture
In the language of computer science, a service is an entity that carries out a specific operation, or a set of operations, based upon requests from clients, e.g. booking airline flights online. Web services are networked services that conform to a family of standards that specify most aspects of a service's behaviour and have been developed by a number of organizations. The LEAD architecture is a SOA, which refers to a design pattern based upon organizing all of the key functions of an enterprise or system as a set of services. The work of the enterprise or system is carried out by the so-called workflows that orchestrate collections of service invocations and responses to accomplish a specific task.
The services in LEAD comprise a complex array of functions, applications, interfaces and local and remote computing, networking and storage resources-the so-called environments-that can be used in a stand-alone fashion or linked together  in workflows to study mesoscale weather; thus, the name LEAD. Because individual services can be concatenated in many different ways (see Droegemeier et al. 2005), the LEAD framework provides users with an almost endless set of capabilities ranging from simply accessing the data and perhaps visualizing it to running highly complex and linked data ingest, assimilation and forecast processes in real time, and in a manner that adjusts dynamically to inputs as well as outputs. Further details can be found in Droegemeier et al. (2005).
Owing to the complexity of LEAD, particularly in the eyes of meteorologists who are not familiar with the technical aspects of a SOA, it is useful to illustrate schematically the logical structure of the LEAD environments. At the fundamental level of functionality, as shown by the top horizontal grey box in figure 1, LEAD enables users to accomplish the following: -Query for and acquire a wide variety of information including, but not limited to, the observational data (including real-time streams) and gridded model output stored on local and remote servers, definitions of and interrelationships among meteorological quantities, the status of an IT resource or workflow and education modules at a variety of grade levels that are designed specifically for LEAD. -Simulate and predict using numerical atmospheric models, particularly the WRF model system now being developed by a number of organizations. The WRF model can be run in a variety of modes ranging from basic (e.g. single vertical profiles of temperature, wind and humidity in a horizontally homogeneous domain) to very complex (full physics, terrain and inhomogeneous initial conditions in single forecast or ensemble mode). Other models (e.g. ocean) can be included, but are not fundamentally part of the LEAD system now being created. -Assimilate data by combining observations, under imposed dynamical constraints, with background information to create a three-dimensional atmospheric gridded analysis. As noted in the tools description below, LEAD supports the Advanced Regional Prediction System (ARPS; Xue et al. 2000Xue et al. , 2001Xue et al. , 2003 data assimilation system and soon will incorporate the WRF three-dimensional variational (3DVAR) data assimilation system (Barker et al. 2005). -Analyse and mine observational data and model output to obtain quantitative information about the spatio-temporal relationships among fields, processes and features. -Visualize and quantitatively evaluate the observational data and model output in one-, two-and three-dimensional frameworks using batch and interactive tools.
To achieve these fundamental capabilities, LEAD supports a large number of tools (second grey region from the top in figure 1) ranging from simple services to highly sophisticated meteorological, data mining and visualization packages. Within this array is a subset of foundational application or productivity tools that include: -LEAD portal (http://portal.leadproject.org; Gannon et al. 2007a), which serves as the primary, though not exclusive, user entry point into the LEAD environments. -ARPS data assimilation system (ADAS; Brewster 1996), a sophisticated tool for data quality control and assimilation, including preparation of model initial conditions. -MYLEAD (Plale et al. 2004, a flexible personalized data management tool that, at its core, is a metadata catalogue. MYLEAD stores metadata associated with the data products generated and used in the course of scientific investigations and education activities. -WRF model (Michalakes et al. 2000;Skamarock et al. 2007), a next-generation atmospheric prediction and simulation model that runs on single or multiple processors at grid spacings ranging from metres to hundreds of kilometres. -Algorithm development and mining (ADaM; Rushing et al. 2005), a powerful suite of tools for mining observational data, assimilated datasets and model output. -Integrated data viewer (IDV; Murray et al. 2003), a widely used desktop application for visualizing, in an integrated manner, a broad array of multidimensional geophysical data in one, two and three dimensions.
The power of LEAD lies not only in the capabilities of its various tools individually, but also more importantly in the manner in which they can be linked together in workflows to solve a broad array of problems (e.g. Droegemeier et al. 2005). The tangible outcomes (bottom bar in figure 1) include datasets, model output, gridded analyses, animations, static images and a wide variety of relationships and other information that lead to new knowledge, understanding and ideas. The fabric in figure 1 that links the top set of requirements with the bottom set of outcomes-namely, enabling functions and extensive middleware and service capabilities-is the cyberinfrastructure that makes LEAD functional.

Dynamic adaptation
The ability of LEAD to dynamically adapt to changing weather represents a challenge of considerable nonlinearity that can best be illustrated through a simple example (see Droegemeier et al. (2005) for real cases). The far left side of figure 2 depicts observations transmitted by a variety of observing systems, including the NEXRAD Doppler radar network, as well as forecast model output. These data can be processed, separately or in parallel, by a data mining system (marker 1), within which exists a persistent agent that searches for user-defined atmospheric conditions associated with the development of deep convection (e.g. instability, precipitation on radar of a certain intensity or vertical extent). Alternatively, the observations and model output can be assimilated using algorithms, such as ensemble Kalman filtering (e.g. Tong & Xue 2005) or four-dimensional variational methods (e.g. Rihan et al. 2005), to produce gridded fields of all relevant atmospheric quantities. The mining agent can then be applied to the resulting quantities to search for specified values or patterns indicative of convection.
If such conditions or features are found, LEAD automatically triggers a WRF numerical forecast (marker 2), the specific grid spacing, domain size, forecast duration and allowable wall clock time of which are communicated by a brokering agent to the TERAGRID (marker 3; www.teragrid.org). If resources available within the TERAGRID at that time are insufficient, LEAD adjusts model parameters based upon user-assigned priorities until the job can be run and output returned sufficiently quickly. Such an output (marker 4) is analysed by the same data mining engine used previously in an attempt to identify regions in which targeted observations might improve forecast quality (this operation could be performed on sensitivity fields as well; e.g. Errico & Vukicevic 1992;Park & Droegemeier 2000). If such regions are found, an agent communicates with an adaptive observing system, such as the radars being developed by the US National Science Foundation (NSF) Center for the Collaborative Adaptive Sensing of the Atmosphere (Brotzge et al. 2006;Plale et al. 2006a), and new targeted observations are collected. The process then repeats or is modified automatically if other specified criteria are met.
It is important to recognize that the objective of adaptive systems is to improve upon their static counterparts in some manner, ideally one that formally optimizes, or at least quantitatively improves upon, certain aspect(s) of performance. Systems or components may adapt in time, space or modality, and the adaptation can be automated, manual, objective, heuristic, etc. Furthermore, adaptation can occur in a variety of locations within the system (i.e. within the workflow or within a model), at multiple levels and in highly connected, nonlinear ways.
It is important to note that the dynamic adaptation has limitations as well. For example, by repeatedly being exposed to output from a prediction model running in the same configuration day after day, forecasters learn model idiosyncrasies and thus are able to account for them when interpreting output. Furthermore, many forecast products need to be available on a regular basis, and thus products generated via dynamic adaptation are viewed as an important supplement to, but not wholesale replacement for, them. As described in §6, LEAD is exploring the modality and value of adaptation in the context of severe convective storm forecasting.

Creating a mesoscale numerical forecast using LEAD
A user unfamiliar with the enormous technical complexities of producing a single mesoscale model forecast, even a static one, using real observations and data assimilation can, with LEAD, configure such a forecast in a matter of minutes. (At the 2006 Unidata User's Workshop in which some 50 faculty, staff and researchers applied LEAD in the manner described ), Prof. David Dempsey, San Francisco State University, noted that 'I spent days last summer learning how to install, configure, run and display output from the WRF model. With LEAD, I was able to do virtually the same thing in part of an afternoon, and I needed far less computer expertise to do it'.) Note that LEAD does not hide the complexity of the associated cyberinfrastructure but rather, by placing it in a web services framework, allows the user to manage it effectively and thereby focus considerably more attention on the research or education problem at hand. The first step in conducting a forecast involves selecting observations and, for this, LEAD has developed a complete data query, acquisition and storage system (Plale et al. 2004) that leverages existing infrastructure, including the Unidata framework ) and THREDDS catalogues and servers . Observations from numerous datasets relevant to mesoscale meteorology are held online for up to 6 months and made accessible to the user via a geographical selection tool (figure 3). Next, the EXPERIMENT BUILDER (figure 4a) is invoked, allowing the

893
Dynamically adaptive weather prediction user to define projects and, within them, conduct experiments. In most cases, experiments involve executing workflows (Gannon 2007;Gannon et al. 2007b), which consist of individual web services chained together to complete a given task (e.g. assimilate the data to produce initial conditions, execute them in a model, send output to a visualization engine, store output). In LEAD, a graphical tool (figure 4b) is used to compose, edit and monitor workflows, which are invoked on a specified resource to perform the desired work. Users may select workflows from a repository and use them as is, modify them, or generate their own. User-developed tools (e.g. a forecast verification code) can be converted into and registered as services within LEAD and thus added to workflows. In order to ensure quality of service for time-critical meteorological experiments, especially forecasts (see §6), LEAD developed a fault tolerance recovery system (Fowler et al. 2008;Kandaswamy et al. 2008) based upon concepts such as over-provisioning and job resubmission.
Given the numerous functions performed by LEAD, the automated generation and cataloguing of metadata is critical for managing large datasets, maintaining provenance information and allowing users to organize and search hundreds of experiments that might be part of a single project. LEAD performs these tasks using a metadata schema built upon existing geophysical standards (Plale et al. 2006b;Simmhan et al. 2007). Metadata, output and other information are stored automatically in the MYLEAD catalogue (figure 5; Plale et al. 2004Plale et al. , 2007, which is a personal workspace for each user that accommodates sharing across a variety of domains. Users may visualize forecast output using the Unidata IDV (Murray et al. 2003) and apply a variety of post-processing tools including the ADaM data mining environment (Rushing et al. 2005).
6. Real-time application of LEAD to severe weather prediction

(a ) Static deterministic and ensemble forecasts
For more than a decade, the Center for Analysis and Prediction of Storms (CAPS) at the University of Oklahoma has collaborated with the National Oceanic and Atmospheric Administration (NOAA) Storm Prediction Center (SPC) and National Severe Storms Laboratory to study fine-scale atmospheric predictability via real-time forecasts performed during the US spring severe weather season. During spring 2005, this work involved using the WRF model to create 2 km grid spacing forecasts over two-thirds of the continental USA, with initial conditions specified by the National Centers for Environmental Prediction (NCEP) operational model analysis (Kain et al. 2005(Kain et al. , 2008. The forecasts provided dramatic evidence that the predictability of organized deep convection is, in some cases, an order of magnitude longer (1 day) than suggested by prevailing theories of atmospheric predictability (e.g. Lilly 1990).
In 2007, with additional funding from NOAA, LEAD applied some of its technology to the NOAA Hazardous Weather Test Bed (http://www.nssl.noaa. gov/hwt/), which is a multi-institutional programme designed to study future analysis and prediction technologies in the context of daily operations. The 2007 effort sought to go well beyond previous capabilities by addressing the following two important LEAD-related challenges: (i) the use of storm-resolving ensembles for specifying uncertainty in model initial conditions and quantifying uncertainty in model output and (ii) the application of dynamically adaptive, on-demand forecasts that are created automatically, or by humans, in response to existing or anticipated atmospheric conditions. Specific goals include providing an initial assessment of the following: -quantitative skill of storm-resolving ensemble forecasts compared to their deterministic counterparts at similar (experimental) and coarser (operational) grid spacings; -predictability of deep convection and organized mesoscale convective systems; -the extent to which dynamically adaptive prediction leads to quantitative forecast improvements, possible negative consequences of adaptation and an evaluation of strategies for making decisions regarding when, where and how to adapt; and -the ability of the TERAGRID to accommodate both scheduled and on-demand applications that have strict quality of service requirements and use a substantial portion of available resources for an extended period of time. The 2007 Spring Experiment (see also Kong et al. 2007a;Weiss et al. 2007;Xue et al. 2007) extended from 15 April to 8 June with all forecasts run on dedicated NSF TERAGRID resources at National Center for Supercomputing Applications (NCSA) and Pittsburgh Supercomputing Center (PSC). The forecast suite included the following: -a 33 hour, 10 member, two-thirds continental US-scale ensemble at 4 km grid spacing (run at PSC using a mixture of initial condition and physics perturbations); -a 33 hour, single 2 km grid spacing deterministic forecast in the same domain as the ensembles; -one or more 6-9 hours nested grid forecasts at 2 km spacing launched automatically over regions of expected severe weather, as determined by mesoscale discussions or tornado watches (run at NCSA); and -one 6-9 hours nested grid forecast, per day, at 2 km grid spacing launched manually when and where deemed most appropriate (run at NCSA).
Much of the technology used to run the ensemble forecasts already existed within CAPS and thus was used. An important aspect of the Spring Experiment was that the daily forecasts were evaluated not only by operational forecasters in the NOAA SPC, but by dozens of faculty staff and researchers who visited the Hazardous Weather Test Bed in Norman, Oklahoma during the 7 week period. A formal procedure is employed by the SPC to evaluate the daily forecasts, and additional details may be found in Kain et al. (2008).
To provide an example of the sorts of output generated by the WRF model in comparison with the observations (see Xue et al. (2007) for more details), figure 6 shows NEXRAD radar composite reflectivity (a proxy for instantaneous precipitation intensity, with the composite value computed as the maximum over all altitudes within a given vertical column) at 1800 on 24 May 2007, and figure 7 shows the ensemble mean and spread of forecast reflectivity, the ensemble-derived probability of reflectivity exceeding 35 dBZ and a 'spaghetti' plot of 40 dBZ reflectivity contours, valid at 1800 on 24 May 2007. Owing to the spatially discrete nature of the convection, the magnitude of the ensemble mean is not very meaningful (Kong et al. 2007b)-the ensemble mean will almost surely underestimate intensity.
The only region where the mean reflectivity exceeds 40 dBZ is southeast Minnesota (figure 7a), where the cold front consistently anchored the precipitation among most of the ensemble members (Xue et al. 2007). Maximum mean reflectivity remains in the 30-40 dBZ range along the line, and the positioning of the middle portion of this line is no better than that of the control or 2 km forecast. Figure 7b shows that the spread of reflectivity has values between 20 and 25 dBZ along much of the line, indicating significant uncertainties at the convective scale. Figure 7c shows that the pattern greater than or equal to 50 per cent probability of reflectivity exceeding 35 dBZ matches that of the largest ensemble mean reflectivity very well, which therefore has a similar position error. The spaghetti plot in figure 7d suggests more spread in the position than the probability field and is therefore more indicative of position uncertainty.
(b ) Dynamically adaptive forecasts As noted previously, LEAD also conducted two types of on-demand, dynamically adaptive forecasts during the 2007 Hazardous Weather Test Bed experiment: those launched at the discretion of forecasters and those launched automatically over regions of expected hazardous weather, as determined by mesoscale discussions or severe weather watches. The former are described in Brewster et al. (2008), so we focus here on the latter.
Using its ENSEMBLE BROKER client (Alameda et al. 2007), the interface for which is known as SIEGE, NCSA developed a simple trigger capability that parses mesoscale discussion and severe weather watch information from the SPC, via a Really Simple Syndication (RSS) feed, and instructed the ensemble broker to launch 6 hour WRF forecast workflows accordingly (Wilhelmson et al. 2008). Typically, 18 km singly nested and 2 km triply nested forecasts were triggered automatically using the NCEP North American Model (NAM; e.g. Rogers et al. 2005) gridded data and the WRF processing package for initialization. This trigger service also generated identifiers that made deriving statistics from the end products relatively easy. The domain centres of all forecasts triggered automatically in this manner-over 1000 in total-are shown in figure 8, and quantitative evaluation of the results is now underway. The 2008 experiment followed closely the 2007 model configuration, except for the use of a larger domain, forecasts initiated at 0000, with a duration of 30 hours and the assimilation of WSR-88D (NEXRAD) level II data in all but one of the ensemble members. For additional details, see Xue et al. (2008) and Kong et al. (2008). The verification of forecasts containing explicitly resolved convection poses a considerable challenge, as noted by Kong et al. (2007a,b), owing to the fact, for example, that a forecast having zero overlap between a predicted storm and its real-world counterpart may have significant practical value, yet produce a very low skill score based on conventional statistical measures. Adding dynamic adaptation makes verification even more problematic. With these challenges in mind, results from the experiments described above are being analysed, both qualitatively and quantitatively, and will be reported in forthcoming papers.

LEAD in education
LEAD operates an education and outreach programme ( Yalda et al. 2005Clark et al. 2007;Meyers et al. 2007) that seeks to bring authentic learning experiences to students across a variety of age groups. One example in which LEAD has been engaging college and university students is weather challenge (WXCHALLENGE).
During the past few decades, the academic meteorology enterprise has supported a national collegiate weather forecast contest (WXCHALLENGE; http://www. wxchallenge.com/) that seeks to engage both graduate and undergraduate students in practical forecasting under a variety of geographical and phenomenological circumstances. In it, individual participants forecast maximum and minimum temperature, precipitation category and maximum sustained wind speeds for selected US cities. WXCHALLENGE provides students with an opportunity to compete against their peers and faculty mentors at other institutions (64 nationwide in [2006][2007], the prize being a trophy and bragging rights for a full year. In spring 2007, 75 students and faculty staff from 10 institutions (seven non-LEAD and three LEAD), who already were participating in WXCHALLENGE, were invited to join a 4 week pilot programme (two 2 week station forecasts plus a 3 week tournament extension) in which they were allowed to generate their own daily WRF forecasts from the LEAD portal and use them, in conjunction with standard products from the National Weather Service, to prepare their forecast (Clark et al. 2008). The pilot project was designed to broaden the exposure of LEAD to a limited number of users for the purpose of evaluating the system stability and reliability, ease of learning and use, the ability of TERAGRID to accommodate dozens of on-demand forecasts, the potential benefits wrought by local models applied to local forecast problems and the manner in which students chose to configure their forecast. Perhaps most importantly, WXCHALLENGE placed very sophisticated technology under the control of students, thus providing a hands-on opportunity to learn about numerical modelling, physics parametrizations, computational fluid dynamics, web services, etc (Clark et al. 2008).
During the 7 week pilot project, users launched a total of 279 workflows and generated 0.6 TB of output. Over 160 processors were reserved on the Tungsten system at NCSA 5 days each week from 1000 to 2000 hours EDT. Upon completion of the pilot programme, participants were asked to complete a survey, the results of which are being used to refine the production release of the portal and prepare for an expanded release to a much larger population of WXCHALLENGE participants in autumn 2008 (see Clark et al. 2008).

Future enhancements, opportunities and remaining challenges
LEAD research and development during 2008-2009 focuses principally on enhancements designed to increase the functionality and stability of the system and thus to enable broad, sustained use, maintainability and extensibility well beyond the NSF ITR grant, which ends in autumn 2009. Foremost is the ability for users to edit namelist input files and incorporate them into workflows for key applications, such as WRF and ADAS. Closely related is the capability for users to edit, compile and manage their own versions of application source codes (e.g. WRF)-a capability that now exists, but has not yet been exposed in a general fashion. Also being developed is the capability for users to run 'parametric workflows', in which selected parameters of the WRF (for example) can be varied systematically over a specified range within a single workflow.
The LEAD vision also involves building upon growing education and outreach programmes  to extend many of the LEAD resources into progressively lower grade levels and into communities for which even basic capabilities are unavailable. Because weather is experienced by every human and is an excellent motivating factor for studying science, our vision includes using LEAD to stimulate interest and broaden participation in science, technology, education and mathematics education at the grade levels where most students choose, sometimes unwittingly, to avoid science as a career.