Ten Practical Questions to Improve Data Quality

On the Ground High-quality rangeland data are critical to supporting adaptive management. However, concrete, cost-saving steps to ensure data quality are often poorly defined and understood. Data quality is more than data management. Ensuring data quality requires 1) clear communication among team members; 2) appropriate sample design; 3) training of data collectors, data managers, and data users; 4) observer and sensor calibration; and 5) active data management. Quality assurance and quality control are ongoing processes to help rangeland managers and scientists identify, prevent, and correct errors in past, current, and future monitoring data. We present 10 guiding data quality questions to help managers and scientists identify appropriate workflows to improve data quality by 1) describing the data ecosystem, 2) creating a data quality plan, 3) identifying roles and responsibilities, 4) building data collection and data management workflows, 5) training and calibrating data collectors, 6) detecting and correcting errors, and 7) describing sources of variability. Iteratively improving rangeland data quality is a key part of adaptive monitoring and rangeland data collection. All members of the rangeland community are invited to participate in ensuring rangeland data quality.


Introduction
High-quality data are a critical component of rangeland research and management where short-and long-term implications of management decisions have significant policy, economic, and ecological impacts. Data collected on rangelands are diverse, collected by observers, sensors, and remote sensing through inventories, monitoring, assessments, and experimental studies. Rangeland data are used and re-used in a variety of management and research contexts. Rangeland data applications include, but are not limited to, adjusting stocking rates 1 ; evaluating conservation practices 2 ; assessing land health at local, regional scales, and national scales [3][4][5] ; determining restoration effectivness 6 , 7 ; developing or improving models 8 , 9 ; and advancing our understanding of rangeland ecosystems responses to management decisions 10 and natural disturbances. 11 To evaluate progress toward meeting management objectives, managers often use a combination of datasets. 12 Use-based monitoring, such as forage utilization, enables managers to adapt management in response to shortterm thresholds. 1 Site-scale monitoring data collected using probabilistic sample designs are often used to infer condition and trend across spatial and temporal scales, 13 such as the Natural Resources Conservation Service (NRCS) National Resources Inventory (NRI) and Bureau of Land Management (BLM) Assessment Inventory and Monitoring (AIM) programs. In all uses of rangeland data, confidence in datasupported decision-making is boosted by high-quality data and eroded by errors and data issues. These issues also relate to rangeland research, where inference from research studies, experimental monitoring, treatments, and practices are also used to support management decisions. 6 For example, the National Wind Erosion Research Network (NWERN) uses a small number of research sites to calibrate dust emission models that can then be run on monitoring datasets such as AIM and NRI to provide managers and conservation planners with dust estimates. 8 If the data from NWERN were found to be faulty, all subsequent dust estimates across multiple study sites would also be faulty. Therefore, any discussion of rangeland data must be paired to a discussion of data quality among land and natural resource managers, conservation planners, and researchers.
Ensuring data quality involves more than maintaining and managing data. This distinction is often overlooked in rangeland research and management, 14 despite the widely recognized need for quality data to support effective decisionmaking. Data quality describes the degree to which data are useful for a given purpose due to their accuracy, pre-cision, timeliness, reliability, completeness, and relevancy. 15 Data management is the process of collecting, annotating, and maintaining quality data so they are findable, accessible, interoperable, and re-usable. 16 Recent efforts to improve rangeland data quality have focused on improving the effectiveness of data management, 17 including describing the data lifecycle, 18 building data management plans, 19 following data standards, 20 using metadata, 21 and leveraging software for data management. 22 Although high-quality data are a consequence of good data management and good data management identifies data quality issues, data management is not the only process that contributes to data qualit y. Data qualit y is also the result of clear communication among team members, well-documented study objectives, careful selection of methods and sample designs, adequate training, and frequent calibration, and appropriate analysis. 23 All members of the rangeland community, including data managers and data collectors, have a role in improving and maintaining data quality. 14 While the importance of data quality is broadly accepted in the rangeland community, specific steps for ensuring data quality are often unclear, overlooked, or considered synonymous with data management. To address data quality, many inventory and monitoring efforts refer to quality assurance (QA) and quality control (QC) as "QA/QC," but the meaning of QA/QC can be highly variable between programs and individuals. 12 , 24 The purpose of QA/QC is to increase the repeatabilit y, defensibilit y, and usabilit y of data by 1) preventing errors whenever possible, 2) identifying errors that do occur, 3) fixing the error with the correct value if possible, and 4) describing and noting remaining errors that cannot be fixed so they can be excluded from analyses. 23 To achieve these goals, all members of a study or monitoring team, which includes data managers, must have a shared understanding of data quality and what actions they are responsible for to ensure the desired level of data quality is attained.
We find it useful to separate the term QA/QC into its different components: QA and QC ( Fig. 1 ). QA is a proactive process to prevent errors from occurring 12 12 , 23 and includes outlier, logical, and missing data checks and expert review of data that occur sometimes iteratively throughout the data life cycle. Although QA and QC are two distinct processes, both are question driven. QA asks "What could go wrong? How can we prevent it?" and QC asks "What is going wrong? What did go wrong? Where did it go wrong? Why did it go wrong? Can we fix it?" Because both sets of questions are important, we encourage the rangeland community to adopt "QA&QC," rather than "QA/QC," which implies that one can exist without the other and is frequently interpreted as a single process (QC).
Here we present 10 practical, overarching QA&QC questions for the rangeland community to adopt ( Table 1 ). If asked regularly and answered thoroughly, these 10 questions can Figure 1. The data lifecycle documents the progression of data through planning, data collection, data review, data maintenance and storage, and data analysis and interpretation. Quality assurance occurs continuously throughout the data lifecycle, whereas quality control begins after data are collected. For simplicity we have only identified five lifecycle stages. However, this framework can easily be expanded or contracted to accommodate a different number of lifecycle stages. 14  Additionally, Questions 9 and 10 can be considered QC questions for the current data collection cycle and QA questions to adapt future data collection. These questions are used to establish projects, build data management plans, evaluate existing research and monitoring programs, prioritize limited resources, and improve collaboration within data collection efforts. Figure 2. A general conceptual model of the data ecosystem and data flow. Monitoring data can exist in a range of states. Raw data include the original observations or values in paper format, personal electronic file (e.g., Excel, Microsoft Access database, and ESRI file geodatabase), or in an enterprise database (e.g., SQL or Postgres). Raw data may be transcribed from paper to an electronic file to a database. Indicators are derived from the raw data, which can be direct indicators (e.g., bare soil, vegetation composition) or combined with covariates to produce modeled indicators (e.g., dust flux). Data may also exist as interpretations of monitoring data using benchmarks, site scale analysis, or landscape analysis. For each data state, there is an opportunity for data to degrade due to errors of omission (i.e., missing data), commission (incorrect values or observations), or incorrect assumptions regarding the data. Once raw data are in a degraded state it is extraordinarily difficult to achieve a reference state again, although it may be possible to reverse degraded indicators and interpretations. For every type of data, metadata provide critical "data about the data" that enables the use and re-use of data. Rangeland managers and scientists who work with data can build a more detailed version of this conceptual model, appropriate to their data, to anticipate resource needs, potential weak points in the data flow, and where QA&QC steps can prevent or correct degraded data.

What is my data ecosystem?
Successful implementation of QA&QC is most effective when data collectors, data managers, and data users have a shared understanding of what kinds of data are being collected, how those data are collected and stored, how data will be used, and where there are opportunities for error. 19 To build this shared understanding, we recommend constructing a conceptual diagram of the data ecosystem ( Fig. 2 ). In describing the data ecosystem, scientists and managers identify different kinds of data they are working with, how those data might be transformed from data collection to data storage, to data analysis, and how those data will be documented through metadata. This helps identify where personnel and technological (e.g., data collection applications, databases, and analysis software) resources are needed and anticipate weak points and opportunities for preventing errors. Within the data ecosystem, it is useful to envision different states (e.g., raw data, calculated indicators or variables, and interpreted data) as well as what each of those states might look like when they are corrupted. If we can anticipate the conditions under which the data no longer accurately represent rangeland condition, it is easier to prevent those issues from occurring. For example, in building a conceptual model of a data ecosystem, a team might notice that they are planning to collect data on paper and store those data in a database. However, the team might note that they currently do not have a process for digitizing the data so that it can be ingested into the database, therefore additional staff time will be needed to enter and check those data to prevent transcription errors. Similarly, while describ-ing the anticipated analyses, a team realizes that the planned database schema will require transforming data into another data format, so they are able to plan and automate that process.
Although calculated and interpreted data can often be restored with some effort as long as the raw data are sound, the opportunities for degraded raw data to be corrected are limited because it is difficult, if not impossible, to replicate field conditions from the raw data collection event. 25 The kind of data (e.g., qualitative vs. quantitative, sensor vs. observational) and available resources available will guide the selection of appropriate data quality actions. 26 The conceptual model of the data ecosystem also recognizes that errors will occur, and therefore includes a process for documenting errors in metadata documentation when they do occur. It is incumbent upon land managers and researchers who collect and use rangeland data to have a detailed conceptual model of their data to enact a data quality plan that promotes a desirable data workflow, preserves data quality, and documents the data and any known issues.

What is my data quality plan?
A data quality plan, informed by an understanding of the data ecosystem (Question 1), can make it easier to anticipate where there are opportunities for error and how those errors can be prevented. A data quality plan describes 1) how sample designs and analyses are checked to make sure they meet objectives, 2) strategies for data collector training and calibration, 3) descriptions of the maximum allowable variability in the data, 4) how to detect errors, 5) how to correct those errors if possible, and 6) how to properly annotate the errors so the original value is still recorded and an explanation of the change is given. For instance, how will the team handle location coordinates that look incorrect? Where will the original value be recorded, and how will the change be described? This is necessary in case the updated value is later proven to be incorrect and an additional change based on the original data is needed.
A data quality plan should encompass the entire data lifecycle ( Fig. 1 ), from sample design to analysis, and address the role of each team member in the data collection effort. 27 Because data quality tasks are often captured across a range of documents, it is important to plan how and where you will describe your data quality plans. 28 In addition to important QA&QC steps recorded in data management plans, other data quality plans might be described in protocol documents, 12 sample design documentation, 29 and analysis workflows. 30 We also encourage developing a process for revising the data quality plan in response to insights gained from collecting, managing, and anal y zing data. Assigning version numbers and dates to data quality plans will help future data users understand the data ecosystem at the time data were collected. With a documentation strategy in place, Questions 3 to 10 can be used to populate and improve those data quality and data management plans.

Who is responsible?
Rangeland data collection is often a collaborative, interdisciplinar y process. 6 Ever y member of the monitoring or study team who interacts with data is responsible for maintaining and ensuring the quality and integrity of those data. While in some cases the land manager, project leader, data collector, data manager, analyst, interpreter, and data QC specialist are the same person, often these roles are filled by multiple individuals with different levels of experience or even from different organizations. For instance, the data collector may have little connection to how the data are anal y zed and interpreted, whereas the data manager and analyst sometimes are not intimately familiar with the data collection protocols. Within data collection teams, assigned roles and responsibilities also ensure that data quality tasks are appropriately distributed according to skillset. This is particularly important as data collectors also have the greatest power to detect and correct errors before they are embedded in the dataset. Without a shared understanding of how quality data will be collected and stored, errors are likely to occur. Therefore, c lear ly defining who is responsible for what, and when, is critical to successfully maintaining data quality. 19 Discretely identified roles that c lear ly tie to the broader monitoring or study objectives empower each member of the team to take ownership of preventing, detecting, correcting, and documenting any errors within their domain and toolset. Detailed timelines of when tasks are to be completed can help budget resources to complete data quality tasks and identify where there might be lapses in data quality due to heavy workload. The longer data stay in a file cabinet or hard drive, the more institutional knowledge is lost as data collectors leave and project leads focus on other projects. Clearly communicating roles has added benefits when multiple kinds of data are involved, as collecting and managing observational data may have different requirements compared with sensor data. 31

How are data collected?
Data quality steps will differ depending on whether data are collected electronically or on paper data sheets. Electronic data collection applications provide a cost-efficient method of quickly capturing accurate data while at the same time reducing error rates. 31 , 32 For instance, hand-recorded geospatial coordinates are often transposed or erroneous. Electronic data capture of study locations can reduce this common error. While more and more data collection programs use electronic data collection, 33 , 34 considerable amounts of rangeland data are still recorded on paper datasheets. Although upfront costs of equipment purchase, training, and form design to support electronic data capture are greater than paper, these are up-front investments whereas the labor costs of data entry and error checking are continual ( Table 2 ). 32 Initial knowledge required to design electronic forms for field data collection may take time, but once the skill is learned, subsequent forms can be developed quickly with minimal effort and eas- ily shared within the range community either through rangeland specific applications (e.g., Database for Inventory, Monitoring, and Assessment, 33 Vegetation GIS Data System, 35 and LandPKS 34 ) or customizable survey software (ESRI Sur-vey123 forms, https://survey123.arcgis.com/; Open Data Kit, https://opendatakit.org/). Electronic data capture also improves data quality through automated data quality checks (see Question 8), automated geospatial data capture, setting allowable data ranges, field standardization (e.g., only numbers allowed in number fields), and controlled domains or options (e.g., plant species name codes) for each field, and automatically linking different data types (e.g., photos and tabular data). Cloud-based data uploads from mobile devices to enterprise databases (e.g., ESRI's Survey123 to ArcGIS online workflow) and automated QC scripts (e.g., the Georgia Coastal Ecosystems sensor QC toolbox) enables real-time error checks that provide feedback to data collectors. This allows data collectors to correct issues if necessary during the field season. 30 , 31 We encourage the rangeland community to explore the many low-cost options for electronic data capture, but do recognize that paper data collection may be the appropriate solution for some data collection teams due to lack of resources, the size of the team, and some field settings (e.g., wet conditions where waterproof devices are unavailable or remote locations where recharging batteries is difficult). At a minimum, it is important to have a paper data collection plan as a backup, as screen glare, extreme temperatures, low batteries, and lack of signal are all common challenges of using electronic data capture. Raw data in an electronic format are also easily ingested into electronic data storage platforms or databases (see Question 5). Emerging data collection mobile platforms (e.g., ESRI Survey123, Open Data Kit) allow for cloud-based data upload and automated data submission. Additionally, a comprehensive data capture and data storage workflow can make rangeland data more readily available for use in datasupported decision-making and research. We anticipate that the availability of electronic data capture applications and central data repositories will continue to increase and become integral to rangeland data collection.

How are the data stored and maintained?
Proper data management before, during, and after a study is one of the most critical, and often overlooked, parts of data quality. 36 Improper data management can lead to loss of data, reduced inference, misleading conclusions, improper exposure of personally identifiable information, an erosion of trust in the data (by stakeholders or the public), and inability for others to use data in both the short-and long-term. 27 Rangeland data includes raw data (see Question 4), as well as calculated indicators or variables, sample design information, interpreted data, additional tables (e.g., crosswalk tables or those with site level information), geospatial data, and analysis datasets (e.g., benchmarks). Planning for data management includes identifying standard formats for field types (e.g., date, text, integer formats), creating naming conventions, and setting up file and folder structures, backup plans, and security for protected and personally identifiable information. 20 , 27 Recent technological and practical advances enable data management to proceed more quickly and efficiently than ever before. 32 These advances include practical guidance on structuring data as "tidy data,"where each observation unit is a row, each variable is a column, and each observation is a cell. 22 Although open-source text files and spreadsheets like Microsoft Excel may be used for storing and visualizing rangeland data, relational databases, such as the ESRI file geodatabase and Microsoft Access, open source databases such as MySQL, and enterprise versions of these databases (e.g., SQL Server, Postgres) allow users to link different kinds of tidy data together in a coherent structure. Relational databases 1) improve storage and access to data by allowing users to efficiently organize and search the database, 2) support complex queries and calculations that present the data in different ways, 3) visualize the data from multiple different viewpoints to aid in the QA&QC and analysis processes, and 4) centralize data across data collectors and over time. 37 Data management and storage systems also make it easier to share and standardize data, either directly with partners, via web services, or to data repositories. In addition to storing raw, calculated, and anal y zed data, data management also in-  7) is an important process to minimize observer variability in the line-point intercept method (A), especially when the true value is not known or is difficult to measure. 12 For successful calibration in the BLM AIM and the NRCS NRI programs, the line-point intercept absolute range of variability among observers should be less than or equal to 10% (B). 12 , 50 Photo courtesy of Rachel Burke

Box 1 Calibration among data collectors
Calibrating data collectors is the primary control on detecting and reducing observer variability in rangeland data collection (see Question 7). Calibration among data collectors, as used by the AIM program, addresses observer and measurement error during data collection. It acts as a mechanism of quality assurance by providing time for data collectors to discuss discrepancies in data and clarify differences in protocol interpretation. Data collection begins only after all data collectors are calibrated. Results of AIM calibration exercises ( Fig. 3 ) are used to identify sources of error and protocol misinterpretations, which allows data collectors and project managers to improve training, protocols, and QA&QC practices to mitigate those specific issues. Calibration data from regional AIM training sessions helps observers and instructors identify areas of improvement prior to data collection ( Fig. 3 ). Each observer records measurements on the same transect and those observations are compared. If the range of variability among observers is less than the tolerance range (e.g., 10% for line-point intercept), the calibration is successful and formal data collection may begin. If observers do not successfully calibrate on all indicators for a method, observers discuss the results, identify sources of confusion, and repeat the calibration exercise on a new transect. cludes curating metadata. Metadata enables the reusability of data by providing managers and researchers with the needed information to interpret and use data. S tandardiz ed data formats and metadata documentation (e.g., FGDC, ISO, EML) are most useful when they include data history records, a data dictionary of field name meanings, documented known errors, spatial projection (e.g., NAD83), and date format (e.g., ISO 8601) to guide appropriate use of the data. Metadata provide a validation of data quality to others (see Question 8), thus metadata are a core component of any dataset. 21

How will training occur?
Training is the primary opportunity to ensure that team members understand how to collect, manage, and use data properly and consistently. Frequent training, together with clear roles and responsibilities (Question 3), reduces errors due to personnel turnover and provides staff with updates to protocols and workflows. Rangeland monitoring courses are offered in many university programs to give young rangeland professionals exposure to the rangeland data collection and monitoring community (see Newingham et al., this issue). These university courses, as well as in-person national monitoring training programs, and web-based training resources all provide new and experienced users with further guidance (e.g., https://www.landscapetoolbox.org/training). Web-based training activities including manuals, courses, and recorded presentations provide an introduction or brief refresher on how to collect data and use data collection tools (e.g., data collection apps, water quality instruments) when travel to in-person training is impractical. For field-based collection methods, we recommend in-person training as the primary learning method that is supplemented by web-based training. In the field, instructors can demonstrate techniques, answer questions, and provide feedback to data collectors in a more dynamic way than is possible in remote learning settings. Field trainings also should include data capture, either with electronic apps or using paper data sheets, so that data entry can be reviewed and field data workflows, such as daily backups to avoid data loss, are practiced. In these trainings, data collectors benefit from exercises that involve reviewing data for completeness, correctness, and consistency (Question 8) and making corrections as needed. Ideally, all data collectors would attend an in-person training at the beginning of each field season. Many monitoring programs, including AIM, NRI, and Interpreting Indicators of Rangeland Health, hold yearly, standardized field trainings to reach the rangeland data collection community.

What is the calibration plan?
Calibration by comparison of measurements to a standard or among data collection specialists helps data collectors identify and correct implementation and equipment errors be- Figure 4. Visualizing monitoring data can be used to identify outliers, missing data, and other data errors (Question 8). Visual data checks can include looking for consistency or correlation between methods, such as bare ground estimates from the line-point intercept and canopy gap methods (A). Data visualization can also identify where and why incorrect values were entered. For instance, in the BLM AIM and NRCS NRI, data collectors are required to use the ecological site name recognized by the NRCS; however, in some instances those names are unknown to the data collectors and so the data collectors use a different name or leave the field blank (B). As a result, it may be assumed there is no ecological site ID available, which may not always be the case. In all cases, photos or site revisits are valuable in confirming or correcting errors.

Rangelands
fore they occur during data collection. Calibration is not to be taken lightly. A faulty sensor or uncalibrated field technician can result in incorrectly collected data. If calibration error is within the range of expected values, the error may never be detected resulting in erroneous conclusions. Depending on the data, calibration may occur between data collectors ( Box 1 , Fig. 3 ), 12 against a known value, 38 , 39 or through double-sampling (i.e., repeat sampling of the same attribute with two different methods to improve precision). 40 A calibration exercise is successful if the indicator estimated by data collectors is within an allowable range of variability. 12 If an indicator value falls outside the tolerance range, calibration results are reviewed by the team (data collectors, project leaders, and instructors) at the plot to identify the sources of variability and re-train data collectors. Sensor equipment calibration schedules should follow the factory-recommended calibration intervals. For observational data, we recommend that all data collectors calibrate ear ly and of ten. For instance, following the Monitoring Manual for Grassland, Shrubland, and Savanna Ecosystems , 12 data collectors must successfully calibrate prior to data collection and then monthly or when entering a new ecosystem, whichever occurs first. Similarly, for species composition by weight and other production methods, recalibration may occur more frequentl y during earl y and rapid phenological changes when encountering a new precipitation pattern, landform, utilization rates, and changes in vegetation. If a new data collector joins the data collection team, a calibration event also is triggered.
Although it is not common practice to publish calibration results alongside rangeland data, we encourage the rangeland community to adopt this practice. Publishing calibration results can verify that calibration steps were taken and detail the observer variability within the dataset (Question 9). Calibration data are also important when describing advantages and disadvantages between methods and prior to replacing an existing method with a new one. 41 Calibration results may provide opportunities for including observer variability as a covariate in analysis. Public calibration data can identify areas of improvement for teaching data collection methods (Question 6), where if one program is especially successful at calibration, the community can learn from those successful training and data collection practices.

Are the data complete, correct, and consistent?
Frequent review of rangeland data for completeness, correctness, and consistency will detect errors and missing data in a timely and efficient manner ( Fig. 4 ). Errors detected in this review process are best addressed in the field, during data collector review. However, these checks are also important steps in data storage and analysis workflows. Many of these data checks can be automated using digital data collection forms and web-based dashboards (e.g., Tableau, ESRI Ar-cGIS Insights). Data are complete if they have every data element present so that every field in every data form is completed for every method required for that project. Data are correct if they are accurate and follow the data collection protocol. For instance, a correct application of the linepoint intercept method requires accurate plant identification, proper pin drop technique, and consistent species code selection following a known taxonomic reference (e.g., USDA Plant codes, unknown plant protocol) in the correct location Table 3 Summary of BLM AIM lotic core indicator crew and intra-annual variability (Question 9) as assessed by residual mean square error (RMSE), coefficient of variation (CV), and signal to noise (S:N) ratio  47 we used S:N to assess indicator precision where S:N < 2 equals low precision; ≥ 2.0 to < 10 equals moderate precision, and ≥10 equals high precision. * Indicator rated as having high precision for at least two of the three measures. † Indicator rated as having high precision for at least one measure and moderate for a second. ‡ Indicator rated as having low precision for two or more measures. § Outliers were removed from total phosphorous analyses for one pair of sites in the 2013-2015 study and two pairs in the 2017 study. Outlier inclusion resulted in Moderate/L ow/L ow and L ow/Moderate/L ow ratings, respectively. on the datasheet. 12 Although data reviewers might find it difficult to check the pin drop technique later, we can infer that, if both plant identification and other elements of a pin drop are recorded correctly, the likelihood of other methodological errors is lower. It is also helpful to review data for likely spelling mistakes (e.g., squirel, sqiurrel, squirell), as typos and unclear handwriting result in species misidentification and erroneous values. Data checks might also find data to be correct if measured values fall within allowable ranges (e.g., percentages must be between 0 and 100%).
Correct data also can be verified by consistency checks to verify that data follow expected patterns 16 or logical relationships among data collection programs, between methods, over time, and within the ecological potential of the site. 38 Method consistency checks, for instance, might verify that stream bankfull channel width is greater than wetted width when sampling below flood stages or that total canopy gaps are equal to or less than bare soil cover ( Fig. 4 ). Ecological consistency checks rely on local knowledge to ensure that range-land data are consistent with our understanding of ecosystem processes and change. Specific checks include ensuring that species are consistent with ecological site potential and, where repeat measurements are available, that changes in species composition are likely given climate and management data. Where outliers exist, ecological checks can determine if those outliers are due to site heterogeneity, extreme conditions, or due to an error. 42 For instance, stream pH values below 6 or above 9 are only possible if substantial alteration has occurred (e.g., acid mine drainage). As rangeland ecosystems change, we urge extreme caution before removing outlier values from analyses, as it is possible that these values represent previously unobserved disturbances (e.g., fire, drought, and climate change) or novel ecosystems. 43 Therefore we recommend a "preponderance of evidence"approach, using photos and other datasets, to identify erroneous outliers. 29 Quality assurance plans should contain data quality objectives that set desired levels of completeness, correctness, and consistency. 23 If data do not meet these objectives, corrective

Box 2
Studying variance decomposition in the BLM AIM wadeable stream and river core methods The BLM Lotic AIM conducted a study to quantify the intra-annual variability (see Question 9) for two different iterations of the wadeable stream and river AIM field protocol. In this study, approximately 10% of the total monitoring locations were resampled, 25 locations for the first protocol iteration (2013-2015) and 37 for the second (2017). Locations were distributed proportionally among geographical regions and stream types to adequately characterize spatial variation and the types of streams data collectors encountered. Although, the study aims included separating sampling and nonsampling error, this proved difficult. To minimize within season temporal variation and attempt to isolate data collector bias, locations were sampled within 4 weeks of each other. The first study assessed crew variability among all possible pairs of data collectors and crews were not aware of repeat sampling. The second study assessed crew variability between a single crew and all other crews due to crew logistical constraints. Within season variability was quantified using residual mean square error (average deviation, in native units, among repeat measurements), the coefficient of variation (variability between repeat measurements scaled to the mean), and the signal to noise ratio (estimate of sample variability relative to site variability; Table 3 ). Each measure of variability was rated as corresponding to high, moderate, or low repeatability and then used as a line of evidence to determine overall repeatability of the BLM Lotic AIM wadeable stream and river core indicators. As a result of these two studies, adaptive monitoring principles were applied. 46 Some indicators were omitted from the program (e.g., ocular estimates of instream habitat complexity), and protocol changes were made to others (e.g., floodplain connectivity) to improve consistency among data collectors (see Question 10). Measures of indicator precision were comparable to those of other monitoring programs. 47 This assures data users of the high quality of lotic AIM data and its comparability to other monitoring programs. action is taken, if possible, and all data edits are tracked (see Question 2) with a clear rationale for the edit. If no corrective action is possible, data are omitted if they are c lear ly wrong or, if they are questionable but not c lear ly wrong, data are flagged as suspect with a clear comment about why they may not be appropriate to use in certain analyses. For example, a vegetation cover value deemed too high to be plausible that cannot be fixed would be excluded from an analysis looking at average cover but could still be included in an occupancy analysis. If electronic data capture is part of the data collection program (see Question 4), many checks for completeness, correctness, and consistency can be programmed into data collection applications to prevent common errors. However, ecological checks generally require manual review of data after collection and a level of expertise that individual data collectors may not have. Photos and data visualization also can assist with these ecological checks ( Fig. 4 ).

What are the sources of variability?
Even if data are complete, correct, and consistent it is important to identify where there are general sources of variation in a dataset. In addition to spatial and temporal ecological variation, variability in rangeland data is due to variation in data collectors. Collectively, these factors add noise (uncertainty) to rangeland data that obscure our capacity to detect differences among locations or changes through time. 44 Sampling error occurs when your estimate differs from the true value because you have only sampled a portion of the entire population. 12 Sample design, stratification, and sample size can influence adequate characterization of ecological variation through space and time (see Stauffer et al. this issue for a review of this topic). Additionally, sampling and nonsampling variance components can be combined in power analyses to determine the size of changes the data collection effort can detect and assist with designing better studies. 45 Sampling error is an important source of variability and should be considered prior to collecting or anal y zing data. Here we focus our discussion on variance components that are a result of nonsampling errors (i.e., errors not due to the lim-itations of sample designs in measuring ecological variability), which can be addressed through QA&QC. Sampling and nonsampling variance components can be combined in power analyses to determine the size of changes the data collection effort can detect and can assist with designing better studies ( Box 2 ). Describing variability across data collectors can identify which indicators data collectors struggle to measure consistently ( Box 2 , Question 7) and improve data collection protocols and training ( Box 1 , Question 6). Ultimately, certain indicators may not be measurable at desired levels of precision no matter how many replicates are taken or how well data collectors are trained. After careful consideration through the adaptive monitoring process, 46 new methods of measuring these indicators may be selected, the indicators may be omitted from the study, or the indicators may only be sampled in situations where the indicators are needed, and less precise data are acceptable.
Quantifying different components of indicator variability is time intensive and expensive. Thus, only a few monitoring programs and studies have conducted such analyses. 47 , 48 If similar data are collected across monitoring programs and studies, data may be used to quantify sampling and nonsampling error across locations and years, but estimates of within season variability could differ among programs. For example, the precision of stream indicators such as bankfull width, percent fine sediment, and percent stream pool habitat differs among monitoring programs that use relatively similar field methods. 47 Such field measurement variation, or intra-annual variability, can result from the combined effects of measurement variation among different field crews, within-season environmental variability, and changes in location. Intra-annual variability is likely the variance component of most interest to monitoring programs assessing trend across years so that they can make proper inferences in analysis. For example, if percent vegetative cover changes from 80% to 90% between year 1 and 2, but data collected within the same year by two different data collectors differs by 10% at a monitoring location, any changes in cover < 10% could simply be due to observer bias rather than management changes. Ideally, monitoring programs and long-term studies would quantify variability among crews within a season for each major iteration of a protocol ( Box 2 ).

How can we adapt to do better next time?
Improving rangeland data quality involves using QA&QC questions to evaluate data and adaptively manage monitoring and research programs. Data collection, especially within monitoring and long-term studies, is an iterative process, with continual improvements based on feedback from the team, metrics from training and calibration, implementation of data management systems, and results of data review. 36 Even in the best data collection systems, mistakes will be made throughout the data collection process. New situations or "edge cases" may be encountered that highlight opportunities for clarifying protocols. Successful data collection efforts identify and learn from those mistakes and adjust for the next field season or in the next study. Rangeland studies and monitoring programs can learn from each other by sharing these mistakes and lessons learned with the community. Through adaptive monitoring, QA&QC Questions 1 to 9 can be revised and refined in subsequent monitoring cycles to produce a higher quality dataset. For example, within the BLM AIM program, data management protocols, calibration protocols, training, and electronic data capture programs are updated and revised annually in response to feedback from data collectors, data users, and errors found during QA&QC. However, we caution against rapid changes in monitoring programs and longterm studies, as substantial shifts can limit power to detect change or differences over space and time. Therefore, when a comparative analysis is critical, care should be taken to ensure that any updates to the monitoring program and study are thoughtfully considered and other data sources (e.g., remote sensing 11 ) are available to provide a preponderance of evidence in detecting trend. 36

Conclusions
High-quality rangeland data are key to data-supported decision-making and adaptive rangeland management. We have presented 10 QA&QC questions that managers, data collectors, and scientists can address to ensure data quality and thereby increase the efficacy of monitoring and other data collection efforts. The answers to our 10 questions can guide the appropriate personnel, data management tools, and analysis strategies to maintain data quality throughout the data lifecycle. Given the expense of collecting and managing rangeland data, improving data quality workflows will reduce the frequency of costly errors and ensure that rangeland data are fit for use in decision-making and in rangeland research and modeling. In the experience of the authors, high-quality data are also more likely to be collected once and used for many purposes, which increases the efficiency of rangeland monitoring. Research studies, assessment, monitoring, and inventory programs can improve data quality by thoroughly describing the data ecosystem, c lear ly defining roles and responsibilities, adopting appropriate data collection and data management strategies, identifying sources of error, preventing those errors where possible, and describing sources of measurement variability. Ensuring data quality is an iterative process and improves through adaptive management of monitoring and inventory programs. The QA&QC questions posed in this paper apply to all members of the rangeland community and all data collected in experimental studies, inventories, short-term monitoring, and long-term monitoring programs. We encourage interagency and interdisciplinary partnerships to discuss these questions early so that data quality is ensured as a collaborative process. Improving data quality will improve our ability to detect condition, pattern, and trend on rangelands, which are needed to improve adaptive management and co-production of scientific research for natural resource management.

Declaration of Competing Interest
S.E.M is a Guest Editor for the Special Issue and J.W.K. is Editor in Chief for Rangelands, but they were not involved in the handling, review, or decision process for this manuscript. The content of sponsored issues of Rangelands is handled with the same editorial independence and singleblind peer review as that of regular issues.