Microphone variability and degradation : implications for monitoring programs employing autonomous recording units

Autonomous recording units (ARUs) are emerging as an effective tool for avian population monitoring and research. Although ARU technology is being rapidly adopted, there is a need to establish whether variation in ARU components and their degradation with use might introduce detection biases that would affect long-term monitoring and research projects. We assessed whether microphone sensitivity impacted the probability of detecting bird vocalizations by broadcasting a sequence of 12 calls toward an array of commercially available ARUs equipped with microphones of varying sensitivities under three levels (32 dBA, 42 dBA, and 50 dBA) of experimentally induced noise conditions selected to reflect the range of noise levels commonly encountered during avian surveys. We used binomial regression to examine factors influencing probability of detection for each species and used these to examine the impact of microphone sensitivity on the effective detection area (ha) for each species. Microphone sensitivity loss reduced detection probability for all species examined, but the magnitude of the effect varied between species and often interacted with distance. Microphone sensitivity loss reduced the effective detection area by an average of 25% for microphones just beyond manufacturer specifications (-5 dBV) and by an average of 66% for severely compromised microphones (-20 dBV). Microphone sensitivity loss appeared to be more problematic for low frequency calls where reduction in the effective detection area occurred most rapidly. Microphone degradation poses a source of variation in avian surveys made with ARUs that will require regular measurement of microphone sensitivity and criteria for microphone replacement to ensure scientifically reproducible results. We recommend that research and monitoring projects employing ARUs test their microphones regularly, replace microphones with declining sensitivity, and record sensitivity as a potential covariate in statistical analyses of acoustic data. Variabilité et dégradation des microphones: implications pour les programmes de surveillance utilisant des unités d'enregistrement autonomes RÉSUMÉ. Les unités d'enregistrement autonome (UEA) apparaissent comme un outil efficace pour la recherche et le suivi des populations aviaires. Bien que la technologie UEA ait été rapidement adoptée, il est nécessaire d'établir si la variation des composants des UEA et leur dégradation avec l'utilisation pourraient introduire des biais de détection qui affecteraient les projets de recherche et le suivi à long terme. Nous avons évalué si la sensibilité du microphone a une incidence sur la probabilité de détecter les vocalisations d'oiseaux en diffusant une séquence de 12 chants d'oiseaux vers une série d'UEA disponibles dans le commerce, équipées de microphones de sensibilités variables, testés sous trois niveaux (32 dBA, 42 dBA et 50 dBA) de bruit induit expérimentalement. Ces conditions ont été sélectionnées pour reproduire la gamme de niveaux de bruit généralement rencontrés lors des études aviaires. Nous avons utilisé une régression binomiale pour examiner les facteurs influençant la probabilité de détection pour chaque espèce et les avons utilisés pour examiner l'impact de la sensibilité du microphone sur la zone de détection (ha) pour chaque espèce. La perte de sensibilité du microphone a réduit la probabilité de détection de toutes les espèces testées, mais l'ampleur de l'effet variait selon les espèces et interagissait souvent avec la distance. La perte de sensibilité du microphone a réduit la zone de détection d'en moyenne 25% pour les microphones juste en dessous des spécifications du fabricant (-5 dBV) et en moyenne de 66% pour les microphones fortement compromis (-20 dBV). La perte de sensibilité du microphone semble être plus problématique pour les chants d'oiseaux de faible fréquence où la réduction de la zone de détection s'est produite le plus rapidement. La dégradation du microphone induit une source de variation dans les recherches aviaires réalisées avec des ARU qui nécessiteront une mesure régulière de la sensibilité du microphone et des critères de remplacement du microphone afin d'obtenir des résultats scientifiquement reproductibles. Nous recommandons que les projets de recherche et de suivi utilisant des ARU testent régulièrement leurs microphones, remplacent les microphones dont la sensibilité est décroissante, et tiennent compte de la sensibilité comme covariable potentielle dans les analyses statistiques des données acoustiques.


INTRODUCTION
Ornithologists use a number of methods to track changes in avian populations, with point count-based methods being one of the most commonly employed approaches (Ralph et al. 1995, Matsuoka et al. 2014).Point counts generally make use of trained observers who conduct surveys over a set duration during which all birds seen or heard within a specified distance of the observer are counted (Ralph et al. 1995, Matsuoka et al. 2014).Point count protocols are widely used in monitoring programs such as the North American Breeding Bird Survey (BBS) to infer population status, trends, and habitat relationships, thus providing an important tool for species conservation (Peterjohn and Sauer 1999, Sauer et al. 2003, Sauer et al. 2013).For many groups of birds, detections during point count surveys are primarily based upon acoustic cues (Dejong andEmlen 1985, Brewster andSimons 2009); therefore, acoustic recording technologies offer an alternative sampling method to supplement traditional point counts (Hobson et al. 2002, Klingbeil andWillig 2015).Although point counts by either human observers or using acoustic recordings are widely used, there is increasing recognition that these methods are susceptible to imperfect detection that can lead to biased estimation of species density or abundance (Thompson 2002, Royle et al. 2005).Imperfect detection results from variable availability of cues (e.g., songs) given by a species or individual, and variation in the ability of observers to perceive these signals once they are available (Alldredge et al. 2007a, b, Diefenbach et al. 2007).Although several studies have focused on factors influencing detection probability by human observers (e.g., Alldredge et al. 2007a, b, Simons et al. 2007, Stanislav et al. 2010), less is known about what factors drive variation in detection probability where acoustic recording technologies are used as the primary survey tool.The recent emergence and growing popularity of autonomous recording units (ARUs) in avian monitoring has created a need to investigate how variation in electronic components of these recording technologies could influence potential biases with this survey method.
Interest in the application of ARUs in avian research is in part because of their potential to address several of the known biases associated with traditional point count surveys.Programmability facilitates devising recording schedules that can directly reduce or possibly eliminate biases associated with variation in call availability with respect to time of day or season (Brandes 2008, Venier et al. 2012).Furthermore, the ease of obtaining repeat "visits" to the same location can also facilitate the application of statistical methods such as N-mixture modeling (Royle and Nichols 2003) to estimate and account for biases in detection probability.ARUs are well suited to monitor species that are logistically difficult to monitor (e.g., nocturnal species) using human observers (Goyette et al. 2011, Rognan et al. 2012) and can reduce impacts on wildlife (Carey 2009) as well as potential biases caused by the presence of an observer (Gutzwiller andMarcum 1997, Riffell andRiffell 2002).In addition, recordings have the added benefit of creating a permanent record of the acoustic environment that can be viewed on a spectrogram (Digby et al. 2013), listened to multiple times (Haselmayer and Quinn 2000), analyzed by multiple analysts to verify species identifications (Hobson et al. 2002), or even slowed down to enumerate certain species based on temporal separation between calls (Drake et al. 2016).
Despite the many advantages of ARUs, they also present several potential challenges to application in research and monitoring.Unlike ARUs, human observers can use visuals cues, estimate distance to observations, directly associate observations with specific microhabitats, and generally detect birds over a larger area than most sound recorders (Hutto andStutzman 2009, Sidie-Slettedahl et al. 2015).Although most of the above limitations can be addressed via appropriate survey design, ARUs have the additional challenge that the ability to detect a signal can vary with the choice of recording unit (Venier et al. 2012, Rempel et al. 2013) and microphone specifications (Fristrup and Mennitt 2012).Microphone sensitivity (essentially how efficiently a microphone converts a signal from sound pressure to electrical energy) is not only a concern when initially purchasing recording equipment for a project, but also through time as the equipment is used and potentially degraded.
Similar to long-held concerns regarding the impact of age-related hearing loss in humans on point count surveys (Ramsey and Scott 1981, Emlen and DeJong 1992, Farmer et al. 2014), degradation of microphone sensitivity through time could affect the detection of sounds by the recorder and therefore the conclusions of longterm studies.We are unaware of any studies discussing microphone sensitivity loss relating to bioacoustic monitoring, but other fields have identified sensitivity loss in microphones as an issue.For example, in cochlear implants, more than 25% of microphones examined experienced a gradual loss of sensitivity over time where a one decibel reduction in sensitivity reduced speech recognition by one word per minute (Razza and Burdo 2011), illustrating the impact that equipment degradation can have on performance.As more monitoring programs incorporate ARUs, it becomes increasingly important to understand how equipment wear might affect detection to ensure that data quality is maintained through time.
We conducted an experiment by broadcasting species calls toward an array of ARUs with microphones of varying quality to assess the effect of microphone sensitivity loss and environmental noise on distance-related probability of detecting sounds recorded by ARUs.Specifically, our objective was to quantify the differences in detection caused by microphone sensitivity loss over a realistic range of survey conditions.We recognize that the types (brand/ model) of ARUs and microphones may vary between studies; however, our experiment provides a clear example of how degradation of equipment may affect data collection.We discuss the implications of microphone degradation for long-term projects and provide general recommendations for quality control.

Microphone sensitivity
We measured the sensitivity of SMX-II microphones for the Song Meter (Models SM2 and SM2+, Wildlife Acoustics, Maynard, Massachusetts, USA) ARU, which are commonly used in bird monitoring programs.Specifically, we installed the latest firmware version available (version 3.3.7)and set the gain jumpers on the ARUs to 0 dB while leaving all other jumpers at the factory settings (i.e.2.5V bias enabled and 3Hz high pass filter cut-off).We then set the Song Meters to calibration mode and let the ARU stabilize for a minimum of two minutes prior to microphone calibration.After removing windscreens from the microphones, we attached each microphone to the left microphone jack of the ARU and fit a sound level calibrator (Model 407744, Extech Instruments, Nashua, New Hampshire, USA) over the end of the microphone.The sound level calibrator emits a 1 KHz pure tone at 94 dB, from which a sensitivity reading can be obtained from the ARU.To determine if microphone sensitivity varied in relation to microphone age, we measured the sensitivity of a population of 369 microphones and divided them into three groups based on the number of field seasons (> 1 mo) the microphones were deployed: microphones used during 2 to 4 field seasons (n = 75), microphones used during a single field season (n = 151), and microphones purchased in 2014 but never deployed (n = 143).We performed a Kruskal-Wallis test followed by post hoc Mann-Whitney U tests to determine whether groups differed from one another.

Field experiment
We conducted our field experiment in the rural municipality of Foam Lake, Saskatchewan, Canada, in an open field with flat terrain covered by graminoid vegetation ~0.5 m tall.We performed three replicate trials after the breeding season was largely over (26 July-8 August) on nights (22:30-01:45 hours) with little to no wind (average wind 0 to 2.2 km/h; measured using an anemometer; Kestrel 4000 NV Wind Meter, Kestrel Meters, Birmingham, Michigan, USA).Average ambient noise ranged from 32.1 to 33.0 dBA during broadcast trials, measured using a data-logging sound pressure level meter (Model C-322, Reed Instruments, Wilmington, North Carolina, USA).
We deployed 12 triads of ARUs in a linear array spanning 220 m with triads spaced at 20-m intervals (Fig. 1).We used a combination of SM2 and SM2+ units in the array, after first ensuring that all ARUs operated within 0.5 dBV of one another using the sound level calibrator, and repositioned the gain jumpers to factory settings (i.e., 48 dB) while leaving all other jumpers in place.Within each triad, we spaced recorders 15 cm apart based on the position of the microphone jack and aligned the left channel toward the broadcast location because our objectives only required one microphone per recorder.
Using the results of our microphone sensitivity test (described above), we quantified the deviation of the sensitivity of each individual microphone (n = 331) from the mean sensitivity rounded to the nearest integer of our unused cohort of microphones (i.e., -41 dBV).Microphones were then assigned to one of three sensitivity classes: sensitivity loss (0 to 10th percentile; deviation of -20.50 to -2.30 dBV), no sensitivity loss (45th to 55th percentile; deviation of -0.05 to 0.30 dBV), and above mean sensitivity (90th to 100th percentile; deviation of +1.40 to +3.30 dBV).For every broadcast trial, we randomly drew 12 microphones from each of the three sensitivity classes and equipped a recorder within each triad with a microphone from each sensitivity class, permitting us to compare the effect of microphone sensitivity loss on detection under identical conditions.
During trials, we broadcast white noise to simulate environmental noise experienced under wind conditions representative of those experienced during most bird surveys.We used an experimental approach because it allowed us to assess the effect of ambient noise while minimizing potential complications brought about by variable (uncontrolled) wind speed and direction.Prior to our experiment, we empirically determined the ambient noise experienced under wind conditions falling within standardized protocols for avian surveys such as the BBS and the North American Marsh Bird Monitoring Protocol (Conway 2011).Specifically, we simultaneously measured wind speeds (km/h) using a Kestrel 2000 anemometer (Kestrel Meters, Birmingham, Michigan, USA) and sound pressure level (dBA) using a model C-322 sound pressure level meter (Reed Instruments, Wilmington, North Carolina, USA).Based on graphical inspection of the relationship between wind speed and sound pressure level, we simulated Beaufort 3 winds (13-19 km/h) using 50 dBA of white noise and Beaufort 2 winds (6-12 km/h) using 42 dBA of white noise, and did not use any white noise to represent Beaufort 0 (< 2 km/h) and Beaufort 1 (2-5 km/h) winds.To simulate wind noise, we placed speakers 1 m in front of each triad of recorders and broadcast white noise to expose microphones to 42 dBA and 50 dBA of noise for each of the respective "wind trials" (Fig. 2).
We broadcast a sequence of bird calls toward the array of recorders using a FoxPro Firestorm digital game caller (FOXPRO Inc., Lewistown, Pennsylvania, USA).We repeated the broadcast from 20 and 30 m away from the first triad of recorders; combined with the positioning of our 12 triads (above), this produced recording samples at 24 10-m intervals for distances ranging from 20-250 m.The majority of our broadcast sequence consisted of wetland-associated species because ARUs are likely the most effective way to monitor this group of birds (Sidie-Slettedahl et al. 2015).We included the primary vocalizations of American Bittern Botaurus lentiginosus (AMBI), Le Conte's Sparrow Ammodramus leconteii (LCSP), Nelson's Sparrow Ammodramus nelsoni (NESP), Pied-billed Grebe Podilymbus podiceps (PBGR), Sedge Wren Cistothorus platensis (SEWR), and Yellow Rail Coturnicops noveboracensis (YERA).The sequence also contained Sora Porzana carolina (SORA) "per-weep" and "whinny" calls and Virginia Rail Rallus limicola (VIRA) "tick-it" and "grunt" calls.We also included songs of two forest-dwelling species (Black-and-white Warbler Mniotilta varia, BAWW; Ovenbird Seiurus aurocapilla, OVEN) to allow comparison of our findings to those of Alldredge et al. (2007a), who looked at similar effects on surveys by human observers.Fig. 2. Experimental setup showing one of 12 triads of autonomous recoding units used to evaluate the effect of microphone sensitivity loss on detection of calls.We simulated wind noise by broadcasting 42 dBA and 50 dBA (measured 1 m away) of white noise from a pair of speakers placed in front of each triad.
Because call loudness will influence the distance over which calls can be detected, we attempted to have the broadcast sequence reflect the volume of actual bird calls to improve the applicability of our findings.To achieve this, we broadcast species calls toward four experienced birders standing 50 m away (following Hobson et al. 2002), but repeated this over 5 dBA increments and had the birders identify which volumes they considered accurate for each species.To encompass the range of observer estimates and represent a potential range of variation in a species call loudness, we broadcast each call type at two decibel levels; quieter species (BAWW, LCSP, NESP, SEWR, YERA) were broadcast with a maximum dBA of 95 and 105 at the source, whereas louder species (OVEN, PBGR, SORA, VIRA) were broadcast with a maximum dBA of 105 and 120 at source (as opposed to 1 m from source).We broadcast AMBI calls with a maximum source dBA of 105 and 111 because we could not create a recording of this species that reached 120 dBA without noticeable distortion.With the use of two data-logging decibel meters, we determined that calls broadcast at 95, 105, and 120 dBA at the source corresponded to approximately 70 (SD 2.9), 75 (SD 4.4), and 86 dBA (SD 4.4) at 1 m away, respectively.These sound levels are consistent with previous studies that have attempted to measure the volume of bird vocalizations (Brackenbury 1979, Drake et al. 2016).
All recordings were processed in a laboratory setting by a single analyst using Raven Pro software (version 1.5 beta) and highquality, noise-cancelling headphones (Bose® QC15; Bose Corporation, Framingham, Massachusetts, USA).The analyst could use either visual evidence on spectrograms or sound to identify vocalizations.We ensured the analyst was blind to all treatment information (i.e., distance from sound source, microphone sensitivity, and noise level) by saving broadcast sequences as randomly numbered files.

Analysis
Data were entered as binomial outcomes where a call type for a given volume was either detected (1) or not detected (0).For each species, we used generalized linear regression modeling with a binomial error family and a complementary log-log link to examine factors influencing the probability that calls were detected.We chose the complementary log-log link because it tends to provide better fit to skewed data (Hosmer et al. 2013) and comparison against initial fits using logit link functions suggested better fits, especially near the intercept where probability of detection should equal 1 in our experiment.We treated distance and microphone sensitivity loss as independent continuous variables, whereas white noise treatment was included as a categorical variable.We did not treat broadcast volume as a variable, but rather pooled the two volumes together to account for uncertainty and variability in actual loudness of species.All models treated distance as a second-order polynomial fit to account for the expected decay in sound pressure level following the inverse square law (Marten and Marler 1977).
We developed a set of a priori candidate models (Table 1) that considered main effects and relevant interactions.For most models, we computed 85% confidence intervals based on the normal approximation interval.However, certain models experienced complete separation; when this occurred we calculated 85% confidence intervals based on profile likelihoods (Heinze and Schemper 2002) using the MASS package in R (Venables and Ripley 2002).
Table 1.Candidate set of models used in analyses including distance to the sound source (Dist), noise treatment (Treat), microphone sensitivity loss (Mic), and relevant interactions among them.We used Akaike's information criteria corrected for small sample size (AIC c ; Burnham and Anderson 2002) to determine the most parsimonious model for each species.We discarded models in which the 85% confidence interval contained zero (Arnold 2010) and defaulted to the next best model that did not contain "pretending parameters" (sensu Anderson 2008).Using the best approximating model, we determined the effective detection radius and subsequently the effective detection area of the ARU for each call type.

RESULTS
A cohort of 143 new (unused) microphones had a median microphone sensitivity of -40.9 dBV (range -37.7 to -43.5 dBV; SD 1.2; Fig. 3).We note that our measurements differ by 5 dBV from the manufacturer's sensitivity specifications of -36 ± 4 dBV associated with the internal microphone element, but hereafter use our measurements (i.e., -41 ± 4 dBV) when discussing microphone specifications.Field use had a statistically significant (Kruskal-Wallis test χ 2 = 130.44, p < 0.0001) effect on microphone sensitivity (Fig. 4), and post hoc Mann-Whitney U tests between the groups show that all groups differ significantly from one another.The median sensitivity of microphones deployed for one season was 1.0 dBV lower (U = 4224, p < 0.0001) than the median sensitivity of new microphones at -40.9 dBV (Fig. 4).Similarly, the median sensitivity of microphones deployed for two or more seasons was 1.9 dBV less sensitive (U = 1126, p < 0.0001) than new microphones at 40.9 dBV corresponding to a median sensitivity difference of 0.9 (U=3580, p < 0.0001) between microphones deployed for a single season and microphones deployed for two or more seasons (Fig. 4).Our initial design should have yielded 648 6-minute sound files; however, mechanical failure and human error caused a subset of recordings to be lost or discarded.In total, we processed 576 6minute sound files resulting in 13,824 calls across the 12 call types potentially available for detection.
Our model selection process resulted in different models being selected to explain variation in call detection between species; however, all of the selected models included the main effects for microphone sensitivity, distance, and noise treatment (Table 2).The most frequently selected model among species included the main effects as well as an interaction between distance and treatment and provided the best fit for five call types (BAWW, LCSP, SEWR, SORA whinny calls, VIRA grunt calls).Two call types (AMBI and VIRA tick-it calls) included the main effects with the distance by microphone interaction as the best model.A model having main effects with the distance by treatment interaction as well as the distance by microphone interaction was the most appropriate model for NESP and YERA calls.Finally, three call types (OVEN, PBGR, SORA per-weep calls) had top models that only included the main effects.LCSP was the only species for which we selected the second ranking model as our best model because the 85% confidence interval around the estimate included zero.A summary of parameter estimates and model ranking is available in Appendices 1-3.The centerline of each box represents the median microphone sensitivities of -40.9, -41.9, -42.8 dBV, respectively.Boxes represent the data between 25th and 75th percentiles, and the whiskers represent 1.5 times the interquartile range.Outliers in each population are represented by dots.The solid and dotted horizontal lines depict the stated manufacturer specifications of mean sensitivity and expected variation of -36 dBV ± 4dBV.For all species, increasing distance, noise, and loss of microphone sensitivity decreased detection probability (see Appendix 4).The effect of microphone sensitivity loss on detection decreased with increasing distance for the four species (NESP, We identified the effective detection radius of each call type under various noise treatments and with increasing levels of microphone sensitivity loss (Table 3) to determine the effective detection area.We did not calculate the effective detection area of loud species under ambient conditions because their effective detection radii were beyond our experimental range of 250 meters.Additionally, NESP calls exceeded our experimental range under ambient conditions, so we excluded this species from the quiet group when summarizing our results.
The effective detection area (ha) of all species decreased with increasing microphone sensitivity loss (Fig. 5, Fig. 6, and Table 4).For quiet species, LCSP calls were least affected by microphone sensitivity loss, whereas SEWR calls were most affected under all noise treatments except ambient conditions, where the effective detection area of YERA initially dropped faster.For loud species, VIRA tick-it calls were least affected by microphone sensitivity Table 3. Effective detection radius in meters for ten species (and 12 calls) under three levels of environmental noise (dBA) and for microphones 0, 5, 10, 15, and 20 dBV below our measured manufacturer sensitivity specifications.Species include American Bittern (AMBI), Black-and-white Warbler (BAWW), Le Conte's Sparrow (LCSP), Nelson's Sparrow (NESP), Ovenbird (OVEN), Pied-billed Grebe (PBGR), Sedge Wren (SEWR), Sora (SORA; per-weep and whinny calls), Virginia Rail (VIRA; grunt and tickit calls), and Yellow Rail (YERA).Results omit species/noise combinations where predicted effective detection radii were beyond the range of measured distances in our experiment.loss, whereas AMBI calls were most severely affected under both noise treatments analyzed.A complete summary of the effective detection area (ha) and the percent loss of each call type with increasing levels of microphone sensitivity loss is available in Appendix 5.
Wind noise reduced the effective detection area across the range of microphone sensitivity; for brevity, we present the effect of noise based on microphones showing no sensitivity losses.The difference in noise between calm conditions to Beaufort 2 conditions caused the effective detection area of quiet species to decrease by an average of 76% (range 70-81%; SD 5%) and a further reduction of 69% (range 64-72%; SD 4%) between Beaufort 2 and Beaufort 3 conditions.Thus, the effective detection area was reduced by 93% (range 92-93%; SD 1%) between calm and Beaufort 3 conditions for quiet species.With loud species, http://www.ace-eco.org/vol12/iss1/art9/noise caused approximately 25% less of a reduction in effective detection area than observed for quiet species; the change of noise between Beaufort 2 and Beaufort 3 conditions caused the effective detection area of these species to decrease by 44% (range 19-58%; SD 13%).AMBI calls were least affected by the increase of noise, losing only 19% its effective detection area.

DISCUSSION
We provide strong evidence that microphone sensitivity decreases with field use.Furthermore, we show that lower microphone sensitivity reduces the effective area sampled by a microphone and thus induces distance-related biases in detection probability for all species.The observed reduction of effective detection area suggests that our results are similar to those found in cochlear implants (Razza and Burdo 2011) in that any sensitivity loss in a microphone results in reduced performance.Compared with microphones showing no sensitivity loss, microphones with sensitivity readings of 5 dBV below (i.e., -46 dBV) manufacturer specifications reduced species' effective detection areas by an average of 25% and by an average of 66% for the least sensitive microphones tested (i.e., -61 dBV).Large reductions in effective detection area should not be surprising given that microphones have the same function as ears during point counts, and variation in hearing ability among observers has the potential to greatly reduce the sampling area (Ramsey and Scott 1981).Although humans tend to experience the greatest age-related hearing losses at high frequencies (Ramsey and Scott 1981, Emlen and DeJong 1992, Farmer et al. 2014), our results suggest that microphones may be more vulnerable to low frequency losses; AMBI and PBGR calls, the two calls included in our experiment that are primarily below 1.5 KHz, experienced the fastest decline in effective detection area (58% and 39% decline with a 5 dBV loss in sensitivity, respectively), and both experienced reductions of > 85% with a 20 dBV reduction in sensitivity.
An initial inspection of spectrograms from recordings made with worn microphones suggests possible mechanisms for the lower detection rates of low frequency calls.Microphones appear to exhibit greater static over low frequencies and may be more vulnerable to sensitivity losses at low frequencies only.Although more detailed experiments and quantification would be required to define the mechanisms and frequency-specific wear that may be occurring in microphones, our results suggest that there may be no single (uniform) clear criterion to determine when a change in microphone sensitivity is sufficient to warrant replacement because multiple methods exist to process sound recordings and each will vary in the degree that damage affects them.For example, although the lower signal-to-noise ratio caused by static will lower detection for sound analysts (Fig. 7; compare sound files in Appendices 6 and 7), auditory inspections of recordings may yield higher detection rates than visual scans because masked visual signatures may still be audible (Fig. 7a; see sound file in Appendix 6).Furthermore, automated detection or recognition software will likely be more affected by worn microphones compared with human analysts because noise impedes the ability of automated recognizers to detect sounds (Bardeli et al. 2010).For most species, background noise greatly reduced the analyst's ability to detect calls, suggesting that noise affects recording-based surveys in a similar way as point count surveys (e.g., Simons et al. 2007, Pacifici et al. 2008).Not surprisingly, the magnitude of the noise effect was greater for quiet species than loud species.For example, for microphones at -41 dBV, the addition of 8 dBA of white noise (from Beaufort 2 to Beaufort 3 conditions) reduced the effective detection area of loud calls by 19-58% and 64-72% across the quiet species.Past research suggests that wind affects detection of calls in ARU surveys more severely than in-person surveys (Digby et al. 2013), and it is possible we did not experience the true effect wind has on recordings because our experimental design did not include wind gusts, which would tend to cause recordings to "clip" (Zakis 2011) and further affect detection.Additionally, white noise may not have been ideal to represent sound produced by wind in open habitats, where greater environmental noise can occur at lower frequencies (Zakis 2011), thereby masking only portions of a call and not affecting detection as severely (Koper et al. 2015).However, our use of white noise was likely appropriate to simulate the interference caused by leaf rustle in deciduous forests and likely dense cattail vegetation, which tend to follow a similar sound profile (Turnbull, personal commuication).Our approach also allowed us to reduce variation associated with changes in wind speed and direction and thus allow more refined inference with regard to the interaction between microphone sensitivity and realistic levels of environmental noise.
The predicted effective detection radii of all species appear large when comparing them with other studies (e.g., Alldredge et al. 2007a), but this may be because we broadcast calls at night in conditions with little to no wind-conditions ideal for temperature inversions that can increase the distance at which sounds are heard-or because we performed the experiment in an area lacking trees and other tall vegetation that can attenuate sound (Schieck 1997, Pacifici et al. 2008).We were surprised to find that NESP calls traveled further than YERA calls, but this is likely because our broadcast volume for NESP was unrealistically loud.Despite large effective detection radii, groups of calls attenuated as we expected within the boundaries of our experiment when ARUs were equipped with microphones near manufacturer specifications.Specifically, quiet species had smaller effective detection radii than loud species, and high frequency sounds attenuated faster than low frequency sounds (Schieck 1997, Alldredge et al. 2007a).These volume-and frequency-dependent attenuations did not necessarily hold true when ARUs were equipped with degraded microphones.For loud species, certain low-frequency calls became less detectable than mid-frequency calls once microphone sensitivity dropped to -44 dBV, and for quiet species, certain mid-frequency calls became less detectable than high-frequency calls once microphone sensitivity dropped to -48 dBV.Additionally, certain loud species became less detectable than quiet species once microphones sensitivity dropped to -52 dBV.
Loss of microphone sensitivity with repeated field use could confound temporal comparisons in monitoring and research programs if quality control guidelines are not established.For example, failure to replace damaged microphones over multiyear studies could create the appearance of a decline in an otherwise stable population, whereas periodic replacement of microphones could theoretically induce cyclic patterns in detection based on the frequency of microphone renewal in long-term studies.If a 10% difference in number of birds detected can cause undesirable bias in trend estimates from index-based surveys (as stated in Rempel et al. 2013), ensuring that monitoring programs use microphones within a certain range of sensitivity will maintain data quality.Some units may be able to compensate for varying microphone sensitivity by adjusting a gain setting on the unit.However, further experiments will be needed to determine what effect this may have on detection because adjusting the gain based on the 1 KHz test tone may result in excessive gain at higher frequencies and may not change the signal-to-noise ratio of microphones that exhibit static at the lower frequencies.
We recommend that all microphones be uniquely identified (labeled) and the sensitivity measured immediately upon purchase and subsequently tested after each field season to track changes in microphone sensitivity.Furthermore, recording the time-specific estimates of microphone sensitivity and tracking which microphones were deployed on a given ARU and recording location would allow the inclusion of microphone sensitivity as a covariate in statistical models to potentially adjust for differences in detection between sites or years.Performing a single point check using a commercially available 1 KHz/94 dB sound level calibrator should be sufficient to identify microphones with poor sensitivity because sensitivity loss appears to be greatest in the low frequencies.However, we did not sweep the whole frequency spectrum of our microphones to detect frequencyspecific damage; thus, future work over a broader frequency range would be useful, but the difficulties and costs associated with this might not be worthwhile.Furthermore, although static appeared to occur more frequently in microphones with lower sensitivity readings, an increase in the noise floor of a microphone can occur independently of sensitivity loss, meaning a single point check will not reliably detect this kind of damage.Thus, periodic inspection of spectrograms would also be useful to determine if other problems exist.This may be especially important for programs heavily reliant upon recognition software.
The cost of a sound level calibrator and replacement microphones, as well as the work involved with testing microphones, presents additional costs that should be accounted for in cost-benefit scenarios (sensu Hutto and Stutzman 2009), but such scenarios should also consider the benefit of repeat visits obtainable via ARU use.It would be useful to conduct longitudinal studies to understand rates of microphone decay and determine whether damage is associated to specific environmental conditions, and examine whether the magnitude at which microphone sensitivity loss affects detection is equal across various habitat types.Lastly, although our experiment used a single microphone per recorder, ARUs function with two microphones and record in stereo; thus, further research should investigate whether matching microphones with similar sensitivities for sound localization is preferred to pairing microphones with different sensitivities to sample similar areas.

CONCLUSION
Microphone variation and degradation present a potential source of bias that monitoring and research programs will have to guard against to maintain data quality.Although our results are specific to the particular products we tested, the patterns observed in this study can be generalized across all ARUs.While we have highlighted distance-related heterogeneity in detection probability caused by variation in microphone sensitivity, it is important to note that the range of variation in microphone sensitivity we observed is still roughly half of that observed between human observers (Ramsey and Scott 1981).We have outlined approaches that can easily be used to document and maintain quality control on microphone sensitivity by testing and replacing microphones as necessary.Determining when to replace microphones will depend on a project's objectives, target species, and methods used to process the recordings.

Fig. 1 .
Fig. 1.Schematic showing the experimental setup used to evaluate the effect of microphone sensitivity loss on detection of calls.The setup consisted of 12 triads (4 are pictured) of autonomous recording units (ARUs) arranged in a linear array spanning 220 m with individual triads spaced at 20-m intervals.Within each triad, ARUs were spaced 15 cm apart and equipped with one microphone from each of the three sensitivity classes.We simulated environmental noise caused by windy conditions by broadcasting white noise from speakers installed 1 m in front of each triad (WN) and broadcasted calls 20 m (B2) and 30 m (B1) in front of the first triad of ARUs.

Fig. 4 .
Fig. 4. Box plot of measured microphone sensitivity (dBV) of three cohorts of SMX-II microphones representing microphones purchased in 2014 and never deployed (n = 143), microphones used for only one field season (n = 151), and microphones used for 2-4 consecutive field seasons (n = 75).The centerline of each box represents the median microphone sensitivities of -40.9, -41.9, -42.8 dBV, respectively.Boxes represent the data between 25th and 75th percentiles, and the whiskers represent 1.5 times the interquartile range.Outliers in each population are represented by dots.The solid and dotted horizontal lines depict the stated manufacturer specifications of mean sensitivity and expected variation of -36 dBV ± 4dBV.

Fig. 7 .
Fig. 7. Spectrogram of a Pied-billed Grebe call made from an autonomous recording unit equipped with (1) a degraded microphone showing severe static and (2) a normally functioning microphone showing no static.Both recordings were recorded within the same triad and during the same broadcast trial.Accompanying recordings for these spectrograms are available in Appendices 6 and 7, respectively.

Figure A4. 2 .
Figure A4.2.Detection probabilities of a) Nelson's Sparrow and b) Sedge Wren for five levels of microphone sensitivity loss and under three noise conditions based on selected binomial regression model.The dotted horizontal line indicates where detection is 0.5.

Figure A4. 3 .
Figure A4.3.Detection probabilities of a) Yellow Rail and b) Ovenbird for five levels of microphone sensitivity loss and under three noise conditions based on selected binomial regression model.The dotted horizontal line indicates where detection is 0.5.

Figure A4. 4 .
Figure A4.4.Detection probabilities of a) Sora per-weep calls and b) Sora whinny calls for five levels of microphone sensitivity loss and under three noise conditions based on selected binomial regression model.The dotted horizontal line indicates where detection is 0.5.

Figure A4. 5 .
Figure A4.5.Detection probabilities of a) Virginia Rail grunt calls and b) Viginia Rail tick-it calls for five levels of microphone sensitivity loss and under three noise conditions based on selected binomial regression model.The dotted horizontal line indicates where detection is 0.5.

Figure A4. 6 .
Figure A4.6.Detection probabilities of a) American Bittern and b) Pied-billed Grebe calls for five levels of microphone sensitivity loss and under three noise conditions based on selected binomial regression model.The dotted horizontal line indicates where detection is 0.5.

Table 2 .
Top binomial regression models for 12 species call types based on model rankings obtained using Akaike's information criteria corrected for small sample size (AIC c ).

Table 4 .
Percent reduction (and standard deviation) in the effective detection area (ha) of quiet and loud calls under various noise treatments with 5, 10, 15, and 20 dBV of microphone sensitivity loss.Quiet calls include Black-and-white Warbler (BAWW), Le Conte's Sparrow (LCSP), Nelson's Sparrow (NESP), Sedge Wren (SEWR), and Yellow Rail (YERA) whereas loud calls include American Bittern (AMBI), Ovenbird (OVEN), Pied-billed Grebe (PBGR), Sora (SORA; per-weep and whinny calls), and Virginia Rail (VIRA; grunt and tick-it calls).Results omit call/noise combinations where predicted effective detection radii were beyond the range of measured distances in our experiment.A complete summary showing the reduction in effective detection area for each call type is available in Appendix 5.
Model ranking based on Akaike's information criteria corrected for small sample size (AIC c ) for each call type.Models include distance (Dist), noise treatment (Treat), microphone sensitivity loss (Mic), and relevant interactions.Parameter estimates for all binomial regression models.Within a species, models are ranked from lowest to highest ΔAIC c score.

Table A5
TableA5.3.Effective detection area (ha) of 12 call types for varying degrees of microphone sensitivity loss under 50 dBA noise conditions.TableA5.5.Percent effective detection area loss of 12 call types for varying degrees of microphone sensitivity loss under 42 dBA noise conditions..6.Percent effective detection area loss of 12 call types for varying degrees of microphone sensitivity loss under 50 dBA noise conditions.