Validation of Trøndelag Apnoea Score Proxy for Obstructive Sleep Apnoea in the General Population of Norway: The HUNT Study

The aim was to validate a new seven-item “TASC” (Trøndelag Apnoea Score) proxy for obstructive sleep apnoea (OSA) against polysomnography in the general population. Objectives included validation against different polysomnographic criteria, stratification by age and gender, and estimation of OSA prevalence. From the fourth wave of the Trøndelag Health Study (HUNT4), 1,201 participants were randomly invited to a substudy focusing on sleep and headaches, of whom 232 accepted and 84 (64% women, mean age 55.0 years, and standard deviation 11.5 years) underwent polysomnography. The TASC proxy sums seven binary items for snoring, observed breathing pauses, restricted daytime activities, hypertension, body mass index (≥30 kg/m2), age (≥50 years), and gender (male). A single night of ambulatory (home) polysomnography was analysed using both the recommended and optional hypopnoea criteria of the American Academy of Sleep Medicine (AASM). We found 65% sensitivity and 87% specificity (Cohen's κ = 0.53, 95% confidence interval 0.34−0.72) for TASC ≥ 3 against AHI ≥ 15 (recommended AASM criteria). Validity was similar against AHI ≥ 30 but lower against AHI ≥ 5 and against the optional AASM criteria. Sensitivity and overall validity were higher among men and those above 50 years of age. The prevalence of an apnoea-hypopnoea index (AHI) of at least 5, 15, or 30 using the recommended (and optional) AASM criteria was 73% (46%), 37% (18%), or 15% (5%). A seven-item TASC proxy for OSA showed good validity and may be useful in screening and epidemiological settings. Sensitivity, specificity, and validity vary considerably by cut-off, by polysomnographic scoring criteria, and by gender and age strata.


Introduction
Obstructive sleep apnoea (OSA) is linked to a reduced quality of life, with unrefreshing sleep, sleepiness, fatigue, and depressive mood as potential mediators [1,2].It is strongly associated with adverse health conditions including atrial fibrillation, heart failure, stroke and coronary heart disease [3], and motor vehicle accidents [4].It is therefore important to develop and validate proxy diagnoses for OSA in epidemiological studies.
The third edition of the International Classification of Sleep Disorders (ICSD-3), by the American Academy of Sleep Medicine (AASM), defines OSA by the number of predominantly obstructive, respiratory events per hour (apnoea-hypopnoea index (AHI), preferably by polysomnography (PSG)), symptoms, and comorbidities [2].The criteria are met by AHI ≥ 15 alone but also by AHI ≥ 5 plus at least one of the following: snoring, breathing pauses, daytime symptoms of poor sleep (sleepiness, nonrestorative sleep, fatigue, or insomnia symptoms), or a diagnosis of a listed comorbidity (including hypertension).However, researchers typically operationalise the lone AHI cut-offs of 5, 15, and 30, referred to as "mild," "moderate," and "severe" OSA.
The prevalence of mild and moderate-to-severe OSA in the general population varies between 9% to 38% and 6% to 17%, respectively [5].Prevalence is known to increase with body mass index (BMI), age, and male gender [2,6], but it also doubles with the use of the recommended 2012 AASM criteria compared with 2007 AASM criteria [5,7,8].Accordingly, there are concerns that both the symptoms and comorbidities accepted by the ICSD-3 and the latest AASM criteria for OSA are too inclusive.Two populationbased studies among men above 40 years of age estimated the prevalence of ICSD-3 OSA at 52.2% (AHI ≥ 10, 2007 AASM criteria) and 74.4% (AHI ≥ 5, 2012 AASM criteria) [9,10].It is therefore of interest to estimate OSA prevalence by different AHI cut-offs and old versus new AASM criteria for hypopnoea.Also, OSA proxies have not previously been comparatively validated against the old and new AASM criteria, to our knowledge.
Although useful in epidemiological and some clinical settings, OSA proxies may never completely replace objective sleep testing [11].The popular, eight-item STOP-Bang questionnaire was developed for surgical populations to rapidly evaluate the risk of OSA, being associated with perioperative complications [12].It appeared superior to pre-existing questionnaires [13,14] and has since been validated in numerous populations [15].Since neck circumference measurements are rarely available in large epidemiological studies relying on questionnaires, including the fourth wave of the Trøndelag Health Study (HUNT4), it is of great interest to develop a STOP-Bang-inspired proxy without neck circumference.
While the STOP-Bang accepts any of tiredness, fatigue, or sleepiness as daytime symptoms of OSA, it is of interest of simplicity to settle on one daytime symptom of OSA.A recent large cohort study found only a weak association between the Epworth Sleepiness Scale (≥11) and the AHI [16], and excessive daytime sleepiness is less typical among women [17].Meanwhile, elderly cases may have fewer symptoms altogether [18].These differences may impact validity and necessitate gender-and age-specific proxy cutoffs [19].Finally, the ICSD-3 states no minimum frequency of daytime symptoms.It is therefore of interest to validate a new proxy incorporating different daytime symptoms, of different frequencies, specified in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) and the ICSD-3 (e.g., insomnia), in age and gender strata.
The general aim of this population-based study was to validate a new seven-item proxy for OSA, named the Trøndelag Apnoea Score (TASC), using items for snoring, breathing pauses, daytime symptoms, hypertension, BMI, age, and gender, against a PSG-based diagnosis, in an adult general population subsample from HUNT4 in Norway.The main objective was to validate several cut-offs of the TASC against AHI ≥ 5, AHI ≥ 15, and AHI ≥ 30, using the recommended AASM criteria.Secondary objectives were to study validity by the choice of daytime symptom (restricted daytime activities, sleepiness, or tiredness), study validity by the choice of minimum frequency of symptoms, stratify validity by age and gender strata, validate the TASC against the optional AASM criteria, and estimate prevalence of PSGand proxy-based prevalence of OSA.

Methods
2.1.Participants.HUNT4 took place between August 2017 and February 2019.All residents above 20 years of age (meeting the World Health Organization criteria of adulthood) of the defunct Nord-Trøndelag county were invited to two questionnaires, a structured interview and a clinical examination [20], which 56,078 out of 96,469 residents (58%) underwent.Next, HUNT4 participants from Stjørdal municipality were invited by postal mail to an approved HUNT4 substudy named Sleep and Pain.Stjørdal municipality is a 938 km 2 agricultural area with a small town centre and 23,165 inhabitants sufficiently representative of Trøndelag county.Out of 1,201 randomly selected HUNT4 participants, 232 (19%) agreed by telephone and were scheduled appointments at the interview site in Stjørdal in November 2017.They completed a waiting room questionnaire about sleep and health pending a face-to-face interview about sleep, health, headache, and pain.Finally, the participants of Sleep and Pain study were invited to a nerve conduction study (included for differential diagnostics of restless leg syndrome) and a single-night ambulatory PSG, which was initially accepted by 87 participants.However, two participants withdrew after the nerve conduction study, and the polysomnogram of another was omitted because of technical issues, resulting in 84 participants for the current PSG study.
The HUNT4 organisation performed the randomisation, and the regional ethics committee approved the invitation letter.The median time delay between questionnaire completion in the Sleep and Pain substudy and the PSG study was 11 months (range 3-22 months, 94% within 15 months).We documented clinical findings including advice for clinical follow-up in the electronic patient journal, sending copies to the participant and their general practitioner.Participants previously with diagnosed OSA (10 out of 232: 4%; 4 out of 84 with PSG: 5%) or any other disease were included.The validity of questionnaire-based diagnoses for insomnia, primary headaches, and restless leg syndrome has been published for this sample [21][22][23].

Questionnaires and Other
Health-Related Data.In the Sleep and Pain substudy, participants completed the Karolinska Sleep Questionnaire (KSQ) [24] which includes items for snoring, apnoeas, and presumed daytime symptoms of sleep disturbances (Supplementary Table 1).Other questionnaires included the Epworth Sleepiness Scale [25], Insomnia Severity Index (ISI) [26], and the Hospital Anxiety and Depression Scale (HADS) [27].Participants listed their health conditions, which were supplemented by their medication list.On the day of the ambulatory PSG,

2
Sleep Disorders the height and weight of participants were measured using a wall-mounted height measuring tape and a bathroom-type body weight scale (SECA®, Hamburg, Germany), respectively.
2.3.The TASC Proxy.Inspired by the STOP-Bang [12], the TASC proxy dichotomises and sums seven OSA-relevant items.From the KSQ, loud and embarrassing snoring (according to others), breathing pauses during the night (according to others), and restricted daytime activities (spare time, school, or job) were each scored if reported "mostly/at least three times a week" (Supplementary Table 1).The remaining four items were hypertension (by questionnaire or list of medications), BMI above 30 kg/m 2 , age above 50 years, and male gender.The main TASC proxy used restricted daytime activities as its daytime symptom of OSA because of its inclusion in the main HUNT4 questionnaire and its close resemblance to the DSM-5 insomnia diagnosis (criterion B).However, we also incorporated bothersome daytime sleepiness and bothersome daytime tiredness/fatigue into alternative proxies: "TASC-sleepy" and "TASC-tired."Finally, a more liberal "TASC-monthly" allowed the three KSQ items to be reported "sometimes/at least once a month."2.4.PSG: Setup and OSA Diagnoses.The PSG equipment for the single night of unattended, ambulatory PSG was mounted at St. Olavs Hospital, Trondheim University Hospital, at 12:00 the preceding day.Participants were instructed to avoid alcohol, hypnotic drugs, and napping after dinner (unless done routinely), to go to sleep between 22:00 and 00:00 under undisturbed conditions (as normal), and to document the lights off and lights on time plus any awakenings (e.g., visits to the toilet).The equipment was dismantled at 08:00 the following morning.
The PSG was recorded using SOMNOscreen plus PSG equipment (SOMNOmedics GmbH®, Randersacker, Germany).Six electroencephalography (EEG) electrodes were placed according to the International 10-20 system: F3, F4, C3, C4, O1, and O2.Two electrooculographic electrodes were placed: 1 cm laterally and 2 cm above the right eye cantus and 1 cm laterally and 2 cm below the left eye cantus.Mastoid M1 and M2 were reference electrodes for electrooculographic and the contralateral EEG electrodes.Surface electromyography was registered from the submental and bilateral anterior tibial muscles.Nasal flow and naso-oral thermistor were affixed above the upper lip.Thoracic and abdominal piezoelectric respiratory effort belts were applied.Pulse oximetry was recorded from the index finger.The PSG was analysed using DOMINO® (version 3.0.2,SOMNOmedics).
As per the latest wording of the AASM manual (February 2023) [28], we used both the recommended (1a) and the optional (1b) hypopnoea criteria to calculate the AHI.The recommended hypopnoea criteria score a hypopnoea if there is an airflow signal drop of ≥30% for ≥10 seconds, associated with either an EEG arousal or a 3% oxygen desaturation.The more conservative, optional hypopnoea criteria instead require the hypopnoea to be associated with a 4% oxygen desaturation, disregarding any arousal.Our senior sleep expert calculated the AHI manually for each polysomnogram, first using the optional criteria and then (several months later) using the recommended criteria on raw PSG data, without any previously scored markers but not formally blinded to the initial scoring.For validation purposes, we used AHI cut-offs 5, 15, and 30.For prevalence estimation, we additionally assigned the ICSD-3 diagnosis to participants with AHI ≥ 5 plus at least one of the following: AHI ≥ 15, a complaint of snoring, breathing pauses, a daytime symptom (restricted activities, sleepiness, or tiredness; Table 1) or any insomnia symptom (difficulty falling asleep, falling back asleep, or waking up too early), at least three times a week, or any selfreported comorbidity listed by the ICSD-3 [2].
Out of the 84 participants with complete PSG analyses, response rates for the three KSQ items regarding daytime symptoms were 100%.However, 10 participants (12%) failed to answer the item regarding breathing pauses, of whom four (5%) also failed to answer the item about loud and embarrassing snoring.For the primary analysis, the response option "never" was imputed for these 14 blank responses as to include all 84 participants.In a supplementary sensitivity analysis, we treated the blank responses to these questions as missing observations, initially leaving only 74 participants with determined proxy scores.For many of the proxy cutoffs however, these 10 participants could be classified as either definite positives or definite negatives owing to the score from the remaining five TASC items.Hence, the final sample size in the supplementary analysis varied between 78 and 82.

Population Characteristics.
Out of a total of 84 participants with complete PSG analyses, 55 (65%) were women and 60 (71%) were above 50 years of age (Table 1).The mean age of the total sample was 55.0 years, and the mean BMI was 27.2.Twenty-four percent of participants had a BMI above 30, 21% had hypertension, 21% was medicated for other cardiovascular diseases or risk factors, and 32% had DSM-5 insomnia by a diagnostic interview.Other 3 Sleep Disorders comorbidities were less frequent (e.g., 5% diabetes, 2% asthma, 1% cancer history, and 8% polyneuropathy).The mean AHI of the total sample was 14.9 and 7.9 using the recommended and optional AASM criteria, respectively, being higher among men and participants above 50 years of age (Table 1).

3.7.
Validity of the TASC Proxy, excluding Blank Respondents to Snoring and Breathing Pauses (Recommended AASM Criteria).Excluding blank respondents to snoring and breathing pauses (according to others), instead of imputing missing responses, yielded slightly higher sensitivity and validity but similar validity overall (Cohen's κ 0.33−0.58,Supplementary Table 3).

Discussion
In this population-based sample, we found good validity (65% sensitivity, 87% specificity, Cohen's κ = 0 53) of a seven-item STOP-Bang-inspired proxy for OSA (TASC), using the cut-off ≥ 3, against PSG-based AHI ≥ 15 (recommended AASM scoring criteria).Validity was similar against AHI ≥ 30, but mostly acceptable against AHI ≥ 5.There were minimal differences when incorporating different, alternative daytime symptoms into the TASC proxy.Sensitivity and overall validity were higher among men compared with women and in those above versus below 50 years of age.Validity was only acceptable using the conservative optional AASM criteria.Using the recommended AASM criteria, the prevalence of AHI ≥ 5, AHI ≥ 15, and AHI ≥ 30 was 73%, 37%, and 15%, versus 46%, 18%, and 5%, using the more conservative optional criteria.The prevalence of ICSD-3 OSA was 61% with the recommended and 37% with the optional AASM criteria.TASC ≥ 3 was reasonably prevalent in this sample, at 32% overall.

Comparisons with Other Validation Studies.
In a recent systematic review and meta-analysis, Chen et al. [32] identified five validation studies of the STOP-Bang in the general population [14,[33][34][35][36]. Against AHI ≥ 5, AHI ≥ 15, and AHI ≥ 30, the authors estimated pooled sensitivity at 73%, 88%, and 92% and pooled specificity at 66%, 42%, and 38%, respectively.Given the pooled prevalence rates, this corresponds to Cohen's κ estimates of merely 0.39 against AHI ≥ 5, 0.17 against AHI ≥ 15, and 0.07 against AHI ≥ 30.While we found similar estimates of sensitivity and specificity, we found considerably higher validity overall (Cohen's κ = 0 41−0.53).However, one should note that the STOP-Bang cut-off was fixed at ≥3 while we let TASC cut-off varies between ≥2 and ≥4.Considering that three out of the five identified studies [14,33,35] used a 4% desaturation threshold only to score hypopnoea, the pooled results should perhaps be compared to our results against the optional AASM criteria instead, which indicated lower validity (Cohen's κ = 0 33−0.38).

Sleep Disorders
Beyond the proxy cut-off, comparisons with the pooled results are hindered by the use of type 3 devices (without sleep and EEG arousal in two studies [33,36], which have roughly 90% sensitivity and specificity against the goldstandard PSG [37].Given an underlying relationship between questionnaire scores and the PSG, the use of type 3 devices introduces nondifferential misclassification of cases and noncases which will weaken the observed STOP-Bang validity.In all, our present TASC scores appear more valid than the STOP-Bang although there are few population-based studies using the gold-standard PSG [14,34]. Whereas the STOP-Bang only requires the symptom frequency "often" for daytime tiredness, fatigue, or sleepiness (T) [12], the current KSQ-based TASC proxy specifies a symptom frequency of "mostly/at least three times a week" for snoring, breathing pauses, and daytime symptoms.We found no advantage of the more relaxed frequency criteria "sometimes/at least once a month" on overall validity, against any AHI cut-off, using any set of AASM criteria.Considering the focus on frequency criteria, the current proxies may also be likened to the Berlin Questionnaire [38].It focuses on the "STOP" items (particularly snoring and tiredness) and requires a frequency of "nearly every day" or "3−4 times a week" for five out of 10 items, congruent with our main KSQ response option of "mostly/at least three times a week."In a systematic review, Senaratna et al. [39] identified two validation studies of the Berlin Questionnaire in the general population.Hrubos-Strøm et al. [40] found only 37% sensitivity and 84% specificity (Cohen's κ = 0 20) against AHI ≥ 5 and 43% sensitivity and 80% specificity (Cohen's κ = 0 13) against AHI ≥ 15, using a 4% desaturation threshold for hypopnoea scoring.Meanwhile, Kang et al. [41] found 69% sensitivity and 83% specificity (Cohen's κ = 0 48) against AHI ≥ 5 and 89% sensitivity and 63% specificity (Cohen's κ = 0 40) against AHI ≥ 15, using a 3% threshold to score hypopnoeas.Hence, from the few available studies in the general population, the explicit use of a minimum symptom frequency (Berlin Questionnaire and the current TASC) may be an improvement on the vaguer wording of the STOP-Bang.
Marti-Soler et al. [34] derived and optimised a five-item score called NoSAS to a large population-based sample.Against AHI ≥ 20, the NoSAS outperformed both the STOP-Bang and Berlin Questionnaire in two separate cohorts (Cohen's κ = 0 37−0.39 vs. 0.15−0.22).This difference may not be completely attributed to the use of predefined cut-offs for the STOP-Bang and Berlin Questionnaire, as the NoSAS also had a greater area (0.74−0.81 vs. 0.63−0.68)under the ROC curve.
The validity of the current TASC ≥ 3 may also be compared with that of proxies for interview-verified headache and sleep disorder diagnoses in the same sample [21][22][23].As judged by Cohen's κ, TASC ≥ 3 performed similarly to proxies for headache suffering, migraine, insomnia, and unspecified restless leg syndrome (Cohen's κ = 0 45−0.57)and better than the proxy for tension-type headache (Cohen's κ = 0 33).

Items and Strata
. There were minimal differences in validity between the different daytime symptoms.The agreement proportion between any two TASC proxies (≥3) with different daytime symptoms (restricted activities, sleepiness, or tiredness) was at least 96% (Cohen's κ ≥ 0 92, not tabulated), partially because 72% to 89% of participants with TASC ≥ 3 did not report the targeted symptom at least three times a week.The choice of daytime symptom may then seem insignificant, but the agreement proportion of the three daytime symptoms was comparatively low, at 86% (Cohen's κ 0.46−0.59).Hence, the choice of daytime symptom may have a larger effect on simpler proxies that are less OSA: obstructive sleep apnoea; AHI: apnoea-hypopnoea index; AASM: American Academy of Sleep Medicine; TASC (Trøndelag Apnoea Score): a seven-item OSA proxy with one potential point for each item: loud and embarrassing snoring (according to others) "mostly/at least three times a week," breathing pauses during the night (according to others) "mostly/at least three times a week," restricted daytime activities (spare time, school, or job) "mostly/at least three times a week," hypertension, BMI ≥ 30, age ≥ 50 years, and male gender.See Figure 1 for receiver operating characteristic curves.
6 Sleep Disorders reliant on other items and on proxies require a lower minimum frequency.We advocate restricted daytime activities as the most clinically relevant daytime symptom of given both its concordance with the DSM-5 diagnosis of insomnia and recent studies reporting weak associations between OSA and daytime sleepiness [16,42].Restricted daytime activities may also be partially viewed as the result of daytime tiredness, sleepiness, or fatigue.Altogether, 63% to 72% of participants with OSA (depending on the proxy) reported at least one symptom (tiredness, snoring, or breathing pauses) at least three times a week.This contrasts with another population-based study, in the middle age, in which only a minority of participants with moderate-to-severe OSA reported symptoms [42].The discrepancy may be due to the authors' use of the Epworth Sleepiness Scale, known to correlate poorly with the AHI [16], and highlights the need to standardise symptom evaluation by OSA proxies.Note that proxy-positive participants include asymptomatic cases, as a TASC score of four can be obtained from hypertension, BMI, age, and gender.The fact that asymptomatic cases do not necessarily benefit from treatment [43] deems such proxy scores more suitable to epidemiological studies than to clinical decision-making.
While the STOP-Bang uses BMI ≥ 35, we chose BMI ≥ 30 for better suitability to our population-based sample with a mean BMI of 27.2.Similarly, previous studies have found the optimal BMI cut-off to depend on ethnicity and gender, down to 30 for women [19] and as low as 27.5 in certain populations [44].However, there is a growing concern that BMI fails to capture adiposity in all demographics.Some studies have used the waist-to-hip ratio as an alternative among women [45].Perhaps the limitations of the BMI partially explain why our TASC proxy was more valid among men than women.Using health care use as a clinical endpoint, a large longitudinal study among persons aged between 45 and 85 years (age range of 85% of our sample) found similar risks whether obesity was defined by BMI, waist circumference, waist-hip-ratio, or body fat percentage [46].The correlation between obesity and health care use was however weaker in the higher age categories [46].
Snoring and breathing pauses according to a bed partner may need special consideration since many participants lack a bed partner.As the 10 blank respondents had a higher AHI overall, we found slightly higher validity when not imputing these responses to "never."Perhaps the lack of a bed partner should be viewed as a marker of poorer health, including greater risk of OSA.In a supplementary analysis (not tabulated), we explored six-item proxies by successively removing one of the 7 TASC items.Notably, the removal of breathing pauses slightly increased the estimate of validity (Cohen's κ = 0 55), while the removal of snoring or BMI slightly decreased the estimate of validity (Cohen's κ 0.48 and 0.45, respectively).
We found higher validity among those above (vs.below) 50 years of age.By comparison, we previously found lower validity of questionnaire-based diagnoses for insomnia and restless leg syndrome among elderly participants in the same population, proposing age-dependent decreases in reading OSA: obstructive sleep apnoea; AHI: apnoea-hypopnoea index; AASM: American Academy of Sleep Medicine; TASC (Trøndelag Apnoea Score): a seven-item OSA proxy with one potential point for each item: loud and embarrassing snoring (according to others) "mostly/at least three times a week," breathing pauses during the night (according to others) "mostly/at least three times a week," restricted daytime activities (spare time, school, or job) "mostly/at least three times a week," hypertension, BMI ≥ 30, age ≥ 50 years, and male gender; TASC-sleepy: requiring bothersome daytime sleepiness (as opposed to restricted daytime activities); TASC-tired: requiring bothersome daytime sleepiness (as opposed to restricted daytime activities); TASC-monthly: requiring the KSQ items "sometimes/at least once a month" (as opposed to "mostly/at least three times a week").See Figure 2 for receiver operating characteristic curves for AHI ≥ 15.
7 Sleep Disorders comprehension or increases in competing causes of toms as potential mechanisms [21,22].Both these factors may have been weakened by the current inclusion of nonitems and by the lower age cut-off in the current study (≥50 vs. ≥65 years).We also found the TASC to be more sensitive and more valid overall, among men compared with women, similarly to Bauters et al. [33].The pos-itive relation between validity and both age and male gender is evident from the stratified summary of the PSG (Table 1), in which older participants and men (in particular) show more obstructive sleep than their counterparts.
intended application must also be taken into account.TASC ≥ 3 (sensitivity = 65%, specificity = 87%, Cohen's κ = 0 53) may be optimal for correlation studies wherein specificity is key, as to not dilute identified cases with false positives.Conversely, TASC ≥ 2 (sensitivity = 87%, specificity = 62%, Cohen's = 0 45) may be more suitable in screening settings, although OSA proxies are deemed unfit to replace objective sleep testing in the clinical setting [11].Regarding prevalence estimation, one may be tempted to choose the cut-off that produces the closest prevalence estimate to the gold-standard reference in the validation study.However, Diggle [47] has shown that the proxy prevalence (and its closeness to the gold-standard) depends on the interplay between sensitivity, specificity, and the gold-standard prevalence itself.Hence, the optimal proxy cut-off for prevalence estimation should be based on overall validity rather than on the closeness in prevalence between proxy and gold-standard in a given study.
In our study, TASC ≥ 2 produced the closest prevalence estimate to the ICSD-3 diagnosis and AHI ≥ 5, while TASC ≥ 3 was the closest to AHI ≥ 15 (recommended AASM criteria).
4.4.The Choice of AHI Cut-Off.The prevalence of OSA varied greatly with the choice of AASM (or ICSD-3) criteria and AHI cut-off and between gender and age categories.The particularly high prevalence of AHI ≥ 5 (and ICSD-3 OSA), at 73% with recommended AASM criteria, raises questions about its clinical and epidemiological relevance, in this population at least.The issue partly remains for AHI ≥ 5 using the more conservative, optional AASM criteria (46% prevalence).We therefore suggest greater relevance of AHI ≥ 15 than AHI ≥ 5, using the latest recommended AASM criteria.The choice of AHI cut-off also affected the balance between sensitivity and specificity of our OSA proxies (at a fixed proxy cut-off).Compared with AHI ≥ 15, the proxies were more specific (less sensitive) against AHI ≥ 5 and more sensitive (less specific) against AHI ≥ 30, a trend also seen in previous validation studies [32].Although a formal mathematical proof of this relation is beyond the scope of this study, one should note that changes in the AHI cut-off and the proxy cut-off have opposite effects on the balance between sensitivity and specificity.

Strengths and Limitations.
A major strength of this study is its population-based recruitment of 1,201 HUNT4 participants.However, the sequential recruitment of participants via those who underwent the interview [21][22][23], and the joint invitation to the PSG study and a nerve conduction study, may have lowered the participation rate to 7% out "mostly/at least three times a week," restricted daytime activities (spare time, school, or job) "mostly/at least three times a week," hypertension, BMI ≥ 30, age ≥ 50 years, and male gender; TASC-sleepy: requiring bothersome daytime sleepiness (as opposed to restricted daytime activities); TASC-tired: requiring bothersome daytime sleepiness (as opposed to restricted daytime activities); TASC-monthly: requiring the KSQ items "sometimes/at least once a month" (as opposed to "mostly/at least three times a week").9 Sleep Disorders of the initial 1,201 HUNT4 participants.By comparison, we achieved an 18% participation rate ambulatory PSG alone in the HUNT3 PSG study [48].Selection bias is most evident in the over-representation of women, elderly, and persons with insomnia (interview focus).On the other hand, an enrichment of subjects with sleep health issues ensured an adequate number of OSA cases from 84 PSGs.
Regarding the PSG procedure itself, one major strength was the analysis using both the recommended and the optional AASM criteria.While there was a considerable delay between questionnaire completion and the PSG for many participants, sleep questionnaires like the Pittsburgh Sleep Quality Index seem to be reliable across several months [49], hypertension and untreated OSA can be considered stable traits (OSA prevalence increasing very slowly until age 65) [5], and BMI was calculated during the PSG setup.Although the AHI is known to between consecutive nights, suggesting repeated PSGs for a clinical diagnosis [50], estimated AHI night-to-night reliability is high in most studies [51].Using a single night of PSG might be considered a weakness, but the so-called "first night effect" appears minimal for ambulatory PSG recordings [52,53].

Conclusion
In this population-based sample, we found good validity of a new seven-item TASC proxy for OSA, against AHI ≥ 15 using a PSG-based gold-standard with the recommended AASM criteria.Validity was similar against AHI ≥ 30, but lower against AHI ≥ 5 and against the more conservative, optional, AASM criteria.Sensitivity and overall validity were higher among men compared with women and in those above versus below 50 years of age.A seven-item TASC proxy for OSA should accordingly be useful in epidemiological studies.Researchers and clinicians should note how sensitivity, specificity, and validity vary by cut-off, polysomnographic criteria, and demographic strata.OSA: obstructive sleep apnoea; ICSD-3: International Classification of Sleep Disorders, 3 rd Edition; AHI: apnoea-hypopnoea index; AASM: American Academy of Sleep Medicine; TASC (Trøndelag Apnoea Score): a seven-item OSA proxy with one potential point for each item: loud and embarrassing snoring (according to others) "mostly/at least three times a week," breathing pauses during the night (according to others) "mostly/at least three times a week," restricted daytime activities (spare time, school, or job) "mostly/at least three times a week," hypertension, BMI ≥ 30, age ≥ 50 years, and male gender.10 Sleep Disorders excluding blank respondents to snoring and breathing pauses.Materials)

4. 3 .
The Choice of Proxy Cut-Off.While we let Cohen's κ compare the overall validity of different proxy cut-offs, the

WomenFigure 1 :
Figure1: ROC curves of the TASC proxy against AHI categories (recommended AASM criteria).AHI: apnoea-hypopnoea index; AASM: American Academy of Sleep Medicine; ROC: receiver operating characteristic; AUC: area under the curve; CI: confidence interval; TASC (Trøndelag Apnoea Score): a seven-item OSA proxy with one potential point for each item: loud and embarrassing snoring (according to others) "mostly/at least three times a week," breathing pauses during the night (according to others) "mostly/at least three times a week," restricted daytime activities (spare time, school, or job) "mostly/at least three times a week," hypertension, BMI ≥ 30, age ≥ 50 years, and male gender.

FalseFigure 2 :
Figure 2: ROC curves of the TASC proxy, by daytime symptom and symptom frequency, against AHI ≥ 15 (recommended AASM criteria).AHI: apnoea-hypopnoea index; AASM: American Academy of Sleep Medicine; ROC: receiver operating characteristic; AUC: area under the curve; CI: confidence interval; TASC (Trøndelag Apnoea Score): a seven-item OSA proxy with one potential point for each item: loud and embarrassing snoring (according to others) "mostly/at least three times a week," breathing pauses during the night (according to others) "mostly/at least three times a week," restricted daytime activities (spare time, school, or job) "mostly/at least three times a week," hypertension, BMI ≥ 30, age ≥ 50 years, and male gender; TASC-sleepy: requiring bothersome daytime sleepiness (as opposed to restricted daytime activities); TASC-tired: requiring bothersome daytime sleepiness (as opposed to restricted daytime activities); TASC-monthly: requiring the KSQ items "sometimes/at least once a month" (as opposed to "mostly/at least three times a week").
KSQ items, distribution (%) of responses: never-rarely/a few times a year-sometimes/at least once a month-mostly/at least three times a week

Table 3 :
Validity of the TASC proxy by daytime symptom and symptom frequency (recommended AASM criteria).

Table 4 :
Gender-and age-stratified validity of the TASC proxy (recommended AASM criteria).