Validity of the Early Years Check-In (EYCI) in a Cross-Sectional Sample of Families

Background: The objective of the present study was to develop and test the validity of the Early Years Check-In (EYCI), a new tool that measures parent and educator concerns regarding children's development. The study examined the EYCI's agreement with 3rd edition of the Bayley Scales of Infant and Toddler Development (BSID-III) an established measure of child development. Two possible thresholds were explored: one to identify children with a probable delay, and another to identify children at the borderline functioning threshold. Methods: Parents of children aged 18 to 42 months were recruited from childcare settings across Ontario, Canada. The study proceeded in two phases. Phase I, intended to pilot the measure, included 49 children. Phase II, a test of the validity of the final version, included 199 children. Parents and educators completed the EYCI for the child, while a blinded assessor completed the BSID-III. Results: The EYCI demonstrated good sensitivity and specificity (86 and 82%, respectively) as a parent-completed tool that identifies children with a probable delay. However, the positive predictive value (15%) suggests the EYCI is likely to over identify children. When identifying children who demonstrated borderline delay, the EYCI demonstrated good sensitivity (80%) but poor specificity (49%). Results from educator-completed EYCIs were poor for both probable and borderline delay. Conclusions: While further research is required, the EYCI shows promise as a parent-completed tool, particularly to identify more-severe cases of delay. Results with educators were poor overall. Future research investigating accuracy of educators in different types of early childcare centres is needed.


INTRODUCTION
Identification of delay and vulnerability during the early years is an important first step to connecting families to services and supports (1)(2)(3). The early years represents a time of neuroplasticity and investing in addressing developmental problems can have a significant impact on later development (4)(5)(6). Despite interest in identifying developmental vulnerability and problems in early childhood, concerns remain about the coverage of current surveillance and screening efforts, which largely occur within primary care, and the number of children who start school with a delay or low readiness for school, especially among lowincome groups (7,8). At present, there is a lack of tools and research that align with current practice guidelines regarding developmental surveillance and screening.
Current guidelines in both Canada (9) and the United States (10,11) highlight the importance of attending to parental concerns as part of regular monitoring and surveillance efforts in primary care, such as well-baby visits. Universal screening is not recommended. Instead, when parental concerns are present, then screening might be necessary (9)(10)(11)(12)(13). Research examining how parents reflect on their concerns suggests that an informal question about concerns, or even a single question about concerns is insufficient (14). An alternative approach is to use a tool to elicit parental concerns.
When examining tools and approaches to developmental surveillance and screening, it is important to consider both the accuracy of tests, and the consequences of applying the tests in contexts where prevalence is low (15). Regarding test accuracy, it is recommended that tests have a sensitivity (the ability to detect true cases) of 80% and specificity (the ability to exclude non-cases) of 70% (16). When prevalence of a disorder is low, the number of false positives is high and the positive predictive value (PPV), the proportion of true positive relative to false positives, is low. The negative predictive value (NPV), which is the proportion of people who score negative on the test that do not have this disorder, is also low. The low PPV is an important issue for assessing delay because estimated prevalence of delay among children and youth range from about 4.1% to about 13% (16)(17)(18). Therefore, even an instrument that meets sensitivity and specificity guidelines is still likely to have low PPV and by extension, a large proportion of false positives. There are no set guidelines for PPV or NPV in surveillance and screening. The cost and impact of a false positive or a false negative should be considered when determining acceptable PPF and NPV values. If the cost of a false positive is high, either in financial resources related to referrals and further assessment, or in terms stress for families, than a low PPV is problematic. Therefore, any universal developmental surveillance approach should consider how to keep the risks associated with a false positive low.
The Parents' Evaluation of Developmental Status (PEDS) (19) is a well-known measure of parental concerns. The PEDS includes 8 items asking parents whether they have any concerns (no, yes, a little) about cognitive functioning, language, motor skills, or social functioning. Two additional items ask parents to list their concerns. The PEDS has been employed for both developmental surveillance, to elicit parental concerns in primary care practice, and as a developmental screening tool. However, research on the psychometric properties of the PEDS has reported variable results. Sices et al. (20) found good sensitivity and specificity of the PEDS in a small (n = 60) sample of young children aged 9 to 31 months attending primary care. Glascoe (21) reported similar results with a sample (n = 295) of children aged age 4.5 and older, however the sensitivity and specificity for children under age 4.5 were 0.68 and 0.66, respectively. Limbos and Joyce (22) reported a sensitivity and specificity of 0.73 and 0.68 in a 334 children aged 12-60 months. PPV values for the PEDS have varied from 19 (22) to 50% (20), while other studies have reported values between 20 and 30% (21,23). The variability in results of the PEDS is not unique, but rather the rule in terms of developmental screening tools. For example, results of another commonly used screening tool, the Ages and Stages Questionnaire (ASQ) (24), has wide variation in results (22,(25)(26)(27)(28)(29).
Another important consideration in developmental surveillance is how to attain maximum coverage of the population. Primary care contact in the early years, such as regular check-ups and immunization visits with physicians, are important but may be insufficient for regular monitoring and surveillance. While many jurisdictions offer regular check-ups during the first year to 18 months, these become less frequent after 18 months (30), just as developmental issues become more apparent (25,31,32). Also, a number of challenges have been identified with well-baby visits including: lack of time, lack of training, and resistance to changing practice (33). Research examining developmental surveillance and screening practice has reported large variability in whether surveillance or screening occurred as well as the methods used. In the U.S., the proportion of families with young children aged 9 to 35 months who received developmental surveillance from a health professional in the last year varied across States from about 19 to 60%. Rates of developmental screening were only slightly lower from 17 to 59% (34). In terms of methods used, recommendations vary by region. In Ontario, the Nipissing Developmental Screen and Rourke Baby Record be used in conversations with parents (35). These challenges highlight the need for a comprehensive model of population monitoring and surveillance that involves settings and professionals beyond primary care.
Incorporating monitoring and surveillance into early childcare settings can complement existing practices in primary care. There are many features of early childcare settings that can facilitate the identification of children who show signs of delay or vulnerabilities. Registered Early Childhood Educators (RECEs) are trained in child development, see families on a regular basis, have multiple opportunities to observe a child, and can offer connections to other organizations and services when there is an identified need (36,37). Regular surveillance and screening already occur in school-aged children and it is accepted that using teacher-reported instruments in schools can be an effective method for identifying school-aged children who show signs of delay or developmental vulnerability (38)(39)(40). In contrast, developmental surveillance in early childcare settings is an area that has received little attention in the research literature. This persists despite concerns regarding under-identification of delay at this age (2,3,18). Recent studies have supported the feasibility of developmental screening in childcare centres, with providers believing it is an important part of their role (41,42). Unfortunately, these studies focused on screening rather than surveillance.
The few studies that have been conducted in early childcare settings show promise. Dereu et al. (43) found an early educator completed tool showed better sensitivity and specificity than parent-reported instruments in detecting autism in children ages 3 to 39 months. Other research has reported that with regular assessment by early childhood educators, earlier identification of children with autism can occur (44,45). This work, however, is limited as it has focused solely on autism. Exploring the ability of tools to detect other kinds of developmental problems and delays in this setting is required.
Only a small percentage of children attend licensed childcare (LCC) facilities. It is difficult to get an exact estimate as both participation in programs, and how programs are defined varies by region. In Canada about 20-38% of children under 4 attend a daycare centre (46). This number does not distinguish between licensed and unlicensed centres, so represents an overestimate of licensed childcare attendance. Over 50% of young children in the United States have been reported in centre-based care, but the degree that these are monitored and licensed varies widely by State (46)(47)(48). While licensed childcare offers potential for developmental surveillance, it is not sufficient to provide wide coverage of young children. There are however, a number of community-based family-focused centres that offer programs for parents and their families. These centres offer a way to increase the reach of surveillance efforts beyond LCC settings. In Ontario, Canada, these include Early Ontario Child and Family Centres (EarlyON), which offer programming for children ages 0-6 years and their parents. These centres are staffed with professionals, such as RECEs, and act as a community hub for connections to other organizations, including public health and other agencies that offer secondary and/or tertiary services (49).

DEVELOPMENT AND RATIONALE OF THE EARLY YEARS CHECK-IN (EYCI) TOOL
We developed the Early Years Check-in (EYCI) for use by parents and early years professionals (i.e., early educators in early childcare settings). The EYCI was developed with input from parents, early educators, and experts in child development, psychometrics, and health measures scale development. The tool was designed for parents and ECE's of children ages 18 months to 6 years; 18 months was set as the lower age band due to difficulties detecting delay at earlier ages (25,31,32). We wanted to ensure coverage across 4 developmental domains (social-emotional; motor; language; and cognitive). We started with focus groups with parents and educators to learn what areas parents and educators think about development, and factors that would facilitate or discourage use. Our results supported the use of 11 questions, with 10 relating to a different area of potential concern and one overall or "global concern" question. The domains of the EYCI largely align with the PEDS, but with two additional areas that capture concerns regarding emotions and overall concerns. Results from focus groups also indicated that for the tool to be used, it needed to be easy to understand and quick to complete. Parents also preferred a visual analogue scale (VAS) to express their concern rather than boxes or numbers (another critical difference between the EYCI and the PEDS). Both parents and educators liked the blue color gradient used in the VAS as it felt more neutral than other colors (particularly, red, orange, or yellow). Once we developed the EYCI questions, we completed cognitive interviews to ensure questions we understood by parents and educators. Further detail regarding the tools development has been reported elsewhere (13).
We conducted a psychometric examination of the EYCI in two phases. The goals of phase I were to: (1) evaluate the item functioning of the EYCI (e.g., response patterns and reliability), (2) measure initial agreement with a standardized instrument, and (3) understand parental concerns by measuring their associations between the EYCI with parent and child history and functioning. The goals of phase II were to: (1) conduct a more robust test of the agreement with the standardized instrument and (2) identify thresholds for detecting probable and borderline delays. For both phases, the EYCI was completed by both parents and educators. Agreement between parents and educators on the EYCI tool was assessed, as well as the agreement of each type of rater with the BSID-III. The EYCI was developed as a surveillance tool, to identify children that further actions such as screening might be beneficial. As such, when testing the EYCI we aimed to maximize sensitivity.

Participants
Both LCC and EarlyON centres were recruited in Ontario, Canada. As these centres are already mandated to support parents and children, and have established relationships with families, they were identified as optimal settings for this work. Phase I included 63 children recruited from a total of 104 children who attended 3 centres (2 OEYCs, and 1 LCC) in Hamilton, Ontario. Non-participating families were deemed ineligible (8%), declined participation (12%), or did not attend with a parent or guardian (3%). Phase II included 255 children recruited from a total of 704 children attending one of 28 centres across Ontario. Other families were ineligible (19%), did not attend with a guardian or parent (23%), or families declined participation (15%). Details of recruitment, and reasons for exclusion are outlined in Figure 1.
Inclusion criteria for families required ability to communicate in English, a child age between 18 and 42 months, and ability to attend a developmental assessment. The upper age-band was set based on the criterion measure employed in the study. We also only included families with a child that the educator had spent at least one hour with in the past 2 months (assessed via Educator report). We excluded all children when a parent reported a prior diagnosis of a developmental condition, such as cerebral palsy, muscular dystrophy, pervasive developmental disorder, an autism spectrum condition, attention deficit hyperactivity disorder, and developmental coordination disorder. This was done because there would naturally be concerns about children with existing diagnoses, which would artificially increase the measure's performance and because there is little clear benefit to identifying children with already-identified difficulties.

Demographic Survey
Parents/guardians reported their age, marital status, age of parent at birth of their first child, their child's age, prematurity, and their own and their child's country of birth.

Access to Services
Parents completed a short survey regarding their child's current and past experiences with developmental services. The type of services included speech and language, physiotherapy, visual impairment, services related to behavioral issues, developmental delay, and other services.

Early Years Check-In (EYCI)
The EYCI assesses parental concerns across 11 items: 10 domainspecific items and one overall item that measures global concern. The full list of items is shown in Figure 2. Items are assessed using a 15 cm long VAS which is shaded in blue from left to right. A "No Concerns" anchor is on the far left end and "Very Concerned" is on the right end (see Figure 2). The EYCI was administered electronically and on paper. When used on paper, parents were asked to draw a vertical line on the scale; scoring took place by measuring from the left anchor to this line using a ruler. Two research team members measured each scale using the same ruler to ensure accuracy in scoring. When there was a discrepancy of under 0.5 cm the lower score was entered. When a discrepancy of greater than 0.5 cm was present a third person measured the item. Each time the third person measured the item, it agreed with one of the first two measures and the value that was consistent across 2 measures was entered. This occurred in ∼1.3% of all completed EYCI's.

Developmental Functioning and Delay
The Bayley Scales of Infant and Toddler Development, 3rd edition (BSID-III) (50) was used to identify developmental functioning and delay. The BSID-III measures development in five domains: Cognitive, Language, Motor, Social-Emotional, and Adaptive. Cognitive, Language, and Motor are measured through a standardized assessment, and Social-Emotional and Adaptive are measured through a parent report questionnaire. For validation purposes, we used the lowest score on any domain to identify children with possible developmental issues.

Procedure
The study received ethical approval from the Hamilton Integrated Research Ethics Board at McMaster University. Once a center agreed to take part, an information package that included the study process and consent forms were sent. The educator consent forms were reviewed, and interested educators provided informed, written consent. In all centres, study advertisements for families were displayed or distributed either electronically or on paper. Research staff attended centres and spoke to families, to share information and assess interest. Interested families were asked questions regarding their child's age, whether their child has ever received a diagnosis for developmental delay or disability, and the child's and caregivers' ability to communicate in English. Next they were asked if their child has spent time with an educator in the centre. If they did not know an educator at the centre they were excluded (see Figure 1).
Once families were recruited, they were invited to complete study materials on paper or electronically on a study tablet and an appointment for completing the BSID-III was scheduled within two weeks. Assessments were conducted by a registered psychometrist or trained research assistants who were blinded to the EYCI results. A list of participating children at each centre was compiled for educators to identify children with whom they were most familiar. If a family consented, but the child could not be matched with a participating educator, they were excluded from the study (see Figure 1). Educators were asked to complete the EYCI for the child within 2 weeks. If the educator needed more time, an extra week was given. This occurred at about 15% of centres.

Analyses
We used alpha coefficient, a common estimate of internal consistency (51), and Spearman correlation to measure associations between parent and educator ratings and between individual EYCI items and BSID-III domains. Spearman was used due to the skewed nature of the data (51). To evaluate the ability of the measure to identify children with potential delay, we used receiver operating characteristic (ROC) curve analysis. The area under the ROC curve (AUC) is a widely-used measure of agreement, and can be interpreted as the probability that a randomly selected participant with a delay will have a higher score than a randomly selected participant without delay. Values from 0.5 to ∼0.7 reflect poor accuracy, values of 0.7-0.9 reflect moderate accuracy, and values above 0.9 are highly accurate (52).
We sought to identify two thresholds for the EYCI: (1) a high concern threshold, where the advice of a health professional and further screening might be appropriate, and (2) an elevated concern threshold, that identified children who would benefit from closer monitoring and where actions might be appropriate.
For the high concern threshold, we selected the "extremely low" threshold of the BSID-III (-2 SD. For the elevated threshold, we used the "borderline functioning" threshold of the BSID-III (-1.5 SD). We compared the highest score on any individual EYCI cut-point to the lowest level of functioning on any BSID-III domain.

Sample Characteristics
The mean age in phase I was lower (M = 29.7 months, SD = 7.0) than phase II (M = 34.4 months, SD = 5.2). The majority of children in both samples were typically developing. In phase I, one child exhibited potential delays and six showed borderline delays. In the larger validation study (phase II), 7 children exhibited potential delays and 18 showed borderline delays. More detailed sample characteristics are described in Table 1. Twelve educators completed EYCIs for participating children across the centres in phase I, and 57 educators completed EYCIs for children in phase II.

Phase I Results
The EYCI showed excellent internal consistency for parents (alpha = 0.95; 95% CI: 0.93-0.97) and educators (alpha = 0.93;  95% CI: 0.91-0.96). All items had item-total correlations above 0.3. Item distributions are presented in Table 2. All items were positively skewed, due to low levels of concern expressed by most participants. Parents used the full range of the VAS to rate their concerns as demonstrated by similar scores in the observed and possible minimum and maximum value. Educators had lower concerns than parents across all items. Across both parents and educators, two items were very low: How this child moves, and How this child uses their hands.
Correlations between the EYCI and domains of the BSID-III indicated low and non-significant correlations between the EYCI maximum item score and motor functioning domain of the BSID-III for parents (< 0.20). We tested the ability of the EYCI to identify borderline delay (i.e., the −1.5SD cut-off on the BSID-III) with 7 children identified as cases; the low number of children who scored below the −2SD threshold (n = 1) meant we were unable to test the threshold for probable delay. For borderline delay, agreement for parents was at the high range of moderate (AUC = 0.88 [95% CI: 0.76-1.00]), while that for educators was at the low range of moderate with very wide confidence intervals (AUC = 0.74 [95% CI: 0.56-0.92]).

Summary of Phase I Results
The AUC results suggest the parent version of the EYCI is able to identify children who might have a delay. Results with educators were less strong. The wide confidence intervals suggest the need to test the tool with a larger sample. Results from both parents and educators' low endorsement of both motor items and low correlations between the motor domain of the BSID-III. We modified the tool to include examples for each motor item to prompt parents and educators to consider motor movements more specifically.
Both parents and educators showed little domain specificity in agreement between domains on the EYCI and the closest corresponding BSID-III domain. For parents correlations between matched domains on the EYCI and BSID-III were no higher than the EYCI and minimum BSID-III score (see Table 3A). For the educator-completed EYCI, all EYCI items showed the strongest relationship with the language domain of the BSID-III (see Table 3B).
Based on ROC results, a maximum item score of 8.65 on the EYCI can identify children with a probable delay as identified on the BSID-III with good sensitivity and specificity (see Table 4). The positive predictive value for detecting probable delay was low (see Table 4; 6 children correctly identified and 35 false positives). When identifying children who meet the borderline cut-off, an EYCI score of 4.95 provides good sensitivity (but unacceptable specificity. The positive predictive value for identifying borderline functioning was also low (see Table 4).

DISCUSSION
We developed and tested the EYCI, a tool for early childcare settings to complement existing monitoring and surveillance efforts in the early years. Considering the differences between the parent and educator completed EYCI, we discuss each separately. While further research is needed with larger samples to better determine the effectiveness of the EYCI, current results show promise and that further investigation is warranted for the parent-completed tool. The parent version of the EYCI was able to identify children who scored very low (-2SD) on the BSID-III with reasonable accuracy. The performance of the EYCI in identifying children with milder delays was not as strong. This is consistent with previous research that has reported detecting delay in milder cases is more challenging (10).
While there are no widely accepted guidelines for developmental surveillance, the identified EYCI threshold for identifying children who scored very low on the BSID-III meet recommended guidelines for early developmental screening (16). The sensitivities and specificities of the EYCI are consistent with some of the higher estimates of the PEDS (20, 21). EYCI  results with the borderline functioning (-1.5SD) cut-off on the BSID-III were not as strong. While it was possible to identify a value with good sensitivity, specificity is quite low. Results suggest that the EYCI is likely to over identify children with potential delay. The EYCI's PPV for the very low BSID-III was low. The PPV for the EYCI is lower than prior work on the PEDS (20-23). The low PPV is impacted by the low prevalence (3.4%) of delay in the sample compared to prior work on the PEDS (20-23). While the prevalence of delay in our sample is lower than estimates of delay or disability (16)(17)(18), we excluded children with a diagnosed developmental condition. This lowers the prevalence of delay, but provided a more accurate test of the functioning of the tool, as asking about "concerns" is problematic for children with diagnosed delays. This prevalence is consistent with other community-based samples (25).
While this is an important limitation, it is important to note that the purpose of the EYCI is to identify parental concerns that recommend further actions. When parents reach or exceed the threshold for high concerns that was used to identify children very low on the BSID-III (-2SD), actions might include discussion with a professional or completing a screening tool. When parents reach or exceed the threshold for elevated concerns (-1.5SD) monitoring concerns in the short term, or discussion with a professional might be warranted. This is consistent with recommendations for screening in the early years, as the screener would only take place in the presence of parental concerns. As such, the EYCI is ideally used within a supportive relationship with an early-years professional where conversation and discussion can take place to explore parent's concerns.
We propose this approach limits the risks and costs of false positives in a community-based surveillance context. Conversations with professionals can help identify and address the cause of the parental concern even when a delay is not present. This is important as unaddressed parental concerns have been linked to lower scores on developmental tests (though not delayed scores), parental well-being, parenting behaviors, and relationships among family members (53)(54)(55). Rather than considering concerns that do not correspond to delay as error, there is evidence that parents engage in a close observation of their children and their health, temperament, and environment, which all play a role in parents' level of concern (56, 57). Research regarding the implications of a false positive, in terms of impacts on families, educators, and services is required to better assess the impact of a false positive test. This work is important as, the PPV for any tool will be low in contexts when prevalence is also low (15).
Phase II results were not as strong at identifying borderline cases of delay as the phase I results, although there were overlapping confidence intervals for the two phases of work. There were no demographic differences in the samples that address this discrepancy. There were some differences in maternal education between the samples, however past research has suggested parent concerns are valid across different levels of parental education (23). One difference between samples that might be relevant was the higher proportion of children identified with a motor delay in phase II compared to phase I. The EYCI showed a low and non-significant correlation with the motor domain of the BSID-III. This suggests the EYCI might not be good at detecting motor delays. Unfortunately, there were too few cases to assess accuracy of the EYCI by BSID-III domain. This highlights the need for further testing in a larger sample to better identify if there are domains that the EYCI might be more limited in addressing. The weak agreement between the EYCI and BSID on motor functioning is an important area to address in the tool. Problems with the motor items, including low endorsement and low agreement with the BSID-III prompted changes to these items. While the endorsement of motor concerns was more consistent with other EYCI items in phase II than phase I, these changes did not appear to increase the relationship between the EYCI and motor domain of the BSID-III. These results could reflect parents' lack of awareness regarding motor delays.
Educator results in the larger validation sample showed poor agreement with the BSID-III, suggesting that, in the study context, educators were not able to identify developmental problems with sufficient accuracy. The agreement between educators and an independent assessment in the current study were poorer than reported in prior research. The childcare settings included in the current study was broader than prior research, which has focused on LCC centres. While including broader early childcare settings allowed for increased reach, educators in these settings do not have the same opportunities to observe children in these centres as LCC. Children in LCC's attend on a regular basis which allows educators more frequent contact and observations. There is more variability in EarlyOn centres, where some families might attend programs regularly and others infrequently. Further research examining the tools accuracy in different childcare settings can help identify what conditions might be necessary for accurate identification of delay by educations. In particular, it would be useful to determine what minimum level of contact or observation is required to improve accuracy to an acceptable level.
The different patterns of correlations between parent-and educator-completed EYCI with the BSID-III domains suggest parent and educator concerns are distinct. For parents, EYCI items appear to capture general concerns regarding how a child is developing, rather than concerns that are specific to a particular domain of functioning. Past research has reported some distinctions regarding the specificity of parental concerns, with some research indicating specificity regarding the domain of parental concern and functional impairment of the child (58), and other studies indicating non-specific relationships (53). The variability in sampling strategies (general population versus at-risk) and outcomes used (identifying a specific disorder versus any delay) limits the ability to identify clear patterns in the literature. Educators' concerns on the EYCI were almost solely related to language development on the BSID-III. To our knowledge, this is the first study that has examined the specificity of early educator concerns. The difference in how parents and educators form concerns might help explain both the lack of item-level agreement on the EYCI and the large discrepancy in accuracy between parents and educators. The focus on language for educators suggests that training educators on other domains, including cognitive and motor skills, could potentially improve the accuracy of the tool for educators. Considering the poor performance of the educator-completed EYCI in both phases of work at both thresholds of the BSID-III, the EYCI is not recommended for use by educators at this time.
A limitation of the study was the small proportion of children who demonstrated a developmental delay. Considering that delays are not the only source of parental concerns, an important question for future research is to assess the ability of the EYCI to predict future problems that include diagnosis, but also other developmental issues such as school readiness.

CONCLUSIONS
This study was the first test of the validity of the EYCI tool. The EYCI measures parent and educator concerns regarding children with potential developmental delay in the early years. Results suggest promise for the parent-completed EYCI, but further work and testing are needed. Specifically we recommend testing in a larger sample, examining the impact of false positives, and comparing the EYCI to existing practices or tools. The tool is not recommended for educator completion at this time.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Hamilton Integrated Research Ethics Boards. Written informed consent to participate in this study was provided by the participants' legal guardian.