Information and Communication Technologies to Support Early Screening of Autism Spectrum Disorder: A Systematic Review

The aim of this systematic review is to identify recent digital technologies used to detect early signs of autism spectrum disorder (ASD) in preschool children (i.e., up to six years of age). A systematic literature search was performed for English language articles and conference papers indexed in Pubmed, PsycInfo, ERIC, CINAHL, WoS, IEEE, and ACM digital libraries up until January 2020. A follow-up search was conducted to cover the literature published until December 2020 for the usefulness and interest in this area of research during the Covid-19 emergency. In total, 2427 articles were initially retrieved from databases search. Additional 481 articles were retrieved from follow-up search. Finally, 28 articles met the inclusion criteria and were included in the review. The studies included involved four main interface modalities: Natural User Interface (e.g., eye trackers), PC or mobile, Wearable, and Robotics. Most of the papers included (n = 20) involved the use of Level 1 screening tools. Notwithstanding the variability of the solutions identified, psychometric information points to considering available technologies as promising supports in clinical practice to detect early sign of ASD in young children. Further research is needed to understand the acceptability and increase use rates of technology-based screenings in clinical settings.


Introduction
Autism spectrum disorder (ASD) is a category of neurodevelopmental disorder characterized by persistent deficits in social communication and social interaction across multiple contexts as well as restricted, repetitive patterns of behavior, interests, or activities [1]. The care and social needs of preschool children with ASD (typically up to six years of age), in particular, are significant [2,3], usually extend to parents and siblings [2,4,5], and require substantial community resources [2,6,7]. In response to these needs, early detection of ASD has become a priority for primary care and other community settings [8] to provide early intervention services and to improve outcomes [2,9].
Timely (i.e., early) identification of ASD may be achieved by implementing screening methods and instruments that allow health and other professionals (e.g., social care, educators) for a rapid and relatively inexpensive evaluation of this condition in young children [10]. Screening measures that are suitable for use to identify ASD are already available and can vary by format (e.g., parent-report versus direct observation), scope, and target population [11]. With regard to the scope of the screening instruments, "broadband" screens cover multiple developmental domains, while "narrow" screens cover only those signs and symptoms specific to the condition of interest [11,12]. With regard to the target population, screening instruments can be used to conduct universal population-wide testing (also referred to as "universal screening" or Level 1 screening), or to identify possible signs of ASD in high-risk populations, such as siblings of children with ASD or those

Full-Texts' Inclusion and Exclusion Criteria
The following inclusion criteria were used in selecting the studies for the review: 1. The paper had to report on the development and/or implementation of technology arrangements (whether they are commercially available or not, independently if they have been specially developed for screening or adapted from solutions available for different purposes) aimed at detecting early signs of ASD across a range of clinical (e.g., primary care; specialized clinics/services), and other settings such as laboratory, home, or school. 2. The studies had to target children aged ≤6 years. Studies involving broader age ranges were included providing that they involved children within the aforementioned age group (i.e., age ≤ 6 years). 3. The studies had to provide quantitative information on the capability of the technology (or the technology-based approach) of: a. Screening for ASD at the population level (Level 1 screening; L1), such as children evaluated by primary care physicians, or b. Screening for ASD in a subsample of the population identified as at risk for the disorder (Level 2 screening; L2), such as a referred clinical sample with a variety of developmental concerns, siblings of children with ASD, pre-term children, children with genetic syndromes that are usually associated with ASD, or children with a diagnosis of other neurodevelopmental disorders [29].
Excluded from the review were studies:

Full-Texts' Inclusion and Exclusion Criteria
The following inclusion criteria were used in selecting the studies for the review: 1.
The paper had to report on the development and/or implementation of technology arrangements (whether they are commercially available or not, independently if they have been specially developed for screening or adapted from solutions available for different purposes) aimed at detecting early signs of ASD across a range of clinical (e.g., primary care; specialized clinics/services), and other settings such as laboratory, home, or school.

2.
The studies had to target children aged ≤6 years. Studies involving broader age ranges were included providing that they involved children within the aforementioned age group (i.e., age ≤ 6 years).

3.
The studies had to provide quantitative information on the capability of the technology (or the technology-based approach) of: a.
Screening for ASD at the population level (Level 1 screening; L1), such as children evaluated by primary care physicians, or b.
Screening for ASD in a subsample of the population identified as at risk for the disorder (Level 2 screening; L2), such as a referred clinical sample with a variety of developmental concerns, siblings of children with ASD, pre-term children, children with genetic syndromes that are usually associated with ASD, or children with a diagnosis of other neurodevelopmental disorders [29].
Excluded from the review were studies: 1.
Reporting on a retrospective analysis of existing databases of evaluation records which were not directly implemented in the aforementioned applied settings and/or did not involve the target users (i.e., health professionals; caregivers); 2.
Focusing on invasive or non-invasive techniques to investigate biological processes and structures (e.g., electroencephalography, brain imaging, electrodermal activity); 3.
Using technology to investigate physiological (e.g., heart rate; eye movements), behavioral (e.g., vocal or movement patterns; crying), or cognitive differences between children with/at risk of ASD and controls not for the purpose of developing a screening tool; 4.
Providing training to professionals on the use of a screening tool.

Data Coding and Extraction
The studies that met the aforementioned inclusion criteria were coded in terms of participant characteristics (i.e., number, age-range and sex), target users of the technology, indicators used to assess ASD condition, types of technology used, context(s) of use of the technology, screening level, and maturity of the technology. A brief description of each technology identified, the methodology for its evaluation, and its psychometric properties were also provided.
Country of origin of the study was reported based on (i) the information provided in the methodology, or (ii) the affiliation of the corresponding or the first author of the paper. To classify the types of technologies used in each paper, we adapted the classification proposed by Kientz et al. [30] which includes six different types of interface, namely (a) Personal computers (PC) or mobile, (b) shared interactive interfaces, (c), virtual, augmented, and mixed reality, (d) sensor-based and wearable, (e) natural user interfaces, and (f) robotics. Likewise, to rate the maturity of the technology identified, we used the maturity levels proposed by Kientz et al. [30], that is, (a) functional prototype or (b) publicly available. Specifically, a functional prototype refers to technology that has been developed and interacted with the intended users for the target purposes but may require assistance with setup, use, or maintenance. Technologies classified as publicly available, in contrast, refer to commercial products, software that is open source, or applications available for download on websites or on mobile marketplaces (even if no longer available at the time of the present review).
When not specifically mentioned in the paper, we conceived L1 screening as applying to (a) all children regardless of the risk status (such as the M-CHAT), (b) tools implemented to assess children during routine pediatric visits, (c) experimental or observational studies that compared children with a diagnosis of ASD with neurotypical children. In contrast, we conceived L2 screening tools as (a) targeted at children already identified as being at increased risk (e.g., due to a positive family history), and/or (b) used to distinguish between ASD and other neurodevelopmental disorders.
Finally, we extracted relevant information on psychometric properties typically used for screeners, when available. Metrics extracted included (1) sensitivity (the percent of cases with ASD classified by the instrument as ASD); (2) specificity (the percent of cases without ASD classified as not having ASD); (3) positive predictive validity (the percent of cases accurately predicted as having ASD); and (4) negative predictive validity (the percent of cases accurately predicted as not having ASD). Measures of accuracy in distinguishing between clinical and non-clinical groups were also considered relevant.

Inter-Rater Agreement
The first author calculated the inter-rater agreement between the three raters pairwise on all titles (n = 2283) and abstracts (n = 229). Based on rating criteria (see details in Appendix B), proportional agreement on the titles and abstracts was calculated by taking the number of agreements and dividing this by the number of agreements plus disagreements, multiplied by 100. Their agreement ranged between 65% and 84% for the titles, and 93% and 96% for the abstracts.
Consensus was reached on the titles and abstracts with disagreement after the three raters reviewed them again together. Inter-rater agreement was also checked on the summary points of the variables coded (see above). The first author extracted the information for the 28 papers included and a second rater extracted the information for eight randomly selected papers. The two authors agreed on 149 of the 152 summary points checked (i.e., 19 summary points per article multiplied by 8 articles). Following the same formula used above, the percentage of agreement was 98%. The two raters then discussed the discrepancies until a 100% agreement was reached.

Overview of the Results
We identified 28 studies that used mainstream or adapted information technologies to screen children up to 6 years for ASD (see Table 1). Seven of the included studies [22,[31][32][33][34][35][36] involved children recruited from primary care or pediatric services, while five studies involved children referred to tertiary care or specialized ASD centers [37][38][39][40][41]. A total of 7308 children participated in the studies. Of these, 3498 were males, 1851 females. In nine studies gender information were missing.
The majority of the studies reported in the papers identified was conducted in the USA (n = 19). Seven studies were conducted in as many countries, including China [51], Peru [49], UK [48], Italy [52], France [54], Colombia [56], and Sri-Lanka [46]. Two papers either did not provide information [55] or provided unclear information as to the country of origin of the participants recruited [45].       2 The two groups did not differ in terms of mental age. * AU, autism-specific; CO, computing; ED, education; MED, medical; SBS, Social/Behavioral Science. Abbreviations followed by -J indicate journal papers, -C conference papers. ** PC, Personal computers and multimedia; M, Mobile applications; SII, Shared interactive interfaces; VR/AR/M, Virtual, augmented, and mixed reality; SW, Sensor-based and wearable; NUI, Natural user interfaces; ROB, Robotics. *** F, functional prototype; P, publicly available.

Types of Technologies Used
The studies included in the review involved four main interface modalities, namely (a) natural user interface (NUI), (b) PC or mobile, (c) wearable, and (d) robotics. Figure 2 illustrates the frequencies of the different interfaces used within each category.
The second category (i.e., PC or mobile) included 16 papers. The studies reported by Abbas et al. [37] and Kanne et al. [39] were included in both categories (i.e., PC and Mobile) as they combined the two strategies within the same application. In a similar vein, the studies reported by Egger et al. [25] and Carpenter et al. [36] were included both in the NUI and PC/Mobile category. Accordingly, 11 papers reported on the use of computerized solutions (PC or mobile platforms) to administer parent-reported questionnaires [22,[31][32][33][34][35]37,39,46,47,55], and seven papers employed screening tools in which videos were collected from [37,39,43,50] or showed via [25,36,42] parents' mobile/PC devices.
The third category (i.e., wearable) included two papers [45,52] that used wearable sensors to track the kinematics of children's movements while they were performing specific reaching and grasping movements.
The fourth category (i.e., robot) included one paper [56] that reported on the use of a humanoid robot to assess joint attention skills.

Types of Technologies Used
The studies included in the review involved four main interface modalities, namely (a) natural user interface (NUI), (b) PC or mobile, (c) wearable, and (d) robotics. Figure 2 illustrates the frequencies of the different interfaces used within each category.
The second category (i.e., PC or mobile) included 16 papers. The studies reported by Abbas et al. [37] and Kanne et al. [39] were included in both categories (i.e., PC and Mobile) as they combined the two strategies within the same application. In a similar vein, the studies reported by Egger et al. [25] and Carpenter et al. [36] were included both in the NUI and PC/Mobile category. Accordingly, 11 papers reported on the use of computerized solutions (PC or mobile platforms) to administer parent-reported questionnaires [22,[31][32][33][34][35]37,39,46,47,55], and seven papers employed screening tools in which videos were collected from [37,39,43,50] or showed via [25,36,42] parents' mobile/PC devices.
The third category (i.e., wearable) included two papers [45,52] that used wearable sensors to track the kinematics of children's movements while they were performing specific reaching and grasping movements.
The fourth category (i.e., robot) included one paper [56] that reported on the use of a humanoid robot to assess joint attention skills.

Screening Level
The majority of the papers included in the review (71%; n = 20) involved the use of L1 screening tools. A detailed analysis of the differences between the two screening approaches according to relevant study characteristics (e.g., target population; type of interface used) was not performed because of the relatively low number of L2 papers. However, it should be noted that all papers involving parent-reported questionnaires (n = 11) focused on L1 screening approach. In contrast, papers involving L2 screening tools were mostly focused on using objective screening measures such as eye-tracking (n = 3), audio recording (n = 1), or kinematics (n = 1). The identified papers were grouped according to the different age ranges of the populations involved. Detailed descriptions of each study are provided in Table 2.

Screening Level
The majority of the papers included in the review (71%; n = 20) involved the use of L1 screening tools. A detailed analysis of the differences between the two screening approaches according to relevant study characteristics (e.g., target population; type of interface used) was not performed because of the relatively low number of L2 papers. However, it should be noted that all papers involving parent-reported questionnaires (n = 11) focused on L1 screening approach. In contrast, papers involving L2 screening tools were mostly focused on using objective screening measures such as eye-tracking (n = 3), audio recording (n = 1), or kinematics (n = 1). The identified papers were grouped according to the different age ranges of the populations involved. Detailed descriptions of each study are provided in Table 2. System composed of (a) a short questionnaire about the child, completed by the parent, and (b) identification of specific behaviors by trained analysts after watching 2-3 short videos of the child within their natural environment that are captured by parents using a mobile device.

Not reported
Based on the responses to the questionnaire and the analysis of the videos, the authors trained two independent ML classifiers and combined their outputs into a single screening assessment Upon check-in to the clinic, CHICA administers two pre-screener questions for the parent to complete in the waiting room. MCHAT may be also administered (at 24-month visit only) and automatically scored. The results of the pre-screening process are provided to the clinician before the visit.

Not reported
To assess change in ASD screening rates after implementation of CHICA at two community-based clinics Ben-Sasson et al.
(2018) [47] System combining (a) automated text analysis relative to parental concerns with (b) minimal standard questioning taken from MCHAT-R to identify risk of ASD

Not reported
Proof-of-concept study assessing the association between the text analysis combined with standard questions and clinician's ratings of ASD risk on a scale from 1 (no risk) to 4 (high risk). The digital M-CHAT-R/F automatically scored answers provided by parents and presented and scored follow-up questions for secondary screening of medium risk results (score of 3-7). The score report was provided to the physician before the visit.

About 20 min
Prospective study assessing the uptake of the digital MCHAT on service process measures (i.e., accuracy of documentation of screening results and appropriate action for positive screens). Acceptability was also investigated with participating physicians.
Short movies presented on a tablet. The embedded tablet camera recorded facial movement so that affect and head position could be subsequently analyzed by means of computer vision analysis.

About 10 min
Proof-of-concept study assessing the feasibility and accuracy of the tablet-based screening procedure.
Participants were recruited at their pediatric primary care visit. Child Health Improvement Through Computer Automation system (CHICA). Based on EHR information and pre-screen questions answered by parents, CHICA alerts the clinicians to either refer the child for an ASD evaluation or administer the M-CHAT-F (or M-CHAT-R/F).

Multi-phase process
Randomized-controlled trial involving 4 clinics (2 using CHICA with ASD module; 2 using CHICA without ASD module) to assess the percentage of children at the 18-month or 24-month visits. Children were shown a series of scenes representing 7 distinct stimulus paradigms (e.g., gaze following and joint attention; abstract shape movement)

5-10 min 3
Proof-of-concept study to validate an Autism Risk Index (ARI) and an Autism Severity Index (ASI) using eye tracking metrics.
ARI test accuracy (AUC) for children < 4years and +4 years was 0.92 and 0.93 respectively. ASI resulted strongly associated with ADOS-2 total severity scores (r = 0.58-0.67) M-CHAT on the iPad provided to children's parents while they were being triaged.

About 2 min 4
To compare the effectiveness of the M-CHAT on an electronic format versus paper format in an outpatient clinic setting. Parents were also asked to rate their experience with the iPad M-CHAT. The study did not perform follow-up on the final diagnosis of patients who were screened Cognoa tool includes (a) 15-item parent-report questionnaire; and (b) a 1-2 min. home video observation of the at-risk child captured via parent smartphone (see also Abbas et al., 2018).

Not reported
The performance of Cognoa in detecting at risk children was compared with ASD screening measures (MCHAT-R/F; SRS; SCQ; CBCL). Participants' eye gaze were monitored while looking at images on a computer screen to obtain a task-evoked pupil measurement and to test differences between dark and light conditions

No reported
Experimental study to test baseline pupil size and pupil responses to visual stimuli (faces, objects, and avatar) in three groups: (1) ASD; (2) age-matched; (3) mental age-matched.

N/A N/A N/A N/A
Pupil size correctly predicted group membership classification in 89% of the participants in the ASD group; in 63% in the mental age-matched group, and in 63% in the chronological age-matched group Social robot (ONO) used to elicit and assess joint attention during triadic (i.e., child-therapist-robot) interactions.

Not reported
Proof of concept study assessing differences in joint attention between children with a diagnosis of ASD and children with other neurodevelopmental disorders. ASDTests is an app based on two short versions of the AQ and Q-CHAT screening methods. It targets 4 age ranges (≤36 months; 4-11 years; 12-16; ≥ 17 years). The app automatically computes the total score of the questionnaires compiled by the caregivers and-if the result is above a specified threshold-refers them a specialized assessment. It also produces a report in PDF.  A tablet was used to present a 1-min video displaying a social scene with playing children and an abstract scene with moving shapes on either side of the screen. The child's face was recorded while watching the video using the tablet's front camera. Gaze preference was then calculated automatically.

5-10 min
Proof-of-concept study exploring (a) the performance of the automatic eye gaze detection algorithm compared to manual scoring, and (b) ASD children's scene preferences.

N/A N/A N/A N/A
The correlation between the manual and the automatic classifications for left/right gaze resulted 73.2%.

Wan et al. (2019) [51]
Children had to attend to a muted video clip of a female speaking while their gaze were tracked.

s
Proof-of-concept study comparing gaze fixations of ASD children with those of neurotypical peers to assess accuracy of the test to discriminate between the two groups by means of a machine learning method (support vector machine). Parents completed VIRSA ratings when their child was 6-, 9-, 12-, and 18-months-old and again 2 weeks later to examine test-retest reliability.  1 Based on results from the parent questionnaire only (all ages). 2 Total sample. 3 Estimate based on a similar study by Frazier et al. [57]. 4 According to the opinion of the majority of respondents (45.1%). 5 Entire age range (18-72 months). 6 All available subjects (n = 126; 69% Geo threshold). 7 Using 69% Geometric Fixation Cutoff. 8 Fixation time for the body and mouth. 9 for 18-month VIRSA with concurrent 18-month diagnosis.

L1 Screening Tools Solutions Tested with Children up to 30 Months
Nine papers were identified that involved children in the 16-30 months age range [22,25,31,32,[34][35][36]46,47]. Of these, two papers reported on studies aimed at adapting the M-CHAT for its administration via tablet [3,31]. Benefits of the use of tablet over the traditional paper-and-pencil form have been clearly highlighted by Campbell et al. [31], who documented that after implementation of the digital M-CHAT (a) the proportion of children screening positive with accurate documentation in the Electronic Health Records (EHR) increased from a mean of 54% to 92%, and (b) the proportion of physicians referring a child for a developmental assessment after a positive score increased from 56% to 100% (see also Major et al. [58] for secondary analyses).
Three studies reported on the use of automated EHR [22,32,35] to facilitate screening procedures within pediatric clinics. Both Bauer et al. [22] and Downs et al. [32] (see also [59], not included in this review) implemented the Child Health Improvement Through Computer Automation system (CHICA). CHICA is a computer decision support system developed to facilitate surveillance and screening for ASD in primary pediatric care services by implementing automated administration and scoring of the M-CHAT. Although encouraging results were observed in terms of increased screening of children for ASD, in both studies concerns were raised about the physicians' response to the alerts that a patient had a concerning M-CHAT. In a similar line of investigation, Schrader et al. [35] implemented the Smart Early Screening for Autism and Communication Disorders (Smart ESAC) in a pediatric service. Results indicated a statistically significant reduction in the average age of referral after the implementation of the Smart ESAC compared to the 16 years prior to system implementation.
Ben-Sasson et al. [47] created a survey through which parents recruited via online advertisement could describe in their own words their concerns regarding their child's social-communication development. Parents were further asked to complete the M-CHAT-R/F and the Autism Spectrum Quotient (ASQ) questionnaire. The authors were able to reliably predict the risk status of a child being on the spectrum by supplementing their written descriptions with only one of 11 questions taken from the M-CHAT-R.
Wingfield et al. [46] developed a mobile-based questionnaire with automatic scoring to be administered by non-specialist health/social workers in low-income countries. The system is a set of 21 "yes-no" questions for the parents. Preliminary evidence shows high accuracy in distinguishing between already diagnosed children with ASD and their neurotypical peers.
Finally, two studies used mobile devices to track facial expressions [25,36]. Egger et al. [25] developed an iPhone/iPad-based application to screen for signs of ASD in the general population. The app includes a short set of questionnaires as well as four brief videos. While the child watches the videos, the camera embedded on the device records his or her face. The recorded videos are thus uploaded by the caregivers on a server that automatically analyzes the child's facial expressions and attention to estimate the risk of ASD. Preliminary results indicated that (a) the majority of parents were willing to upload the full videos of their children; and (b) significant associations were found between emotions and attention and age, sex, and autism risk status (based on the M-CHAT scores). Similar encouraging results were reported by Carpenter et al. [36] who seemingly used the same system as that tested by Egger et al. [25].

Solutions Tested with Children up to Six Years
Vargas-Cuentas et al. [49] presented a 1-min video displaying a social scene with playing children and an abstract scene with moving shapes on either side of the screen. Observer's eye gaze while watching the videos were automatically tracked to assess spatial preference. Results from the proof-of-concept study comparing the eye gaze of children with ASD over those of their neurotypical peers as controls showed that the former group spent 26.9% to 32.8% of the time gazing at the social scene, compared to 44.2% to 50.7 of the control group.
Anzulewicz et al. [48] used two commercially available gameplays running on iPad to record children's movements while interacting with the device. Differences between children with a diagnosis of ASD and their neurotypical peers were estimated by means of a machine learning algorithm which resulted highly accurate in distinguishing the two groups based on the sole kinematics information.
Wan et al. [51] used an eye tracker to distinguish children with ASD from their neurotypical peers. They developed a rapid screening session which involved the presentation of a video showing a speaking girl for a very brief time interval (i.e., about 10 s). Automatic analysis of children's gaze produced reliable results in distinguishing between the two groups (i.e., ASD and neurotypical). Despite several differences in gazing behavior between the two groups while watching the speaking face, only the fixation times at the moving mouth and body could significantly discriminate the ASD group from the control group with acceptable classification accuracy.
Duda et al. [33] tested the Mobile Autism Risk Assessment (MARA) screening tool with children aged between 16 months and 17 years referred to a developmental-behavioral pediatric clinic. MARA is a 7-item parent questionnaire that can be administered via an electronic platform with automatic scoring. Before its implementation in a clinical setting, the questionnaire was validated in a series of preliminary studies [60]. Results from the implementation study showed that children who received a clinical ASD diagnosis were more likely than those without a clinical ASD diagnosis to receive a MARA score that was indicative of ASD. Importantly, the respondent could complete the MARA questionnaire either at home or in the clinic. Based on this preliminary clinical validation, two further papers by Abbas et al. [37] and Kanne et al. [39] tested the Cognoa application involving children aged between 18 to 72 months. Cognoa is a mobile-based application (i.e., tablet; smartphone) using the same algorithm used in MARA. It follows a two-stage approach to ASD screening whereby a parent (a) answers to a 15-item questionnaire and (b) uploads through the mobile phone at least 1-2 min. videos of the child being rated recorded in different everyday scenarios (e.g., mealtime, playtime, or conversations). Videos are then rated by specialized assessors to determine the need for further assessment. Results indicated that the Cognoa (a) performed similarly to other screening measures (i.e., MCHAT-R/F; SCQ; SRS; CBCL-ASP), and (b) was able to reliably screen all children in the 18-72-month age range, thus covering the screening age gap between 30 months and 48 months.
In a similar vein, Tariq et al. [50] created a mobile web portal to test the ability of machine learning to reliably detect autism based on short home videos of children. The results suggest that machine learning may enable rapid ASD detection outside of clinics, thus reducing waiting periods for access to care and reach underserved populations.

L2 Screening Tools Solutions Tested with Children up to 18 Months
Two papers were included that involved children up to 18 months [42,43]. Young et al. [42] developed a web-based application named Video-referenced Infant Rating System for Autism (VIRSA). The application is intended to be used by parents and shows pairs of videos of parents and infants playing together. After the presentation of each pair of videos, the respondent is asked to make judgments of which video is most similar to the child being rated. The application was tested involving infants with an older sibling with ASD, with preliminary results showing that VIRSA could correctly identify all children diagnosed with ASD at 18 months.
Talbott et al. [43] reported on the feasibility of instructing parents to administer specific semi-structured behavioral probes using the Telehealth Evaluation of Development for Infants (TEDI). This approach resulted reliable and acceptable to parents, although the sample involved was relatively small (i.e., 11 children).

Solutions Tested with Children up to 48 Months
Four papers were identified involving children aged between 10 and 48 months. Pierce et al. [41] developed the GeoPref test based on the assumption for which preference for geometric shapes over social content might be a reliable biomarker of ASD (see also [61]). The test involved the use of an eye-tracker that monitored the gaze behavior of the child while he or she was watching a video representing dynamic geometric images paired with a video representing dynamic social images. Results showed that a subset of ASD toddlers who fixated on the geometric images >69% of the time was accurately identified as being on the spectrum with high specificity. These promising results were further replicated by Moore et al. [40] using longer and more complex social scenes (see also [62] for the use of the GeoPref test as a symptom severity prognostic tool).
Wedyan and Al-Jumaily [45] conducted a proof-of-concept study to investigate the use of a wrist-worn light sensor to monitor object manipulation skills of children while they inserted a ball into a plastic tube. Automatic classification of the movement data was able to differentiate children at high risk of ASD from those at low risk with high accuracy.
Oller et al. [44] used the Language ENvironment Analysis (LENA) system to collect whole day audio recordings of infants in their homes. They further developed an automated approach to data analysis that was able to differentiate between vocalizations produced by neurotypical children from those produced by children with ASD or language delay.

Solutions Tested with Children up to Six Years and Older
Two papers were included in this group. Frazier et al. [38] estimated an Autism Risk Index by means of eye-tracking technology used to record fixations of children while presented with a variety of social and nonsocial visual stimuli. The results indicated that, for children with ASD up to 48 months and older, the index was able to classify their clinical condition with very good accuracy. Classification accuracy was also strong for children aged 30 months or younger.
Ramirez-Duque [56] tested the feasibility of using a social robot with a humanoid appearance to elicit and assess joint attention in children with a diagnosis of ASD. The robot was used in triadic interactions. The results showed that children with ASD produced less joint attention-related behaviors compared to a control group of children with other neurodevelopmental disorders.

Technology Maturity
About half (57%; n = 16) of the papers identified reported on the use of the screening tools were classified as reporting on a Functional Prototype (see Figure 3). Of these prototypes, 10 (62%) were L1 screening tools. Similarly, of the papers reporting on technologies classified as publicly available (n = 12), the majority (92%; n = 11) reported on L1 screening tools. Almost all the screening tools classified as publicly available (n = 10) were PC/Mobile interfaces used to administer parent-reported questionnaires for L1 screening. In contrast, functional prototypes were mostly represented by NUI interfaces (56%; n = 9), of which five involved the use of eye trackers.  Table 2 reports key information on the psychometric properties of the screening tools assessed in the papers identified. Five studies reported all the four metrics considered relevant for a screening tool (i.e., Sp; Se; PPV; NPV), and 18 papers reported at least one of such psychometric metrics or provided information of accuracy in detecting risk of ASD. Of the papers reporting psychometric information (n = 23), eight papers reported sensitivity and specificity values equal or over 75%. It should be noted, however, that sensitivity values below this threshold may be not indicative of poor psychometric properties, as the tool may be reliable in detecting specific ASD subgroups (e.g., [41]).

Discussion
Prospective identification of early signs of ASD is widely considered a priority to ensure that children at risk of this condition have timely access to specialized services and interventions [11]. The aim of this paper was to provide healthcare and other practitioners with an overview of the technologies available to support them in the identification of overt behavioral signs of ASD in children up to six years of age. Overall, the solutions identified varied greatly in terms of screening modalities (e.g., questionnaires, behavior observations), type of interface used (e.g., tablets, eye tracker), the granularity of behavioral indicators used to estimate the risk for ASD (e.g., from subtle eye movements to behaviorally defined clinical symptoms), intended technology users (e.g., parents, clinicians), and age ranges covered by the screening tools developed. Notwithstanding such variability, psychometric information point to considering available technologies as promising support in clinical practice to detect early sign of ASD in young children. In light of these findings, some considerations may be put forward.
First, one of the main barriers to ASD screening seems to be implementing such activity within routine clinical practice due to lack of administration or scoring time [9]. The literature identified in the current review suggests that the administration and the scoring of either existing (e.g., M-CHAT) or newly developed parent-reported questionnaires can be automated through machine learning (ML). Such ML-based solutions can be implemented within the EHR of specific primary care or specialized services (e.g., CHICA), and are effective in reducing the burden on care staff. Specifically, the evidence reviewed indicates a rapid increase in the number of children screened for ASD during the visits. Despite such encouraging results, however, it remains unclear whether clinicians would take advantage of this automated approach to screening. For instance, in the study by  Table 2 reports key information on the psychometric properties of the screening tools assessed in the papers identified. Five studies reported all the four metrics considered relevant for a screening tool (i.e., Sp; Se; PPV; NPV), and 18 papers reported at least one of such psychometric metrics or provided information of accuracy in detecting risk of ASD. Of the papers reporting psychometric information (n = 23), eight papers reported sensitivity and specificity values equal or over 75%. It should be noted, however, that sensitivity values below this threshold may be not indicative of poor psychometric properties, as the tool may be reliable in detecting specific ASD subgroups (e.g., [41]).

Discussion
Prospective identification of early signs of ASD is widely considered a priority to ensure that children at risk of this condition have timely access to specialized services and interventions [11]. The aim of this paper was to provide healthcare and other practitioners with an overview of the technologies available to support them in the identification of overt behavioral signs of ASD in children up to six years of age. Overall, the solutions identified varied greatly in terms of screening modalities (e.g., questionnaires, behavior observations), type of interface used (e.g., tablets, eye tracker), the granularity of behavioral indicators used to estimate the risk for ASD (e.g., from subtle eye movements to behaviorally defined clinical symptoms), intended technology users (e.g., parents, clinicians), and age ranges covered by the screening tools developed. Notwithstanding such variability, psychometric information point to considering available technologies as promising support in clinical practice to detect early sign of ASD in young children. In light of these findings, some considerations may be put forward.
First, one of the main barriers to ASD screening seems to be implementing such activity within routine clinical practice due to lack of administration or scoring time [9]. The literature identified in the current review suggests that the administration and the scoring of either existing (e.g., M-CHAT) or newly developed parent-reported questionnaires can be automated through machine learning (ML). Such ML-based solutions can be implemented within the EHR of specific primary care or specialized services (e.g., CHICA), and are effective in reducing the burden on care staff. Specifically, the evidence reviewed indicates a rapid increase in the number of children screened for ASD during the visits. Despite such encouraging results, however, it remains unclear whether clinicians would take advantage of this automated approach to screening. For instance, in the study by Downs et al. [32], almost half of positive M-CHAT results were not followed up by clinicians. A possible strategy to cope with this issue may be automating the whole screening process to ensure that at-risk children are properly assessed [32].
Second, several mobile solutions have been developed that allow data collection on children's behaviors in non-clinical settings (e.g., home). The most affordable and effective solutions include the use of smartphones to record videos of children in their daily contexts which are subsequently analyzed (i.e., scored) by expert clinicians [37,39]. In these studies, home-made videos could be further supplemented by short questionnaires to improve the accuracy of the screening process. Alternatively, Young et al. [42] substituted text-based with video-based questionnaires to enable detection of ASD in infancy and clearly showed that video can be used to improve parent reporting of early development. Together, mobilebased solutions may be considered a strategy to (a) reduce the burden on health services, (b) increase the number of screened children, and (c) accelerate the diagnostic process. Further research is needed, however, to explore whether these mobile-based screening strategies can be effective also when used in other settings and by other users, such as kindergartens and pre-school teachers. Indeed, there are limited screening tools developed for these stakeholders (i.e., pre-school teachers), despite their importance as informants of ASD children's social behaviors compared to their normative peer groups [63,64]. As mobile, interactive, and smart technologies (e.g., smartphones, tablets, robots) are becoming increasingly available in educational settings to foster children's learning and creativity (e.g., [65][66][67]), teachers can be trained to use them also to contribute to the screening of young children, thus providing valuable information on children's behavior in socially rich environments (e.g., kindergartens; primary schools).
Third, encouraging evidence is available on the use of technology combined with ML to detect early signs of ASD through the monitoring and successive analysis of biobehavioral markers, such as speech, movement and gaze behavior. In particular, monitoring of eye gaze behavior by means of an eye tracker resulted in the most used screening strategy to (a) distinguish between children at risk and neurotypical children (e.g., [49,51]), (b) perform L2 screening procedures (e.g., [38]), or (c) identify ASD subgroups [41]. Overall, current evidence suggests that monitoring of eye gaze should not be considered as a replacement of more traditional screening practices (e.g., parent-reported questionnaires), but an additional source of information about early signs of ASD. As already mentioned, screening is indeed widely considered a multistep process, whereby failing a L1 assessment would require a secondary screener (L2) before initiating a diagnostic process [27]. Likely, based on present findings, we argue that the increased availability of affordable and reliable eye trackers could facilitate the diffusion of this screening strategy in a variety of contexts as L2 screeners. However, more research is needed on (a) the integration of this technology in routine clinical practice, (b) whether the use of eye trackers is acceptable to clinicians, and (c) how the information gathered from the analysis of the eye movement of children can be integrated with the results obtained from more traditional screening tests.
Voice recordings and movement observation, as well as social robots, were also further strategies identified in the present review to screen for ASD in young children (e.g., [52,53]). Although promising, however, these emerging technologies may be considered at an earlier stage of development compared to eye tracking.
Fourth, maturity of screening solutions in terms of technological development was found to be well balanced across maturity levels (i.e., Publicly Available, Functional Prototypes), but highly unbalanced for what concerns the level of screening. Specifically, almost all the solutions included in the Publicly Available category belong to L1 (or universal) screening tools. This is not surprising given that the majority of the L1 screening solutions identified are parent-reported questionnaires which included already validated (and available) tools (e.g., M-CHAT). Based on this finding, it can be argued that the transition from traditional to technology-based screening tools may be primarily based on adaptation from currently available forms of screening strategies (i.e., questionnaires).
Fifth, understanding the feasibility, acceptability, and effectiveness of implementing telehealth assessment is becoming of fundamental importance to cope with the limitations to health services delivery due to either low resources available (e.g., lack of trained staff), or public health emergencies (e.g., coronavirus disease 2019) [68,69]. As showed in the study by Talbott et al. [43], this approach required the active involvement of parents who had to elicit target behaviors and collect data to be shared with expert clinicians. Though telehealth assessment resulted acceptable to parents, more research is needed to understand the applicability of telehealth assessment to those parents who may experience language barriers or are less confident with technology.
Sixth, despite we attempted to provide a comprehensive overview of the technologybased solutions available to screen for ASD, some limitations may have reduced the number of potentially relevant screening solutions. For instance, we excluded papers reporting on screening tools at a conceptual design phase that were not tested with the target population. Two further limitations include the decision (a) to focus on screening tools to assess overt children's behaviors, thus excluding technologies to detect biological markers related to ASD condition, and (b) to exclude the literature focusing exclusively on ML-approaches to ASD screening that was not implemented in clinical settings.
In conclusion, the results of the present review of the literature suggest that technology may be a valuable support for ASD screening. Already validated parent-reported questionnaires may be easily adapted to be administered through mobile platforms to speed up the administration and scoring processes. Commercially available mobile technologies may be used to extend the screening process to children's life settings (e.g., home, kindergartens). In addition, more sophisticated technologies such as eye-trackers may be considered as a valid supplement to traditional screening measures.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix B
In this first step, the titles of the papers retrieved will be reviewed by three independent researchers (Lorenzo Desideri, Patricia Pérez-Fuster, and Gerardo Herrera) and scored as not relevant (0), probably relevant (1), or relevant (2). The scores will be added to make a sum score ranging from 0 to 6. All publications with a sum score of 6 will be selected for the next step. In general, in case of doubt please keep the tittle in the list (i.e., if the age-range is not specified, if the target population is not clear or if it may include ASD together with other populations, or if it is not clear whether it is related to screening/monitor/intervention or not, or if it is not clear if it is a review paper or a primary study). Table A5. Instructions for titles scoring.

Score
Instructions Examples 0 points (a) Title refers to a different age range than 0-6 OR (b) Title refers to a different term than autism (i.e., elderly or cerebral palsy, but not autism related terms) OR (c) Title refers to a different application area than Screening/ Monitoring or Intervention OR (d) Title is related to a systematic review or meta-analysis (instead of being a primary study) OR (e) Tittle is related to genetic/biochemical research Title 1: "Digital images as meaning bridges: Case study of assimilation using avatar software in counselling with a 14-year-old boy" Explanation: The study satisfies two inclusion criteria: (1) it involves autism, (2) it refers to a technology-based intervention. However, it is explicitly mentioned that it does not focus on pre-school children. Title 2: "Technology-mediated learning in students with ASD. A bibliographical review" Explanation: The study satisfies two inclusion criteria: (a) it involves autism, (b) it refers to a technology-based intervention. However, it is a systematic review.
1 point (a) Title includes any term related to autism spectrum disorder condition (autis* or Asperger* or pervasive or PDD or PDDNOS or pervasive develop* or autistic) OR (b) Title refers to (any kind of) technology-based intervention or screening (or monitoring) AND (c) Title does not qualify for any of the 5 options that apply for 0 points Title 1: "Sustained Community Implementation of JASPER Intervention with Toddlers with Autism" Explanation: The article refers to autism (and toddlers) which is the focus of our study. Even if we don't know whether JASPER is a technology-based intervention, it is worth including this article in the next step. Title 2: "Factor Analysis of the Childhood Autism Rating Scale in a Sample of Two Year Olds with an Autism Spectrum Disorder" Explanation: The study satisfies two inclusion criteria: (a) it involves autism, (b) it refers to a tool for diagnosis. I know that the Childhood Autism Rating Scale is an observational tool, but I would prefer to be highly inclusive in this very first step.

Score Instructions Examples
2 point (a) Title includes any term related to autism spectrum disorder condition (autis* or as-perger* or pervasive or PDD or PDDNOS or pervasive develop* or autistic) AND (b) Title refers to (any kind of) technology-based intervention or screening (or monitoring) AND (c) Title does not qualify for any of the 5 options than apply for 0 points Title 1: "Randomised controlled trial of an iPad based early intervention for autism: TOBY playpad study protocol" Explanation: The study satisfies inclusion criteria: (a) it involves autism AND (b) it refers to a technology-based intervention. Title 2: "Automatic newborn cry analysis: a non-invasive tool to help autism early diagnosis" Explanation: The article refers to autism (and newborns). We might suppose that the mentioned "tool" is a kind of digital technology. Hence, it would be better to include this title in the next step.