Pupillometry via smartphone for low-resource settings

The photopupillary reﬂex regulates the pupil reaction to changing light conditions. Being controlled by the autonomic nervous system, it is a proxy for brain trauma and for the conditions of patients in critical care. A prompt evaluation of brain traumas can save lives. With a simple penlight, skilled clinicians can do that, whereas less specialized ones have to resort to a digital pupilometer. However, many low-income countries lack both specialized clinicians and digital pupilometers. This paper presents the early results of our study aiming at designing, prototyping and validating an app for testing the photopupillary reﬂex via Android, following the European Medical Device Regulation and relevant standards. After a manual validation, the prototype underwent a technical validation against a commercial Infrared pupilometer. As a result, the proposed app performed as well as the manual measurements and better than the commercial solution, with lower errors, higher and signiﬁcant correlations, and signiﬁcantly better Bland-Altman plots for all the pupillometry-related measures. The design of this medical device was performed based on our expertise in low-resource settings. This kind of environments imposes more stringent design criteria due to contextual challenges, including the lack of specialized clinicians, consumables, poor maintenance, and harsh environmental conditions, which may hinder the safe operationalization of medical devices. This paper provides an overview of how these unique contextual characteristics are cascaded into the design of an app in order to contribute to the Sustainable Development Goal 3 of the World Health Organization: Good health and well-being. (cid:1) 2021 The Author(s). Published by Elsevier B.V. on behalf of Nalecz Institute of Biocybernetics and Biomedical Engineering of the Polish Academy of Sciences. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). the algorithm, and the manual measurements. The


Background
The photopupillary reflex regulates the pupil dilation and constriction according to the intensity of the light that hits the retina and is controlled by the sympathetic and parasympathetic nervous systems. Therefore, this reflex is used as an indirect measure of the central and autonomic nervous system [1]. The key medical applications of the photopupillary reflex measurements include the detection of brain trauma and the assessment of its severity [2,3], the assessment of the level of anesthesia and pain [4,5], an aid to the certification of death [6], the evaluation of alcohol [7] and drug intoxication [8,9], and the study of ophthalmological diseases such as diabetic retinopathy and Horner's syndrome [1,10]. A quick evaluation of brain trauma via pupillometry, i.e., the measurement of pupil size, symmetry and reactivity, can make the difference on the patient's health and future life and is an essential part of the supportive care provided in this case [11,12]. The early management of traumatic brain injury, in fact, minimizes the progression of the injury and improves recovery and clinical outcomes [12,13]. Accordingly, in many high-income countries, technologies for photopupillary reflex analysis have been proposed [14][15][16][17], also using smartphones [18][19][20][21]. Recently, an app for tracking the photopupillary reflex using trained object-detectors was introduced [22]. As regards the pupil and iris detection algorithms, there are various technical solutions available including edge detection and Hough transform [23], Starburst transform [24], blob detection algorithms [25], watershed segmentation [26], gradient vector flow snake-based method [27], and deep learning [28]. However, very little has been proposed for low-and middle-income countries (LMICs), where traumatic brain injury is becoming one of the main causes of morbidity and mortality. In fact, Africa owns less than 5% of the motor vehicles in the world and accounts for 10% of global deaths caused by vehicular injuries [29]. In LMICs, and in particular in lowresource settings (LRSs), there is a lack of expertise and diagnostics to assess brain trauma [30]. Accordingly, the United Nations (UN) aims to ''halve the number of global deaths and injuries from road traffic accidents". This is target 3.6 of the UN Sustainable Development Goals (SDGs) number 3, Good health and well-being [31].
The photopupillary reflex can be measured with a simple penlight. Despite the simplicity of the device, accurate and reliable assessments of the photopupillary reflex require an experienced user: Couret et al. [32] demonstrated that the penlight photopupillary reflex observation in neurocritical care is prone to human error, limited reproducibility and low precision. In many LMICs, diagnosis and healthcare delivery is hindered by the lack of specialized clinicians, alongside the lack of resources and poor supply chain [33]. An alternative is the digital pupillometer, i.e., a medical device performing automated pupillometry using infrared cameras, which are expensive and not designed (i.e., not resilient) to operate in the harsh environments (i.e., dusty, warm, humid, with unstable power supply etc.) typical of Sub-Saharan Africa (SSA).
This article presents the early results of our study aimed at designing, prototyping and validating a mobile app, based on relevant international and military standards, for testing the photopupillary reflex via Android in LRSs.
The aim of this app is to act as a screening tool that can be used by nurses (or also lay-users) to test the direct pupillary reflex in order to screen the incoming patients' conditions (e.g., suspected presence of brain injuries) and plan further investigations. This is crucial in LRSs.
Specifically, this paper describes the acquisition of videos, the signal processing and their technical validation. The results from eight field studies in SSA have informed the contextualized and user-driven design, and can also be relevant for informing the design of other devices for LMICs. In fact, we added additional design criteria, due to the challenges typical of SSA, which included the lack of specialized clinicians, the scarcity of funds, of spare parts and consumables, poor maintenance, which hinder the safe and efficient operationalization of medical devices. This paper demonstrates how these peculiar contextual characteristics can be cascaded into the design of a mobile app and redesign of a medical device.
This work was inspired and informed by existent regulations and standards. In particular, those related to existing pupillometers were taken into consideration, because of their similarity to our solution. Further punctual analysis of standards and requirements will be needed in the later stages to pass from prototype to product.

Ethnography-driven user-need and contextual analysis in LMICs
Designing medical devices for LRSs requires the synergy of different but complementary methodologies, comprising of not only engineering, scientific and quantitative techniques, but also qualitative approaches such as ethnography research [34]. Ethnography applied to the design is, in fact, one of the keys to further develop the current technological progress, by allowing designers and researchers to understand the design challenges more deeply, with a focus on a particular kind of end-users and their surrounding contexts. For this reason, we conducted the need and context analyses with a mix of methodologies ( Fig. 1).
They were structured in three steps, which were iterated twice: general formalization, contextualization in SSA countries, and field studies in Benin and Uganda.
The first requirements were identified by reviewing the literature on medical devices and their related standards (i.e., ISO 14971 -Medical devices -Application of risk management to medical devices, IEC 62366 -Medical devices-Part 1: Application of usability engineering to medical devices, IEC 62304 -Medical device software-Software life cycle processes, ISO 15004 -Ophthalmic instruments-Fundamental requirements and test methods), and performing focus groups with international experts of medical device design and management, and hospital engineering. Five focus groups were held with world leading experts of biomedical and clinical engineering during international conferences [35][36][37][38][39].
The contextualization in SSA was performed by: 1) administering surveys to African Scholars; 2) holding focus groups with biomedical and clinical engineers in SSA countries (in accordance with the ethical approval REGO-2018-2283). Five focus groups were held in SSA and were attended by delegates from more than 12 SSA countries (two during the Africa-Health conferences, two in Benin at the Ecole Polytechnique d'Abomey-Calavi, and one in Ethiopia) [40].
Three field studies were conducted in Benin in April 2017, January 2018 and November 2019 and one in Uganda in Octo-ber 2019. During these studies, several aspects of medical devices and medical locations were analyzed. This included electric measurements, examinations of medical devices, inspections of medical locations in 6 African hospitals and semi-structured interviews with the available staff, including biomedical engineers, technicians, nurses, doctors and hospital directors [41][42][43].
In collaboration with the International Federation of Medical and Biological Engineering (IFMBE) African Working Group [40], focus groups were organized.
The ethnographic analysis was conducted in accordance with the ethical approval REGO-2018-2283, obtained from the Biomedical and Scientific Research Ethics Committee.
For quality insurance, we followed the prescriptions of the European regulations on medical devices, which equate medical apps to medical devices. Moreover, we based our work on the 5As principles of the World Health Organization (WHO), i.e., affordability, availability, adequacy, accessibility, and appropriateness, in line with the solutions proposed in the WHO compendium of innovative health technologies for LRSs.

Development of the smartphone-based pupillometer
The development of the smartphone-based pupillometer followed 5 stages, namely the smartphone pupil stimulation and video acquisition, the preprocessing, the image processing, the system integration and the technical validation, which will be described thoroughly in the following subsections. These stages were developed and validated with videos acquired from 11 healthy subjects in accordance with the ethical approval obtained from the Ethical Committee of University of Campania Luigi Vanvitelli 1 . Further details about the dataset can be found elsewhere [14].

Smartphone pupil stimulation and video acquisition
During the first feasibility study, the pupil of a subject with light brown eyes was stimulated with the flash embedded in a smartphone (i.e., ZTE Blade C341), with the illuminance set at 480 lx and the duration at 500 ms. The photopupillary reflex was captured with a second smartphone, namely a Samsung Galaxy a7 (2016) with a 13-megapixel camera, although the final app integrates both functions and only one smartphone without tripod support is needed for future use. In fact, in the final app the video recording and the flash are synchronized as follows: the flash starts 2 s after the recording has started, lasts for 500 ms and the recording is stopped when 9 s in total are reached. This allows a recording of about 6-7 s of the pupil reaction, in line with the typical duration of the event [44]. Both smartphones were selected depending on their availability at the times of the experiments and were placed on a tripod at a distance of 8 cm from the subject's face. In literature, distances in the range of 8-15 cm were used [20,45,46]. As the luminance will vary according to the distance from the light source following an inverse-square law [47], the luminance at various distances (i.e., 5-20 cm, with a 1-cm step) was also evaluated to check whether small differences in distance could significantly affect it. It resulted that in the above-mentioned range, i.e., 11.5 ± 3.5 cm, the luminance had an average percentage change of 10.7%. Furthermore, we investigated whether the flashes at 8 cm and at 15 cm would trigger a pupil reaction, and whether the minimum reached by the pupil would be similar. This was tested with two smartphone models available at the time of writing (i.e., a Doogee S60 Lite and an iPhone 7). The pupil minimum, expressed as the normalized pupil/iris ratio, resulted to be varying in the range 71.2 ± 2.7% (percentage variation of 3.8%) with the Doogee S60 Lite flash, and in the range 50.7 ± 0.4 (percentage variation of 0.8%) with the iPhone 7. This difference is probably due to the more powerful flash embedded in the second smartphone model. Therefore, assuming the test will be run with the same device, the pupil contraction will remain substantially the same in the recommended distance range.
The height at which the two smartphones were placed was so that the eye resulted to be at the center of the frame. The opposite eye was neither stimulated by the flash, nor covered. Throughout these experiments the ambient light was measured using a luxometer (Dr. Meter, LX1010B), in order to ensure an approximately constant baseline light intensity.
The videos were then fed to a dedicated algorithm as detailed below. At this prototyping stage all the signal elaborations are performed on Matlab.
In the future, the definitive algorithm will either run on a dedicated server application that will include a Matlab compiled dynamic link library (DLL) to foster the accessibility of older models of smartphones, or will be embedded in the mobile device itself, for the most performing models.

Preprocessing
The images contained in the frames of the video are preprocessed by turning them into gray-scale (see Fig. 2, IIa), binarizing them according to a certain threshold, and going through morphological opening (i.e., erosion followed by dilation) and closing (i.e., dilation followed by erosion) to remove any dark unrelated pixel or particularly small objects (see Fig. 2, IIb). The binarization phase is a pivotal pre-step and can be influenced by the overall light intensity of the frames: if the intensity changes over the frames, the results may not be ideal. In case the acquisition is performed with high intensity of light (e.g., flashlight on), a higher threshold is selected, by assessing the mean intensity of the first 3 frames (i.e., baseline) and setting the highest threshold if the mean intensity of a frame results 5% greater than the baseline. The values of the two thresholds were determined empirically. The final app records 9-s H.264 encoded-videos with a 30-fps framerate and an average size of 11.5 MB.

Image processing
The image recognition algorithm consists of two main parts: the pupil and the iris recognition. The reason behind the choice of including the iris part is due to the contextualization of the design and will be further explained in the results subsection 3.1 named ''Ethnography-driven user-need and contextual analysis".

Pupil recognition algorithm.
The algorithm (see Fig. 2) starts by prompting a user input, i.e., the framing of the part of interest (i.e., the eye). The user can draw a rectangular box, superimposing it on the first frame of the video, and the coordinates of such polygon will be utilized to crop the frames used during the tuning step of the algorithm (see Fig. 2, I). The latter consists in running the commands for preparing the images (as described above) and for finding the pupil and its center, only over the first three frames of the video.
As regards the pupil recognition, three different approaches were tested, namely blob-detection algorithm, circular Hough transform, and watershed transform. These methods were assessed computing the mean absolute error (MAE) and the correlation with the manual measurement (Pearson's r). In particular, the blob-detection algorithm used in the comparison was a variation of the original that consisted in removing the spikes and substituting them with the average value of the points preceding and following the spike.
The blob-detection algorithm was inspired by Barragan's [25] algorithm, which detects the ''blob" with the greatest area contained in the picture, circles it, and finds its center. The diameter output of the tuning stage is called baseline diameter, because it is then utilized to automatically calculate the cropping frame dimensions for the main part of the algorithm, by creating a framing square with a side equal to four times the baseline diameter, centered on the pupil. The same pupil recognition algorithm is then called again upon all the newly-cropped frames and an array containing the unprocessed diameter is saved (see Fig. 2, III).

Iris recognition algorithm.
The algorithm (see Fig. 3) starts by prompting a user input, i.e., the framing of the part of interest (i.e., the iris). A circle is superimposed over the first frame and the user can resize it according to the iris boundary in the frame. Given the position of three points of such circle, its equation is derived as well as the baseline radius of the iris, which is then used as a parameter for the circular Hough transform algorithm (see Fig. 3, II).
In this case, the same three algorithms tried out for the pupil recognition were tested as well.
Before being fed to this algorithm, the frames are preprocessed as described above (see Fig. 3, I, IIIa, IIIb). Moreover, as the outcome of the application of the circular Hough transform algorithm depends on the sensitivity, an extra precaution is taken in this direction. In fact, although the initial sensitivity is set to 0.88, if no circle is found during the first run, the sensitivity is increased by 0.02 until a circle is found. An array containing the unprocessed diameter is saved.

Postprocessing.
The acquired pupil measurements were often subject to artifacts such as blinking and image overexposure due to flash. For this reason, a series of functions were applied in order to smooth spikes and filter out noise. In particular, the affected data were identified, removed from the dataset and the gaps were filled using an interpolation.
As regards the flash, the above-mentioned threshold for the binarization was designed to tackle this problem. However, the method is not completely robust and can fail to switch to the correct threshold in the first and final frames of the flash, when it is not at full brightness. Consequently, the flash-related frames were removed and substituted with a linear interpolation. This would not affect the overall performance as the first part of the constriction phase of the photopupillary reaction is steep and approximately linear.
As regards the blinking artifact, it partially or completely obstructs the pupil, making the algorithm track nothing or a larger area (e.g., a shadow under the eyelid). Such sudden change of the detected area is a good indicator of when the pupil detection fails. Detrending and differentiation were used to identify these sudden changes: the data points  s r e s e a r c h a n d c l i n i c a l p r a c t i c e 4 1 ( 2 0 2 1 ) 8 9 1 -9 0 2 affected by blinking are flagged when the local derivative exceeds a threshold in magnitude. Also in this case, the flagged points are removed and linear interpolation is used to fill the gaps.
Finally, the ratio between the diameter of the pupil and that of the iris is calculated and normalized to the initial value. The values related to the frames preceding the pupil reaction are individuated and substituted with 100% values. The part of the array related to the pupil reaction is fitted with a Gamma function, as suggested by Knapen et al. [48] (see Fig. 2, IV).
The algorithm also calculates some variables relevant to pupillometry: Pupil minimum, the minimum size reached by the pupil at the end of the constriction phase; the constriction phase was considered to start when the pupil/iris normalized ratio fell under 98% of its original value; Latency, the delay in the pupil response calculated as the time between the start of the flash and the start of the constriction phase; Max constriction velocity, the maximum rate of change in the pupil diameter during the constriction phase; Mean constriction velocity, the average rate of change of the pupil diameter during the constriction phase; Mean dilation velocity, the average rate of change of the pupil during the dilation phase, which is contiguous to the constriction phase and was considered to end when the pupil/iris normalized ratio overtook 98% of its original value; T75, the time implied by the pupil to recover 75% of the amplitude of the constriction starting from the peak of the constriction;

System integration
The resulting pupillometry system has been designed as a three-tier application: Presentation layer: the Android app will be used for acquiring the video samples and firing the flash. Logic layer: a connection with the server code performing the analyses will be developed. The app will act as client. In case of high performance devices, both the client and the server software will be running on the mobile device. Data layer: our system will be linked to the database and web application described in [15][16][17] through RESTful dialog and dedicated APIs.

App design
The app was developed in Android Studio, using Java for the implementation of functions and XML for the design of the user interface, targeting Android-based smartphones with an API level of at least 21 (i.e., Android 5.0 Lollipop), because of the use of the ''camera2 00 package. This choice allows 94.1% of Android users to use our app, as only 5.9% of the Androidbased smartphones have an API level lower than 21 worldwide (and similar trends can be found in Africa) (from Android Platform/API version distribution -Android Studio) [49].
The logo, representing an eye-shaped logarithmic spiral, was hand-drawn and digitized using GIMP (GNU Image Manipulation Program).

2.3.
Technical validation 2.3.1. Video acquisition procedure and image processing All the frames of one of the acquired videos were analyzed both with the Matlab algorithm and manual measurements.
In particular, the frames were analyzed manually by two independent authors that were blinded to the output of the pupillometer, in order to reduce the risk of bias: for each frame the diameters of the pupil and of the iris were measured twice and averaged in order to reduce the measurement error. Also in this case, the values related to the frames preceding the flash were individuated and substituted with 100% values. Consequently, Pearson's r and the associated pvalue, the root mean square error (RMSE) and the MAE were calculated for the Gamma-fitted signal and the raw automated signal, compared to the manual measurements. Moreover, the error rate was estimated by calculating the percent error and counting how frequently it would go over a 10% threshold.

Benchmarking
Our pupil tracking algorithm was also validated against the output of an IR Pupillometer (DP-2000 -NeurOptics). The gamma fit in this case was not needed because of the noninteraction between the flash and the IR recording. The technical validation was done based on the output variables, specified above. In particular, the variables outputted by the IR pupillometer were normalized in respect to the initially measured pupil size in order to make them comparable with those resulting from our algorithm.
The RMSE and MAE were calculated for each variable and for both the algorithms, comparing them with those coming from the manual measurements, taken by two independent and blinded authors. Consequently, 3 Bland-Altman plots were generated for each of the 4 variables, after testing whether their residuals were normally distributed with a Shapiro-Wilk test [50] (normality being a necessary condition for such plots). The Bland-Altman plots compared the app Algorithm and the IR Algorithm, the app algorithm and the Manual measurements and the IR algorithm and the Manual measurements.

Testing the safety of the flash
An experiment was set up to test the safety of use of a smartphone flash on the human eye, although the safety of the procedure has been confirmed by preliminary research [45]. The smartphone was placed on a tripod and an operator held the sensor of a luxometer (Dr. Meter, LX1010B) in front of the camera at a distance of interest for pupillometry, i.e., 8.5 cm. Firstly, the illuminance in this condition (i.e., ambient light) was recorded; secondly the flash was turned on and the illuminance in this condition was recorded. Hence, the illuminance could be easily calculated and compared against the ISO standards [51] after a conversion to W/cm 2 .

Ethnography-driven user-need and contextual analysis in LMICs
The contextual analysis highlighted that SSA countries have [41,43]: extremely limited resources, an insufficient number of healthcare professionals and of specialized doctors; inadequate hospital infrastructures, highly unstable main power supply, poor transport infrastructure and supply-chain, and an uneven distribution of the resources that are concentrated in the capital to the detriment of remote areas. Nonetheless, SSA can count on a very young population, a wide diffusion of mobile phones, smartphones [52], and ICT literacy. There is a wide diffusion of one dominant smartphone operative system (i.e., 86.39% of smartphones based on Android) [53] and a good coverage of wireless telecommunication. Prospectively, the SSA market of medical devices is fast growing (the compound annual growth rate is around 6%) [54]. The adoption of new technologies meets limited inertia and healthcare operators are resilient. In fact, working in challenging conditions pushes workers to practice with the unpredictable conditions and events, developing a great capability to react to, respond to and recover from emergencies. Nonetheless, this positive attitude comes with evident risks too. Often, nonspecialized personnel respond to the medical devices malfunctioning with creative shortcuts, which tend to become chronic solutions, prone to new risks, hindering the recovery of the initial level of effectiveness and safety [55]. Finally, a massive ''brain drain" affects doctors and specialized doctors, who move to other countries for better opportunities, further depleting SSA health care systems [56].
The results of the contextual analysis have been discussed with African scholars and healthcare personnel in Benin, Ethiopia and South Africa, resulting in a series of specifications for the local manufacturing of a resilient pupillometer, with its consumables and spare parts. The design should be low-cost, based on free design and manufacturing processes, it should empower non-specialized healthcare personnel and providing clear guidance or affordances, possibly be batterybased and resilient to the unstable power supply, resilient to misuses, requiring no maintenance and easy to clean, and based on Android smartphones, possibly compatible with the high degrees of ingress protection (e.g., IP68) described in IEC 60,529 and with rugged and military standards (e.g., MIL-STD-810G).
None of the pupillometers reviewed resulted sufficiently resilient to LMICs. Existing smartphone solutions meet the cost-requirement, but as it emerged from our study, this is not the only criterion for being resilient in LMICs. For example, most of the proposed solutions widely utilized accessories and spare parts, including external LEDs, filters, and lenses, which will hinder the lifetime of the device in SSA. In fact, such parts would be difficult to retrieve, repair or replace in LRSs [41,43].
Moreover, when deepening the design principles of a pupillometer, two technical requirements emerged: computational capability compatible with an old Android smartphone; use no accessories or only accessories that could be locally manufactured (e.g., 3D printed).
This last criterion particularly influenced the design of the app. The majority of existing pupillometers utilize visible light to stimulate the pupil and infrared (IR) cameras to film its constriction, in order to avoid artefacts. Most smartphones do not contain IR cameras, therefore visible light was used both to stimulate the pupil, using the phone flash, and to film its reaction with the phone camera. As a consequence, the video frames coinciding with the flash resulted overexposed due to the sudden change of luminosity and the proximity of the subject, requiring the adoption of a fitting algorithm to recover the missing pupil diameter in those frames. Moreover, phone camera framerates are lower than the one of many pupillometers. Thus, the proposed algorithm fitted the acquired diameter data first with a linear fitting, in order to recover missing data due to the flash, and then with a Gamma distribution for approximating missing frames, reconstructing the complete response of the pupil, as proposed in [48]. The interpolation also reduced the blinking artifacts, affecting also standard pupillometry. Moreover, the distance between the eye and the device created artifacts in the estimation of the pupil diameter. These artifacts could be limited with a recycled plastic 3D printed accessory clipped on the mobile phone, aiming at keeping the eye to phone distance constant. However, since a 3D printer could be not available, the proposed algorithm for the recognition of the pupil reflex was based on the ratio between the diameter of the pupil and that of the iris. In fact, while the pupil diameter reacts to light, the iris does not. The ratio was normalized with the value measured before the flash shooting to facilitate the reading of the pupil diameter. The adoption of these features required a specific technical validation of the final algorithm and app.

Development of the smartphone-based pupillometer
A total of 4 videos were recorded, in which the eye was stimulated 3 times in order to be sure to capture a good-quality response (i.e., absence or reduced number of blinks).

Preprocessing and image processing
Three methods were tested for the pupil and iris detection: a blob-detection algorithm, the circular Hough transform, and the watershed transform. The blob-detection algorithm outperformed the other methods with lower MAE (3.9% versus 4.55% of the Hough transform, and 21.25% of the Watershed transform) and higher correlation (Pearson's r of 0.95 and pvalue <0.00001 versus 0.84 and p-value p-value <0.00001 of the Hough transform, and À0.03 and p-value of 0.83), being selected for the pupil tracking (Fig. 2, III). This choice also avoided the introduction of an extra user input, i.e., the pupil radius range, which is necessary for the Hough transform to work. The Hough transform outperformed the other methods in tracking the iris. Consequently, the Hough transform was performed for the iris tracking (see Fig. 3, IV). Fig. 4 shows the comparison of three signals, namely the gamma-fitted ratio, the algorithm, and the manual measurements. The removal of the flash and blink artifacts via post-processing are evident in Fig. 4.

App design.
The app, named Oida (meaning ''I have seen" and ''I know", from Ancient Greek ''ὁqάx"), for tracking the photopupillary reflex is being finalized. As of now, the app comprises of a Main Activity, Instructions Activity and a Camera Activity. It is available in two languages: English and French, both widespread languages in SSA.

Video acquisition and image processing
During the manual validation (see Fig. 4), the Gamma fitted ratio resulted significantly highly correlated with the manual measurement (Pearson's r = 0.963, p-value < 0.0001 versus Pearson's r = 0.982 and p-value < 0.0001 of the raw automated signal (app)), with a RMSE and a MAE of 3.20% and 2.24%, respectively (versus 3.96% and 3.09% of the raw signal). Moreover, the error rate for the Gamma fitted ratio resulted to be 7.14%.

Benchmarking
Ten videos acquired by clinical ophthalmologists with the IRpupillometer on healthy subjects were analyzed with the IRpupillometer software and with the app algorithms in Matlab. Fig. 5 shows the pupil reaction over the frames, captured by the three different algorithms. Resulting measures were compared with those calculated by hand after annotating the diameter of the pupil manually for each video-frame. The MAE and RMSE demonstrated a significant improvement in comparison with the software provided with the commercial device, for all the variables, as reported in Table 1. The agreement among the measurement methods, namely app algorithms/IR method, app Algorithms/Manual method and IR method/Manual method, was estimated with Bland-Altman plots [57,58] (plots are not reported for brevity, but are available upon request), following the Shapiro-Wilk test for normality [50].
All the differences imputed to the Bland-Altman plot resulted normally distributed with a 0.842 < W < 1 [59] at a 95% confidence level. Table 2 reports the 95% limits of agreement for each variable (the lower the better). The agreement between the app and the manual method outperformed the other methods.

Testing the safety of the flash
The base illuminance (i.e., the one of the ambient) was measured at 200 lx; since the illuminance in the ''flash on"-state was 680 lx, the illuminance of the flash alone was 480 lx. The comparison of this value to the above-mentioned ISO standards ensured the safety of the procedure. In fact, 480 lx convert to 7.03Á10 -5 W/cm 2 under the hypothesis of an average wavelength of 555 nm (the ISO standards set the max allowed value to 0.706 W/cm 2 ).

Discussion
This paper presented the design and technical validation of an app for the measurement of the pupillary reflex, intended to be used in LRSs. Given the absence of specific regulations or clear guidelines for the design of medical devices for LRSs, we adopted the prescriptions of the European regulations on   The first part of this paper illustrated how the local needs and contextual analyses can be performed enriching engineering design with ethnographic methods. The second part presented and discussed the technical validation of the software, which was performed in two steps: validation of the acquisition and benchmarking of our app versus a commercial IR-pupilometer assuming as gold-standard the frameto-frame manual annotation of pupillary video recordings from 10 subjects. The very low errors and high correlation resulting from the former validation confirmed that a smartphone-based pupillometry acquisition without accessories was viable. This concept was corroborated by the low errors and narrow limits of agreement for the variables resulting from the second validation. The latter proved that the proposed solution, despite being based on a simple app and a smartphone in order to be sustainable in resource-scarce settings, is able to perform just as well, and often better than the benchmark. These results were possible due to the interpolation algorithm and the normalization of the pupil diameter with the iris one, which minimised artefacts due to hand motions and the use of visible light for pupil stimulation via mobile phone flash and video acquisition. In fact, commercial pupilometers use IR for image acquisition, which is not available in the majority of smartphones. Indeed, the app described achieved better results than the commercial IR medical device.
Moreover, the comparison with existing literature suggested that the proposed solution is the only one designed for LMICs and rigorously validated. In 2013, Tae-hoon Kim et al. [18] proposed a smartphone-based pupilometer that works with an Android app and an add-on device, which contains two types of LEDs and an IR filter. Their results showed that their system could have been a good candidate for pupillometry, however it had not been validated against a CEmarked or FDA-cleared commercial pupilometer. Moreover, the required accessories would make its use in LMICs inconvenient. In 2017, Mariakakis et al. [19] proposed an iPhonebased pupilometer that works with a box similar to the one used for virtual reality headsets and makes use of convolutional neural networks. The box was used to eliminate ambi-ent light and control the distance to the person's face. Nonetheless, the authors themselves claimed that such a box could be a hindrance in case of measuring the pupil light reaction with an unconscious patient and for tracking the whole reaction to the flash (i.e., the dilation phase cannot be captured because of the lack of lighting). Their design, in fact, only allowed assessing the pupil constriction phase and seems to require a server connection in order to work. In 2018, McAnany et al. [20] performed a study proving that the iPhone camera could be used for this purpose, comparing it with an IR camera, which was not medical rated. In 2019, the start-up Brightlamp introduced an iPhone app for tracking the photopupillary reflex based on trained object detectors and on the use of no accessory. Such app was manually validated similarly to part two of our validation with no benchmarking, resulting in a higher MAE (2.9%) and wider limits of agreement for the pupil constriction (±14%, which improved to ± 9% after bias correction). However, a recent study by McKay et al. [60] benchmarked Brightlamp with a portable IR pupillometer demonstrating that this particular iPhone app has poor repeatability and is not practical tool for supporting clinical decisions. Nonetheless, in general, iPhone-based pupillometry, relying on Hough transform, was proved to be possible and accurate enough by Neice et al. [61].
Moreover, the use of iPhones in SSA is quite uncommon due to their cost and because iPhone does not have any rugged model. In 2019, Vigà rio, et al. [21] proposed a system for the continuous monitoring of the pupil using a smartphone, the Virtoba support for mobile-phones and two LEDs. However, the system does not provide the typical pupillometry stimulus (i.e., flash in the eye) and its validation was limited to physiological data found in literature (i.e., the reaction of the pupils to a cold stress test). None of the designs above were conceived for LRSs. As it emerged from our contextual analysis, in fact, basing part of the design on extra add-on parts can turn out to be counterproductive either in a possible early health technology assessment phase or when already on the market. Adding extra add-on devices will increase the need for spare parts that will probably not be available in LRSs. For this reason, the authors of this paper suggest that the ''less is more" philosophy should be adopted when starting considering additional parts of a device conceived for these settings. Although the study was focused on pupillometry, its findings on the design can be relevant for other applications. For instance, it emerged clearly that affordability is not the only criteria for a device to be suitable for LRSs. Many other issues should be considered during the design, including affordance, easiness of deployment and use, resilience to underlying infrastructures that could be not stable, availability of spare parts and consumables, and available underlying technologies.
Another issue that emerged is the tendency to release apps with healthcare ambitions without proper technical validations (e.g., manual and/or benchmarking for apps). In the past years, both the FDA and the European Commission equated medical software (including app) to medical devices, making validation essential to guarantee safety and adequate performance. For this reason, in this paper, we decided to adopt the European perspective for CE marking medical apps,

Limitations
This study presents the preliminary results of the design and technical validation of a smartphone app. The results are valid and limited to one Android smartphone model; further testing could include more models. In these further tests the different flashes of different smartphone models should be checked for safety against the relevant standards.
The current design relies on a server connection, which may be a bottleneck, although currently many remote areas of LRSs (e.g., Africa) are served by good quality mobile phone services. To overcome these limitations, future versions of the app will also include the processing algorithms. To this regard, also artificial intelligence may be explored. While this solution may be difficult to run on very old smartphones, it should run smoothly on the other models.
Furthermore, a possible bias in the feasibility study might have been introduced because the opposite eye was not covered and could have potentially been partially stimulated by the changes in the ambient light. However, the ambient light was measured and maintained as constant as possible throughout the experiment. To this regard, healthcare workers will need to be instructed and cover the opposite eye in order to avoid bias in the pupillary reactions.
Moreover, as of now, the app is not giving any result in terms of millimetres; future versions may include this feature only for the pupil size, as it would be redundant for the pupil/ iris ratio.
Finally, the performance of the app is currently evaluated on light brown eyes, darker shades should be investigated, as they may be more challenging for pupillometers relying on visible light only. Future experiments could test the application on subjects with three types of iris colour (i.e., fair, medium, and dark). In this way, the efficiency of our application on different iris colours could be evaluated. This could also inform future upgrades of the app software to make it more efficient.

Conclusions
This paper presented the design and technical validation of a mobile app aimed to perform smartphone-based pupillometry, suitable for use in LMICs. The performance of the app algorithm is promising and, being able to compete with the performance of the algorithm of a commercial IR pupillometer medical device, suggests furthering the study with more smartphone models and transitioning towards a dedicated server application and/or a completely standalone app. The performance of the algorithms of the app, as confirmed by the technical validation, are sound: the proposed solution, by exploiting the pervasive presence of smartphones in LMICs and by not requiring expensive settings or complex procedures, represents a significant improvement towards an extensive screening of eye pathologies and brain trauma worldwide.

Ethics approval and consent to participate
The ethnographic analysis was conducted in accordance with the ethical approval REGO-2018-2283, obtained from the Biomedical and Scientific Research Ethics Committee.
The videos used for the technical validation were acquired from 11 healthy subjects in accordance with the ethical approval obtained from the Ethical Committee of University of Campania Luigi Vanvitelli (Registration number 500, approval title ''Studio pilota sull'utilizzo della pupillometria cromatica per la diagnosi e il monitoraggio delle degenerazioni retiniche ereditarie").