Remote videolink observation of model home sampling and home testing devices to simplify usability studies for point-of-care diagnostics

Both home sample collection and home testing using rapid point-of-care diagnostic devices can offer benefits over attending a clinic/hospital to be tested by a healthcare professional. Usability is critical to ensure that in-home sampling or testing by untrained users does not compromise analytical performance. Usability studies can be laborious and rely on participants attending a research location or a researcher visiting homes; neither has been appropriate during COVID-19 outbreak control restrictions. We therefore developed a remote research usability methodology using videolink observation of home users. This avoids infection risks from home visits and ensures the participant follows the test protocol in their home environment. In this feasibility study, volunteers were provided with models of home blood testing and home blood sampling kits including a model lancet, sampling devices for dried blood spot collection, and model lateral flow device. After refining the study protocol through an initial pilot ( n = 7 ), we compared instructions provided either as written instructions ( n = 5) , vs addition of video instructions ( n = 5 ) , vs written and video instructions plus videolink supervision by the researcher ( n = 5 ). All users were observed via video call to define which test elements could be assessed remotely. All 22 participants in the study accessed of point-of-care testing methods in the home setting. This article is included in the Coronavirus (COVID-19) collection. I have read the study by Sarah Need al entitled “Remote videolink observation of model home sampling and home testing devices to simplify usability studies for point-of care diagnostics”. The authors developed a remote videolink observation of a POCT home users for a IgM and IgG Dengue virus diagnostic tool. This approach was designed to avoid infection risks to check whether or not the participants follow the test protocol at home. This study is interesting and of practical value since it provides useful information to implement the procedure into the diagnostic approach of diseases or the periodic monitoring of other pathological conditions in a pandemic such that we are now affording. This methodology, however, could be of help even after this CoV19 pandemic since it could improve significantly the relationship between doctors and patients in rural areas or when patients have difficulties in moving to Clinical Centres. Nevertheless, I have some points to be addressed to the attention of the authors. The paper is too long and should significantly reduced. It should be written to allow readers to catch the key points of the proposed procedure. ○ In the Abstract results are not described. They should, instead, be outlined. ○ A detailed statistical analysis should be presented. Results should be expressed as median and range in the text, not only in the graphs. In the graphs, there is present some regression lines but R 2 along with both intercepta and slope are lacking. Results are striking so the p value could be omitted. Although the analysis is referred to very small groups, a comparison among them with ANOVA (Analysis of Variance) should be performed. A non parametric ANOVA (Kruskal-Wallis test) is to be preferred to counterbalance a possible non normal distribution of the data.


Introduction
Point of care diagnostics and differences with home testing and sampling Point of care (POC) tests are most commonly operated by trained users in a healthcare environment (for example screening tests for blood cholesterol or glucose), however, POC technology has made self-testing in the home feasible. Whilst pregnancy testing and blood glucose monitoring for diabetics remain the most common home self-test, home use devices are also available for other conditions including infectious disease screening such as HIV (Peck et al., 2014;Wei et al., 2018). Kits are also available for home sampling for testing indicators of diabetes, cholesterol, urinary tract infections, and chlamydia, with results being determined by diagnostic laboratory analysis after mailing the sample (Shih et al., 2011). Some patients use home testing devices routinely as part of their self-care such as blood glucose monitoring for diabetics.
Home testing and sampling devices offer a range of benefits in medicine, including ability to inform treatment decisions, overcoming reluctance to undertake testing (e.g. in sexually transmitted disease diagnostics), removing the barrier of the patient having to attend a diagnostic testing centre, convenience, and the potential for expanding testing with reduced infrastructure requirements -the latter being particularly important in low-resource settings (Garcia et al., 2015). The value of self testing is not straightforward, however, and there are disadvantages as well as advantages (den Oudendammer & Broerse, 2019).

Accuracy of POC testing and home testing and sampling
The relative accuracy of POC testing vs laboratory alternatives has three major components. Firstly, the analytical performance -i.e. accuracy in making biological measurements -is precisely defined in a controlled testing environment and measured in known samples. Secondly, the clinical performance -i.e. ability to make clinically important decisions -is determined using carefully selected positive and negative populations and is, as with analytical performance, usually well defined, typically by the manufacturer, at least for a defined population. Thirdly, the test performance in real-world use. The real-world accuracy can often fall significantly short of that expected based on technical performance alone. For example, human error can affect the test outcomes (Figueroa et al., 2018;Rennie et al., 2007), and clinical performance in larger and more diverse populations may also reduce the overall accuracy of the diagnostic test (when compared to the more defined population used for formal clinical performance evaluation).
In settings where end-users are untrained or semi-trained, the potential for the performance to vary due to changes in operation can increase (compared to tests operated by trained user), but for both untrained or trained users, significant errors may arise from variations in following the test instructions . Usability tests of malaria rapid diagnostic tests (RDT) have found that users make errors in positioning the sampling devices for the test, carrying out the steps in the right order, following the test times correctly, and interpretation of the results (Seidahmed et al., 2008). It may also be that users fail to refer to the instructions altogether (Weinhold et al., 2018). This can be heightened by the complexity of multi-step test kits and result in errors which may affect the test result and consequent decision making.
Medical device manufacturers are required to undergo safety and performance requirements to gain regulatory approval and for CE marking prior to product sale. This includes usability of the device by the intended end-user and taking into account the level of training of the intended end-user, the use environment and the user-device interface which includes any instruction or instructional video provided. During development of this process the medical device needs to have testable elements that relate to the usability of the device and determination of usability of the frequently used features of the device, and products must meet usability standards (including EN 62366). In the case of lateral flow rapid test devices use in the field has, for some tests, been found to be less accurate than those in a more controlled setting.

Usability methodology
To make use of home tests the overall performance must be measured in a trial and clinical performance of home test and home sampling procedure calculated by comparing test results with disease state for the trial populations. This clinical performance must be considered against a gold standard laboratory diagnostic sampled by healthcare professional. In some cases, a lower accuracy can be balanced against benefits of home testing. But this overall clinical performance for home testing should be further segmented to understand two distinct elements of home testing: analytical performance of the home testing/sampling method -this can be evaluated in a laboratory, and relates to the technical properties; vs usability analysis that determines how accurately the method was executed by the home user. This can be evaluated by observing home users execute the task using the instructions provided. The latter can influence the former if analytical performance is compromised by errors or incorrect interpretation of instructions for use or test results.
In execution of diagnostic tests, participant observations have been used to identify recurrent user errors, where the researcher records user errors using a checklist (Seidahmed et al., 2008;Wei et al., 2018). Patterns in this data can signal critical parts of the procedure that are particularly difficult to carry out, and where more or better information is required to help the user to avoid common mistakes. For example, in a study investigating HIV finger prick self-tests in China, Wei et al. (2018) found that most errors were during the collection of the specimen (drawing and collecting the blood). Another study investigating malaria RDT use found errors during preparation of the test and interpretation of the results (Seidahmed et al., 2008). A mixed-methods approach combining performance observation with open-ended user input can provide further insight. In a study comparing the usability of different HIV self-test prototypes, video observation of participants and monitoring of performance with exit questionnaire and qualitative interviews was combined. Qualitative interviews signposted to specific parts of the instructions which were not clear to participants and revealed that some participants had not understood how to use the home sampling devices (Peck et al., 2014). Exit questionnaires have been used to measure perceptions of acceptability of a self-test and ease of use (Smith et al., 2016). This approach is especially useful when data is being collected to guide the design or redesign of a particular set of instructions, as it can help to locate missing or misleading content, problems of style in the writing of instructions and problems for searching and finding information (Atlas, 1981).
Various studies have looked at the use of different representational approaches in instructions for untrained users. To make use of current access to technology and considering the effect that multimedia can have on learning and performance (Mayer et al., 2020), different avenues have been explored for instructions for POCT. For example, a usability study for an autoinjector compared four instruction formats: a quick guide, a quick guide plus a dummy training device, a quick guide plus a patient leaflet, and a quick guide plus a video (Allaert et al., 2018). The study found that, although the format of instruction did not affect performance to compromise the results of the test, participants who had used a leaflet and a video made fewer minor errors. Another study looking at malaria rapid diagnostic tests compared user performance with manufacturer's instructions, simplified pictorial instructions, and a training session plus the pictorial instructions (Harvey et al., 2008). It found that the instruction format had a significant effect on how accurately participants were able to administer the test. Given the available evidence, we wanted to explore whether it was possible to compare different instruction formats using videolink observation and assessment of home testing and sampling practices. For the purposes of this feasibility study, we used manufacturers' existing instructions.

Remote consultation and research methodology
During the COVID-19 pandemic, infection control measures including social distancing and 'stay-at-home' orders needed to address national outbreaks have made it challenging to assess the usability of novel devices or sampling protocols, or to adapt devices previously used only in clinical settings. At the same time, diagnostic services have been overwhelmed with massive expansion of COVID-19 testing alongside closure of many clinical settings to restrict contact, especially between vulnerable groups with potentially contagious patients. This has reinforced the need for home testing and home sampling, which can reduce the infection hazard not only for COVID-19 tests but also other vital diagnostic tests (Torjesen, 2020). However, inviting volunteers to attend a usability study in such a scenario is equally challenging, creating a barrier to innovation. The infection control challenge is greatest for some of the people in most need of novel testing methods (e.g. care workers). Yet it is critical that usability of home testing or home sampling is optimised for those in most need. We focus here on developing research methodology for the remote assessment of usability of testing procedures and devices within volunteers' homes.
During COVID-19-related restrictions, videolink procedures are being used more frequently for remote medical consultations (Greenhalgh et al., 2020b). Visual cues provide valuable information about the patients condition and indicators for diagnosis (Greenhalgh et al., 2020a). This is also true for usability studies (Shah & Gupta, 2017; van der Weegen et al., 2014). Although exit questionnaires can be used to collect user perceptions, directly observing participants executing the tasks is pivotal in usability testing. Using only questionnaires reduces the amount of information collected. Given that participant observation is pivotal to test usability, but face-to-face research settings are not viable in the current pandemic we wanted to understand what level of information could be collected using video calls.
This feasibility study investigated the use of remote assessment of POC rapid tests and sampling techniques by untrained users using model tests with existing instructions. Information about participants handling of home test kits, capacity to return reliable images of sufficient quality and their use of instructions was collected. The study included collection of both quantitative and qualitative measurements. Our model device designs are open source and published to allow anyone to 3D print these for more detailed study, or to modify and improve usability.

Overall study design and recruitment
This pilot was designed to establish if remote observation was a suitable methodology for observation of usability and also quantification of accuracy of blood testing kits used in the home. The home testing packs used in this study included 3 components of home testing and sampling methods. These components were tested by observing lancet use, liquid handling for lateral flow rapid tests and liquid handling of blood sampling medical devices. Model lancets and rapid tests were 3D printed on a Prusa i3 MK3 using PLA filament, the open source designs for these models in the form of OpenSCAD-2019.05 files and STL 3 dimensional geometry files can be downloaded and used or edited (Extended data (Needs, 2020b)).
Participants were recruited via University of Reading email distribution lists from April 2020 to June 2020. Participants were included based on age (18-69), locality to Reading for test kit delivery and were not on the government list of high risk for COVID-19. Participants from high risk groups were excluded. This included identifying as on the government list of high risk for COVID-19, have had an organ transplant, receiving cancer treatment, have a severe lung condition, have a condition that makes them more likely to get infections, a weakened immune system, are pregnant and have a serious heart condition and are over 70. Those that identified that they lived with anyone from a high-risk group was also excluded. Participants had to complete a video call using Microsoft Teams whilst they followed instructions in the testing pack and were observed by the researcher, and after completing the supervised task send images taken on their smartphone camera to the researcher via email. Data was recorded during the video call using a template to add observations during the call (Needs, 2020a). By avoiding face-to-face contact between researcher and test user, the only infection hazard was distribution of model test kits. To avoid any risk of infection, these test kits were prepared in a sterile filtered air cabinet and 3D printed parts sanitised with 70% ethanol. After delivery, participants were instructed to leave the packs untouched for 48h prior to opening. In this pilot, participants also completed a screening questionnaire to identify and exclude any participants at greater risk of severe COVID-19 disease who were not recruited to this feasibility study. However, the study could be extended to a broader pool of participants as appropriate.

Blood sampling kits
In parallel to assessing the remote observation methodology, we explored if model components could be designed to simulate home sampling and home testing devices, using computer aided design (CAD) and rapid prototyping (3D printing). 3D printed model lancets were used to test if users could operate correctly, with a simple arrow indicating correct orientation allowing assessment of ease and consistency of following instructions ( Figure 1).
Medical device blood collection devices were also included, that are designed to deliver a fixed volume of blood from a fingerstick to a POC test or onto a filter for home sampling. Home blood sampling can use three methods: firstly, blood spots can be dried onto filter and a fixed quantity tested in the lab using a punched disc (Tuaillon et al., 2010); secondly blood can simply be collected into a small tube with products such as BD microvette, and the laboratory can process serum or plasma; thirdly, a fixed volume of blood can be dispensed onto a filter or into a tube, allowing the laboratory to elute the dried blood and process based on the volume dispensed. We chose the third method in order to explore if home users could accurately dispense a fixed volume using these devices. Microsafe (40 µL) (Safe-tec LLC, USA supplied by Fitech, MS-40) and PTS Collect (40 µL) capillary blood collection tubes (Polymer Technology Systems Inc, USA supplied by Fitech, 10392) were used. Participants were given a tube containing 1 mL of simulated blood (2% PME red food colouring, 20% ethanol, 78% water). The test packs included a copy of the manufacturer's instructions for each collection device ( Figure 2).
Instructions for the rapid lateral flow test were adapted from those used in the widely used SD BIOLINE Dengue Duo test product, to create a model lateral flow kit for evaluation. This rapid test was selected because it is a widely used lateral flow product that shows sufficient accuracy to be clinically useful in dengue fever diagnostics (Gan et al., 2014), however this specific product is intended for healthcare professional operation rather than home use by an untrained user. The model lateral flow test kit has two wells marked IgG and IgM. Filter paper inserts were designed to very simply permit remote measurement of the volume of simulated blood deposited by participants -the higher the volume, the further along the filter that the red dye travelled. Participants were provided with a Nalgene 4 mL capacity dropper bottle filled with simulated blood (2% PME red food colouring, 20% ethanol and 78% water)-these dropper bottles are routinely included in many lateral flow products to add buffer alongside sample. Participants were asked to follow the modified instructions in Figure 3, designed to represent the real  test instructions and assess users ability to deposit different volumes into distinct parts of the device, but without requiring real blood.
To determine if participants could accurately image a rapid test result to record the result, for example for public health records or remote analysis, participants were asked to place the device on a template alongside representative images of a negative and positive lateral flow test (Figure 4). These tests were deliberately selected as hard to interpret examples where the lateral flow test result was not clear-cut, by using blood with a significant level of haemolysis staining the test strip red. The participant was asked to photograph this with the digital camera on their own smartphone and return these images to the researcher. This acted as a baseline to identify if images sent to the researcher were reliably of sufficient quality to identify test results of a known test. This also captured the volume of simulated blood deposited by each user, as the distance travelled by the red dye was proportional to the volume deposited.
Participants were randomly allocated into three groups: group 1 had written instructions alone and were not supervised (n = 5); group 2 were provided with both written and video instructions and encouraged to view the instructional video prior to starting the test (n = 5) or; and group 3 were given written and video instructions and were supervised by the researcher (n = 5) (Video 1). When they had completed the tasks, participants were asked to complete a questionnaire. At the end of the call participants were required to image and send images of the study consent form, questionnaire, blood sampling task and rapid home test task to the researcher.

Statistical analysis
GraphPad Prism 8.0 were used to generate graphs.

Results and discussion
This feasibility study aimed to explore the practicalities of remote observation of usability of home tests and home sampling in a real home setting during social distancing and movement restrictions imposed in response to the COVID-19 pandemic. We needed to establish that participants could connect using video calling reliably, and determine if it was possible for the researcher to observe the participant following test instructions during use. At the same time, we wanted to check if we could assess the actual volume of sample dispensed remotely, using images taken by participants. We wanted to see if the methodology could identify specific errors in operation and in following instructions, and if the use of 3D printed model tests allowed us to measure volumes dispensed. The feasibility study was not designed to determine specific error rates with specific home testing kits, nor to define differences in participants ability to follow differing instruction formats, but to check if such comparisons could be made using this methodology.
Suitability of 3D printed model kit for remote usability studies The instructions and model kits were designed to mimic liquid handling techniques associated with home sampling and rapid tests, adapted to explore the accuracy of liquid dispensing by home users and identify use practices via videolink. Where appropriate, real fixed-volume blood sampling medical devices were used. However, model lateral flow tests and model lancets were designed and 3D printed. In order to establish usability, there was no need for users to actually take a fingerstick or blood sample, therefore model lancets were used to avoid hazardous sharps. A 3D printed model was used to evaluate instruction use and check if users could orient the model lancet correctly. Lateral flow devices were 3D printed firstly to allow use of a modified format that permitted quantitative measurement of the volume of simulated blood delivered by the user to the model lateral flow device. Secondly, they permit rapid iteration of device design in future to improve usability. The models still very closely mimic real devices (number of wells and device size), and we conclude this remote usability methodology would allow use of real home testing and home sampling devices to be assessed instead of models.

Implications for recruitment of videolink methodology
Recruitment of volunteers for a remote study requires consideration. Because volunteers must have access to both a device with internet connectivity and the ability to make video calls, plus a smartphone for imaging test devices, there will be a recruitment bias and individuals without access, alongside those who may be less confident of their ability to use Internet-based applications or smartphones will be excluded. This may affect the demographic representation of participants including age and socioeconomic status. This could be significant to the study of home testing procedures, if individuals who are less confident with new technology, such as smartphones, are also less confident in their use of diagnostic devices and home sampling processes. Individuals over the age of 69 and people vulnerable to severe COVID-19 disease were also specifically excluded from this feasibility pilot for their increased risk of infection, and so we were unable to explore participation rates in that age group.
However, balancing these disadvantages, there may be positive impacts on recruitment from the remote study methodology, allowing participation by volunteers who may not have participated if the study required travel to a research site to take part. These could include individuals in remote locations, those with limited access to means of travel, those with childcare or caring responsibilities, or with disabilities which restrict their ability to travel. During restrictions imposed by COVID-19 control measures, this may become especially important, as many people may be reluctant to attend, or advised not to attend a research venue, for shielding reasons as well as travel restrictions. Response levels returning enrolment forms was not as high as the number of initial responses expressing interest in participation. Of 34 expressions of interest, 7 did not return forms and 5 participants returned forms but were not eligible to take part -but a participation rate of 22 out of 34 initial volunteers remains high suggesting there is no inherent barrier with this methodology. For this feasibility pilot, volunteers were mostly recruited from an academic background (13/22 were University members either staff or student), and so having established the feasibility of the videolink observation method, it must be considered that when sampling the general population, there may be differences in responses rates.

Feasibility of videolink methodology
The study aimed to identify if participants were able to join video call and return images of the test components. Participants were required to complete a Microsoft Teams video call with a researcher (30 minutes), fill in a questionnaire and send images of the tasks and questionnaire plus the consent form back to the researcher. Of the eligible participants who were selected and completed the screening questionnaire, the majority completed the video call (22/22) and returned images of the task to the researcher within 48 h with no reminders (21/22). This suggests that the study format was not a major barrier to recruitment and participation levels were high after initial signup.
While all scheduled participants completed the video call, extra support was needed for some participants. In around 27 % of cases, participants needed further instructions on how to join the video call at the start time of the scheduled call (6/22) and some participants (3/22) asked for further details on how to join prior to the scheduled meeting time. We therefore recommend including step-by-step instructions on how to use the video conferencing tool and providing a backup contact method for the participant to ask the researcher for this extra support.
An initial pilot study was conducted with 7 participants to investigate the videolink observation method and review the usability of our 3D printed devices and instructions. This pilot study was valuable and identified several flaws in the initial instructions that arose from a disparity between the written and visual instructions which led to errors in use. Following the pilot, a refined instruction set was prepared, and a full iteration of the study was conducted with a further 15 participants. Based on experience in the pilot, each task was individually packaged and numbered to help participants identify the correct materials for each task. We recommend running a small pilot to validate methodology and to identify any errors or inconsistencies around the instruction given to participants, prior to any larger study enrolment.

Returning high quality images to the researcher and interpreting rapid test results
Images returned to the researcher should be of sufficient quality to analyse. The quality of images can vary since this relied on the participants smartphone which can differ in specifications. Sending images, for example of completed tests, to healthcare professionals for analysis has become an important tool for remote testing and telemedicine to confirm result interpretation (Wong & Dunn, 2018). One important purpose of our feasibility study was therefore to determine if home users were able to reliably return images of tests for interpretation or analysis electronically, using their own smartphone camera. All participants successfully returned analysable images of the blood sampling test and model rapid test (Figure 5a). Alongside returning images of tests, the study checked if participants could correctly identify rapid test results using a standard lateral flow interpretation guide. When asked to identify the test results of a real lateral flow test device result which used whole blood (Figure 4), around 9.1% of the participants incorrectly identified the negative NS1 dengue test as invalid and around 13% incorrectly identified the positive test as invalid (Figure 5b).
Quantitative assessment of liquid handling using remote testing is possible To analyse specific rapid test components the study included the use of a simple filter-paper based device was designed to mimic rapid tests or blood sampling techniques. This method would test whether the images returned could be quantified. We initially established in the laboratory that the area (for fixed volume of simulated blood dispensed onto filter square) or height (for drops of simulated blood added to simulated lateral flow device) that the red dye travelled into the filter was directly proportional to the liquid volume dispensed (Figure 5c-d). The blood collection devices selected for evaluation are designed to transfer a fixed volume of blood sample into a point-of-care test device; these are commercially available medical devices used for example for blood cholesterol measurements. Using the images sent to the researcher the volume dispensed for each test was calculated (Figure 5e-f). In this study, we found the variation in blood sampling volume dispensed by study participants ranged from 3-56 µL and 15-48 µL for the Microsafe and PTS collect tubes respectively. The volume dispensed by the user was likewise calculated for the rapid test. The IgG and IgM test had a dispensed volume of 45-116 µL and 49-187 µL respectively; instructions required delivery of 3 drops and 5 drops respectively into these two test channels, which in laboratory conditions corresponded to 101 and 173 µL respectively. This is a large range if a downstream assay requires precise blood volume to function correctly, and this variation therefore warrants further investigation to determine if this variation in volume dispensed by the home kit user might affect accuracy.

Specific use errors identified by videolink observation
Having confirmed that using images sent to the researcher can quantify how much volume is dispensed for each user/test, these images alone cannot provide information on how the devices are used. The videolink observation was therefore intended to allow identification of user errors that could account for the variable volume dispensed. Prior to starting the kits, participants were asked to adjust their webcam such that their hands were visible and that the model lancet, blood spot sampling and rapid test could be monitored by the researcher during operation and recorded using an assessment template (Needs, 2020a). All video calls were visible in enough detail to identify the number of drops dispensed for the rapid test and observe the level of simulated blood in the capillary collection tubes. In approximately 50% of cases participants were asked to adjust the camera or move the hands closer to the camera to improve the observation, and we therefore recommend the study protocol is planned carefully to allow common webcam setups to permit the researcher to observe the kit during use. The most common errors identified for the capillary collection tubes were not waiting for the liquid to reach the fill line completely (5/15) or squeezing the ends of the tube to fill them (4/15). The latter error would not necessarily lead to a significant difference in volume dispensing (Figure 5e). In one case for the microsafe tube, an unexpected result was observed where the volume dispensed was significantly lower than others, but no user error was observed by the researcher. In this case the liquid was not fully expelled from the tube transferring only a small volume. The participant was asked to confirm the use of the device afterward and no error was identified. By video it was confirmed the tube was filled to the marked line with the simulated blood, but the tube had not been squeezed. It was confirmed that after the participant squeezed the tube to release the liquid the simulated sample remained in the tube. Therefore, this was not categorised as a user error, and the low volume dispensed could instead be from a device failure.
Using the video call, errors in use were highlighted. Understanding these differences in usage through direct observation allowed us to record these specific errors. When these known errors were excluded the liquid volume ranged from 22-48 and 26-37 µL for Microsafe and PTS collect tubes respectively. In some cases, in spite of the user error the correct amount of liquid was still dispensed. In these cases, participants squeezed the tube to fill with simulated blood up to the correct fill line (Figure 5e).
Use of the dropper bottle resulted in fewer errors compared to the blood collection tubes. The video observation allowed us to identify how each participant used the dropper bottle. Variations in liquid handing technique could in many cases be clearly linked to the volume dispensed. Our model test required the addition of 3 and 5 drops dispensed respectively for the IgG and IgM lanes; this was designed to determine if adding complexity affected user accuracy. In spite of this complexity, all participants managed to identify the correct number of drops to add, however, there was still a variation in the liquid dispensed. The main identifiable reason for this variation we believe was holding the dropper bottle close to the filter paper such that full drops were not formed (12.5%), and with one participant combining this with taking an image before the 2-5 minute timing. Since the liquid did not have time to fully dry the measured volume may have appeared been even lower.
We therefore believe the most important benefit of the video call methodology, compared to simply requesting participants return the images, was to identify use errors live, and to be able to attribute variation in volume with specific user behaviour. Using the videolink we were able to identify, by direct observation, if the participant used a blood sampling device in a way that would lead to errors in sample collection or test performance. For home sampling, the laboratory may be able to correct errorsfor example errors using the collection tube led to an increase and decrease in sample volume, however, a decrease in sample volume is more likely to negatively affect assay results due to insufficient material, assuming the laboratory can adjust the sample volume prior to testing.
The effect of instruction format on usability Following a pilot to refine the study protocol (n=7), 15 participants were split into three groups with different instruction levels: written; written and video; written and video with videolink supervision by the researcher. Through supervising the participants via the videolink, we were able to check whether they could access the online video instruction. All participants were able to access the video instruction link with no technical issues in accessing the video hosted on YouTube using a private link (Needs, 2020a).
The focus of this study was to determine the feasibility of, and to optimise, the videolink methodology for observing usability in the context of following instructions, and so although participants were split into three groups based on different level of instruction, these groups were not large enough to identify which instruction format led to fewer errors. Furthermore, the instructions used in this study were not developed specifically for improved usability and were instead adapted from existing rapid test products that were designed for use by a trained professional tester. However, we still examined if there was any evidence of differences between instruction format in spite of these limitations, and explored if participants felt more or less comfortable with the different formats. The group with both video instruction plus researcher supervision made the fewest user errors overall ( Figure 6) with only 2 errors identified in this condition compared to 6 in the unsupervised category and 4 in the video/ unsupervised category. An expanded study with larger groups is therefore warranted to explore if this apparent reduction in user error with direct videolink supervision is significant.
Alongside showing this methodology allows analytical assessment of volumes dispensed, and permits observing if these volume errors were associated user errors, this study protocol allowed participants to report their experience using the devices and instructions. The user experience questionnaire identified a general trend that participants were more likely to report that they felt confident when provided direct supervision, with increasing numbers of less confident participants for video and written instruction or written instructions alone (Figure 7a). There was an even clearer suggestion that participants reported they found instructions easier to follow with supervision than without (Figure 7b). Increased confidence in performance have been noted in video aided instructions (Alexander, 2013; Shah & Gupta, 2017). Whilst our initial data supports this observation in the context of home testing kits, both the written and video instructions in our pilot were designed purely for exploring method feasibility, recruitment was not designed to represent the broader population, and group sizes were too small to be confident of the differences seen here. But having shown it is possible to quantify user experience and accuracy simultaneously with remote observation, it is now vital to expand this study with larger groups using bespoke instructions in different formats, and use this methodology to define the optimal instruction format for eliminating user errors whilst maximising user confidence.

Conclusion
This study shows it is feasible to undertake remote usability studies for using rapid tests in the home or for home sampling for laboratory diagnostics. The use of remote videolink methodologies for research will become increasingly important during and after the coronavirus pandemic as social distancing and decreased face-to-face meetings are required. These methods would also allow the inclusion of individuals who may be at higher risk of COVID-19 who would not be able to attend in person meetings. A further benefit may be the ability to observe users following instructions in their home environment, without either requiring participant travel to a research site, or requiring a researcher to visit participants in their home.
We identified a number of factors that are important to consider when designing a videolink study. An initial pilot is essential at the start to allow the kit format and instructions sent out to participants to go through an initial review process before the final study is launched. It is important to identify which metrics are to be measured and whether these can be quantified or recorded remotely -this includes quantitative elements likely to affect analytical performance, as well as qualitative factors such as interpretation and user confidence. In this study we measured a number of metrics that could have been captured without direct observation (volume of liquid dispensed and questionnaire) but we were also able to collect further information through direct observation -for example by identifying specific user errors -that would not be otherwise have been recorded. Videolink observation of home testing kits allowed the researcher to determine accurate use of equipment for home testing tasks, and we were able to attribute most quantitative errors to a specific action by the user.
Identification of the correct demographic for a study is important to gain insight into the usability of a home test for the target population. For people to participate in studies with videolink observation they must be able to make a video call using an internet-connected device, potentially excluding individuals not confident with technology; this barrier is reducing as such communication becomes more common. We used Microsoft Teams in this study as most of those recruited through our University contact list regularly use this platform, but the use of a wider selection of publicly available videolink platforms including FaceTime and WhatsApp might simplify participation for many important target groups. Likewise, exploration of diverse platforms for hosting instructional videos may be important. However, the advantages of allowing a wider participant demographic using more familiar platforms must be balanced against possible privacy and data security concerns using each different platform.
We conclude that videolink observation provides a viable method to conduct usability studies remotely, with several benefits. Although preliminary, we found some evidence that remote supervision by the researcher increased participants confidence in using these home testing kits. Further research on home sampling instructions -using this methodology -is now important to identify ways to increase participant accuracy, ideally without supervision. Future work will also include research with users on the design of instructions to accompany home-testing kits. This will consider the extent to which the form of the instructions (spoken, written, pictorial on paper or screen) leads to fewer errors in sampling.

Ethical considerations
Informed written consent was obtained from all participants. Ethical approval to undertake this study was received from the University of Reading, reference code 23/2020.  (6):800-805 2 ). These two references may be of interest for readers involved in the of POCT. I suggest that they should be added to the manuscript.