Challenging issues of integrity and identity of participants in non-synchronous online qualitative methods Methods in Psychology

Qualitative data collection is increasingly occurring online, with data collection methods often lacking the synchronous contact between researchers and participants present in more traditional methods of qualitative data collection such as face-to-face interviews. Despite numerous benefits of non-synchronous online methods of qualitative data collection, such methods also pose unique challenges concerning participant eligibility and data quality in the qualitative domain. Due to a longer tradition of conducting non-synchronous quantitative online data collection, researchers have discussed issues related to data quality for use within quantitative research, and developed techniques to address such issues. However, such discussions have not taken place within qualitative research and due to the differences in types of data and theoretical underpinnings, only some of the techniques developed in quantitative research can be appropriately applied in qualitative research. In this paper, we address this knowledge gap by providing an important ‘how to guide ’ , presenting techniques to help address threats to data quality and integrity in non-synchronous online qualitative research. We start by outlining techniques developed for use in quantitative research that can be appropriately transferred to qualitative paradigms, before proposing techniques to manage challenges faced specifically by non-synchronous online qualitative research. We go on to discuss some of the potential pitfalls which can prevent the implementation of these techniques and how to overcome them. Finally, we urge researchers to be transparent about the techniques they implement to optimise data quality and to adopt a proactive rather than reactive approach to maximising data quality in qualitative research studies.


Introduction
Within the last two decades, the use of qualitative methods has gained momentum in the field of psychology, a trend accompanied by an increasing diversity of qualitative methods. Traditional qualitative data collection methods such as interviews and focus groups necessitate synchronous contact with participants (e.g. face-to-face, telephone, or video communication) (Braun and Clarke, 2013). Whilst such methods remain popular, methodological innovation has expanded the types of research data that can be collected and crucially, how data are collected. Many methods offer the opportunity for use online with and without synchronous contact between researcher and participants, facilitating large participant samples with relatively little researcher time dedicated to participant recruitment and data collection. Such data collection methods include textual data from qualitative surveys, diaries, and story completion Clarke et al., 2017;Meth, 2017), visual data in the form of photos, drawings, or other images (e.g., Pain, 2012), and combinations of these data collection methods (e.g. Favaro et al., 2017;Hayfield and Wood, 2019). Although many of these methods have long been used within qualitative psychology, the recent move towards their use online provides investigation of previously unexplored or inaccessible areas and populations, and thereby opens new avenues of discovery. For example, online delivery of methods has considerably widened the available participant population, and due to such methods being more time and cost efficient, researchers are able to collect more data and conduct more studies in limited time using fewer resources. It is these online qualitative methods of data collection that do not involve synchronous participant contact that we discuss in this paper. For simplicity, throughout this paper, we use the term non-synchronous online methods when referring to all non-synchronous online participant recruitment and data collection methods, and non-synchronous online qualitative/quantitative methods to distinguish between research paradigms. Within the term non-synchronous online methods, we include those which employ non-synchronous screening processes (e. g. screening questions answered via email), and such processes are discussed in section 2 below. In addition to this natural evolution of qualitative online methods, the COVID 19 pandemic has acted as a catalyst, accelerating this process by moving many studies online. Such studies are likely to have been rapidly re-designed to adapt to the current situation, leaving researchers little time to consider many of the implications of online methods (e.g. Dodds and Hess, 2020). However, as with all research design decisions, researchers must ensure that their choices regarding the use of novel and non-synchronous online research methods are driven by their research objectives rather than simply practical constraints. Additionally, researchers must ensure appropriate alignment across their research questions, methods of data collection and analysis, and overall research paradigm.
The benefits and opportunities that non-synchronous online qualitative methods provide have been well documented (e.g., Subramaniam and Wuest, 2018), as have the challenges posed by non-synchronous online quantitative methods (e.g., Godinho et al., 2020;Teitcher et al., 2015;Ward and Meade, 2018). However, the challenges posed by non-synchronous online qualitative methods, specifically the potential threats to data quality and integrity, are rarely discussed. This lack of discussion can be ethically problematic as it prevents transparency regarding how the research community can ensure that nonsynchronous online qualitative research adheres to key ethical principles. Recent literature has opened up this ethical conversation with regards to safeguarding participants rights (Gupta, 2017;Sugiura et al., 2017). Due to such literature providing key discussions regarding such ethical concerns, we will not cover these in detail here. Instead, we signpost the reader to Gupta (2017) and Sugiura et al. (2017) to consider specific ethical issues related to research participants in more detail. Here, we focus on how the lack of discussion regarding potential threats to data quality in non-synchronous qualitative online methods presents practical challenges as researchers may be left unsure how to ensure the quality and integrity of their data. This in turn leads to additional ethical challenges, as without clear guidance or standards related to ensuring data quality and integrity, research findings and conclusions may be undermined. The purpose of this paper is therefore to critically reflect on the implications of using non-synchronous online qualitative methods and initiate discussions related to the implementation of techniques to ensure data quality and integrity.
This commentary begins by outlining techniques for ensuring data quality and integrity which have been developed for use in nonsynchronous online quantitative methods and are transferrable to qualitative paradigms. We then consider the additional considerations relevant to non-synchronous online qualitative methods and present a comprehensive selection of techniques for qualitative researchers to draw upon. We end by outlining our aspirations for continuing the conversation in this area, with the aim of propelling qualitative online methods forward in a rigorous and robust manner (see section 4).

Learning from previous literature
The current literature exploring issues related to ensuring the quality and integrity of data collected through non-synchronous online methods has focused exclusively on quantitative methods. Therefore, although there may be many useful techniques within the current literature that are appropriate for use with qualitative methods, it is crucial that such techniques are examined thoroughly to ensure their appropriateness for use in qualitative paradigms. Here we consider the challenges to data quality and integrity which are shared across qualitative and quantitative paradigms, and the techniques to address these issues which have been developed within quantitative paradigms that can be transferred to qualitative paradigms.
Common across qualitative and quantitative study designs are three types of study respondents: eligible participants, ineligible participants, and fraudulent participants. Eligible participants are those who meet the study eligibility criteria. Ineligible participants are well-intentioned individuals who are unaware that they do not meet the eligibility criteria. Finally, fraudulent participants are participants who intentionally complete study tasks inappropriately. Many studies provide financial reimbursement for participants to rightfully acknowledge their time and effort (Jones and Liddell, 2009;NHS Health Research Authority, 2014). Unfortunately, these payments may make a study a target for fraudulent participants, who participate for financial gain.
The current literature has focused on helping quantitative studies prevent such fraudulent responses (Godinho et al., 2020) and also careless responses (Godinho et al., 2016;Goldammer et al., 2020;Ward and Meade, 2018) from eligible participants who do not fully engage with the task. Fraudulent participants may use automated systems (known as 'bots'), or manually submit responses to the study. Fraudulent participants will likely submit multiple responses, although a single submission does not guarantee the response is not fraudulent. To help ensure that data is only provided by eligible participants, nonsynchronous online qualitative studies can implement four simple design features which have been used within quantitative research (see Bowen et al., 2008;Godinho et al., 2020;Teitcher et al., 2015). Firstly, a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a short task designed to be easy for humans but too complex for computers (Butterfield et al., 2016). Many online study platforms (such as Qualtrics) have the facility to add CAPTCHA easily to a study, making it a quick and simple way to help prevent automated responses. Secondly, most online study platforms have settings which can prevent multiple responses from the same IP address. This may help to prevent or deter participants from completing the study more than once, although fraudsters may use technology which disguises their IP address. Thirdly, including a clause in the information sheet and consent form outlining the conditions under which payment will not be sent provides researchers with an appropriate position for withholding payment and may dissuade would-be fraudulent participants. Finally, requiring responses for all questions and setting minimum limits for free text boxes makes fraudulent responses harder. For example, in our recent online study employing story completion methods (Jones et al., 2020) we set a minimum limit of 900 characters for stories, a decision informed by both methodological and analytical guidance Clarke et al., 2017). However, caution is needed when requiring responses to questions as participants have the right to choose not to respond to a question. One resolution to this ethical dilemma is to include a "prefer not to answer" response to such questions, thus allowing participants to submit a response while choosing not to answer the question. As this may not be appropriate for all questions/tasks, we suggest researchers consider the ethical implications prior to implementing forced responses.
In addition to these design features, there are two techniques relating to study procedures that can be implemented to help confirm the eligibility of participants. Firstly, making the study accessible only via personal links sent by the researcher allows the researcher to act as a 'gatekeeper', only giving access once eligibility has been confirmed. We recommend asking potential participants to email the researcher, who then asks a series of screening questions to assess eligibility, and answers any questions that the participant may have. This process also helps to ensure that participants are fully informed about the study prior to consenting, although may present some concerns regarding participant anonymity. However, although this screening process can help to ensure the eligibility of participants, it is not a panacea. Fraudsters may pass screening by providing false information, or by being eligible to take part (therefore providing accurate information) and then fraudulently taking part multiple times. Additionally, the anonymity afforded by nonsynchronous online qualitative methods may result in attracting participants who may not wish to engage with synchronous qualitative methods. Therefore, researchers should consider what constitutes the minimal amount of information needed to assess eligibility and avoid enquiring beyond this. We recommend that researchers consider such participant-related ethical concerns when deciding if and how to implement these techniques. For more information on participantrelated ethical concerns, please see Gupta (2017) and Sugiura et al., (2017). Secondly, we recommend checking that the email addresses provided for sending payment are not duplicated. Generating a document solely for tracking payments ensures that data protection is adhered to, excess personal information is not stored, and data that are stored is anonymous. Such a document only needs to include details of the payment (i.e., payment method), the date the payment was executed, and the email used for payment.
The techniques outlined here were developed for use within quantitative paradigms and are appropriate for use within qualitative paradigms as they are not impacted by the different epistemological or ontological assumptions underpinning the different paradigms. However, there are some challenges related to data quality and integrity which are either specific to qualitative research or been tackled within quantitative paradigms in ways unsuitable for use in qualitative paradigms (e.g., statistical analysis of patterns of responses as suggested by Goldammer et al. (2020)).

Challenges and opportunities unique to qualitative paradigms
The techniques outlined above focus on the prevention of fraudulent or ineligible responses being submitted, however it is also important to consider how to detect such responses once they have been submitted. Due to the differences in types of data and theoretical underpinnings of qualitative and quantitative methods, the detection of fraudulent or ineligible responses requires different techniques across paradigms.
As soon as is practical following receipt of a response, and prior to any payment being sent, data should be screened for duplicate or incongruous responses. Timely screening avoids potentially problematic responses from accumulating, a potential danger with intentional fraud where many responses may be submitted in a short period. Prompt screening also ensures that eligible participants receive payment without delay and is important for ensuring the appropriate safeguarding of vulnerable individuals. Safeguarding is outside the scope of this paper; however, researchers must consider how to manage disclosures or other safeguarding concerns which may arise in the data. When screening data, it is important to balance the competing needs of ensuring only data provided by eligible participants are included, and not discarding eligible, but unusual, responses. Such a balance is particularly important in non-synchronous online methods as they may attract different participants as compared to face-to-face and other synchronous methods. Therefore, researchers should avoid being overly influenced by their expectations of what the data 'should' look like. Ideally, response screening would be facilitated by someone who is blinded to the study hypothesis and aims. In our experience it is relatively easy to identify fraudulent responses as they tend to contain inappropriate phrases, unrelated to the question. Table 1 shows examples of fraudulent responses that we received to our story completion study (see Jones et al., 2020). The examples show how fraudulent responses may be simply bizarre and nonsensical, or they may include attempts to answer the question.
It is worth noting here that the 'richness' of qualitative data offers qualitative methods an advantage as suspicious responses are easier to identify. For example, the grammar, syntax, and word choices demonstrated in the fraudulent responses in Table 1 provide signs that these responses to the story completion task were not genuine eligible responses. Additional signs of fraudulent or ineligible responses are those responses which do not follow the instructions, fail to complete the task, or responses completed in an unfeasibly fast time, although such responses may also come from eligible participants. Any suspicious responses identified should be discussed, where possible, with the research team and a consensus reached regarding the exclusion of any responses.
In addition to preventing and identifying fraudulent and ineligible responses, non-synchronous online qualitative research must ensure that eligible participants are able to provide high-quality data (i.e., data that allows the researcher to answer the research question). In such studies, the researcher is unable to provide clarifications or ask probing questions, creating three potential problems that may impact data quality. Firstly, participants may not fully understand what is being asked of them. They may not realise that they have misunderstood the task or may simply exit the study rather than seek clarification. Secondly, for a variety of reasons, participants may not provide the level of detail needed. However, different study designs require different levels of detail, and individual responses to non-synchronous online qualitative methods, are likely to be less 'rich' compared to data generated by synchronous methods (Davies et al., 2020). Finally, participant fatigue or lack of engagement can result in responses that lack appropriate detail or are missing altogether. As previous literature in this area has focused on detecting careless responses in non-synchronous online quantitative research (e.g., Goldammer et al., 2020;Meyer et al., 2013) thus numerical data, it is limited in its application to qualitative studies. To address this gap in the literature, we propose the following measures, although we strongly encourage researchers to consider the specific ways in which these could be implemented in each study.
To help guide participants to provide high-quality data, task instructions need to be clear and precise but avoid overburdening Table 1 Examples of fraudulent responses to our story completion study (Jones et al., 2020), where the story stem asked participants to think about attending a hypothetical 10-year school reunion and what participants tell people about their lives.

Nonsensical fraudulent response
Fraudulent response with some attempt to answer the question (reference to 10-year timeframe) Fraudulent response with several attempts to answer the question (reference to future plan, pain condition, school reunion) Only when you feel miserable can you feel something. It's not that I don't care. It's just a deep hiding. It doesn't matter, but it doesn't put it on the face. I have seen your most loving face and the softest smile. In the cool state of the world, the lights give me the ability to go along, and love while walking. The night lonely feeling is dead, the night wind blows the heart, only the cold and the cold, the flowers withered and the undeserted, the old flourishing has fallen alone.
Ten years ago, I thought ten years later my husband would love me as much as he did now. Although my face was old, no daughter's lively and lovely, young fashion, but his love for me did not reduce a little, and not because of the daughter and migrated. Now, I am young and many gold, I have many unfinished career, interest and knowledge, part of my life. Where is my career in ten years? Will the lifelong career be to foster the growth of the daughter.
I might be a CRPS* medical specialist. Doctors have always been my favorite job. When I save people, I earn respect. CRPS has always been my most worried, the persistent pain gave up a lot of interesting work and things, and some of my friends have been troubled by CRPS. I can feel their mood, as I can do, so I hope I can become an expert in CRPS in the next ten years, helping myself and helping me. Students, help more social people.
* CRPS refers to Complex Regional Pain Syndrome, a particular pain condition. Our study explored the future stories of adolescents who have CRPS.
participants or leading them towards certain responses. We recommend three specific considerations when writing task instructions and questions. Firstly, setting a minimum limit for the free text sections as outlined above in section 2, and if appropriate, stating there is no maximum limit on response length. Studies will require differing minimum limits depending on the specific task, the analytical approach used, and the other tasks within the study (i.e., to prevent overall participant burden). Secondly, stating that there are no right or wrong answers and encouraging the participant to provide as much detail as possible in their responses. Finally, involving participants in the design stages is considered best practice (see UK Public Involvement Standards Development Partnership, 2019), and can help improve the readability of questions, and the appropriateness of the tasks. Conducting pilot studies can also help to identify elements that do not work (e.g., survey links not working) or which participants have trouble completing (e.g., questions that are often left incomplete). Together, pilot studies and the involvement of participants can help to ensure that the study design and instructions are likely to be acceptable to, and understood by, participants.

Implementation pitfalls
Ideally, researchers should implement all the techniques outlined in section 2 and 3, however this may not be possible due to a variety of potential pitfalls. Although the security techniques outlined in section 2 are quick to set up and require no further input from the researcher once in place, they are not a panacea, and implementing these alone cannot guarantee the eligibility of all participants. Rather, we see these as basic protections to be incorporated into the standard study set-up for all online studies.
In contrast, the techniques related to study procedures outlined in sections 2 and 3 increase the amount of study administration required. We therefore suggest considering these techniques during the development of study funding applications (where possible) to ensure the appropriate resources are available for study management. As with many aspects of study management, a greater number of resources will likely be related to higher quality data, and a balance between pressure on resources and data quality must be reached. Although implementing these techniques may initially require additional resources, they are likely to provide more efficiency overall, due to the reduction of the burden created by fraudulent responses, ineligible participants, and inappropriate or incomplete responses. If full implementation is not possible, researchers should consider which techniques they may be able to implement, specifically considering which threats to data quality their study is most at risk from.

Conclusions
In this commentary, we have outlined techniques that can help to ensure data quality and integrity in non-synchronous online qualitative research methods. These techniques, summarised in Table 2, cannot solve all threats to data quality. Rather, they form the start of an evolving conversation around ensuring that the use of non-synchronous online qualitative methods is not at the expense of data quality and integrity. Therefore, we invite our colleagues to both use the techniques we propose, and actively engage with the ongoing conversation through, for example, future in-depth reviews, studies, commentaries, and case studies.
As a community of researchers using qualitative methods, we would be wise to learn from the knowledge and techniques developed to ensure high data quality in non-synchronous online quantitative methods (see e.g. Bowen et al., 2008;Curran, 2016;Godinho et al., 2020). However, we must also acknowledge the specific challenges and opportunities related to data quality presented by non-synchronous online qualitative methods. As qualitative researchers, we are familiar with methods that require us to build rapport with our participants, and we often gain very personal insight into their lives. As non-synchronous online methods prevent this type of contact with participants, we must find alternative ways of ensuring the eligibility of our participants and the quality of the data they provide. Additionally, as the data we are gathering is more nuanced than numbers, it may be easier to identify suspicious responses or poor-quality data.
It is essential for researchers to be open and transparent about these issues, and not let pressures around participant recruitment and publications deflect from the need for reliable, trustworthy, and high-quality data. Some of the techniques we propose can be easily implemented with minimal additional work, while others may require researcher time and resources, which are often in short supply. However, we recommend against viewing these techniques as optional extras, used only when there are ample resources. We believe these techniques should be considered essential requirements when conducting non-synchronous online qualitative research. Therefore, we encourage all qualitative researchers using such methods of data collection to consider the issues explored here and critically evaluate their research design to ensure that their data is of the highest quality.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Funding
Dr Jordan & Dr Caes declare a grant from the Pain Relief Foundation which supported this work. The funders had no involvement in the writing of this manuscript.