Indicators for the use of Robotic Labs in 1 Basic Biomedical Research 2

Robotic Labs, in which experiments are carried out entirely by robots, have the potential to provide a reproducible and transparent foundation for performing basic biomedical laboratory experiments. In this article, we investigate whether these labs are applicable in current experimental practice. We do this by text mining 1628 papers for occurrences of methods that are supported by commercial robotic labs. We ﬁnd that 62% of the papers have at least one of these methods. This and our other results provide indications that robotic labs can serve as the foundation for performing many lab-based experiments.

reproduce results in preclinical cancer studies, potentially explaining the failure of several costly oncology 23 trials (Begley and Ellis, 2012). 24 Munafò et al. (2017) outline several potential threats to reproducible science including p-hacking, 25 publication bias, failure to control for biases, low statistical power in study design, and poor quality 26 control. To address these issues, the Reproducibility Project: Cancer Biology in its reproduction of 50 27 cancer biology papers, used commercial contract research organizations (CROs) as well as a number of 28 other interventions, such as registered reports (Errington et al., 2014). They argue that CROs provide a 29 better basis for replication as they are both skilled in the expertise area and independent, in turn reducing 30 risk of bias. 31 Extending this approach to providing an industrialized basis for performing experiments, is the 32 introduction of large amounts of automation into experimental processes. At the forefront of this 33 move towards automation is the introduction of "robotic labs". These are labs in which the entire where it took 280 hours to reproduce a single computational experiment in computational biology (Garijo 47 et al., 2013). While there are still challenges to reproducibility even within computational environments 48 (Fokkens et al., 2013), robotic labs potentially remove an important variable around infrastructure. They 49 provide, in essence, a programming language for biomedical research.

50
While this promise is compelling, a key question is whether robotic labs would be widely applicable 51 to current methods used in biomedical research. This question can be broken down into two parts: To answer this question, we use an approach inspired by Vasilevsky (Vasilevsky et al., 2013)

63
Our aim was to construct a meaningfully sized corpus that covered representative papers of basic lab-based 64 biomedical research. Additionally, for reasons of processing efficiency we selected papers from Elsevier 65 because we had access to the XML versions of the paper in a preprocessed fashion. To build our corpus, 66 we first selected journals categorized under "Life Sciences" in ScienceDirect 2 , specifically those marked 67 under "Biochemistry, Genetics and Molecular Biology". We then filtered for journals categorized as    To define what methods could be automated by a robot lab, we built a list of available and soon to be 90 available methods from the Transcriptic and Emerald Cloud Lab websites as of March 10, 2017. This list 91 contained 107 methods. We term methods that can be executed within a robotic lab a robotic method. 92 We manually mapped those lists to MeSH concepts from the Investigative Techniques [E05] branch. We 93 were able to map 71 methods to MeSH concepts. During the mapping procedure, we selected leaf nodes SoDA's exact setting. Adopting such a dictionary based approach translates to high precision in method 105 identification, sacrificing recall. Using this approach means that we cannot determine complete coverage 106 of all methods used in a paper.

107
After annotation, analysis was performed by matching the lists detailed above with the output 108 annotations. The analysis procedure code is available in Groth and Cox (2017).

110
Within our 1628 article corpus, 1165 of those articles were identified to have at least one method as 111 defined by matching to an MeSH investigative technique. In total, we identified 151 unique methods used 112 across the corpus.

113
Using the mapping to robotic labs discussed above, we identified 1011 articles or roughly 62% of the 114 total corpus have at least one method that can be executed within a known robotic lab. Of the 1165 papers 115 where the procedure recognized a method, the mean number of arobotic methodswithin an article is 1.5.
116 Figure 1 shows the number of times a robotic methodor a non-robotic methodoccur within a paper. For 117 example, in roughly 19 papers, an robotic methodoccurs 4 times.   Table 1 lists robotic methodsthat occur in more than 15 papers. Of the 59 potential robotic methods, 119 33 occurred within our corpus. We analyze this list in more detail later in the discussion section. 120 Additionally, as discussed we identified the most common non-robotic methods. There were 118 121 unique non-robotic methodsin total, and methods appearing in at least 15 papers are presented in ta-122 ble:nonautomethodnum. 123 We note that robotic methodsappear more frequently in articles. For example, the most frequently 124 occurring robotic methodoccurs in 15 times more articles than the most frequently occurring non-robotic 125 method.

127
We return to our initial questions: 1) do basic biomedical papers reuse existing methods and, 2) if so, are Combined with its pervasiveness in biomedical research labs, these factors make PCR an attractive choice 142 for automation.

143
With the exception of cell culture, the other methods in Table 1 are also comprised of highly au-144 tomatable tasks. Just as thermocycler technology is relatively standardized, so too are the equipments, 145 kits and protocols used for methods like HPLC and ELISAs. Biomedical labs are using nearly identical 146 protocols in many instances, yet introducing their own variability due to human use. In these cases,  Table 2 represents the most commonly identified non-robotic methods. We combed through the list of  Table 1. 158 Additionally, many of the methods tagged are not applicable in the context of a biomedical laboratory 159 pipeline. For example, "passive cutaneous anaphylaxis" refers to a clinical event, but reflects the nature of 160 MeSH as a information management vocabulary as well as potential outliers in our document corpus.

161
In terms of the second question, our analysis suggests that the research represented by this corpus of 162 literature has the potential for using robotic labs in at a least some aspects of the described experimental  Looking more deeply at the actual methods identified, the top robotic methodsin Table 1 are a mix of 168 both workflow techniques (i.e. cell culture, transfection) and endpoint measurements (i.e. qPCR, ELISA).

169
Roughly 6% of our corpus had more than 3 robotic methodswithin one paper, which we believe to be extraction, however, this was applied only to a small number of papers. In future work, we aim to apply 197 these recent advances to deepen our analysis. Based on the challenges listed above, we believe that the 198 numbers presented here are an underestimation of the total number of robotic methodsthat can be applied 199 in biomedical research. 200 Finally, While we believe the selected corpus reflects the body of literature that would most likely use 201 robotic labs, it could be argued that a much larger corpus would be more informative. This investigation 202 is also left to future work.

204
Reproducibility is of increasing concern across the sciences. Robotic labs, particularly in biomedicine, 205 provide the potential for reducing the quality control issues between experiments while increasing the 206 transparency of reporting. In this article, we analyzed a subset of the biomedical literature and find that 207 greater than 60% of the papers have some methods that are supported by existing commercial robotic 208 labs. Furthermore, we find that basic methods are indeed "popular" and are increasingly being covered by 209 robotic labs.

210
While there will always be labs that specialize in the development of new methods, given these 211 indicators, we believe that robotic labs can provide the basis for performing a large percentage of basic 212 biomedical research in a reproducible and transparent fashion.