Six Laws of Open Source Drug Discovery

Six to swear by! Society needs effective and affordable medicines. We currently have at our disposal essentially one system to discover and develop drugs, and there are many areas where this system struggles to deliver, for example to combat antimicrobial resistance, or tropical diseases, or dementia. It is sensible to cultivate alternative, competing approaches to drug discovery and development. A genuinely new alternative is to open up the entire research cycle, abandoning secrecy altogether. This “open source” approach has now been trialed and the lessons learned distilled to six laws of operation that help to clarify working practices. This article examines and explains those laws, which can be adopted by anyone wishing to create medicines using an inclusive, public process.

innovations. Many of these are in the technical side of our discipline:m ethods in organic synthesis, assay technologies, machine learning, or approaches based in fundamentalb iology. Some innovations arise in allied disciplines of economics, or law.B ut in parallel we must try to notice when our work patterns are constrainedb ys ocial structures that may limit our abilitiest of unctionm ost effectively.A re we going about things in the right way?A re we innovating in the way that we work?
There are large-scale initiatives in pharma trying to address this under the banner of "open innovation". [1] The term is nebulous, but there is typically some re-orientation of ac ompany to be more outward-facing-to place problemsi nt he public domain and attempt to broaden the net of expertise in order to solve those problems faster.T he broad range of such initiatives means that while some are highly valuable, others are branding exercises. Either way they will in general not change the way we work:j ust as for secretive, inward-facing endeavours, participants in open innovationp rojects typically work in closed groups, revealing only that which is felt worth sharing. The open part of open innovation is the shared problem, but not necessarily the proposed, and attempted, solutions.
Examples of an alternative way of workinga re all around us, but largely viewed as insufficiently seriousf or science. Wikipedia has transformed the way knowledge is curated,a ggregated, and shared, yet is probably the tool that scientists deny using mostf requently.T he softwaret hat underpins the internet was built using am ethod similart ot hat behind Wikipedia, specifically that the details can be seen by anyone andp eople can work together openly to improvet he content or take it in an ew direction. There is no appetite for secrecyo rs ilos because those stifle our ability to access the best ideas, and why would you want to do that?T his methodi sopen source,d istinct from open innovation and to be distinguished from open access,w hich term refers to our ability to read papers-valua-ble, but not the transformative feature of initiatives like Wikipedia. In open source the community participates in peer-reviewedp ublicw ork. Depending on the exact licence of the project, one will have high levels of freedomt oa ct. Crucially, one may act while the work is happening, rather than merely after it is complete. Youa re ap layer,n ot an observer.T here is cooperation towards goals that operates alongside an open competition in how work is done, i.e.,c ollaborationw ithin a ratherb rutal, open arena of expertise.
Over the last 15 years, Ih ave worked with large numbers of people who are interested in open source in drug discovery because the approach promises to be able to solve problems in exactly the way that the pharma industry,i fa cting alone, cannot. If one values competition, as Is uspecta ll scientists do, then ac ompetition of approaches must be seen as healthy.I f open source can competew ith proprietary methods in ac ommercial space as valuablea st elecommunications( i.e.,t he projects leading to Android vs. iOS) can the same competition of ideas benefita nother commercially-important area:d rug discovery?
Is trongly suspecti tc an. The answer depends on the disease. But Ia mr unningt his experiment with others because the answer is not yet clear.
This article is intended to clarify key terms. One cannot set off for the moon without some basic principles of operation, and vaguenessc an be worse than nothing. We ought to be clear what "open source drug discovery" means. When Is tarted in this endeavour the term was aspirational,b ut we have now run several successful projectsw ith ac lear and unified set of principles, and Iw ould like to highlight these because they are useful. Operations have been distilled to six laws that have held strongi nt he real world and which are the subjecto ft his article.

How the Laws Came About
Aswith any endeavourd esigned to elicit maximum inclusion and productivity,t here is af ine line to navigate between freedom to operate on the one hand and asense of order and unified purpose on the other.Y uval Noah Harari makes an ice point in his book Sapiens that corporationse xist beyondt he physical-that they may possess al asting identityw ithouta ny permanent assets. [2] It is the same with open source initiatives. The lack of at angible "thing" or "building" remains one of the most challenging ideas for those coming to an open source project for the first time, yet the fluidityo fi ts structure and constitution provides the kind of resilience you observe when trying to spear as hoal of fish. "What is an open source proj-ect?" is ac omplicated question, but as hort form answer might be "people, andt heir enthusiasm for as hared public mission".
The SixL aws attempt to capture someo ft his structure, but were not set up ap riori. They arose from the first project Ir an in an open way:t he search for ar obustr oute to an enantiopure version of the world'smost widely used anthelmintic, praziquantel. [3] The need for such an improvement had been explicitly mentioneda saresearch priority by the tropical disease research divisiono ft he World Health Organisation. To solve it, Ip roposeds ome chemically elaborate approaches and placed them online in an attempt to kick-start an open approach to solving this chiral switch. My intention was to mimic the hive mind that was, in late 2004, having am ajor impact on the construction of new software and Wikipedia. The project became busy once we were modestly funded. At this point reality collided with the grandeuro fg rant proposals and the scientific approachw ew ere takingc hanged in response to precisely the expert collective intelligence Ih ad hoped for,f reely given and mostly from the private sector.T he solution emerged swiftly and was solid. The science was unquestionably accelerated through the inputs of well-qualified strangers.
Over ad inneri nC ape To wn hostedb yK ellyC hibaleIgot talking with TimW ells, the CSO of the Medicines for Malaria Venture (MMV), who asked the obviousq uestion:c ould we use the same approach to the discovery not of betterc hemical routes, but of new chemicale ntities? The conversation went back and forth until we decided that conversation can only take people so far,a nd it is sometimesb etter to learn by doing. Open SourceM alaria (OSM, initially called Open Source Drug Discovery for Malaria)w as born from that conversation and the invaluable wisdom of the MMV team of Jeremy Burrows and Paul Willis. Ir ealized that this might turn into a larger endeavour:t here were no such initiatives anywhere else. Genomics collaborations, such as the Human Genome Project,h ad pioneered the sharingo fd ata sets with clear commercial potential, downstream. There continues to be ag reat deal of impactful research into open source tools in cheminformatics. [4] The Open SourceD rug Discovery project in India was, despite its name, operating ac rowdsourcingi nitiative as opposed to something that was open source. [5] The StructuralGe-nomicsC onsortium were pioneering the bold sharingo fc hemical probes (not drugs or their analogs)a nd weren ot proposing the sharing of the full research cycle that led to them. [6] We were proposing something different-secrecy-free creationo f new medicines.
To minimize confusion, it was important to capture the core principles of OSM in aw ay that would be simple to understand for potentiali ncomersa nd to ensure those joining in knew the level of mischief they would be getting into. So on July 25 th 2011, led by what If elt to be the most important lessons learned from the WHO project, Iw rote down the 'Six Laws' of OSM. [7] The list alwaysb rings out the most phone cameras when Is peak about open sourced rug discovery, an observation that has led to this article. What If ind remarkable is how well these Laws have stood the test of time, holding essentially unaltered for 8y ears and guiding the involvement of, to date, over 300 people on the four OSM campaigns. The Laws can apply equally well to the next campaigns that I, or others,c reate under OSM'sb anner or to similarp rojects in other areas of drug discovery that people might want to run. [8] The Laws are intended to guide behaviour,t of ree people to act to the best of their ability within af ramework that promotesadistinctive discoveryp rocess. They do not hasp and hoop the contributors, and can be changed in the future if they are found to be faulty.
The first three Laws clarify day-to-day operations. The others are more subtle, big-picture concepts.

1) All Data and Ideas Are Freely Shared.
This is ad eceptively short Law that we oughtt ou npick if we are not to miss some essential features. We might calli t" The Condensed" law.W hen experiments are performed, those need to be recorded in al aboratory notebook (obviously), and that notebook( wherever it is) needs to be available to read in its entirety.T his means it needs to be online and, to be of any practical use, not behind ap assword wall. It is remarkable how many academic scientists are still using paper lab books that reside on desks, as if in homage to Leonardo da Vinci. We have not the space to cover the subjecto fn otebooks here (typically" Electronic Laboratory Notebooks" or ELNs), [9] but given peoples' propensity to have opinions, we cannot proscribe ap articular ELN (which would be ad istraction) and should insteadf ocus on the core FAIR principles of findability, accessibility,i nteroperability,a nd reproducibility. [10] It is important that there is an electronic record and that we can find it. An open source ELN would be desirable. Some exist, but they are not perfect, or well-supported. [11] It is not as thoughac ommercial ELN is ad erogation of the first law,i ft hat solution allows all the data to be seen, exported and re-used (so, open file formats are good). We should just find as olution that works fort he scientist or the team and ensure the contents are openly available. In OSM there are many examples that aim for this; [12] some entries may fail to live up to all the formal requirements, and we can only try.
The lab notebook needs to contain all the project data. That means the TLCs, the NMR data (i.e.,the file, not just aPDF), everything. Youm ay say "but who will ever read such at hing?" to which the answer is "who ever reads al ab notebook cover to cover?" There must be ap rimary repository of information from which everything else comes. Without full disclosurew e break an important line of trust between participants.
The word "ideas"i si ncluded in the Law since it is important not only what has been done, but what is going to be done. Naturally,i deas may be better placed in al ocation more suited to focusedd iscussion of objectives than an ELN. For some time we struggled with this, and have found an efficient solution on aw idely-used software development platform called GitHub,w hich has an "issue tracker" function( essentially a place to discuss, and resolve, smaller problems). [13] Ideas can be mooted, discussed, opened andc losed, assigned to individuals for action, andi ncorporated into other discussions. Conclusionsa nd decisions can be folded into as ummary wiki describing where the project is up to. Thus forward planningi s essential, but needs functionality not present in ELNs.
The words "freely shared" are included because people should not feel inhibited in sharing everything they can. But there is the other,m ore substantial, meaning of "free" which brings up the subjectoflicences (the word "free" has aparticularly complicated nuance in software). [14] When things are shared, what are the terms?T herea re many licences for open source software projects, noneo fw hich quite work for us because drug discovery involves tangible objects, ideas, data and assay platforms. To allow for an easily-understandable structure in OSM we adopted the CreativeC ommons CC-BY licence (used by Wikipedia)m eaning you can use anythingy ou want in the project,i ncluding for the purposes of making money, providedy ou cite the project. [15] This works as an interim, until we come up with something more robust. Note that there is no "viral" aspect to the licence:i fy ou use something in OSM, you need not share your work under the same terms. While this would be fundamentally desirable, and is clearly implied by participation in the first place, obligations of this sort might preventp eopleg etting involved for no other reasont han people are cautiousa bout constraining their future. So, nothing viral.
The First Law establishes the day-to-day way of working. Conduct experiments, keep ac omprehensive lab notebook that people can see, and which contains all your data, sharei t with ac lear (e.g.,C C-BY-style) licence, and be sure to use your conclusions to share your ideas of what should be done next.

2) Anyone May Participate at Any Level.
Science is at eam sport. There is no point in doing it unless you want to work with others to get things done more quickly, and there is no point in workingo pen source unless youw ant to work with intelligent strangers. If you want to learn how to play with Lego, or you want to build something big with it, tip the box on the floor and make room for others. If you are going to define at the outsetw ho can participate, then you are looking for at raditional collaboration. If you do not know who might be the best person to work with, go open source. So:a llow anyone to participate, and allow them to do anything they want. This second Law,t hen, is the combined "Freedom and Low-hierarchy" law.
The words "any level" remind us that people might want to do many different things, from commenting on data to designing as ynthesis to actually carrying out an experiment through to starting aw hole set of experiments themselves. All of these thingsh ave happened in OSM. Senior pharma professionals have run experiments.C ohorts of school children have innovated.I ti sa lmost alwayst he case that people have contacted me or othersf irst to ask, in essence, "are you sure, Im eanI can justg oa head and do that?" to which the answer is always "yes." But the words "any level" also refer to the freedom to work across institutional boundaries, or to act in aw ay that subverts how one mayb es een by others. Undergraduates can (and have) debated points of substance with senior professors without rank becoming an issue. Unlike much of the execrable online conversation we mays ee in our daily lives, the interactions online in af ocused science projecta re refreshingly productive.
The second law meanst hat we are intentionally setting up an arena of ideas and expertise, and entry is unrestricted. The emphasis on inclusiveness is au seful reminder that open sourced rug discovery is not "anti-pharma" and indeed has thrived on the expert contributionso fm any talented scientists from the private sector.

3) There Will Be No Patents.
People with experience of the patent system, upon reading the no-patent Law,w ill remind me that ap atent is an intentionally open statemento fa chievement, designed to be provided in sufficient detail that someone else may replicate the work. This public statementi sm ade in return for the protections ap atent affords the inventort on urture the commercial development of the invention. Permit me to put at emporary pin in the observation that the system sits on af irm bedrock of protectionsgiven to inventors by the state.
People with experience of the patent system are also aware that in many cases ap atent provides not ac lear declaration of achievement but rather broad legal claims that border on scientific obfuscation. Patents are frustrating to read and reading an ew one is the triumph of hope over experience.
The 3 rd Law is intended partly for clarity-thatp atentsa re out. The law exists not because of the faults with patents The required secrecyp oisons the effectiveness of the research that is upstream, devoiding it of the efficiencies that one might gain through openness. So one cannota dopt the workaround (suggested by many interested in the aim of affordable medicines) of patenting and then licensing out the results. This is ap erfectly reasonable thing to do, but it is not compatible with an unrestricted open source R&D community.
This 3 rd Law takes us towards ab iggerp icture view of matters since it begs the question:i fy ou don'th ave patents, how are you going to take ad rug through to market?P erhapsi n some cases this is am oot point, if for example there is not going to be ar ealistic market or there is as tructural problem with obtaining af inancial return (as currently plagues the field of drugst oc ounter antimicrobialr esistance). The answer to the question obviously depends on the nature of the relevant disease, but the question is am ajor one since an effective answer to it would subvert the statusq uo of essentially all the drug development work currently going on around the world. The answer is complicated and being discussed in many places, with many possible solutions. [16] Space here preventsu s from reviewing them all, but Iw ould like to highlight just one that is of topical interest.
Some time ago Iw as speculating on the core challenge of commerciald evelopment of open work-that one could not "protect" an idea that was already in the public domain. [17] I was struck by how appropriate it would be that one should be able to demonstrate that an invention works in the field and then be rewarded somehow,a fter the fact. Particularly if the invention wered eveloped openly,a llowing others to benefit from the details of the research along the way.C learly for such at hing to work economically (i.e.,t or ecoup expenses or reimburse any investors) the inventor would need aform of temporary protection that triggers when ac ertain point is reached. It turns out Ih ave family history associated with this idea arising from an invention duringt he industrial revolution, as Ih ave described elsewhere, [18] and there are examples of related ideas, [19] but to translate all this into drug discovery terms one would need, not ap atent, but ad ifferent temporary exclusivity granted by the state that allows some level of cost recovery (I am retrieving my earlier pin at this point). This mights ound outlandish to those focusedt oo much on patents, but it should not because it exists already,i nt he form of variousf lavours of regulatory data exclusivity availablet od rug inventors and intended to protect them in just this way.T he idea has been mooted as compatible with open source approaches [20] and has now been excitingly instantiated in ar eal company using this idea in its operations, M4K Pharma. [21] Open research leadingt oa ne ngaged community leveraging guaranteesa rising from existing regulatory arrangements as the business model.T hisi sa ne xciting idea that we can try out in the comingy ears, and one that anyone should welcomei nto the chocolate box of competing ideas that we surely need if we are serious about trying genuinely new things.

4) Suggestions Are the Best Form of Criticism
This is the "no asshole" rule that simply reminds us all to be constructivew hen being critical. Open source requires ap otentially confronting playing field of competence in which it is perfectly possible for an undergraduatet oc orrect as easoned academic in ap ermanent public record.T his structure needs to be embraced, but the publicly-viewable nature of the record of work is one of the concerns Ih ear from senior academic colleagues:" What if we make am istake?" There are two parts to such an objection: a) Data could be wrong, and people may waste their time. As an objection to openness,t his is mostly ap hantom worry. Data can alwaysb ew rong, and peoplec an always be led down unproductive paths.W ea re all aware of clear,r ecent examplest hat make us worried about reproducibility in the academic literature. [22] An advantage of an open lab notebooki s that the level of uncertainty can be laido ut clearly (if, for example,afashionable positive result hasb een obtained against ab ackground of nine unfashionable negatives), and data can surfaced with al abel saying "PRELIMINARY DATA (DRAGONS)" which is not al evel of disclosure one frequentlys ees in high impact factor journals.
b) Data could be wrong, and we may look foolish.T his is possible, but Is uspectt his, too, is ap hantom. If doing science teaches us anything it is to be humbleb efore the experiment. As humans we try to see truth by lookinga tt he error-prone flickeringi mages on the wall of the cave. It is natural to make mistakes, and good scientists forgive the mistakes of others. Yeta tt he same time, the need to work openlya nd in real time requirest he best of us-theb est-keptr ecord, the most careful conclusions, the best-prepared arguments, much as we vacuum and tidy before people come round for dinner. To do this "live"-to manage complexity,i nterconnected ideas, and hypotheses "live and online"-is something that is thrilling to do and, Ih ope, for the public to watch.I tc aptures just the sorts of real ups and downs of sciencet hat we see in the best science dramas and does not sanitize the detailsi nt he way that must happen for much of the science communication we might witness on the TV,f or example. We learn quickly from our own mistakesa nd those of others, and it is as hame if pride inhibits this.
Error is inevitable. Wen eed to encourage theb rightest amongu s. These twin facts necessitate our being able to criticize others in public.B ut one never knows what one'sf ellow scientists are dealing with in their offline lives. Law 4s ays "Be kind, always".

5) Public Discussion Is More Valuable than Private Email.
As imple "no email" rule. Email is bilateral, or multilateral, but not usually public. If we want to ensure there are no "insiders" we cannot use it. This is again al aw that people can find difficult at first. Frequently,i nitial contributors will email in ideas or data, and there is acost of translating theseinto the public domain,with permission and maybe redaction. As soon as peoplea re comfortable with writing publicly,t hat cost vanishes. In fact, the GitHub issue tracker( mentioned above)h as an ice feature in that it is compatible with email (with an associated but acceptable risk that peoplem ay write ap rivate note that appears on the public site). But we do not yet have ap erfect solutiont o light online conversations. SomeG itHub threads run to 100 comments or more and become alittle cumbersome if nobody digestst hem into as maller number of active tasks or forks them to fresh conversations, but thesep roblems beset email chains too. People are reticent to write trivially small notes in public, but there are platforms that might suit (such as Slack and Reddit), and the reticence may anyway be ag enerational issue. But ensuring key conversationsa re kept relevant and not swamped by contributionsc an only really be solved by active research coordination,a nd it is better for such coordination to be in public (rather than email) so that we ensure everyone is up to speed.

6) An Open Project Is Bigger Than, and Is Not
Owned by,Any Given Lab.
The "Under aB us" Law.I faproject leader ceasest ob et he leader (encounter with ab us, or,w orse, disinterest), it must be possible to continue the research.T his law is therefore aq uality-control reminder of the need for Law 1-that all data and ideas need to have been shared for as eamlessc ontinuancet o be possible. Ap roject'sl eader is leaderb yv irtue of their behaviour,n ot by virtue of an ame tag (I am reminded of the tragicomic award of the Sherriff badge to Billy Curtis' character Mordecai in the Clint Eastwood movie High Plains Drifter). Leadership in open source projects (the role of the "benign dictator")i safascinating subject in itso wn right, but the Law here acts to remind us that the project and its outcomes are king. In the first iteration of this Law there were the additional words "The Aim is to Find aD rug for Malaria as Quicklya sP ossible," to remind us that all these laws and considerations were there in order to help us leverage the power of an ew way of working in order to develop am edicine to help people enjoy their lives. If Id ecide that ac ertain target or series of molecules is no good then Ineed to leave open the option that someone else disagrees and needs to be able to build on what Ih ave done, unfettered by my biasesa nd my hoarding of anyd ata.
This Law is also makingasubtle point about branding:t hat if one wishes peoplet og et together to work on something voluntarily then one needs to minimize ownership. It is not about the person, but the project. Wikipedia would unquestionably have been less successful if it had been called "Jimmy Wales' Encyclopaedia". The irony of my publishing this paper is not unappreciated, but is countered by Law 2, that anyone can take part at any level.
The Law is also abouti mmortality.A sk ac oder how to make their code immortal. Backi tu pt o1 00 private servers?P lace it, etched,i natomb?N o, the answero ften given is to open it up. If you need to stop, make sure that what you have done need never be done again,a nd that it is clear how someone else can build on it later.T oa chieve this requires digital infrastructurew ith ac ommitmentt oal evel of permanencet hat we are increasingly coming to rely upon (e.g. Pubchem, arXiv). This makeso ne think of all the drug discovery campaignst hat have been pursued in academia and industry that have either not been published at all, or published incompletely,o r stoppedf or strategic rather than scientific reasons and then not shared. It is unsettling to consider all the work we might inadvertently repeat. To the public it ought to be as candalous waste of resources. This is not to criticize the scientists who stoppedp rojects on solid grounds, but merely to raila gainst the assumption that nobody else knows better.T here are al ot of flags planted in sand that serve only to mark unpublished data where there should instead be trees of data awaiting gardeners.
If the laws resonate with you, then jumpin.

Matthew H. To dd
Chair of Drug Discovery

University College London
Acknowledgement It hank Drs. Alice Motion, Chris Swain and Lindi Todd for fruitful discussions. Ia lso thank all the many and talented contributors to OSM and relatedi nitiatives for boldly road-testing open source drug discoverya nd, thereby, helping to create the six laws.