The Si elegans project at the interface of experimental and computational Caenorhabditis elegans neurobiology and behavior

Objective. In light of recent progress in mapping neural function to behavior, we briefly and selectively review past and present endeavors to reveal and reconstruct nervous system function in Caenorhabditis elegans through simulation. Approach. Rather than presenting an all-encompassing review on the mathematical modeling of C. elegans, this contribution collects snapshots of pathfinding key works and emerging technologies that recent single- and multi-center simulation initiatives are building on. We thereby point out a few general limitations and problems that these undertakings are faced with and discuss how these may be addressed and overcome. Main results. Lessons learned from past and current computational approaches to deciphering and reconstructing information flow in the C. elegans nervous system corroborate the need of refining neural response models and linking them to intra- and extra-environmental interactions to better reflect and understand the actual biological, biochemical and biophysical events that lead to behavior. Together with single-center research efforts, the Si elegans and OpenWorm projects aim at providing the required, in some cases complementary tools for different hardware architectures to support advancement into this direction. Significance. Despite its seeming simplicity, the nervous system of the hermaphroditic nematode C. elegans with just 302 neurons gives rise to a rich behavioral repertoire. Besides controlling vital functions (feeding, defecation, reproduction), it encodes different stimuli-induced as well as autonomous locomotion modalities (crawling, swimming and jumping). For this dichotomy between system simplicity and behavioral complexity, C. elegans has challenged neurobiologists and computational scientists alike. Understanding the underlying mechanisms that lead to a context-modulated functionality of individual neurons would not only advance our knowledge on nervous system function and its failure in pathological states, but have directly exploitable benefits for robotics and the engineering of brain-mimetic computational architectures that are orthogonal to current von-Neumann-type machines.


Introduction
Nervous systems are a special invention of nature for every multicellular organism 'on the move'. Living on a different timescale with other prey and defense mechanisms than hunt and escape, 'rooted' organisms such as plants, fungi or sponges process stimuli through a far less complex network of less specialized information processing cells compared to neurons in nervous systems (Desalle and Tattersall 2012). How complex does a nervous system need to become to let a multitude of adjustable stimulus-response actions (usually referred to as behavioral output) emerge? The nematode Caenorhabditis elegans (C. elegans) provides nature's answer.
C. elegans, a tiny roundworm (L: 1 mm, Ø 80 μm) with a life span of a few weeks, is among the five best characterized organisms in nature (Epstein and Shakes 1995). With less than 1% of its population being males, the nematode proliferates predominantly as a quasi-clone through hermaphrodites. These are comprised of exactly 959 cells, including 95 body wall muscle cells and 302 neurons that fall into 118 classes (WormBook 2016, Altun and Hall 2016). Its nervous system has been almost completely mapped by electron microscopy (White et al 1986). And it was the first multicellular animal whose genome had been completely sequenced (C. elegans Sequencing Consortium 1998). The nematode's behavior and its underlying operation principles are the subject of numerous past and ongoing studies (Bono andVillu Maricq 2005, Corsi et al 2015), which have led to an extensive body of knowledge on this creature. This inspired biologists and neurocomputational researchers at the end of the last century to simulate not only the C. elegans nervous system, but the organism and its development in its entirety. We will point out some of the key works in this field to then focus on the scope and status of two concerted simulation initiatives, the OpenWorm and the Si elegans projects.

C. elegans as a model organism in neural computation
With the advent of sufficiently powerful computational resources in the 80s of the last century, researchers discovered computers for the simulation of all kinds of natural phenomena, among them the events in nervous systems. Modeling neural systems has diverse roots and inspirations, most of them being inductively derived from first principles (e.g., (Hodgkin and Huxley 1952)) or deduced from direct observation. The nematode C. elegans was considered as an ideal system to start with. Besides the pharyngeal system (Bhatla et al 2015), the most accessible circuit in C. elegans is its body wall muscle control system responsible for locomotion (Gjorgjieva et al 2014) consisting of 75 motor neurons (out of a total number of 113 motor neurons) of 8 classes that innervate 79 (out of 95) body wall muscle cells arranged posterior to the head along the dorsal and ventral cords (Riddle et al 1997, Gjorgjieva et al 2014, Zhen and Samuel 2015, Altun and Hall 2016. Although the nature of the driving neural signals is still under debate (electrotonic or graded-regenerative potentials versus self-terminating action potentials (Lockery and), its output can be visualized rather easily and is thus verifiable and quantifiable by direct comparison with time-lapse recordings of worm postures (Stephens et al 2008 and movements . Recent advances in designing genetically encoded optogenetic switches (Xu and Kim 2011) and fast-response calcium indicators (Gottschalk 2014) paired with real-time tracking (Faumont et al 2011), imaging techniques (Moy et al 2015) and laser ablation experiments (Gray et al 2005, Rakowski et al 2013 allow the association of postural changes with activity patterns and signal flow (Nguyen et al 2016, Venkatachalam et al 2016. Therefore, the majority of publications on simulating C. elegans focuses on various aspects of the sensory-motor loop (Lockery 2011, Cohen and Sanders 2014, Gjorgjieva et al 2014, Zhen and Samuel 2015 and its driving inputs (Schafer 2015).
Most of these studies were inspired by seminal work of Niebur and Erdös, who-based on reported behavioral descriptions of the nematode's locomotion-laid the groundwork for the mathematical formalization of C. elegans nervous system function by taking body physics, environmental and biomechanical properties and resulting interaction forces into account Erdös 1991, Niebur andErdos 1993). In 1992, a book entitled 'AY's Neuroanatomy of C. elegans for Computation' was published that provided a neural circuitry database of the nematode C. elegans having been compiled from the literature that was available to that date (Achacoso and Yamamoto 1992). Several mathematical tools in form of BASIC and FORTRAN programs were made available that allowed the visualization and manipulation of morphological and circuit data to reveal correlations and functional roles. Despite the wealth of supplied data and the inspiring examples, it was cited more often as a general reference for small network architectures rather than serving as a basis for actual C. elegans network analyses studies. Since then, diverse strategies for describing neural events that lead to postural change have been proposed of which only a few are mentioned. Among them are event-driven models consisting of an asynchronous system based on pulse modulation (Claverol et al 1999), compartmental conductancebased models exclusively for muscle cells (Boyle and Cohen 2008), a central pattern generator that drives the forward movement of a physics-based rigid body representation of the nematode (Mailler et al 2010) inspired by (Niebur and Erdös 1991), neuromuscular control systems that rely on a sensory feedback mechanism based on bistable dynamics without the need for a modulatory mechanism except for a proprioceptive response to the physical environment (Boyle et al 2012, Williamson 2012, dynamic neural networks based on a differential evolution algorithm in the head and body with a central pattern generator in between acting on a locomotion model with 12 multi-joint rigid links (Deng and Xu 2014), evolutionary algorithms for the identification of a minimal klinotaxis network (Izquierdo and Beer 2013) and genetic algorithms to train 3680 synaptic weights within the motor connectome to replicate behaviors based on sensorymotor sequences (Portegys 2015) as recently reviewed by (Gjorgjieva et al 2014) and (Izquierdo and Beer 2016). The lack of sufficient electrophysiological and biochemical data continues to fuel the connectome debate on whether the emergence of a certain behavior can be predicted solely from a network analysis (Jabr 2012, Seung 2012) featuring simplistic bistable neurons (Roberts et al 2016) or requires a more detailed description of neural events including other signaling modalities such as neuromodulators (Trojanowski and Raizen 2015) and proprioceptive (Butler et al 2015) or mechanosensory feedback (Bryden andCohen 2004, Karbowski et al 2008).

Past named projects on simulating C. elegans
In 1997, researchers at the University of Oregon in the USA proposed 'NemaSys'. It aimed at developing a computer simulation environment for C. elegans to support basic research and education in C. elegans and systems computational neuroscience. Due to C. elegans's simplicity, an anatomically detailed model of the entire body and nervous system was perceived as an attainable goal. Over the years, a concerted effort employing electrophysiology, calcium imaging, quantitative behavioral analysis, laser ablation and mathematical modeling led to the identification of the mechanism and simple computational rules by which C. elegans computes the time derivative of chemosensory input (Ferree and Lockery 1999). The results were transcoded into a phototaxis response algorithm to control and analyze the trajectories of a custom-made robot (Morse et al 1998).
In 1998, 'The Perfect C. elegans Project', a collaboration between researchers with Sony, the Keio University in Japan and the University of Maryland in the USA, targeted at introducing synthetic models of C. elegans to further enhance our understanding of the underlying principles of its development and behavior, and life in general. Initial efforts focused on a realistic simulation of a subset of biological observables by providing a Java-based visualization tool for embryogenesis including cell position, kinematic interactions between cells, cell division, cell fate, neural connections and thermotaxis. Ultimately, a complete synthetic model of the nematode's cellular structure and function, including genetic interactions, was envisioned. The concepts and first steps were outlined in an initial report (Kitano et al 1998), but were not followed up on.
In 2004, researchers at the Hiroshima and Osaka universities in Japan aimed at developing a virtual C. elegans in the 'Virtual C. elegans Project'. Based on data on the spatial and structural layout of the nematode, they proposed a dynamic body model with muscles to analyze motor control. It was founded on a neural oscillator circuit to generate rhythmic movement. It could be shown that the model qualitatively generates rhythmic movements similar to wildtype and mutant nematodes. Another demonstration was a realcoded genetic algorithm to drive a kinematic locomotion model that responded to gentle-touch stimuli (Suzuki et al 2005a(Suzuki et al , 2005b.

Recent concerted project initiatives on simulating C. elegans
Emerging from the 'CyberElegans' project, the 'OpenWorm' project (USA, 2011-present;www.openworm.org) is an international open science project to simulate C. elegans from the cellular level upwards on standard and graphical processing unit (GPU) -enhanced computers via OpenCL. The longterm goal is to provide a full simulation of the C. elegans hermaphrodite. The first target is a description of the worm's locomotion by simulating the 302 neurons and 95 body wall muscle cells (Szigeti et al 2014). Among the currently available modules is a realistic flexible worm body model including the muscular system and a partially implemented ventral neural cord (Palyanov et al 2011, Openworm Browser 2014. It is based on a merged and extended connectome dataset , which is similar to a system of spherical particles of different sizes that was reported to model both the nematode and its environment during movement and feeding behavior (Rönkkö and Wong 2008). Currently available open-source resources include the OpenWorm browser, the NeuroML C. elegans connectome, Sibernetic and 'Geppetto' (Geppetto Contributors 2016), a web-based multi-algorithm, multi-scale simulation platform for simulating complex biological systems and their surrounding environment (Openworm Community 2016).
Around the same time, NEMALOAD ('nematode upload'; USA, 2012-2014; github.com/nemaload) initiated the integration of a number of recent experimental imaging technologies (Marblestone et al 2013, Schrödel et al 2013 to learn how one neuron affects another in C. elegans. The project was structured in four subsequent stages that were supposed to build on one another. In the molecular biology stage, C. elegans strains should be functionalized with optogenetically encoded sensors and actuators (e.g., calcium indicators, photo-stimulators and inhibitors) for the tracing and manipulation of neural activity. In the imaging stage, this activity flow should be recorded in freely behaving worms at neuronal resolution. In the perturbation stage, individual neurons should be excited optically by means of a custommade two-photon digital holography system to map their contributions to a certain behavior. In the final modeling stage, automation tools for the correlation of neural activity with behavior should have allowed the development of a dynamic model of the worm's behavior in a simulated environment to mirror the experimentally observed behavior in its natural or laboratory environment. This should have allowed for elucidating the underlying information processing structure. In 2014, these activities merged with the Open-Worm project.
The most recent concerted effort in emulating C. elegans is the Si elegans project (EU, 2013-present; www.si-elegans. eu). It aims at providing a closed-loop, open-source, peercontribution platform being based on brain-mimetic principles for the emulation and reverse-engineering of C. elegans nervous system function in a behavioral context. Si elegans was motivated by the lack of a holistic closed-loop simulation environment, where neural events can be linked to as well as altered by their behavioral outcome. In this, the overall objectives are very similar to previous and ongoing endeavors. The chosen approach is slightly different, though. The nervous system consists of a dedicated hardware infrastructure that, unlike software implementations, permits true parallelism in the intra-neural as well as inter-neural signal processing. It is based on 329 field-programmable gate arrays (FPGAs), a parallel circuit definition architecture by design. Unlike functionally pre-defined neuromorphic computing systems (Furber 2016) based on very large-scale integrated circuit technology (Mead and Conway 1980), which-surprisingly-has not yet been exploited in the context of emulating C. elegans nervous system function, FPGAs are freely reconfigurable circuit fabrics that can accommodate distinct neural response models, just one for each C. elegans neuron (requiring 302 FPGAs) or several at a time. Similarly, FPGAs can carry one or several other models that interact with neurons, such as models of downstream muscle response (e.g., 27 FPGAs sharing up to 6 muscle models each to emulate the 95 C. elegans striated body wall muscles and 60 nonstriated muscles) and algorithms of subsequent body physics. These circuit-embedded response models may be dynamic and context-aware (Machado et al 2014(Machado et al , 2015(Machado et al , 2016 and thus evolve over time. This adaptation is not restricted to simply adjusting e.g., synaptic weights, but may allow the model to respond differently as a function of the (sensory) signal type and origin, environmental conditions (e.g., T) and their history, or of the local level of 'neuromodulatory biochemical background' at a given time. In view of the high number (359 200) of adaptive logic modules of the chosen FPGAs (Altera Stratix V GX), models are thus allowed to include aspects that are often ignored in computational neuroscience. For instance, the complexity of the dendritic tree suggests its involvement in the computational pre-processing of incoming signals such as their temporal filtering and amplitude modulation and its effect on altering synaptic properties (Smith 2010). Likewise, electrical junctions account for about 9% (recent work suggests up to 48% (Hall 2016)) of the overall interconnectivity between neurons, thereby constituting alternative signal propagation and-due to their permittivity for small moleculesmodulation pathways to synaptic transmission.
Originally, the nematode's entire connectome was to be implemented by a static light-projection scheme to warrant interference-free, parallel synaptic information transfer with high temporal fidelity (Petrushin et al 2014, Ferrara et al 2016. The axonal output of each FPGA neuron would have triggered a light-emitting diode, whose light was projected through a patterned mask onto 'synaptic' photodetectors of only those postsynaptic neurons that the respective presynaptic neuron connected to. This required each of the 302 light sources to carry a different neuron-specific projection mask. The final system features a dynamic version of such opto-electrical connectome based on digital light processing technology. The reconfigurable digital micromirror devices substitute the static projection masks to allow for exploring the impact of changes in neural interconnectivity on neural information processing. Due to its complexity and costs, it is currently restricted to the synaptic signal transmission between the 20 neurons of the pharyngeal sub-network 1 as a proof-of-concept implementation. The remaining 279 neurons 2 exchange synaptic and gap-junction information through an Ethernet backbone. To nevertheless warrant the temporal parallelism inherent to biological networks and events, the hardware-based network will operate on a central clock (50 MHz). Some neural operations will require more FPGA 'hardware clock' cycles than others. At the cost of real-time operation, the supervising FPGA-based controller will thus ensure that all model operations of all neurons including the inter-neural signal transmission within a 'biological clock' cycle are completed before a new one starts. Any delays related to different lengths at the axonal arbor or synaptic properties can be incorporated in the respective neural models on the individual FPGAs.
Inspired by previous work on a closed-loop simulation framework for body, muscles and neurons (Voegtlin 2011), this biomimetic hardware nervous system emulation is controlling a virtually embodied and physically realistic representation of the nematode (via soft-body physics) in an equally realistic three-dimensional virtual behavioral arena (e.g., an agar Petri dish) (Mujika et al 2014(Mujika et al , 2016. In there, the virtual C. elegans will encounter commonly tested stimuli (e.g., touch, chemicals, electric fields, light and/or temperature gradients) at any pre-defined time. These, together with characteristics of the environment (e.g., the shape of the plate, substrate properties) and the initial position and orientation of the nematode, can be batch-defined in a dedicated behavioral experiment configuration interface. The definition parameters are translated into an editable extensible markup language schema. During an experiment, the sensory experience is transmitted to the sensory neurons in the FPGA network. Based on published knowledge on network-internal circuitry and signal processing pathways, the sensory input (and proprioceptive information) will generate a motor output to instruct the muscles of the virtual worm on what to do next. In this closed-loop scenario, it will furthermore be possible to read out any network state (e.g., synaptic weights) at any given time for the reverse-engineering of network function. The simulation results, both the neuron variable traces as well as the body motion, can be visualized and downloaded after the simulation is over. To make the Si elegans framework 1 The pharyngeal network is thought to be connected to the main network by only a single gap junction between I1 and RIP. 2 CANL/R were excluded since they have no obvious synapses. VC6 was omitted as well since it only makes one neuromuscular junction. user-friendly for novice and expert users alike, several model generation (e.g., drag-and-drop) and import functionality (e.g., from existing simulation engines) are provided (Krewer et al 2014, Morgan et al 2015. The current model design is based on the low entropy modeling specification language. In a neural network configuration graphical user interface, the user places neuron and synapse models in a graphically represented C. elegans connectome and can parametrize specific neuron models. Once the chosen models generate a behavioral output that is comparable to observations in real laboratory experiments, the platform will allow the neuroscience community to better understand, if not anticipate, the neural mechanisms that underlie behavior. The open-source, peer-contribution Si elegans platform is publically accessible through platform.sielegans.eu. Its early implementation and functionality may be compared with personal computers (PCs) in the 70s of the last century: just like the PC hardware and its basic operating system at that time, the Si elegans platform provides a basic computational framework to model C. elegans nervous system function and observe the generated behavioral output. Its usefulness in predicting neural function to reproduce a certain behavior will therefore strongly depend on its adoption and on contributions by both the biological and neurocomputational communities.

Discussion and conclusion
Biological nervous systems are robust and highly adaptive information processing entities that excel current von-Neumann-type computer architectures in almost all aspects of sensory-motor integration. While they are slow and inefficient in the serial processing of stimuli or data chains, they outperform artificial computational systems in seemingly ordinary pattern recognition, orientation or navigation tasks due to their parallel and multifactorial information processing capabilities. In terms of number of neurons and interneural connections, C. elegans is the most prominent and astonishing example of how a most minimalistic nervous system can process a multitude of different stimuli and sustain a diverse repertoire of behavioral outcomes. This irreconcilability suggests that there must be other mechanisms involved, which render this nervous system computationally more powerful beyond plugging a number of stereotypic computational units together. Yet, despite its seeming simplicity, C. elegans is keeping the third generation of biologists, neural engineers and computational neuroscientists busy in elucidating the underlying principles of how genes translate into nervous system function and a certain behavioral phenotype.
When Sydney Brenner proposed C. elegans as a model organism to the Medical Research Council in the UK in 1963, he stated that 'We intend to identify every cell in the worm and trace lineages' (Brenner 1963). While this goal has been achieved, it became clear that this information is insufficient to deduce the cells' contributions to behavior. Several key questions are still unanswered. One of them is our lack of biological knowledge that would instruct us to what level of detail a simulation has to drill down to let realistic behavior emerge. Will we need to uncover and formalize the entirety of the molecular machineries that underpin worm biology or will a more abstracted, thermodynamics-inspired description faithfully elicit the observed behavior in silico? Although we know most of the neurons' roles and purposes (e.g., sensory, inter, motor, projection, local/solitary), still little is known about their identity (excitatory or inhibitory) and the relevance of the individual connections (including gap junctions). Furthermore, evidence suggests the existence of parallel, sometimes opposing (inhibitory versus excitatory) circuits. Similarly challenging are divergent circuits from a common starting point to different endpoints. In addition, the neural dynamics of different neurons are not uniform and even vary between individuals. Moreover, they may be modulated by extrasynaptic neural activation mechanisms including diffusible biochemical regulators (e.g., neuromodulators) or physical parameters (e.g., temperature, proprioception) (Bargmann and Marder 2013). These, in turn, may vary with internal states (e.g., starved versus satiated) and the environmental conditions. On top of that, synapses are constantly remodeled not only in response to behavioral experience, but in a context-sensitive and time-or activity-dependent manner on the timescale of milliseconds to weeks (Friston 2011). Thus, C. elegans's neural circuit, despite its quasi-static wiring diagram, features many dynamic and difficult to capture mechanisms that encode different behavioral outcomes.
Due to this complexity and the many unknowns, any simulation approach is almost doomed to start with naïve and oversimplified assumptions. No matter how a simulation framework is conceptualized, the above findings strongly suggest to keep it as flexible, extensible and scalable as possible to accommodate new insights into the mechanisms that govern nervous system function underlying a particular behavioral phenotype. This may include the deviation from standard reasoning: instead of building population-or neuronspecific response models (Marder and Taylor 2011), an even more fine-grained approach may become necessary that provides a variety of adaptive models for one and the same neuron each responding to context-specific events. For this reason, new computational architectures such GPUs and FPGAs are explored to lift the restraints from the required hardware resources. In doing so, the OpenWorm and Si elegans initiatives both aim at providing the required tools in support of answering the question of how the C. elegans nervous system encodes behavior.