NDDN: A Cloud-Based Neuroinformation Database for Developing Neuronal Networks

Electrical activity of developing dissociated neuronal networks is of immense significance for understanding the general properties of neural information processing and storage. In addition, the complexity and diversity of network activity patterns make them ideal candidates for developing novel computational models and evaluating algorithms. However, there are rare databases which focus on the changing network dynamics during development. Here, we describe the design and implementation of Neuroinformation Database for Developing Networks (NDDN), a repository for electrophysiological data collected from long-term cultured hippocampal networks. The NDDN contains over 15 terabytes of multielectrode array data consisting of 25,380 items collected from 105 culture batches. Metadata including culturing and recording information and stimulation/drug application protocols are linked to each data item. A Matlab toolbox named MEAKit is also provided with the NDDN to ease the analysis of downloaded data items. We expect that NDDN may contribute to both the fields of experimental and computational neuroscience.


Introduction
Spontaneous neural activity plays a critical role in the development and function of the nervous system [1][2][3][4][5][6]. e diversity and the complexity of neural activity patterns are believed to be the key to understand how the neuronal network operates whether it is in or out of the normal state [7][8][9][10][11][12][13][14][15][16]. Investigations on the characteristics of spontaneous activity, stimuli-evoked response, and drug-mediated activity are amongst the fundamentals of neuroscience research. Further, computational biologists and researchers in the field of computer science are also inspired by the dedicated organization of structures and functions of neuronal networks [12,13,[17][18][19][20]. It is found that developing neuronal networks share similar and critical aspects with other biological networks, the global interconnected computer networks (the Internet), social networks, and even the universe [17,19,[21][22][23][24][25][26]. Gathering information on the evolving dynamics of developing neuronal networks is therefore attracting increasingly attentions [11,27].
As an important in vitro model of the nervous system, dissociated neuronal networks have been used by neuroscientists since over 30 years ago. But until the recent decade, it is difficult to perform long-term observation of networkwide activities from the cultured neurons and manipulation of the network dynamics without impairing the health of the culture [11,20,28]. Growing on the multielectrode arrays, large random neuronal networks developing in vitro are demonstrated to display general properties of neural systems and enables extensive measurements and manipulations of neuronal dynamics with little interference with the network state [10,[28][29][30]. However, although many databases have been built to promote genetic/proteomic researches and investigations of EEGs and ECGs [31,32], there is still lack of a comprehensive database focused on collecting and sharing electrophysiological activities of long-term developing neuronal networks. Scattered in laboratories around the globe, it is difficult and inconvenient to access and utilize such information.
In the present study, we describe how we construct the NDDN (Neuroinformation Database for Developing Networks), a new repository for electrophysiological data acquired from long-term cultured hippocampal networks and the features of both the NDDN and the data it stores.

Dissociated Hippocampal Cultures.
All experimental procedures used in this study were approved by the Animal Ethics Committee of Huazhong University of Science and Technology. As described previously [8,33,34], the hippocampus was extracted quickly from E18-19 Wistar rat embryos and then was gently dissected and dissociated by trypsin (Sigma, 10 min at 37°C). e hippocampal neurons were plated onto a culture dish with an embedded multielectrode array (MEA, Ayanda Biosystems SA, Lausane, Swiss; Multichannel Systems, Reutlingen, Germany) at a density of 2500 cells/mm 2 (Figures 1 and 2). To improve cell adhesion, poly-L-lysine was used to coat the array before seeding. e culture medium contained 1 ml neurobasal medium (Invitrogen) with B27 supplement (Invitrogen), 10% equine serum (HyClone), and 0.5 mM Glutamax (Invitrogen). Half of the medium was replaced every 2 days. e dishes were placed in a 37°C, 5% CO 2 water jacketed incubator.

Data Collection and Organization.
Raw data were collected using an MEA1060 recording system (Multichannel Systems, Reutlingen, Germany). Each dish contains 59 recording electrodes. Extracellular signals were continuously sampled at 25-50 kHz. We used a threshold method (5 × standard deviation) to convert the raw data stream into a spike train. Raw data were saved into * .mcd format by MC_Rack (Multichannel Systems, Reutlingen, Germany) and could be later converted into hierarchical data format (HDF) for sharing. e * .mcd file also can be read into Matlab using Neuroshare interface (http://neuroshare. sourceforge.net/) with the provided toolbox. Spike data were saved into * .mat format with Matlab structures. All data files were uploaded into the distributed storage of the server, and the path to each data file in the file system was saved as a property of the corresponding data item in the database.
Metadata (refer to Table 1 for detailed information) which contain the experimental details for each data file are organized as experimental sets rather than individual items. To facilitate data management, metadata items can be tagged   as a different category by users; therefore, items from the same experiment or with identical experimental protocol, condition, and parameters can be grouped into a working set. If available, the microscopy images of corresponding cultured network were collected and linked to the metadata item.

Database Design and Implementation.
We used the MySQL relational database management system to store and perform data queries. Eleven tables were created to provide indexed and structured data organization, as well as secured data access. e entity-relationship (ER) data model representing the relationships among these tables is shown in Figure 3. e core table is named as "item." Identified by a unique ID, each item in the database has a record in the "item" table which stores essential information of the item, such as the owner, the type, and access permission, and can be further linked to detailed experimental descriptions. If the item is a data file, then we save its path as "fileloc" in the item record. If the item is a photo or a piece of code, the data are saved directly in the item table as "data." Related metadata information is linked to the table named "Metadata," and the tagging information is stored in the "Tag" table which is linked to the main item table by the "Map." e NDDN is designed to have the access control for each group of users. Users do not have to save their passwords into the database. Based on the OpenID framework, the authentication is accomplished by third-party providers, such as Google accounts and Microsoft accounts service.   e returned OpenID identity is saved into the database and later used to identify the user. erefore, NDDN users can login into the system with their Gmail, Hotmail accounts, or other credentials that support the OpenID communication protocol. In the database, each data item has an "owner" property, and the access permission may be granted to three classes of users: the item owner, the group that the owner belongs, and to all users. Table 2 shows the UNIX-like permission code string for data items. Additionally, to provide detailed information of data manipulation, login actions and data access history are saved into the database.

Service-Based Web Portal.
e core functions of the NDDN have been implemented as RESTful (representational state transfer) web services with PHP. Based on the MVC (model-view-controller) coding pattern, resources in the database were able to reach via URIs, for example, "http:// bmp.hust.edu.cn/neuro_db/v2/data/618352/retrieve." Data access may require a token which was returned after the user was authenticated. After data query, structured information will be returned in XML format, and the raw/spike data will be returned in binary form.

Customized Scripts and Data Visualization.
e NDDN has implemented a Python interface using interprocess communication (IPC) mechanism. With preinstalled SciPy package (http://www.scipy.org/), users can upload their scripts and perform various computing tasks using NDDN data. Considering security issues, access to the file system and other critical system resources in Python is limited. e visualization of NDDN data items were also implemented using SciPy and Matplotlib (http://matplotlib.org/). Graphics are generated at the back-end and then displayed at the frontend afterwards. In the NDDN , all scripts including visualization and user-uploaded algorithms written in Python are also saved as data items with specific item types. e access permission rules are identical to other regular items. ). Recordings which were taken with stimulations and/or drug applications were grouped with spontaneous recordings before and after the application, which helps users to perform quantitative analysis focused on whether or how the network dynamics was affected by a specific protocol.

Results and Discussion
As a database for developing neuronal networks, the diversity of spontaneous recordings from varied developmental ages of cultured neuronal networks is an important index. e distribution of recording dates (also known as, days in vitro (DIV)) of existing data items is shown in Figure 4 which is also regularly undated in the statistics page of the NDDN. Abundant scientific experiments have been conducted with cultured neuronal networks between 1 and 9 weeks [5, 7-9, 11, 28, 30]. In NDDN, most recordings fell within the similar time range, providing a rich repertoire of data resources for neuroinformatics and modeling researches. Further, we have data from 15+ culture batches which lived over 150 DIVs. ese data sets of longterm developing networks are believed to benefit the understanding of evolving dynamics of neuronal networks during in vitro development [5]. e histogram shows the distribution of DIVs of NDDN items (in percentage). Note that most recordings were taken before 100 DIVs. Although long-term cultures were rare, NDDN still has recordings between 200 and 300 DIVs. Numbers in black boxes show the actual numbers of items in the corresponding DIV range.

Website Interface and NDDN Web Services.
Users can access the NDDN at http://bmp.hust.edu.cn/neuro_db/. Pages of introductory materials and related publications can be browsed without login. To access pages of data query and download requires educational user authentication (currently, the website is hosted on a university server (Intel Table 2: Permission codes of data items.

Roles
Owner  Xeon E3 CPUs with 64 Gigabytes RAM) in the China Education and Research Network (CERNET) which is required by our university but may block some international access due to government policy; we are trying to apply for a permission to deploy the database in a public cloud run by a private company). An example page of data query and built-in visualization using the provided Matlab toolbox is shown in Figure 5.
Example query results are shown in the table. e output of built-in visualization functions is shown below. Arraywide spike detection rate is shown with the line graph with red dots. e channel activity hot map is shown next to the line graph. e NDDN accepts various search conditions: culture date, recording date, culture dish number, DIV, stimulation protocol, drug name, and the operator who conducted the experiment. As mentioned in Methods, tags can be labeled on individual data items. Users can put the same tag on all the items in the returned search result, which saves the search results for reuse in the future. For example, if one had performed a specific search based on last returned search results or with multiple conditions, then she/he could directly load the results by the tag next time. Besides, users can specify whether their own tags will be exposed to all users, which helps to keep the database organized and encourages constructive sharing. e core functions of the NDDN are exposed as web service APIs (application programming interfaces). Table 3 shows the list of core APIs. Each request contains all of the necessary information to accomplish the request. e client does not need to hold any session state. User authentication information will be sent to the client after successful login and will be used as a token for next calls. e stateless RESTful web service APIs helps researchers to develop tools that can directly download/upload data from/to the NDDN.

Matlab Toolbox.
e NDDN also provides a toolbox written in Matlab code to reduce the difficulty in using    NDDN data. e MEAKit (multielectrode array ToolKIT) toolbox is freely available at the NDDN website. Users can also contribute to the future development of the toolbox by submitting bugs and issues or even committing their codes to make their own version branch at the GitHub portal (https://github.com/pujb/meakit). e key functions are listed in Table 4. Briefly, M-functions in the MEAKit toolbox are grouped into different categories by their purposes. Data files can be loaded into Matlab workspace by I/O functions, for example, util_load_mcds(). We have already implemented multiple built-in functions in the "Calculation" directory to perform some commonly used classical neural dynamics analyses [30], as well as some newly adopted algorithms, such as neuronal avalanche analysis and fractal quantification [5]. Users can use their own way to visualize the results or they can use functions in the "Plot" directory to generate graphs that meet the common publishing standards of scientific journals. Figure 6 shows changing firing patterns of an example neuronal network during development. Scripts which were written for specific purposes are located in the "Scripts" directory. For detailed information, see the toolbox references topics in the "Help" directory.

Comparison with Other Databases.
Compared to a rich repertoire of online bioinformatics databases, there are much fewer databases aimed at providing electrophysiological information of neuronal networks, let alone databases that specially focused on multielectrode array data of in vitro developing networks. e CARMEN project (http:// www.carmen.org.uk/) provides a powerful international cooperation framework for sharing codes, data, and models of multiple levels of the neural system [27]. e Allen Brain Atlas (ABA) (http://www.brain-map.org/) and the Visible Brain-wide Networks (VBN) project (http://vbn.hust.edu. cn/) are famous for their precious image resources and powerful tools [35]. Among these databases, the CARMEN project aims at providing multiple types of data (including electrophysiological data and images) from various sources at different levels of the neural system (from the cellular level to the whole-brain level). e CARMEN project is powerful for its virtual laboratory framework for enabling whole brain leveldata but also online exploitation of neurophysiological data and online code running and analyzing. e Allen Brain Atlas and the VBN project aim at providing cell type databases, toolboxes, and detailed images of brain connectivity. Currently, there is little report of a database focused on dissociated cultured neuronal networks. Clearly, the dissociated neuronal networks lack many features of the intact whole brain, but the essential nature of the neural cells and the network formed by the neurons and other neural cells are kept in these dissociated cultures. erefore, observing the neurons and how they form and develop into a network may help us to better understand the mechanism of the brain. Also, many detailed and dissected analyses of neural circuits are not feasible in living animals and humans. Here, we collected our dissociated neuronal network data by multielectrode arrays (MEAs). Spontaneous activities and activities under stimulated and medicated conditions were recorded. Although there are obvious limitations in the NDDN for its limited data sources and data types, we are trying to release the unique data of dissociated cultured neuronal networks as a specialized database tailored for developing neuronal networks on MEAs. e NDDN provides a large set of unique data which is exclusive at the moment as far as we know and the developmental information for cultured neuronal networks which is unique currently.

Conclusions
Electrophysiological activity patterns in developing neuronal networks are of great importance in the fundamental research of neural dynamics and neural coding. Here, we described a new database for developing networks, which has over 15 terabytes data at present. e NDDN can be utilized by computational neuroscientists and modelers to extract the characteristics and derive new models, shedding new light on novel algorithm development and evaluation. Experimental neuroscientists may be also benefitted by NDDN which can be seen as a database containing preliminary trials with various experimental protocols. We expect that the NDDN will better serve the researchers in the related field as a basis for insight into the neural dynamics at the network level.

Conflicts of Interest
e authors declare that they have no conflicts of interest.