Proceedings of the 2016 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) Conference

s into a matrix, and then perform singular value decomposition to produce a lower dimensional manifold [9]. They show how the resultant decomposition can be used for miRNA and term queries, and they perform clustering and functional annotation of the miRNAs. The authors present an interesting utilization of unsupervised text mining for the annotation of miRNAs. Glass and Dozmorov report the impact of parameters such as sample size, cell type-specific proportion variability, mean squared error, etc., on power analysis of linear regression used for estimating cell type-specific gene expression [10]. They incorporate these analyses that can determine the probable significant detection into LRCDE, an R package that performs linear regression cell type-specific differential expression detection on a gene-by-gene basis. Liu et al. describe a workflow and a database useful for plant geneticists and bioinformaticians [11]. While workflows for sequencing data for human and model organisms have been extensively tested, their application to plant sequencing data is less developed. This workflow is optimized for high-performance processing of plant sequencing data. Extendable to other organisms, it will aid both novices and sequencing data analysis professionals. Imaging Mutlu Mete et al. reported a generalizable machine learning framework and applied the framework in classification of cocaine dependent subjects using brain imaging [12]. This framework reduces data dimension by information theory based feature selection followed by a statistical classifier from machine learning. The results demonstrated the proposed framework is efficient to classify cocaine-dependent and healthy individuals and can be generalized to classifications. Sertan Kaya et al. describe a novel method for automatic detection of malignancy of the skin lesion in dermoscopy images using the texture homogeneity along the periphery of the lesion [13]. This method performs with high accuracy, and can contribute to early diagnosis of malignant melanoma by distinguishing it from benign lesions.

space"; and William Slikker, Jr., "Regulatory Implications of Genomics and Bioinformatics for Food and Drug Safety".
There were three workshops: Workshop 1: "Using Cyberinfrastructure to Scale your Science" provided by Jason Williams, Ph.D., describing the computational tools and services developed by CyVerse (formerly iPlant Collaborative) to upload, share, and analyze large biological datasets for a variety of applications from genomics to phenotypic analysis. Workshop 2: "Gene Network & Systems Genetics" by Robert Williams, Ph.D., to provide hands on experience with GeneNetwork (www. genenetwork.org). GeneNetwork is one of several web resources that combine public genomic and genetic data along with open-source code for genome-to-phenome analysis. Participants were introduced to human, mouse, rat, and plant genomic datasets and code that can be used in a wide range of bioinformatic, medical, and agronomic settings. Workshop 3: "Next Generation Sequencing Analysis" was provided by Rakesh Kaundal, Ph.D., focused on the analysis of RNA-Seq data for differential gene expression and related statistical tests using R/Bioconductor. There were 12 breakout sessions. Each session had one featured speaker and four oral presenters presenting their research. The Drug Discovery and Development Colloquium was organized by the student leaders of MCBIOS and was held at the University of Alabama at Birmingham from June 28-30, 2016. The colloquium was organized to showcase the synergistic interaction between chemistry, biology, pharmacology, and bioinformatics in the process of drug development and to promote a professional dialogue between students and experts from different disciplines interested in the processes of drug discovery and development.
Best Paper Award, MCBIOS 2016: "VDJML: A file format with tools for capturing the results of inferring immune receptor rearrangements" by Inimary Toby (1st author), Lindsey Cowell (senior author) and 21 co-authors [1]. Best

Selecting papers for the MCBIOS XI Proceedings
A total of 27 papers from the work presented at MCBIOS 2016 were submitted to be considered for publication in the Proceedings, and 14 papers were accepted (52 % acceptance rate). At least 2 reviewers anonymously peer-reviewed all submitted papers and acceptable papers were quantitatively ranked on the basis of three evaluation criteria: Novelty (1-5), Impact (1-5) and Clarity (1)(2)(3). Editors that were co-authors of submitted papers were not permitted to handle their own papers editorially. Papers generally fell into three categories:

Networks and pathways
Hyundoo Jeong et al. proposed a probabilistic approach for comparing protein-protein interaction (PPI) networks [2]. The approach estimates the steady-state network flow between nodes of different PPI networks using a Markov random walk model. The proposed approach was evaluated using multiple PPI networks and was found to be accurate and low at computational cost. Xueyuan Cao et al. describe CC-PROMISE, a method to integrate any two forms of quantitative high-dimensional molecular data such as genotype, copy number, methylation, mRNA expression, miRNA expression, etc., with multiple clinical endpoints for a cohort of patients [3]. This approach identifies genes for which some form of molecular data shows a biologically meaningful association with multiple related end points.
Eshleman and Singh describe an innovative approach to mining social network data from Twitter to extract potential adverse drug events [4]. They look for potential complications along with drug names to identify potentially recurring patterns, leading to the possibility of identifying previously unsuspected adverse events not documented.
Khunlertgit et al. proposed an integrative model to identify subnetworks as molecular markers that relate to cancer status and improve cancer outcome prediction [5]. Such markers can help improve prognosis and diagnosis of cancers, and their study showed that incorporating topological information from prior knowledge to identify the biomarkers may provide additional information for cancer classification.

Genomics & transcriptomics
Yongsheng Bai et al. developed a program called MMiRNA-Viewer [6] for interactive visualization of the expression relationships between miRNA-mRNA pairs of both tumor and normal samples into a single graph, to help users better explore the relationships between these two entities.
Detection of indels in NGS data has received much less attention than the detection of SNPs. This is in part due to technology limitations, commonly resolved by the "indel realignment" step in bioinformatics pipelines. Vo and Phan described an alternative approach to detect indels by incorporating known genetic variant information in the alignment and variant calling steps [7]. They report improved accuracy in detecting known and novel indels. In particular, their method is well designed to resolve indels that are located in proximity of each other.
Chen et al. address one of the main challenging problems in phylogenetic tree construction, and make a compelling case for the need of improvements in FFP methods [8]. The authors proposed a phylogenetic tree construction method by counting the frequency of triplet translation in prokaryotic DNA, which fully utilized the information contained in genes compared to the traditional FFP-k method, and with lower computational complexity.
In Sujoy Roy et al., the authors encode the cooccurrence of miRNAs and terms within MEDLINE abstracts into a matrix, and then perform singular value decomposition to produce a lower dimensional manifold [9]. They show how the resultant decomposition can be used for miRNA and term queries, and they perform clustering and functional annotation of the miRNAs. The authors present an interesting utilization of unsupervised text mining for the annotation of miRNAs.
Glass and Dozmorov report the impact of parameters such as sample size, cell type-specific proportion variability, mean squared error, etc., on power analysis of linear regression used for estimating cell type-specific gene expression [10]. They incorporate these analyses that can determine the probable significant detection into LRCDE, an R package that performs linear regression cell type-specific differential expression detection on a gene-by-gene basis.
Liu et al. describe a workflow and a database useful for plant geneticists and bioinformaticians [11]. While workflows for sequencing data for human and model organisms have been extensively tested, their application to plant sequencing data is less developed. This workflow is optimized for high-performance processing of plant sequencing data. Extendable to other organisms, it will aid both novices and sequencing data analysis professionals.

Imaging
Mutlu Mete et al. reported a generalizable machine learning framework and applied the framework in classification of cocaine dependent subjects using brain imaging [12]. This framework reduces data dimension by information theory based feature selection followed by a statistical classifier from machine learning. The results demonstrated the proposed framework is efficient to classify cocaine-dependent and healthy individuals and can be generalized to classifications. Sertan Kaya et al. describe a novel method for automatic detection of malignancy of the skin lesion in dermoscopy images using the texture homogeneity along the periphery of the lesion [13]. This method performs with high accuracy, and can contribute to early diagnosis of malignant melanoma by distinguishing it from benign lesions.

Miscellaneous
Inimary Toby et al. describe VDJML, a community standard to annotate V(D)J rearrangements of immune receptors and antibodies [1]. These genes are not germline encoded but rather generated somatically in response to infection. Up until now, the field has lacked a good standard format to report, describe and model sequencing data gathered on these immune receptors. This paper, which won this year's Best Paper Award, provides a means of representing this data that can be expanded and improved by the immunoinformatics community.
Lee et al. perform a pilot study in Dynamic Topic Modeling (DTM) for the analysis of time-series gene expression data [14]. The authors use DTM to perform unsupervised clustering of time-series gene expression profiles for drug treatment experiments. They map differentially expressed genes as "words" and the drug treatments as "documents" which are then clustered these into "topics".

Future meetings
The 14th Annual MCBIOS conference will be held in the Embassy Suites, Little Rock, Arkansas on March 23rd-25th, 2017. The conference theme will be "Bioinformatics and the Development of Therapeutics: Make them Better, Make them Safer".