Signal processing including analysis, understanding, detection, estimation, and modelling of the events and trends, the way they evolve, and the abnormalities and anomalies affecting them have attracted many researchers around the globe. Signal processing theory originates from mathematical foundation with astonishing applications which help information technologists discover and invent new realities branching off into communications, acoustics, speech, music, biomedical engineering, networking, control, and many other fronts in research and development. A remarkable balance between theory and applications of signal processing has been found with its enormous footprints. Linear algebra, data transforms, and signal distributions have been perhaps playing the major roles in most of these applications.

On the other hand, the pioneering works in artificial neural networks, inspired by the structure of central nervous system, by Warren McCulloch and Walter Pitts in 1940’s was another celebrated establishment in the area of data assessment and machine learning. Machine learning prefers to create generative models for the problem under study. Inference models and parameters, inherently relying on Bayesian learning, are determined by the data and their environments.

Machine Learning and information theoretic ideas can help statistical signal processing overcome the barriers of linear models, and mitigate the need for Gaussianity and stationarity assumptions. Statistical signal processing and inductive inference algorithms provide a common ground at the overlap between signal processing and machine learning which result in some elegant areas of research such as adaptive and nonlinear signal processing, intelligent systems, and multitask cooperative networking.

Audio and video processing, brain computer interfacing, self-organized and cognitive information systems are very few out of many application domains of joint machine learning and signal processing systems. We may further categorize the areas where signal processing and machine learning meet as learning theory and techniques; graphical models and kernel methods; data-driven adaptive systems and models; pattern recognition and classification; distributed, Bayesian, subspace/manifold and sparsity-aware learning; multi-set data analysis and multimodal data fusion; perceptual signal processing in audio, image and video; cognitive information processing; multichannel adaptive and nonlinear signal processing, and their vast applications, including: speech and audio, image and video, music; biomedical signals and images; communications; bioinformatics; biometrics; computational intelligence; genomic signals and sequences; social networks; games, and smart grid. In addition, learning algorithms inherently involve in many other applications such as latent variable analysis and blind source separation.

In this special issue, there is no room to cover all the above aspects. Instead, only few topics have been explored and examined by some prominent researchers. These topics are summarized and their importance acknowledged here.

In the first paper titled “Text-informed audio source separation. Example-based approach using non-negative matrix partial co-factorization” (10.1007/s11265-014-0920-1), the authors have attempted a text-informed speech source separation. For this purpose a new variant of the non-negative matrix partial co-factorization approach based on a so-called excitation-filter-channel speech model has been proposed. In this model the linguistic information between the speech example and the speech in the mixture are shared.

The second paper, “A model-free de-drifting approach for detecting BOLD activities in fMRI data” (10.1007/s11265-014-0926-8), introduces a model-free method for efficiently capturing drifts in functional magnetic resonance imaging (fMRI) data. The proposed algorithm applies a first order differencing to the fMRI time series samples in order to remove the drift effect. Linear least-squares method followed by applying wavelet threshold is then used to optimally estimate the drift. In the final stage, the de-drifted fMRI voxel response is acquired by removing the estimated drift from the fMRI time-series. Its performance is assessed using simulated and motor-task real fMRI data sets obtained from both block- and event-related designs.

The method in the third paper, “Blind suppression of nonstationary diffuse acoustic noise based on spatial covariance matrix decomposition” (10.1007/s11265-014-0922-z), involves suppression of nonstationary diffuse noise which is a popular problem. In this approach the observed spatial covariance matrix is decomposed into signal and noise. In their design the authors exploit spatial invariance instead of temporal invariance to regularize the inherent ill-posed decomposition problem.

The problem of scrutinising multiple gene expression microarray datasets for identification of the gene subsets, consistently co-expressed across them, has been tackled in the fourth paper titled “Application of the Bi-CoPaM method to five Escherichia Coli datasets generated under various biological conditions” (10.1007/s11265-014-0919-7). The developed clustering technique leads to an improvement in exploiting the biological facts which differentiate between various contexts and multiple processes. Consequently, the authors draw some biological hypotheses relating some of the genes with currently unknown biological processes to their potential processes. These hypotheses can serve as pilots for future gene discovery studies.

The fifth contribution, “Blind separation of orthogonal mixtures of spatially-sparse sources with unknown sparsity levels and with temporal blocks,” (10.1007/s11265-014-0918-8) addresses the problem of blind separation of a static, linear orthogonal mixture has been addressed. This process is not based on statistical assumptions (such as independence), but on the sparsity of the sources. Unlike in some established solutions to this problem, in this paper two pre-processing stages for improving the algorithm's performance have been proposed. The authors verify by experiment that the improved algorithm outperforms the recovery rate of alternative source separation methods for such contexts, including K-SVD, a leading method for dictionary learning.

Next, in “Sparse coding with anomaly detection” (10.1007/s11265-014-0913-0) paper, a solution to the problem of simultaneous sparse coding and anomaly detection has been pursued. This is based on the assumption that the majority of the data vectors comply with a sparse representation model, whereas any anomaly is caused by an unknown subset of the data vectors - the outliers - which significantly deviate from this model. The proposed approach utilizes the Alternating Direction Method of Multipliers to recover simultaneously the sparse and the outlier components for the entire collection. This approach provides a unified solution for both jointly sparse and independently sparse data vectors. An application to detection of irregular heartbeats from electrocardiogram has been demonstrated.

In the seventh paper, “learning incoherent subspaces: classification via incoherent dictionary learning,” (10.1007/s11265-014-0937-5) the authors present a new method for learning discriminative incoherent subspaces from data based on so called supervised iterative projections and rotations. They employ the algorithm to learn incoherent sub-spaces that model signals belonging to different classes. The method is effectively used as a feature transform for supervised classification.

Finally, the eighth paper, “Entropy power inequality for learning optimal combination of kernel functions,” (10.1007/s11265-014-0899-7) deals with designing an appropriate kernel for some specific data. This problem is highly demanded by many workers in machine learning. The approach in this study follows the notion of Gaussianity evaluated by entropy power inequality for kernel learning. Using a number of benchmark datasets, it is shown that the classification performance of the method is comparable or superior to other conventional multiple kernel learning methods.

We hope that the following eight papers, which cover a number of areas within joint signal processing and machine learning research arena, will generate new seeds for future thoughts.

Guest Editors