ABSTRACT
Large collections of annotated single-cell RNA sequencing (scRNA-seq) experiments are being generated across different organs, conditions and organisms on different platforms. Transferring annotations from this growing database of single cell expression data to a new unannotated experimental dataset can accelerate insights into the underlying biology. There have been many approaches towards aligning and unifying such heterogeneous datasets. In our work, we recognized the need for a robust data driven distance metric to map annotation across datasets. Towards this aim, we applied a one-shot training approach, Siamese Neural Networks (SNN), to learn a distance metric that can differentiate between known annotated single cell types. Requiring only a small training set, we demonstrated that the SNN can perform predictions across different scRNA-seq platforms, identify novel cell types and transfer annotations across samples.