1 Introduction

Microorganisms or microbes are the living organisms that are too small to be seen with naked eyes but visible under microscope. The world of microbes includes viruses, bacteria, fungi, algae etc. Microorganisms play an important part in our ecosystem. They are beneficial in monitoring environmental changes, decomposition of waste materials, wastewater management, food processing etc. There are various microorganisms which are pathogenic as well and can cause disease in humans, plants, and other living organisms. They are responsible for causing various deadly diseases like tuberculosis, plague, anthrax, toxoplasmosis, HIV etc. [1]. The coronavirus disease (Covid-19) that reported more than 623,000 deaths worldwide, is also caused due to the pathogenicity of virus strain namely SARS-CoV-2 [2]. Since microorganisms are of extreme importance, their study is vital to the scientists involved in clinical microbiology, agriculture, medical science and food production. To understand the biological, genetic and physiological characteristics of microorganisms, they are observed under microscope using culture techniques. But the traditional methods are labor-intensive and expensive [3]. Microorganisms possess huge morphological similarities that sometimes it becomes exhaustive to classify them. To make the procedure of microorganism classification less exhaustive, an automatic ML assisted microorganism image recognition tool with less human intervention can be developed. ML methods have been used comprehensively in many application areas like speech recognition [4], health care [5], business forecasting [6], agriculture [7] etc. Particularly, DL (a sub-field of ML) techniques have brought tremendous success in areas like image recognition [8], object segmentation and classification [9], pattern recognition [10], autonomous vehicle [11] etc.

From last many years, researchers are implementing ML to develop an automatic tool that can assist scientists in microscopic image recognition for the species level identification and classification. There exist some literature review works targeting the microorganism image recognition domain. Li et al. [12] presented a comprehensive review of CBMIA (Content-based microscopic image analysis) methods, applied in microorganisms’ classification field. The authors have analyzed and discussed different image analysis methods used for image pre-processing, feature extraction, post-processing, classification and evaluation. In another study, Kulwa et al. [13] reviewed various image processing and ML methods applied solely for image segmentation of microorganism. Other than this, Li et al. [14] provided a literature review of different clustering methods employed for microorganism image analysis.

Unlike the above cited related works, this work focuses exclusively on ML approaches applied in microorganism image recognition field. In the concerned field, ML techniques have been used for image classification, feature extraction and image segmentation. This work covers all these aspects. In order to find the research gap, it is very important to study the limitations of research work. In this paper, cautious methodical analysis of selected studies along with the dataset used has been done, to find out the limitations of each work. In addition to this, we have also reviewed DL approaches applied for image classification, segmentation and feature extraction.

In this review paper, a systematic approach has been adopted to analyse various ML based approaches used by the researchers in the microorganism image recognition field. This paper covers all important research papers in the concerned field for the time period (1995–2021). The research publications are presented in their chronological order of appearance. The selected papers have been inspected to answer the research questions formulated to understand the trends in the use of ML techniques in the concerned field.

This review work is organized as follows; Sect. 2 describes the methodology used to conduct this review including research questions, selection process and inclusion–exclusion criteria. In Sect. 3, detailed methodical analysis of selected research papers have been provided. Summary of research papers and their limitations are presented in the form of tables. In Sect. 4, various findings have been discussed that will provide research directions in microorganism image recognition field. Lastly, conclusion and future scope is given in Sect. 5.

2 Methodology

To conduct the systematic review presented in this paper, a review protocol has been developed. The review protocol aims at steering the review process and reducing the risk of publication bias. The first step for conducting this review is to frame the research questions. Then the relevant research papers were searched using various online databases and digital libraries. The number of papers was then reduced according to the inclusion and exclusion criteria.

2.1 Research Questions

In this systematic review paper, the authors have deeply analyzed the research methodologies used by the researchers, by framing some important research questions. Table 1 presented the Research Questions framed to conduct this review.

Table 1 Research questions

2.2 Search Process and Sources of Information

To conduct this review, the authors have searched for significant research done in the field of image recognition of various microorganisms using ML. The ML approaches was started to be applied for microorganism image recognition in the 1990s. No relevant study was found before 1995. So, the selected papers are limited in the interval from 1995 to 2021. Figure 1 shows the yearly distribution of selected articles. Following online databases and digital libraries have been searched for conducting this review:

Fig. 1
figure 1

Yearly distribution of selected articles

Keywords used to search for relevant studies were “Microorganism classification” OR “Detection”, “Bacteria identification” OR “classification”, “algae”, “protozoa”, “fungi” “ML”, “neural networks” “DL”.

2.3 Inclusion and Exclusion Criteria for Article Selection

After searching the different online databases and libraries, huge number of research papers has been collected. To filter out the relevant papers and to avoid publication bias, the criteria for inclusion and exclusion of studies has been defined. The Inclusion criterion is defined as: (1) Articles written in English language. (2) Articles considering ML techniques only. (3) Studies being able to answer the framed research questions. The exclusion criterion involved unrelated studies, duplicate studies, abstract-only papers, and the articles not able to answer the framed research questions. Figure 2 presents the process used for article selection. After going through the systematic process, the authors have selected 100 publications to conduct this review. Figure 3 shows the number of articles selected from different database to conduct this review. The articles included research papers and chapters published in reputed global journals and in proceedings of numerous national and international conferences.

Fig. 2
figure 2

Article selection process

Fig. 3
figure 3

Number of articles selected from different databases to conduct this review

3 ML in Microorganisms Image Recognition

ML techniques have achieved great success in image recognition field such as medical image classification [15], object detection [16], face recognition [17], traffic sign classification, etc. [18]. In microbiology, the researchers have employed ML techniques for the image recognition of four types of microorganisms; Bacteria, algae, protozoa and fungi. Figure 4 shows the impact of ML techniques on image recognition of various microorganisms. In the following sub-sections, the authors have addressed the research question (RQ1) framed in section II, Subsection A to review different ML based approaches implemented for the image recognition of different microorganisms. Detailed summary of methodologies along with the limitations have also been presented in the form of tables.

Fig. 4
figure 4

Impact of ML techniques on image recognition of various microorganisms

3.1 ML in Bacteria Image Recognition

Bacteriology studies the ecology, biochemistry, morphology and genetics of bacteria. Both land and water ecosystem rely heavily on bacteria, as the cycling of important nutrients like sulphur, nitrogen is done by them. Good bacteria have economic importance in many areas such as food processing, genetic engineering, biochemistry, pest control and fibre retting. Escherichia coli is used in the preparation of vitamin K and riboflavin. In the process of retting, Clostridium butyricum is used to separate flax, hemp and jute [19]. In contrast, there are also some pathogenic bacteria. They cause illness and diseases like food poisoning, cholera, staph infections, tuberculosis and many more. Mycobacterium tuberculosis is responsible for causing tuberculosis. Tuberculosis is one of the world’s largest epidemics. As stated in [20], in India approx. 220,000 deaths being reported each year due to tuberculosis. Sample images of some bacteria species are shown in Fig. 5. Because of the pathogenicity of various bacteria species and the necessity to characterize the ecological and economical beneficial species, bacteria strains are classified at species level. ML techniques are widely employed by the researchers for studying different bacteria species.

Fig. 5
figure 5

Example of microscopic images of bacteria species. a Vibrio cholera [125], b Tuberculosis bacteria [36]

In the year 1998, Veropoulos et al. [21] proposed an artificial neural network (ANN) based technique for the automatic image recognition of tuberculosis bacteria in in Ziehl–Neelsen (ZN) stained sputum smear images. The dataset used for the study included 267 bacillus images and 88 non-bacillus images. The methodology involved edge based segmentation using canny edge detector, followed by shape features extraction using discrete Fourier transform. Then multi-layered neural network with two hidden units was trained using back propagation for image classification. Liu et al. [22] presented a computer aided automatic system called CMEIAS” for the classification of different bacterial morphotypes. The dataset included bacterial cell images of different morphotypes like cocci, curved rods, regular rods etc. The proposed technique involved image segmentation using threshold method and extraction of shape, size and gray density based features like area, roundness, major axis length, minor axis length and so on. In the next step, a tree classifier was designed using different measures of extracted features and K-nearest neighbor (K-NN) classification technique.

Men et al. [23] implemented support vector machines (SVM) for image recognition of heterotrophic bacteria colony. The acquired coloured bacteria colony image was pre-processed using de-noising, brightness balancing, smoothing and enhancement. Then the image segmentation was performed using suitable threshold method. In the next step, shape features like area, equivalent diameter, perimeter etc. of heterotrophic colonies were extracted. After feature extraction, 300 data samples were attained, out of which 200 were heterotrophic colonies and 100 were non-heterotrophic. Then SVM was applied to classify images as heterotrophic and non-heterotrophic. Chen et al. [24] proposed a ML based approach to count and classify bacterial colony images in petri dish image. Firstly, to choose proper image segmentation technique, the authors identified whether the colony images were chromatic or achromatic. Then, the dish/plate regions were identified using contrast limited adaptive histogram equalization (CLAHE) for image enhancement and otsu threshold method for identification. In the next step, colonies were extracted from the identified regions. For chromatic images, colour similarity between colonies was used. For achromatic, otsu threshold method was employed. Following this, watershed algorithm was implemented to separate clustered colonies. Then, SVM was trained using colour and shape features of bacterial colonies to classify and enumerate extracted colonies. Xiaojuan et al. [25] proposed a ML based approach for the image recognition of wastewater bacteria species. The methodology involved image segmentation using the combination of mathematical morphology based edge detection and iterative threshold method. In the next step, morphological and invariant moment based features were extracted and principal component analysis (PCA) was used for dimensionality reduction. For classification, an optimized adaptive accelerated back propagation algorithm was proposed. The authors also compared the training rate of the proposed algorithm with traditional back propagation algorithm. The experimental results predicted better training rate for the proposed algorithm. Kumar et al. [26] proposed ANN based approach for the image classification of five bacteria species, namely, Lactobacillus brevis, Staphylococcus epidermis, Bacillus thuringiensis, Escherichia coli and Listeria innocua. The approach involved background correction followed by extraction of textural, optical and shape features. In the next step, nine optimal features were selected and a type of ANN called probabilistic neural network (PNN) was employed for classification. The authors also compared the classification performance of PNN with back propagation neural network (BPNN). Experimental results predicted that PNN takes less time for classification using same number of parameters as BPNN. Akova et al. [27] presented a supervised ML approach for the detection of novel and unknown bacteria serovars. The study was performed using distinct 28 serovars of five bacteria species, namely, Listeria, Salmonella, Vibrio, E-coli and Staphylococcus. The proposed approach involved extraction of Haralick texture descriptors and moment invariants as features. Then Bayesian method was employed to classify images as known and unknown. The classification performance of bacterial strains into known and unknown was evaluated by plotting receiver operating characteristic (ROC) curve. The proposed approach achieved average accuracy of 95% for servovars level classification of bacterial species. Osman et al. [28] implemented the combination of genetic algorithm (GA) and ANN for the detection of tuberculosis bacteria in ZN stained slide images. The dataset included 960 tuberculosis images obtained from 120 slide images. The proposed approach involved image segmentation using colour image segmentation, K-means clustering and region growing algorithm. Then median filter was applied for noise removal. Following this, Hu moment invariants were extracted as features. To select optimum features, GA was employed. The selected features were then used to train ANN by applying Levenberg–Marquardt algorithm for classification of images as ‘possible TB’ or ‘true TB’. Zhai et al. [29] presented an automatic approach for recognizing and counting mycobacterium tuberculosis bacilli in ZN stained sputum smear images. The proposed approach involved image segmentation using two-step method. Firstly, the images were transformed from RGB to HSV color space and coarse segmentation was performed by segmenting the Hue component using threshold method. In the next step, the images were transformed to CIEL*a*b* color space and lightness component was segmented using adaptive threshold method. The final segmentation was achieved by combining both segmentation results. In the next step, shape features were extracted and decision tree algorithm was used to classify object images as touching bacillus, single bacillus and non-bacillus. Then the count of tuberculosis bacilli was obtained manually by counting numbers of touching bacillus objects and single bacillus objects. The counting performance of the algorithm was tested on 100 tuberculosis images. The algorithm achieved detection accuracy in between 81 and 90%. The authors also used proposed approach for tuberculosis diagnosis using 300 positive samples and 50 negative samples. Then detection rate was calculated in terms of specificity and sensitivity. Experimental results predicted 100% sensitivity and 94% specificity. Zeder et al. [30] implemented ANN for accessing the image quality of fluorescently stained microscopic bacteria images. The dataset consisted of 25,000 images belonging to three classes; high quality, low quality and medium quality. The proposed ANN achieved good identification accuracy of 94%. Hiremath et al. [31] presented a ML based approach for the image classification of six types of bacterial cells namely, cocci, streptococci, diplococci, staphylococcus, tetrad and sarcinae. The approach worked by acquiring 350 bacterial cell images consisted of bacterial cells under study. Then bacterial cell regions of each category were segmented using global threshold method. In the next step, the segmented regions were labeled and geometric shape features were extracted. Using the extracted features, three ML algorithms namely, 3σ, K-NN and ANN were trained for classification. The ANN was trained using back propagation algorithm. The classification performance was evaluated using tenfold cross-validation. Experimental results predicted better classification accuracy of 99% for ANN. Rulaningtyas et al. [32] implemented ANN for image classification of tuberculosis bacteria. The dataset included 100 binary images of tuberculosis bacteria. The methodology involved extracting shape features using geometric descriptors. Following this, classification was performed using ANN with 0.9 momentum, 0.5 learning rate and single 20 neurons hidden layer. The classifier was trained using back propagation method. Osman et al. [33] proposed a hybrid multilayer perceptron (MLP) based ANN for the classification of tuberculosis bacteria using ZN stained tissue slide images. The dataset included 1603 tuberculosis bacteria images belonging to three classes, non-TB, overlapped-TB and TB. The proposed approach involved image segmentation using colour filter, K-means clustering, median filter and region growing algorithm. Following this, shape features were extracted and classification was performed using hybrid MLP. In the hybrid MLP, the input layers were directly connected using weighted connections to the output layer. The model was trained using the combination of Modified Recursive prediction Error algorithm and extreme learning methods.

Chayadevi et al. [34] implemented unsupervised learning algorithms for extracting bacterial clusters in microscopic images. The dataset consisted of 320 digital microscopic images of bacteria species. The methodology involved image pre-processing using threshold method and binarization. Then feature set was generated with 81 features like perimeter, eccentricity, circularity etc. Following this, bacterial clusters were extracted using ANN technique called self-organizing map (SOM) and K-means clustering algorithm. For counting and recognizing individual bacterial type, freeman chain code method was used. The clustering performance was evaluated using five quality measures, namely intra-cluster distance, inter-cluster distance, cluster separation, cluster compactness and overall cluster quality. Based on these measures, the authors concluded that SOM performed better in extracting bacterial clusters using microscopic images. Ahmed et al. [35] implemented SVM for the classification of bacteria species using laser light scattered technique. The study was performed on images of scatter patterns formed by ten vibrio species. The technique involved extraction of Zernike and Chebyshev moments along with Haralick texture descriptors using grid computing approach, resulting into a feature vector containing thousands of features. Then most prominent features were selected using Fishers criterion. These features were then used to train SVM for classification. Ayas et al. [36] implemented random forest (RF) algorithm for the image segmentation and classification of tuberculosis bacteria using ZN-stained sputum smear images. The methodology is shown in Fig. 6. The authors also compared the segmentation and classification performance of RF with other ML techniques and concluded that RF outperformed other ML algorithms. Govindan et al. [37] proposed a ML based approach for tuberculosis identification in ZN stained sputum smear images. The proposed technique included image segmentation using decorrelation stretching technique and k-means clustering. Shape based features were then extracted and SVM was used for classification of images as TB-positive and TB-negative. Nie et al. [38] proposed a DL based framework for the segmentation and classification of bacterial colony images. The dataset was composed of 862 images of growing bacteria colonies. The methodology involved employing a conditional deep belief network (CDBN) called restricted Boltzmann machines, for segmenting bacterial colony image into different regions (patches) like bacteria colonies, plate, agar etc. and extracting patch-level features. These features were then used to train SVM to classify image patches as foreground and background. In the next step, Convolutional neural network (CNN) was used to predict bacterial colony in each foreground image patch. Ghosh et al. [39] proposed an automatic approach for detection of tuberculosis bacteria region in ZN stained sputum smear images. Firstly, the regions containing bacteria were highlighted and segmented using threshold method. Then features like shape, granularity and colour were extracted from the segmented regions. Using these features, classifier was designed using fuzzy membership functions to predict presence of tuberculosis bacteria in sputum smear images. Seo et al. [40] implemented ML algorithms for the classification of five species of staphylococcus bacteria using hyper-spectral imaging system. The species under study were: haemolyticus, sciuri, hyicus, simulans and aureus. Firstly, spectral signatures were extracted from region of interest. In the next step, outliers were removed using Mahalanobis distance method. Then PCA was implemented for dimensionality reduction. Finally, classification was done using SVM and partial least squares discriminant analysis. SVM performed better in terms of accuracy and kappa-coefficient. Priya et al. [41] proposed an automatic classification method for tuberculosis bacteria using sputum smear images. The dataset included 100 sputum smear images. Out of which, 1278 labeled bacilli objects and 259 outliers were extracted for training. The approach involved image segmentation using active contour method. Then shape features were extracted and fuzzy entropy function was applied to select important feature descriptors. Using these feature descriptors, MLP was trained using SVM for classification. The authors also implemented back propagation learning based MLP and compared the classification results with proposed classifier. The proposed classifier outperformed the back propagation based MLP. Ferrari et al. [42] proposed a DL based system for counting bacterial colonies in culture plate images. The approach involved acquiring blood agar plate image and segmenting the bacterial colonies images using adaptive and mixed global threshold method. The images so obtained were pre-processed using enhancement techniques. Then CNN was employed to classify these images into seven classes. The authors also presented the comparison of proposed system with hand crafted features based SVM method and conventional watershed method. The proposed system outperformed both the methods. Lopez et al. [43] presented DL based method for classification of microscopic smear patches to identify tuberculosis bacteria. The methodology involved patch extraction from sputum smears microscopic images followed by data augmentation resulting into 29,310 patch images. In the next step, CNN was trained to classify images as positive patch and negative patch. The training was done using three versions of patches: R-G, RGB and grey scale. For the comparative analysis of different versions, ROC curve was implemented. Best accuracy of 99% was obtained using R-G colour format. Turra et al. [44] implemented CNN for hyper spectral imaging based bacteria identification. The methodology involved data normalization and spatial-spectral analysis for extraction of colony spectral signatures. Then CNN was employed for the classification of spectral signatures of each bacterial colony. The authors also compared the CNN classification results with SVM and RF. Experimental results predicted best accuracy of 99.7% for CNN. Zielinski et al. [45] proposed DL based approach for recognition of bacterial species. The authors also provided a dataset called DIBaS (Digital images of bacterial species) containing 660 images of 33 species. The approach worked by extracting texture features and deep features. Texture descriptors were extracted using scale-invariant feature transform (SIFT) and deep features were extracted using three CNN architectures; AlexNet, VGG-M, VGG-VD. In the next step, pooling encoder was applied to obtain single feature vector. The feature vector was then used to train SVM and RF for classification.

Fig. 6
figure 6

Flowchart of RF based Identification of Tuberculosis bacteria in ZN stained sputum smear images. [36]

Wahid et al. [46] proposed transfer learning based approach for the classification of bacterial microscopic images. The dataset consisted of 500 grayscale images of five bacterial species, namely Clostridium botulinum, Neisseria gonorrhoeae, Vibrio cholera, Borrelia burgdoferi and Mycobacterium tuberculosis. The images were pre-processed using series of steps. Firstly, images were converted from grayscale to RGB. Then the images were flipped and translated. In the next step, deep features were extracted using pre-trained Inception deep CNN model. For classification, the last three layers of the Inception model were replaced by fully connected layer, softmax layer and classification output layer. Andreini et al. [47] proposed a DL based technique for the segmentation of bacterial colonies using agar plate images. The proposed technique involved synthetic data generation and semantic image segmentation using CNN ResNet model.

Hay et al. [48] proposed DL based approach for classification of bacterial and non-bacterial objects in larval zebra fish intestines, using three-dimensional microscopy imaging. The proposed technique involved extracting manually labeled three-dimensional regions of interest using histogram equalization, followed by data augmentation. Then three-dimensional CNN was used for classifying regions as bacteria or non-bacteria. The authors also presented the comparison of three-dimensional CNN with hand-crafted features based techniques like RF and SVM and concluded that three dimensional CNN outperformed both techniques with classification accuracy of 89.3%. Mohamed et al. [49] applied SVM for bacterial microscopic image classification. The dataset was generated by extracting 200 images of 10 bacteria species from DIBaS database. Firstly, images were converted from RGB to grayscale and then enhanced by applying histogram equalization. In the next step, features were extracted using Bag of words (BOW) model. These features were fed as input to SVM for classification. The methodology is shown in Fig. 7.

Fig. 7
figure 7

Workflow of bacteria image classification using BOW model and SVM [49]

Rahmayuna et al. [50] implemented genus level classification of bacteria species using ML techniques. The dataset consisted of 600 optical images of four bacteria species, namely Listeria sp., Escherchia sp., Staphylococcus sp. and Salmonella sp. The methodology worked by image enhancement using CLAHE followed by Zernike moments and texture features extraction. The optimal features were then selected manually and were used as input to SVM for classification. The authors employed two SVM kernels, radial basis function and linear kernel. Experimental results predicted better accuracy of 90.33% for SVM with radial basis function.

Panicker et al. [51] proposed DL based method for detection of tuberculosis bacilli in sputum smear images. The dataset included 120 images, out of which 900 bacilli regions and 900 non-bacilli regions were extracted for training. The approach worked by image de-noising using fast non-local technique and segmentation using Otsu threshold method. The segmented objects were classified by applying CNN. The proposed model was evaluated using 22 sputum smear images containing 1817 bacilli. Experimental results predicted 97.13% sensitivity, 86.76% F-score and 78.4% precision. Traore et al. [52] applied CNN for image classification of Vibrio cholera and Plasmodium falciparum. CNN was trained using 200 Plasmodium falciparum and 200 Vibrio cholera images. CNN architecture included 6 convolutional layers and one fully connected layer. Each convolutional layer is followed by Relu activation function and max-pooling layer. Training was done using stochastic gradient function. The model was tested on 80 images of both species.

Ahmed et al. [53] developed a hybrid model using CNN and SVM for classification of bacteria species. The dataset consisted of 800 images of seven bacteria species. The methodology involved image pre-processing using manual-cropping and converting images from grayscale to RGB. Then features extraction was performed by applying CNN architecture called Inception-V3 model and extracted features were flattened using average pooling function. In the next step, using these features SVM was trained for classification.

Mithra et al. [54] proposed DL based technique for identifying and counting tuberculosis bacteria in sputum smear images. The dataset consisted of 500 sputum images. The proposed technique involved noise reduction using adaptive median filtering. Then images were converted to grayscale and channel area thresholding algorithm was employed for image segmentation. In the next step, feature extraction was done using speeded up robust features (SURF) and location oriented histogram. These features were then used to train a CDBN called restricted Boltzmann machine for counting and classifying images as few bacillus, no bacillus and overlapping bacillus. Treebupachatsakul et al. [55] presented DL based method to recognize two species of bacteria. The study was performed using two dataset. Each having sample images of Staphylococcus aureus and Lactobacillus delbrueckii bacteria species. The authors used LeNet CNN method for classification.

Bonah et al. [56] implemented two ML algorithms, i.e., SVM and Linear discriminant analysis for image classification of food borne pathogenic bacteria species, using hyper-spectral imaging. The authors also employed meta-heuristic optimization algorithms (grid search (GS), GA and particle swarm optimization (PSO)) to optimize SVM parameters and optimal wavelength selection methods (Competitive Adaptive Reweighted Sampling (CARS), Synergy interval (SI), Ant colony optimization (ACO), SI-GA, GA) to reduce wavelengths. Experimental results predicted that the combination of SVM, CARS and PSO achieved better results. In the same year, Treebupachatsakul et al. [57] implemented LeNet CNN architecture for the image classification of three bacteria species (Micrococcus spp., Staphylococcus aureus, Lactobacillus) and one yeast specie (Candida albicans). The dataset used in this study contained standard resolution images of species. For comparison purpose, the same architecture was employed on the high resolution images of species considered in this study, selected from the dataset presented in [45]. The experimental results predicted an accuracy of more than 98.6% on both standard and high resolution images. Mhathesh et al. [58] implemented three-dimensional CNN to locate and classify Vibrio cholerae bacteria specie in three dimensional microscopic images of Zebrafish. The authors employed vibrio cholera image dataset containing three-dimensioanl images of Zebrafish contaminated by Vibrio cholera bacteria. The proposed methodology involved separating and manually labeling vibrio cholera in each three dimensional Zebrafish image by applying Gaussian method. Then three-dimensional CNN was trained using these images for features extraction and classification. Table 2 summarizes and gives detailed analysis of papers reviewed on bacteria image recognition.

Table 2 Summary of research papers reviewed on ML methods for bacteria image recognition

3.2 ML in Algae Image Recognition

Algae are unicellular or multicellular photosynthetic eukaryotes. They are typically a large group of aquatic plant and are used as bio-indicator for monitoring freshwater ecosystem. They produce carbohydrates and oxygen, to be used by other organisms. Algae have significant importance as a food source, as a fertilizer, in fish farming and in reclaiming alkalinity. Algae are not typically considered pathogens. But they have indirect negative effect on environment and humans. Harmful algal blooms affect organisms by producing natural toxins. Green algae are responsible for causing disease called Protothecosis in humans, cats, dogs and cattle. [59]. Sample images of some algae species are shown in Fig. 8. Because of their significance in the ecosystem, scientists are working on the taxonomic classification of algae. Various ML techniques have been used by the researchers to develop an assistive tool for the detection of algae abundance and to classify them based on their characteristics.

Fig. 8
figure 8

Example of microscopic images of different algae genera [73]

Thiel et al. [60] implemented Discriminant analysis for the identification of blue-green algae. The dataset consisted of image samples of seven blue-green algae species and two green algae species. The methodology involved image segmentation using Sobel edge detection technique and Laplacian of Gaussian operator based threshold technique. Then images were enhanced using neighborhood averaging technique. The enhanced images were then processed to extract Fourier descriptors, moment invariants and texture descriptors. Using these descriptors, classification was performed using discriminant analysis. Out of 158 image samples, the classifier predicted correctly 155 samples. Tang et al. [61] presented an automatic system for the recognition of real time plankton images. The experiment was performed on 1869 images of six plankton species. The images were obtained using Video Plankton Recorder. The proposed system worked by implementing mean shift method for segmenting the gray-scale images into binary images. Following this, shape and granularity features were extracted. In the next step, Karhunen–Loeve transform technique and Bhattacharya distance method were employed for feature selection. Then the selected features were fed as input to an ANN technique called Learning Vector Quantization (LVQ) for classification. Alvarez et al. [62] presented ML based approach for diatom image recognition. The approach involved image enhancement using spatial domain method. Then enhanced image was divided into blocks. Following this, contrast stretching was performed and image frequency based parameters were extracted from power spectrum of images to generate feature vector. The feature vector so obtained was fed to LVQ neural network for classification. The model was trained using competitive learning. Luo et al. [63] implemented active learning based multi-class SVM for plankton image classification. The dataset consisted of 7440 images of five plankton species. Another ML based approach was presented by Blaschko et al. [64] for the image recognition of plankton species. The dataset consisted of images of plankton species like Diatom, Dinoflagellates etc. The Images were acquired using FlowCam instrument. The approach involved image segmentation using snake and intensity based method. Then five types of features namely differential, contour representation, moment, texture and shape were extracted. The classification was done using single as well as ensemble classifiers. Single classifier models were developed using five ML algorithms namely K-NN, SVM, decision tree, ridge regression and naïve Bayes. Ensemble classifier models were based on boosting and bagging methods. The models were trained using single and mixed feature vectors and the results were compared. Better classification results were achieved with mixed feature vector and SVM. Jalba et al. [65] implemented ML techniques for contour analysis based diatom identification. The methodology involved contour extraction followed by adaptive image smoothing of contours. In the next step, freeman chain code method was applied for curvature measures extraction. Following this, curvature scale space was obtained and shape features were extracted using mean shift method. Then for classification, K-NN and decision tree algorithms were employed. The study was performed using two datasets. First dataset consisted of 120 contours images of six demes of Sellaphora pupula species. Second dataset consisted of 781 contour images of 37 different diatom species. Tao et al. [66] proposed a technique for the classification of red tide algae. The experimental study was carried out on the images of six known algal species and two unknown algae. The unknown algae were considered outliers. The proposed methodology involved intensity based segmentation of images to extract region of interest, followed by shape, texture and differential feature extraction. Then Naive bayes classifier was used for removing unknown algae. Following this, linear SVM was employed for classification. The performance of linear SVM classifier was compared with K-NN classifier and Radial Basis function based SVM. Linear SVM was found out to be a better classifier. The model was evaluated using threefold cross validation. The accuracy achieved was 74.3%. In 2010 Tao et al. [67] improved the previously proposed technique by using support vector data description for removing unknown algae. The accuracy was improved to 82.3%.

Xu et al. [68] presented a technique for the classification of red tide algae images. The methodology involved extracting region of interest using otsu self-adaptive method followed by smoothing closed contours using canny edge detection technique. In the next step, shape features were extracted and an ensemble of SVM, summation of negative probability and semi-supervised fuzzy C-means clustering was used for classification. Mosleh et al. [69] conducted a study for the image recognition of fresh water algae species. The dataset consisted of images of five algae genera namely Navicula, Oscillatoria, Chroococcus, Microcystis and Scenedesmus. The proposed study involved image pre-processing using histogram of equalization for contrast enhancement and median filter for noise removal. Then image segmentation was performed using canny edge detection. In the next step, shape and texture features like area, shape, perimeter, major axis and minor axis were extracted. To extract texture features, Fourier transform and PCA were used. The extracted features were fed as input to MLP for classification. The MLP was trained using back propagation algorithm. Drews- et al. [70] proposed a semi-supervised and active learning based ML approach for the classification of microalgae. The experiment was performed on two datasets acquired using FlowCam equipmnent. The first dataset consisted of four classes: Misopores, Flagellates, Pennate diatom and others. The second dataset also included four classes: Fagellates, Pennate diatom, Porocentrales and Gymnodivium. Both datasets were imbalanced. The approach involved extraction of features like width, length, transparency etc. using FlowCam. In the next step, Gaussian mixture model along with expectation-maximum algorithm was used for classification. The classification results were further optimized by using active learning. Schulze et al. [71] proposed an automatic system for phytoplankton identification. The dataset included images of 10 phytoplankton species along with some images of detritus and unknown plankton species. The images were segmented using region growing approach and Sobel edge detection. Following this, feature set was generated which included features extracted with measurement function in ImageJ (Java based image processing software), Fourier descriptor, gray-level of co-occurrence matrix, image moments, rotation invariants, symmetry measurement and some fluorescent features. Using all extracted features, ANN was trained using Elliott activation function. Coltelli et al. [72] presented an unsupervised learning based approach for the automatic identification of algae. The dataset included 53,869 images of 23 algal strains. The methodology involved image pre-processing for contour detection and object segmentation. Following this, shape and colour features were extracted to generate feature vector. The obtained feature vector was fed as input to unsupervised neural network technique, SOM for identification of algal strains. Promdaen et al. [73] introduced an automatic image recognition method for microalgae. The dataset included 720 images of 12 genera of three types of microalgae namely Blue-green algae, Eugenoids and green-algae. The proposed method involved image pre-processing by image-resizing and colour to gray-scale transformation of each image. For image segmentation, dataset was divided into two groups: rod shapes and non-rod shaped. Rod shaped genera images were segmented using multi-resolution edge detection and non-rod with single-resolution edge detection. In the next step, texture and shape features were extracted. Then these features were used for classification using SVM trained with sequential minimal optimization (SMO) technique. Dannemiller et al. [74] presented a SVM based approach for image segmentation of algae species. The approach involved image quality enhancement using Retinex algorithm. Then three types of spatial features namely, mean, variance and frequency were extracted. Using these features, SVM was trained to classify images as background or algae. The dataset included 100 background images and 100 algae images. Medina et al. [75] proposed an ANN based methodology for the detection of algae in underwater pipelines. Dataset included 19,921 annotated video frames. The methodology involved image segmentation using Hough transform, canny edge detection and Gaussian filter for noise removal. In the next step, wavelet based features were used to train neural network for classification. The neural network architecture included MLP with one input layer having 23 neurons, one hidden layer having 12 neurons and one output layer. The activation function used was hyperbolic tangent. The algorithm was further optimized to reduce false positive rate using clustering based method. The author also compared proposed ANN architecture with SVM. The accuracy achieved was similar for both techniques, but processing time was less for ANN.A pixel-wise classification approach was presented by Qiu et al. [76] for the segmentation of microscopic images of Chaetoceros. In the first step, images were resized and were converted to grayscale. Then grayscale surface direction angle model was applied to extract pixel-level feature maps. In the next step, connected region pre-segmentation was done using threshold method to select correct training samples. Then SVM was trained using extracted feature maps, to classify pixels as background or object. Correa et al. [77] proposed two approaches to solve the dataset imbalance issues in microalgae classification. The study was conducted on an imbalanced dataset consisted of 24,302 images of 19 microalgae species. First approach involved resampling using Synthetic minority oversampling technique (SMOTE) and random resampling. The second approach was based on cost matrix to reduce cost of minority class. The authors implemented five machine algorithms namely, SVM, MLP, K-NN, decision trees, and naïve bayes for classification. The classification performance was evaluated using kappa and F-score metrics. Resampling and SMOTE techniques achieved favorable Kappa value of 0.981 and 0.810 respectively. Medina et al. [78] proposed a DL based approach for detecting algae in underwater pipelines. The dataset consisted of 41,992 video frames annotated as algae and non-algae. The methodology involved comparison of multi-layer perceptron and CNN for classification. Multi-layer perceptron was trained using shape and texture features. The authors also performed post-processing to reduce false positive by employing spatial and temporal analysis. Experimental results predicted better classification accuracy for CNN. Giraldo_Zuluaga et al. [79] proposed an automatic identification approach for Scenedesmus algae in microscopic images. The dataset consisted of Scenedesmus coenobia images belonging to four classes; coenobia with one cell, coenobia with 2 cells, coenobia with 4 cells and coenobia with 8 cells. The proposed methodology involved noise removal using CLAHE. Following this, image segmentation was implemented using threshold method. In the next step, shape and texture features were selected using Sequential forward selection. The selected features were then used to train ANN and SVM for classification. Experimental results predicted better classification accuracy of 98.63% for SVM. Dannemiller et al. [80] proposed a ML based approach for algae identification. The approach worked by employing non-uniform background correction technique for image quality enhancement. Then enhanced images were partitioned into blocks to extract texture features. Then, using these features SVM was trained to classify image regions, as algae or background. The training dataset included 100 algae regions and 100 background regions. Lakshmi et al. [81] proposed a ML based approach for algal image classification. Firstly, 400 chlorella algae images were acquired and pre-processed to remove noise using median filter. Then the texture features were extracted. For classification, the authors presented a comparative study of various neural networks and DL based CNN. The performance evaluation was done by plotting ROC curve. The experimental results predicted higher accuracy of 91.82% for CNN. Wu et al. [82] implemented ANN for the recognition of algal bloom in synthetic aperture radar images. The experiment was performed on 74 dark regions extracted from 17 synthetic aperture radar images. Out of which 39 dark regions were caused by algae and 35 dark regions were caused by winds (non-algal). The methodology involved image pre-processing using filtering, linear transformation, geometric correction and radiation calibration. To extract the dark regions K-means clustering and region growing algorithm were applied. In the next step, texture features, shape features and gray features were extracted. Using these features, ANN was trained to classify each dark region image as algal bloom and non-algal bloom. The classifier was trained using back propagation algorithm. Deglint et al. [83] implemented three ANN models for the identification of algae. The study was performed using images of six species of algae. First classification model was trained using morphological (shape) features, second was trained using fluorescence-based spectral features and third model was trained by combining both shape and fluorescence-based spectral features The authors concluded that the model second and third using spectral features achieved better results. Park et al. [84] proposed a hierarchical learning method based on semantic features of non-negative matrix factorization for red tide algae image recognition. To further enhance the image recognition process, the authors also used the image entropy and roundness measure to recognize species. The experiment was performed using 3500 images of 63 red tide algae species. The authors also compared the proposed technique with other ML techniques like SVM, naive bayes, ANN etc. and concluded that proposed technique performed better. Iamsiri et al. [85] proposed a modified version of previously proposed multi resolution segmentation algorithm in [73]. The author also proposed Skelton-based shape descriptor as features. The technique involved pre-processing the outer blue channel of input image as grayscale version of input image. In the modified approach edge based segmentation and region-based segmentation were combined. Then Eccentricity was computed on the final segmentation. The author also performed edge enhancement on grayscale using CLAHE. This technique did not improve the previous accuracy in [60] that much, but the ambiguity between species was reduced. Sanchezh et al. [86] proposed a ML based technique to identify diatoms at their life cycle stages. The methodology involved feature extraction using phase congruency, elliptical Fourier descriptor and Gabor descriptors. Then feature selection was done using PCA and linear discriminant analysis. For classification, the authors implemented both supervised and unsupervised algorithms. The authors employed K-NN and SVM for supervised learning and K-means algorithm for unsupervised learning. The experimental results predicted 99% accuracy for supervised algorithms and 98% for un-supervised algorithms.

Ruiz-Santaquiteria et al. [87] presented a comparative study of semantic and instance segmentation methods for Diatom image segmentation. The authors implemented semantic segmentation using SegNet and instance segmentation using Mask-RCNN. The author concluded on the basis of experimental results that instance segmentation achieved better specificity 91%, precision 85% and sensitivity 86%. Flow-chart of Mask-RCNN applied for diatom image segmentation is shown in Fig. 9. Semantic segmentation achieved better sensitivity of 95% but the specificity was decreased to 60% and precision to 57%. Table 3 summarizes and gives detailed analysis of papers reviewed on algae image recognition.

Fig. 9
figure 9

Flowchart of Mask-RCNN for diatom image segmentation [87]

Table 3 Summary of research papers reviewed on ML methods for algae image recognition

3.3 ML in Protozoa Image Recognition

Protozoa are single celled eukaryotes, either parasitic or free living. They feed on other microorganisms. Sample images of some protozoa species are shown in Fig. 10. Environmental microorganisms also belong to this group. Protozoa play an important role in sanitization and keep drinking water safe. They also excrete phosphorus and nitrogen and their presence in soil increases plant growth. They can live in animals and human bodies. Inside human bodies, they cause diseases like transitory edema, diarrhea, intestinal diseases and can damage brain and eyes. [88] Traditionally they were detected using physical or chemical reaction and molecular biology using RNA or DNA. Those methods were expensive and time consuming. To develop a tool for the automatic image analysis of protozoa, the researchers have used various ML algorithms.

Fig. 10
figure 10

Example of microscopic images of protozoa species. a Peranema, b Euglypha, c Coleps, d Asidisca cicada [93]

Widmer et al. [89] proposed an ANN based approach to identify Cryptosporidium parvum oocyst in microscopic images. The training image set consisted of 525 digitized microscopic images respectively. Firstly, these images were cropped into 36 * 36 pixels images. Then cropped images were pre-processed and fed as input to ANN for classification into two classes; oocyst and non-oocyst. ANN was trained using back propagation algorithm. To evaluate the proposed approach different set of 362 images was used. The proposed approach achieved 81% accuracy. This work is extended by Widmer et al. [90] for image classification of two protozoa species, namely Crysporidium parvum and Giardia lambia. In this work, shape based features were extracted and for classification two ANN models were developed. The first ANN model was trained using 1586 images of Crysporidium parvum. The second ANN model was trained using 2431 images of giardia lambia images. The model trained with Gradia lambia images performed best in terms of classification accuracy, by classifying 99.6% of giardia cysts images and 91.8% of cryptosporidium oocyst correctly. Weller et al. [91] presented an unsupervised ANN based technique for the classification of Dinocyst images. The dataset included 908 images. The methodology involved extracting features based on shape, colour and texture. Then SOM clustering algorithm was used to identify five different dinocysts clusters, namely Proximate dinocysts, Freshwater algae, Proximochorate dinocysts, Chorate dinocysts and Proximate dinocysts with long horns. In the next year, Castonan et al. [92] presented an image recognition system for Eimeria species. The dataset included 3891 micrographs of oocyst of seven Eimeria species. The system worked by adjusting contrast of images using histogram equalization. Then the images were segmented using threshold method. In the next step, sequential forward selection algorithm was implemented to generate feature vector with 13 features of three types; curvature, texture and geometric shape. Following this, Gaussian distribution based Bayesian classifier was employed for classification. Ginoris et al. [93] performed a comparative study of three ML algorithms for the image classification of wastewater protozoa and metazoa. The algorithms selected for study were: ANN, decision trees and discriminant analysis. The study was performed on 23 classes of metazoa and protozoa. For experimental purpose, the images were divided into two groups: non stalked and stalked microorganisms. The approach involved image processing using series of steps. Firstly the images were pre-treated by applying histogram equalization for image enhancement. Then noise removal was performed using median filter technique and region of interest was defined. Following this, image segmentation was done using threshold method. Figure 11 shows the main steps of protozoa image pre-processing method. In the next step, selected algorithms were employed for classification. The experimental results predicted that the neural networks and discriminant analysis performed better than decision trees. However, overall recognition accuracy for stalked images was only 50% and for non-stalked images the accuracy was 85.8%. For stalked protozoa identification, Amaral et al. [94] presented an ANN based approach. The study was performed on eight different species, subclasses and genera of stalked protozoa. The images were pre-processed using series of steps. Firstly, images were enhanced using histogram equalization. Then region of interest were extracted using Otsu threshold method. Following this, post-processing was done to eliminate debris material. In the next step, morphological, signature and simple geometric shape based features were extracted. The extracted features were then used for classification using discriminant analysis and back propagation learning based feed forward ANN.

Fig. 11
figure 11

Main steps of Protozoa image pre-processing method, a pre-treated image, b region of interest, c binary image after segmentation, d final image [93]

Suzuki et al. [95] proposed a ML based technique for the segmentation and classification of human intestinal parasites. The segmentation process involved image foresting transfom, image quantization, border enhancement and ellipse matching. For classification, the authors implemented the comparison of various ML algorithms like ANN-MLP, ANN-MLP with Bagging, ANN-MLP with Adaboost, optimum path forest classifier, SVM, and SVM with Adaboost and optimum path forest. Based on the experimental results, optimum path forest was selected for classification.

Li et al. [96] presented a ML based technique for image classification of environmental microorganism. The study was performed on two datasets, both containing 200 images and 10 classes. In the first dataset, images of species (class) were from distinct subspecies (subclasses). In the second dataset, images of species were from same subspecies. The approach involved extraction of shape descriptors like Fourier descriptors, edge histogram and some geometrical descriptors like area, complex rate, perimeter etc. The authors also introduced internal structure histograms to describe shape. Following this, SVM was employed for classification. The best classification accuracy of 89.7% was achieved with geometrical feature space and second dataset. In another study, Li et al. [97] extended the previous work of environmental microorganism image classification [96], by introducing image pre-processing method, i.e., image segmentation before feature extraction and classification. For image segmentation, the authors tested six image segmentation methods: mouse clicking based manual method, Otsu threshold method, canny edge detection, Sobel edge detection, watershed algorithm and the method based on the combination of canny edge detection, zero-crossing LoG technique and watershed transform. Based on the segmentation results obtained, Sobel edge detection was employed for final image segmentation. In the next step shape descriptors like Fourier descriptors, internal structure histogram, edge histogram and geometrical descriptors were extracted and fed as input to SVM for classification. The proposed methodology achieved an accuracy of 89.7%, a specificity of 99%, similarity of 99% and a sensitivity of 99%. In the next year, Yang et al. [98], introduced a robust two dimensional shape descriptor for environmental microorganism classification. The methodology involved segmenting the microorganisms from the background by applying double clicking based manual method and Sobel edge detection technique. In the next step, the orientation of segmented objects was normalized followed by feature extraction. These features were then normalized and fed as input to multi-class SVM for classification. Apostol et al. [99] developed an application RadSS, using SVM for the automatic identification of radiolarian species. The system worked by extracting shape and texture features like circularity, roundness, contrast, Haralick texture descriptors etc. followed by feature selection using PCA. Then SVM was used for classification. The evaluation was done on random images selected from training set during cross-validation. Experimental results predicted that SVM trained with basic texture features and 4 principal components achieved highest accuracy of 88.33%. In comparison, SVM with Haralick texture descriptors achieved highest accuracy of 95%, using 6 principal components. Abdalla et al. [100] presented a ML based approach for the identification of Eimeria species in chicken and rabbits. The experiment was performed using two datasets. First dataset consisted of 4402 microscopic images of 7 Eimeria species found in chicken. Second dataset consisted of 2902 microscopic images of 11 Eimeria species found in rabbit. The approach involved image enhancement using image filling and compliment. Then image segmentation was performed using otsu threshold method followed by pixel level feature extraction. The optimal features were selected using relieff method. These features were then used for classification using K-NN and ANN. The model was evaluated using 5-cross validation. ANN achieved best accuracy of 96.6% for first dataset and 91.9% for second dataset.

Keceli et al. [101] presented a comparative study on different traditional features and deep features for image classification of radiolarian species. The dataset included microscopic images of four radiolarian species. The approach involved image binarization using Otsu threshold method followed by shape, texture and deep feature extraction. The deep features were extracted using pre-trained CNN called ALeXnet. Feature selection was performed for deep features using relieff method. The authors used four ML algorithms for classification: Adaboost, SVM, RF and K-NN. Experimental results predicted that the better accuracy was achieved using deep features. Zhong et al. [102] proposed a ML based framework for the classification of foraminifera species. The study was performed on 1437 images of foraminifera species collected under different illuminations. The authors performed a comparative study of bag-of-features framework and pre-trained CNN architectures like VGG 16, Resnet50 and InceptionV3 etc. for feature extraction. In the next step four ML algorithms namely RF, ANN, K-NN and SVM were compared for classification. Experimental results predicted better performance for ANN trained by merging ResNet50 and VGG 16 features. Kosov et al. [103] proposed a DL based framework for the image classification of environmental microorganism. The study was performed using 400 images of 20 microorganisms. The methodology involved deep features extraction by employing pre-trained VGG16 deep neural network followed by extraction of simple shape features and texture descriptors. In the next step, conditional random field based model was developed for localization and classification of environmental microorganisms. Pho et al. [104] used RetinaNet architecture to segment and identify protozoa species along with their life cycle stage. The study was performed using cyst and oocyst images of six protozoa species at different life cycle stage. The dataset consisted of 38 images for training and 31 images for testing. Solano et al. [105] extended RadSS application for radiolarian classification proposed in [98]. In this study, the authors implemented both supervised and unsupervised techniques for radiolarian species classification. Firstly, shape and texture features were extracted followed by feature selection using Coefficient of variation technique. For unsupervised learning based classification, SOM and K-means clustering algorithms were employed. The accuracy achieved using clustering techniques was 88.75%. For supervised learning based classification, RF, Lazy K-Star and naïve bayes techniques were used. Naive bayes technique achieved best accuracy of 88.89%.

Vijayalakshmi et al. [106] implemented transfer learning approach to identify malaria parasite (Plasmodium falcipuram) in blood smear images. The approach involved feature extraction using VGG 19 CNN model, followed by fine tuning the last three layers of VGG19 using SVM for classification. The dataset consisted of 1030 images with falcipuram and 1520 images without falcipuram. Mitra et al. [107] used CNN for the classification of six species of foraminifera. The study was performed on the dataset presented in [102]. Firstly feature extraction was performed using two pre-trained CNN architectures; ResNet50 and VGG16. The feature map so generated was fed as input to three fully connected layers for classification. To avoid over fitting dropout layers were used in between the fully connected layers. Dionisio et al. [108] employed CNN for the genus–level as well as species-level classification of radiolarian species of nine different genera. The species level classification was performed only for two species of Pseudostylosphaera genus. In the next year, Liang et al. [109] proposed an optimized Inception-V3 CNN architecture for image classification of Environmental microorganisms. The research was done using EMDS (Environmental Microorganism Dataset), prepared by applying image augmentation operations like black-box method, affine image transformations, white-box method etc. The authors employed genetic algorithm for fine-tuning the hyper-parameters of fully connected layer of Inception-V3 i.e. dropout rate and neuron number. Experimental results predicted an accuracy of 92.9% for proposed optimized Inception-V3.

Zhang et al. [110] proposed a low cost U-net architecture for image segmentation of Environmental microorganisms. The architecture included CNN based on Inception-V3, concatenate operations and U-net. The segmentation results achieved with proposed architecture, were further optimized by performing post-processing by applying dense conditional random field to obtain global information of image. For evaluation of segmentation results, the authors employed six performance metrics; Jaccard, Dice, Precision, Recall, Volumetric overlap error and Accuracy. Table 4 summarizes and gives detailed analysis of papers reviewed on protozoa image recognition.

Table 4 Summary of research papers reviewed on ML methods for protozoa image recognition

3.4 ML in Fungi Image Recognition

Fungi are found in almost any habitat, but most of them live on the ground, mostly in plant material or soil. They included mould, some mushrooms and yeasts. A group of fungi called decomposers grow on the dead plant matter, where they play a significant role in carbon cycling. Together with bacteria, they release nitrogen, oxygen and phosphorus into the atmosphere. Sample images of fungi are shown in Fig. 12. Mostly fungi are saprophytic and not pathogenic. But some species releases toxins and causes diseases in humans, plants and animals. Aspergillus flavus is a pathogenic fungus which produces aflatoxins and secondary metabolites [111]. They are injurious to both animals and humans. To study fungal species various ML as well as DL techniques has been employed by the researchers.

Fig. 12
figure 12

Example of microscopic images of fungi. a Pencillium, b Aspergillus

Jin et al. [112] applied ML techniques for the classification of hyper spectral images of fungi, Aspergillus flavus strains into toxigenic and atoxigenic. The study was performed on two image sets. First image set was collected using halogen light source and second image set was collected using ultra violet light source. The acquired images were pre-processed in batches using series of steps. Firstly geometric correction and conversion of data format was done. Following this, scene calibration, wavelength assignment and noise removal were performed. After image pre-processing, PCA was implemented for selecting optimal spectral bands and for data correlation. Further data reduction was done, by selecting optimal combinations of spectral bands, using GA. These bands were then used by SVM for classification. The authors employed K-fold cross validation for training as well as testing. Experimental results predicted better average classification accuracy of more than 95% using ultra violet light source image set. Yu et al. [113] proposed a ML based approach for the classification of yeast cell images. The images were classified into three classes: no bud, small bud and large bud. The dataset consisted of 240 yeast cells images. Firstly threshold method based segmentation and image enhancement was performed. In the next step, shape based features like; compactness, axis ratio and bud-to area ratio were extracted. Following this, three ML techniques; Mahalanobis distance, SVM and K-NN were employed for classification. Experimental results predicted better accuracy and less processing time for SVM. Tleis et al. [114] presented a ML approach to classify cells of Saccharomyces cerevisiae yeast as healthy and non-healthy. The approach involved extraction of texture ad shape features. The dataset consisted of 1380 images and was imbalanced. To improve the classification accuracy, three sampling techniques namely over-sampling, under-sampling and SMOTE were employed. Following this, feature selection was performed using PCA, correlation feature selection and information gain method. For classification, the authors compared different ML algorithms, namely RF, Random committee, Logit boost, bagging, Random sub space, Adaboost, Logistic model trees, decision table and simple logistic. The classification models were evaluated by plotting ROC curve. Best accuracy of 92.27% was achieved with simple logistic model. Liu et al. [115] proposed a ML based approach for fungi detection in microscopic fecal images. The proposed approach involved image segmentation using Otsu threshold method. Then eight shape descriptors like area, perimeter, eccentricity, concavity point etc. were extracted. In the next step, values of different descriptors were counted and the images were classified into three classes; two circle fungi, three circle fungi and budding fungi. The classification was performed using ANN in two steps. In the first step ANN was used to select features values from the images under distinct focal length. In the next step ANN was used to classify the fungal images. In both steps back propagation was applied. The model was trained using 924 and 979 fungi images respectively. Zhang et al. [116] proposed a ML techniques for the detection of fungi in leucorrhea images. The methodology involved image segmentation, feature extraction, dimensionality reduction and classification. For image segmentation an optimized threshold method was applied. The features were extracted using CNN. CNN architecture included one input layer, 3 convolutional layers and one output layer. Each convolutional layer was followed by subsampling layer, except the last convolutional layer. The authors used histogram of gradient for extracting edge features from feature maps obtained at each layer in convolutional process. To reduce dimensionality of the feature vectors, PCA was used. The feature vector so obtained was used by SVM for classification. The authors also implemented two other techniques; CNN LeNet architecture and histogram of radiant based SVM for the same problem. Experimental results predicted better specificity and sensitivity for the proposed approach. Tahir et al. [117] used SVMs for fungus spores detection. The methodology involved acquiring air sample images containing fungus spores. Then, 78*78*3 sized patches were extracted from images. The patches were pre-processed by applying 7*7 Gaussian filters for smoothing and median filter for image sharpening. In the next step, histogram of gradient features and handcrafted features like color, size, shape etc. were extracted. These features were then used to train SVM for classification. The model achieved 88% accuracy. In the next year, Tahir et al. [118] employed CNN for the classification of five types of fungus spores namely, Penicilliodes, Versicolor, Restrictus, Eurotium, Cladosporium. The CNN architecture is presented in Fig. 13. The model achieved promising classification accuracy of 94.8%.The authors also proposed a novel dataset consisted of 40,800 labeled fungus spores images. Arredondo-Santoyo et al. [119] presented a transfer learning based approach for the characterization of dye de-colorization in fungal strains. The study was performed on an imbalanced dataset of 1024 fungi assay images. The images belong to four classes depending on de-colorization level. To solve the class imbalance and over fitting problems, the proposed technique involved SMOTE and data augmentation. The authors used traditional features, expert features and deep features. For deep features extraction, various transfer learning models like Resnet50, VGG16, VGG19, GoogleNet etc. were used. The authors implemented the combination of features extraction technique with different ML classifiers, namely SVM, K-NN, Logistic regression, MLP, RF and randomized trees. Experimental results predicted best accuracy of 96.5% for the combination ResNet-C-SVM (ResNet with controlled image and SVM). Zhou et al. [120] proposed a DL based approach for identification of filamentous fungi in microscopic images. The study was performed on three Aspergillus species. The approach involved converting images from color to gray and then applying threshold method to covert gray image to binary image and sub-images were detected using information of each conidium position. In the next step, two CNN models, namely GoogeNet and AlexNet were employed for classification. GoogleNet achieved best training accuracy of 95% and test accuracy of 69.25%. Due to the small dataset, AlexNet achieved less training accuracy of 85%. Another CNN based approach was proposed by Hao et al. [121] for the detection of Candida Albicans fungi in microscopic leucorrhea images. Firstly, a threshold technique called maximum inter-class variance technique was applied for image segmentation. Then the segmented fungi sub-images were classified using CNN as image with fungi and image without fungi. To further recognize the fungi, template matching method and concave point detection method were employed to determine circles and concave points respectively. Zielinski et al. [122] proposed a DL based technique for the microscopic image classification of fungal species. The experiment was conducted on 180 images of five fungal species. Firstly, using background removal and contrast stretching techniques, the images were pre-processed. Then pre-trained deep neural networks like AlexNet, InceptionV3, ResNet etc. were employed for features extraction. Following this, bag of words model was used for feature aggregation. These features were provided as input to SVM for patch-based classification of images. Ma et al. [123] performed a comparative study between different CNN models, namely, ResNet50, VGG19, InceptionV3, Xception and InceptionResNet V2 for the image classification of Aspergillus fungi. The experiment was performed on 17,142 images of seven Aspergillus fungi species included, A. clavatus AI1, A. flavus AI2, A. flavus AI1, A. niger AI1, A. terreus AI1, A. nidulans TN02A7 and A. fumigatus A293. Experimental results predicted better training accuracy of 99.8% and test accuracy of 99.7% for Xception model. The Xception model was further validated using another image set containing 2,853 images of seven Aspergillus species considered for this study. On validation image set, Xception model achieved 98.2% accuracy.Table 5 summarizes and gives detailed analysis of papers reviewed on fungi image recognition.

Fig. 13
figure 13

CNN architecture for fungus detection [118]

Table 5 Summary of research papers reviewed on ML methods for fungi image recognition

4 Discussion

In this paper, we have reviewed the research done in implementing various ML techniques for image recognition of various microorganisms. After going through all the studies, it has been analyzed that many researchers have effectively used ML techniques to automate the traditional methods of microorganism classification and identification. Researchers have used different types of imaging techniques including microscopic imaging, hyper spectral imaging etc. for the identification, detection and classification of microorganisms. ML based research has also performed better on ZN stained sputum smear imaging for the identification of tuberculosis bacteria. This section aims to address the research questions (RQ2-RQ6) framed in section II, Subsection A.

4.1 Image Pre-processing

The general approach adopted for image recognition of microorganisms involved image pre-processing, feature extraction, feature selection and classification. Image pre-processing is performed to enhance the significant features of an image and to remove noise caused by the variation of colour or brightness. In microorganism classification field, smoothing techniques like median filter and Gaussian filter are the mostly adopted techniques for noise removal. Another important step of image pre-processing is to determine the regions of interests. using image segmentation. The researchers have adopted local as well as global image segmentation techniques for microscopic image analysis. The broadly used segmentation techniques in the selected studies are categorized as: (1) Edge detection based techniques (2) Region based techniques (3) Threshold method (4) Clustering based methods. Various ML techniques like U-net architecture, SVM and K-means clustering have also been implemented for segmentation purposes. Recently researchers have also explored DL based architectures like SegNet and Mask-RCNN for semantic and instance segmentation respectively.

4.2 Feature Extraction and Selection

Feature extraction and selection is another prominent step in image classification. Feature selection is performed after feature extraction to select the most relevant features. It provides various benefits like improving accuracy, reducing over fitting and training time. For the classification of microorganisms, the most common features used were shape, texture and differential features. The researchers have employed various descriptors for the extraction of these features In Table 6, various types of features used in microorganism classification and the descriptors used to describe them are presented. Recently, some researchers have employed. CNN for feature extraction. Experimental results predicted better classification performance using CNN feature map. Researchers have also explored feature selection techniques like PCA, GA and linear discriminant analysis to improve performance by reducing redundancy.

Table 6 Descriptors used for different types of features extraction for microorganism classification

4.3 Classification Techniques

The researchers have explored various ML classifiers depending on the type of microorganism for identification and the type of data modalities. The authors have employed handcrafted features based techniques like K-NN, Naïve Bayes classifier, SVM, ANN etc. for microorganism classification. Recently, DL based techniques like deep neural networks, deep belief networks and CNN have also been introduced for the classification of microorganisms. Researchers have also combined CNN and SVM to improve the classification accuracy. However, long training time is required to train DL architectures. To solve this problem, some authors have proposed Meta heuristic optimization algorithms like PSO and GA. Some meta-algorithms like Adaboost and bagging were also explored to improve the model performance. Figure 14 shows the number of research papers in which a particular ML technique was used. As demonstrated in the Fig. 14. SVM, ANN and deep neural networks are the mostly implemented classifier by the researchers for the classification of images of different microbes.

Fig. 14
figure 14

Distribution of ML techniques used in reviewed research papers

4.4 Performance Metrics

After feature extraction, selection and model implementation, the next important step is finding the effectiveness of model. In microorganism image recognition field, researchers have mostly used six metrics, i.e., accuracy, recall, precision, sensitivity, F-score and specificity for classification task. In addition to this, for image segmentation task, metrics like Jaccard index, Dice coefficient, sensitivity, recall, specificity, precision, accuracy and volumetric overlap error have been employed.

4.5 Development Trends

ML techniques have been used efficiently for the image recognition of only four types of microorganisms. These four types included bacteria, algae, fungi and protozoa. Initially ML techniques like SVM, ANN, RF, naïve bayes etc. have been evolved out for automatic microorganism image recognition. These techniques achieved better classification performance on the high dimensional image data of microorganism even with small dataset. To improve the performance further, ensemble techniques like bagging and boosting were also explored. These methods reduced variance and bias components of classification error. The methodologies based on these techniques are semi-automatic, as the feature extraction process is still manual. To fully automate the microorganism image recognition procedure, the researchers employed DL techniques like CNN for automatic feature extraction and classification. CNN outperformed other feature extraction techniques, by extracting significant features without human direction. In some publications, the authors also combined CNN with SVM and RF to design more robust classifier. These models combined automatic feature extraction and better classification ability of SVM and RF. Table 7 shows the development trend in ML based research in microorganism image recognition.

Table 7 Development Trend in ML based research in microorganism image recognition

During 1995–2005 signification research has been observed only for algae and bacterial image recognition. From 2005 onwards, the researchers also started implementing ML techniques for image recognition of protozoa and fungi. The ML based research for microorganism image recognition, was mostly active during the years 1995–2019.

For bacterial image recognition ML based research has been evolved from 1998 onwards. The ML techniques were broadly employed for image recognition of tuberculosis bacteria, counting bacterial colonies in culture plate images and species level classification of various bacteria species like staphylococcus, vibrio cholera, food borne pathogens etc. The data used by the researchers was mostly private, in the form of digital microscopic images. From 2015 onwards, the researchers also applied DL techniques for bacterial image recognition. For algae images, ML based research has been observed from 1995 onwards. The studies were mostly performed for detecting and classifying microalgae, diatoms, harmful algae blooms, fresh water algae and red tide algae. The researchers have mainly used microscopic and hyper-spectral imaging techniques. In some publications, FlowCam particle analyzer was also used for generating algae image data and extracting features. DL based research has also been observed from 2017 onwards. The research for identification and classification of protozoa by applying ML techniques has been witnessed from 2002 onwards. The studies were mostly performed on species like wastewater protozoa, dinoflagellates, eimeria, radiolarian, environmental microorganisms and plasmodium. DL techniques in protozoa image recognition have also been observed from 2017 onwards. Less but significant research has been performed for fungi image recognition from 2009 onwards using ML and DL methods.

4.6 Challenges

While reviewing various ML approaches implemented for the image recognition of microorganisms, the authors observe some difficulties faced by researchers. Due to the unique characteristics of microorganism species, different microorganisms offer different challenges to ML based research. Microscopic images of some microorganisms like bacteria have overlapped species. It is significant to separate these species to reduce the misclassification rate of the ML model. Some authors have explored techniques like method of concavity, marker-controlled watershed and Multi-phase active to separate overlapped bacilli. Priya et al. [124] presented a comparative study of these techniques for separating overlapped tuberculosis bacilli and predicted better accuracy of 93.3% using method of concavity. However, more research needs to be done using different bacterial species. Another challenge is the morphological similarities of microorganism’s species like protozoa, fungi etc.

Traditional ML algorithms with hand-crafted features are less efficient for such species. In [46, 51, 118, 123], authors have employed the DL techniques for feature extraction as well as classification. But the quantity as well as quality the data is not sufficient for developing DL model. The datasets used were also smaller in size. DL models require huge data for training to yield optimal results. In [45], the authors have also combined CNN with SVM and RF for bacterial image recognition. But the dataset lacks in quality. Thus, the main challenge ML based research facing is the lack of quality data. There is no benchmark datasets available online. The researchers have mostly used private data. Due to this reasons ML techniques have been employed for very few microorganisms.

5 Conclusion and Future Scope

Microorganisms have an important role to play in many activities of life. They affect other organisms and the surrounding environment both positively and negatively. Since, they were observed under microscope for the first time in the nineteenth century. Researchers showed a deep interest in studying these tiny lives and the impact they have on other species and environment. They usually clustered themselves in different colonies and reflect changes in shape and behavior during their entire life cycle. To aid experts in dealing with such problems while identification and to automate the process, ML techniques have been widely applied in many aspects of microbiology. This review aims to explore and analyze the ML based methodologies employed by the researchers for the image analysis of different microbes. The review is based on the articles published in various reputed journals and conference proceeding, during the time period 1995–2021.

The research on microorganism using ML was started around 1995 using basic ML algorithms and statistical techniques. But with time, the models have been improved using different pre-processing techniques and features extraction as well as selection. It has been observed through this review that ML techniques can be successfully implemented for the classification and identification of microorganisms. Researchers have explored ML techniques for the image analysis of four types of microorganisms namely bacteria, algae, protozoa and fungi. DL based techniques have also been proposed recently. Algorithms like deep belief networks, CNN etc. have been employed for feature extraction as well as classification. But still the researchers have studied very few species. This is due to the fact that very less datasets are available publicly. DL is performing efficiently for image analysis in many fields. But in microorganism field, due to the lack of data the performance is not up to the mark. The models developed so far worked well on the particular dataset but faces performance degradation with other datasets. Some researchers have combined DL techniques with other ML algorithms and Meta heuristic optimization techniques. These models have performed better. But this type of research is at its beginning level in microbiology field.

In future, researchers can explore other DL techniques like MobileNet [126], recursive neural networks [127], auto encoders [128], generative adversarial networks [129], deep residual dense network[130], attention-based CNN[131], Long short-term memory[132], wavelet CNN [133] etc. in microbiology field. DL based models can be further optimized using methods like batch normalization [134], parameters compressing [135] etc. to reduce over fitting and memory cost. Till now researchers have used very few techniques for dimensionality reduction. Techniques like Locally linear Embedding [136], t-distributed stochastic neighbor embedding [137] etc. can be employed to deal with non-linear and high dimensionality data. The classification accuracy can also be improved by using different hybrid systems like Neuro-fuzzy [138], R-CNN (RNN + CNN) [139] etc. In addition to this, more research can be done in microorganism image recognition using Swarm intelligence techniques like particle swarm optimization [140], artificial bee colony[141], SALP swarm algorithm [142] etc., in combination with ML methods. As of the ability of ML techniques to deal with high-dimensional microorganism data and with continuous improved performance. The ML based research has wide scope in microbiology field. However, it requires the cooperation and joint efforts of researchers from different fields like informatics, medicine and biology.