1 Introduction

Additive manufacturing (AM) is a group of computer-controlled processes where three-dimensional objects are manufactured by depositing material layer by layer [1]. AM, also known as 3D printing, is the backbone of Industry 4.0. Automotive, defence, aerospace, healthcare, and general manufacturing are some of the prominent areas where additive manufacturing is replacing conventional manufacturing [1]. The main strengths of AM are the reduction in manufacturing costs and time, improved rapid prototyping, geometrical independence, rapid repair, and an ability to produce complex geometries using more sophisticated designs. Moreover, AM helps in reducing the weight of the object [2]. Lightweight production of metallic objects is significant, particularly in the aerospace industry, as the reduced weight contributes to reduced oil consumption and carbon dioxide (CO\(_{2}\)) emissions [3]. According to an estimate by Gebler et al. [4], by 2025, AM will reduce global manufacturing costs by 170-593 billion US dollars, with a 2.54-9.30 exajoules reduction in energy and 130.5-525.5 million tonnes (Mt) reduction of CO\(_{2}\) emissions.

Laser powder bed fusion (LPBF) is a manufacturing technique where complicated geometrical objects are produced by melting pre-defined regions, producing a solidification of the metal powder, layer after layer [5]. LPBF is the most recommended method for metal construction of objects [6]. Typical defects in LPBF include incomplete fusion of powder particles, porosity, powder contamination [7], cracks, surface deformation, irregularities in powder re-coating, and balling [8]. Among these defects, porosity is the most frequent and challenging to detect. Porosity compromises mechanical properties such as fatigue life [9]. It is particularly challenging due to its small size as it is difficult to observe with the naked eye. Several processing parameters, such as laser power, powder morphology, layer thickness, scan strategy, scan speed, gas flow, and hatch spacing, can either directly or indirectly contribute towards the creation of porosity [8]. Incomplete fusion holes, voids, and keyholes are the main porosity types. Incomplete fusion occurs because of the partial melting of the powder layer due to insufficient laser power [10]. The partial melting of the powder layer fails to merge with the layer below, causing porosity [11]. Keyhole pores are formed by a high-energy input that vaporises the powder and leaves gas bubbles in the solidified metal [12]. Voids, on the other hand, are caused by rapid cooling, which increases the residual stress of the melt pool [13].

The size and shape of the pores vary between different porosity types. Keyhole pores are round/spherical in shape and much smaller in size, typically bigger than 50µm [14,15,16,17]. The gaps created due to a lack of fusion are irregular, narrow, and elongated in shape and usually more significant than 200µm in size [14, 16, 17]. The largest pore size observed by Zhang et al. [18] was 340 µm. Here, the pores were irregular in shape and could have resulted from a cluster of smaller pores. There were numerous factors that directly and indirectly influenced porosity. Increasing the energy density caused tiny, round pores (50-110µm), whereas decreasing it below the optimum value resulted in pores as large as 250µm [19]. Du Plessis [20] stated that at higher laser power, the keyhole pores increased in number and size (30µm up to 400µm), whereas keyhole pore formation decreased at higher scan speeds. Choo et al. [21] studied the effect of varying laser power from 50-150%. The maximum reported pore size was 340µm. Leung et al. [22] categorised porosity into two types: gas pores and pores near the oxide layers. Gas pores observed in the experiments were in the range of 250µm, whereas pores near the oxide layer merged with gas pores and grew as large as 50 to 500µm. Mireles et al. [23] had presented an image-based closed-loop control of the electron beam melting (EBM) process. The artificial spherical pores of 600 µm to 900µm were created in test cylinders and successfully detected from powder bed images. Mireles et al. [24] had designed spherical, triangular, cylindrical, and cubic shaped pores of 100µm to 2000µm in their test 3D specimens. The experiments revealed that re-melting the porous layer reduced porosity.

Real-time identification of porosity is complex and challenging. Porosity can be detected using post-build evaluation techniques. However, post-build analysis can be expensive, time-consuming, and laborious [25]. Several destructive (microscopic cross-sectional analysis) and non-destructive (Archimedes density measurement, gas pycnometry and XCT) methods are in practice to detect porosity [26]. Destructive methods like microscopic cross-sectional analysis slice the 3D object to identify defects. However, this results in extra cost, time, effort, and wasted material.

The broader acceptance of AM technologies, especially in aerospace and the medical domain, is hindered by the lack of in situ defect detection, as quality assurance is essential in these fields [27]. The main challenges to in situ monitoring are limited view of the build chamber, the poor spatial resolution of cameras, high temporal load, and the enormous amount of data collected [28]. Machine learning (ML) models are data driven and known for their efficient and effective handling of large data sets. However, applications of ML in LPBF are relatively new, and several issues are limiting the performance of ML solutions. The absence of publicly available data sets, the high cost of data capture, the installation of sensors, and data labelling are considerable challenges to solve for ML applications to be practical in in-process monitoring of the LPBF process [29]. Moreover, the scarcity of data, lack of experience in labelling data, lack of expertise in selecting good features, and the issue of over-fitting and under-fitting of the derived ML models are also hindering the applicability of ML solutions in LPBF [30].

This research aims to develop an ML model capable of identifying seeded defects in layer images from an LPBF build. Our study aims to resolve the issues hindering ML applications in AM, such as data capturing, data labelling, extracting valuable features from the data, class imbalance, and overfitting and underfitting of the ML models. A further aim is to demonstrate that effective hyper-parameter tuning could assist in producing an efficient model from scratch without the need for transfer learning. We constructed 3D metal test specimens with rich porosity defects to acquire data for experiments. We created seeded defects to simulate porosity artificially. The experiments covered a range of pore sizes to assess the ability of the LPBF to produce the pores and to identify any limits associated with the smallest detectable pore in the layer images captured during the build process. Three cylinders were designed containing seeded porosity. The porosity is inserted into the cylinders at various locations, with different shapes and dimensions. We constructed seeded defects as it is challenging to create porosity in a controlled manner, especially when building the objects with several other objects in the same printing job. It is also difficult to identify the pores on images due to their small sizes. Using seeded porosity accomplishes two main objectives:

  1. 1.

    It provides us with a rich, synthetic data set able to mimic a range of pore sizes and types to allow detectable defects to be studied.

  2. 2.

    It provides geometrical information to assist in labelling the image data captured during the build process.

Artificial seeded porosity helps in data labelling as the exact location of the porosity in the powder bed images from the CAD file is known, reducing the need for experts in the labelling stage significantly and avoiding the need to use expensive, time-consuming destructive methods. We designed two labelling approaches for the image data set:

  1. 1.

    The image data set is labelled according to the CAD design information.

  2. 2.

    The same image set is labelled with the help of post-build XCT scans.

The fundamental intuition here is that the seeded CAD information helped build a valid and reliable model, and the XCT showed how we could use non-destructive testing on the 3D samples to tune the model for non-seeded applications. The non-destructive XCT scans of the test cylinders were obtained and correlated with the in-process build images. A complete analysis was performed to accurately evaluate the ML model’s ability to detect different types and sizes of pores.

Convolutional neural networks combine advanced image processing techniques with deep neural networks. We trained a deep convolutional neural network (CNN) using the in-process images. A CNN approach was selected primarily due to their superior ability to extract features from the images without human interaction automatically. Manually extracting valuable features from images is extremely difficult [30] and can often fail to capture local, spatially related information. Automatic feature extraction makes CNNs extremely powerful and computationally more efficient when compared to the other standard ML models such as decision trees [31], random forests [32], and support vector machines (SVMs) [33]. In a comparison by Chouiekh and EL Haj [34], CNNs outperformed traditional ML models such as SVMs, random forests, and gradient boosting classifiers, both in terms of accuracy and training time. CNNs are scalable and capable of handling big data. Another significant advantage of CNNs over traditional ML classifiers is their ability to incorporate transfer learning, allowing complex models to be trained efficiently for a wide range of applications. Transfer learning is the process of keeping the weights and learning of a trained model and using it to solve new similar problems with little re-training [35, 36]. Going forward, the authors intend to use transfer learning on the current models to identify different defect types. The transfer learning will allow the learned weights of the current CNN model to be retained and transferred to new, possibly more complex, deep CNN models to identify a broader range of defects.

The remaining sections of the paper are organised as follows: Section 2 discusses the related work. Section 3 details the materials used in the experiments and the chosen model structure and explains the experimental methodology. Section 4 covers the results and discussion, followed by conclusions in Sect. 5.

2 Related work

Porosity detection from images captured during printing is a highly studied defect in AM. Capturing the porosity on images is challenging and requires a high-resolution camera and a well-lit build chamber. However, identifying the existence of porosity in LPBF parts is complex, and different researchers have followed different experimental approaches. Mireles et al. [24] designed pores of size in the range of 100 to 2000µm and of various shapes (sphere, cubes, circular, triangular, and prism) in their test specimens. HIPing and re-scanning were used to reduce the porosity. CT scanning performed 60% better than (Infra Red) IR cameras in recording porosity on images. Moreover, the IR camera used could not detect pores smaller than 600µm. Similarly, Mireles et al. [23] had created artificial seeded spherical pores of size 600µm to 900µm in test cylinders. Other researchers have created 3D metal objects with porosity induced by varying the laser power, scan speed, and hatch spacing [25, 37,38,39]. ML algorithms are widely used for defect detection in LPBF with varying degrees of success. Many studies have used different ML algorithms to identify porosity from powder bed images. The most prominent of those studies are discussed here.

One of the significant works that developed an ML model using images from the build chamber was carried out by Aminzadeh and Kurfess [40]. They employed a Bayesian classifier to detect defects from images. The proposed framework achieved a precision of 89.5%. Gobert et al. [25] aimed to identify porosity types (gas pores and elongation voids) using images from the build chamber and XCT scan data, labelled by human experts. The layer-wise images were classified as nominal or flawed using a binary linear support vector machine (SVM) with high accuracy of 85%. A similar effort by Kwon et al. [37] identified porosity by using 13,200 images from the build chamber. The experiments were divided into seven groups based on different laser power ranges from 50W to 350W, keeping the rest of the printing parameters constant. The experiments revealed that laser power of less than 250W caused porosity. The trained neural network identified porosity images with less than a 1.1% failure rate. Zhang et al. [38] trained a CNN model on images of a 3D metal object printed with titanium powder using direct laser deposition. Five specimens of length 15mm each were printed with varying scan speeds of 1-4mm/s and laser power of 150-250W. Both destructive, cross-sectional analysis and non-destructive (XCT scan) were performed to identify the porosity location in the test objects. The model achieved an accuracy of 91.2% for micropores as small as 100µm [38].

In situ melt pool monitoring is another highly studied area. Many defects occur around the melt pool, such as keyhole porosity, spatter, balling, and under-melting. Various studies have focused on melt pool monitoring to investigate the causes and identification of defects. Yuan et al. [29] proposed a semi-supervised CNN for SLM process monitoring. One thousand and two hundred (1,200) individual tracks of 5mm were printed using 316L stainless steel powder. The data set consisted of frames of 250x250 pixels extracted from 1,200 LPBF videos, out of which 700 were labelled manually. The experiments showed that a semi-supervised approach achieved an accuracy of 93.8% compared to 92.2% accuracy of the supervised approach [29]. The experiments by Li et al. [39] aimed to identify porosity employing a semi-supervised ML approach to overcome the laborious, time-consuming, and sometimes unavailability of labelled data sets for supervised learning [39]. A study based on infrared images was carried out for zinc and its alloys. Data mining, statistical data analysis, and feature extraction techniques were employed to build a real-time in situ monitoring system. Cubes of size 5x5x5 mm were printed using zinc on an SLM prototype machine. The experiments revealed the effectiveness of plume stability monitoring for zinc [41]. In a study by Scime and Beuth [42], in situ monitoring of LPBF employed computer vision and unsupervised ML techniques to identify keyhole porosity and balling signatures from the melt pool. The test specimens were built using an EOS290 LPBF machine with Inconel 718 powder. The features detected in situ were related to the post-build analysis, enabling the classification of the in situ melt pool signatures as defects (keyhole porosity, balling). However, the authors concluded that defect detection based on melt pool signatures was not reliable and required further experimentation [42]. In addition to powder anomalies and melt pool monitoring, Scime et al. [43] presented comprehensive findings in a further study relating to microstructural defects such as porosity, spatter, and soot. They proposed a novel CNN model capable of real-time detection and identification of defects. The model was successfully tested on six different metal printers belonging to three different technologies: electron beam fusion, powder bed fusion, and binder jetting [43]. Instead of using optical camera images, Bartlett et al. [44] employed full-field infrared thermography to identify lack-of-fusion defects. Four cylinders of diameter 20mm and 6mm in height were printed using AlSi10Mg powder with a layer thickness of 50µm on an X-line 1000R SLM machine. The lack-of-fusion defects were detected with an 82% success rate. The pore sizes greater than 0.5mm were detected successfully. However, pores of size below 0.5mm were detected only 50% of the time [44].

Plumes and spatter signatures were used to train a deep belief network (DBN) on images captured by a near-infrared camera. The test specimens were printed on a custom design SLM machine with an integrated infrared camera. Five experimental scenarios were designed by varying scan speed 50-500mm/s and laser power 50-150W. The proposed DBN identified five different melted states with an 83.4% accuracy. Moreover, the proposed framework required fewer parameters, feature extraction, and signal processing [45].

An in situ, bi-stream, deep convolution neural network, which aimed to identify insufficient layer densification, trace discontinuity, and surface deformation defects in LPBF, successfully identified the process-induced errors with an accuracy of 99.4% [46]. Scime and Beuth [47] aimed to identify six anomalies in LPBF; recoater hopping, recoater streaking, debris, superelevation, part damage, and incomplete spreading. Previously, Bag of Words (BoW) and a CNN were used for multi-object detection from a single image. The BoW technique relies heavily on human input. The authors proposed a multi-scale CNN (MsCNN) based on reinforcement learning of the AlexNet CNN, which was trained on a colour image data set called ImageNet. The images were captured from 53 builds (of 3D printed objects) on an EOS M290 LPBF machine. The training data set was composed of 10,071 multi-scale patches, out of which 3,827 were defect-less, 1,896 recoater hopping, 527 recoater streakings, 666 superelevations, 1,297 disturbance, and 1,858 incomplete spreading patches manually labelled by human experts. The proposed MsCNN outperformed the BoW and CNN and was less affected by human bias [47]. In a similar effort to identify powder anomalies by the same authors, a computer vision-based solution was proposed that can be used as a real-time control model with some sufficient future improvements [48]. The image data set of 2,402 images, out of which 1,040 were fault-free images, 264 recoater hopping patches, 228 recoater streaking patches, 187 debris, 314 superelevation, 264 part failure, and 105 incomplete spreading patches, was captured from an EOS M290 LPBF machine using only the built-in camera and LED light. The proposed computer vision algorithm successfully identified failure modes and the exact location of flaws in the final product with microscopic accuracy. However, the use of deep learning algorithms with further improvements in the accuracy of the ML models will be required to use it as an in situ monitoring algorithm [48].

Identifying porosity from powder bed images is crucial for real-time identification and avoiding expensive post-processing methods. Porosity is a complicated defect to detect and avoid in LPBF. The reported studies that used powder bed images for porosity detection had used non-neural network ML models with accuracy generally less than 90%. The few neural network-based studies had better results than the non-neural network solutions, but their performances could be improved significantly by using various sophisticated deep learning techniques. CNNs are extremely powerful due to their ability to extract features automatically from the images. The cited studies covered a range of similar defects in LPBF like recoater problems, surface deformation, layer densification, and track discontinuity that were identified with excellent accuracy using neural network-based solutions. However, its incredibly small size makes porosity difficult to detect using CNNs with great precision. This paper aims to employ sophisticated deep learning techniques to enhance CNNs performance. The extensive and automated hyper-parameter tuning, balanced dataset, artificial data augmentation, and extensive evaluation of the CNN model using various criteria could drastically improve the model’s learning. Apart from that, data quality could be enhanced by inventing better and more accurate labelling techniques and improving the images’ sharpness using affine transformations. The proposed CNN model will become the pioneer in micro LPBF defects detection. The model’s learning will provide a solid base for identifying similar defects from powder bed images.

3 Materials and methods

We constructed the specimens by utilising LPBF. A typical LPBF process is depicted in Fig. 1. The recoater system spreads a uniform dose of metal powder over the build platform. A laser (or lasers) melts specific regions on the newly spread metal powder layer. 3D objects are formed by successively melting 2D cross-sections of the whole 3D object. After each layer is melted, the powder bed is lowered by the thickness of the powder layer, and a new powder layer is spread. The oxygen level inside the chamber is kept to a minimum to avoid metal oxidation during the melting process by introducing inert gases such as argon or nitrogen.

Fig. 1
figure 1

Illustration of the LPBF Process [42]

A sample geometry was designed to incorporate seeded defects of different sizes to explore the minimum pore size created by the LPBF machine by validation with XCT of the built parts and to identify the smallest pore size that the layer imaging camera can capture. Spherical and cubic defect shapes are employed to see if this affect detection of the pores. The pores constructed in test cylinders varied in size from 20µm to 2mm, as shown in Fig. 2.

Fig. 2
figure 2

Computer-Aided Design of three cylinders

Three cylinders, each with a height of 30mm and a diameter of 12mm, were designed to contain seeded pores within them. Two cylinders, B1 and B2, had circular pores, whereas B3 had cubical pores. Mireles et al. [24] performed similar seeded porosity experiments. Spherical and cubic pores were selected here to study the application of deep neural networks in identifying defects from the images irrespective of the shape of the pores. The cylinders were produced using an SLM500HL machine (SLM Solutions, Germany) from A20X powder with 20-53µm size distribution. The cylinders were built on an aluminium substrate held at 150 \(^\circ\)C throughout the build process. The process parameters used were 360W laser power, 1500mm/s scan speed, 100µm scan spacing, and a layer thickness of 30µm. The stripe scan strategy was used.

The layer imaging camera used in this study is the LayerCam system in an SLM500HL machine. The imaging system comprises two Baumer TXG20 cameras, capturing 2600 \(\times\) 1,440-pixel images of the build area (500 \(\times\) 280 mm). The approximate pixel size of camera systems is 0.2mm/pixel. The cameras are not centred on the build area. The SLM500HL machine carries out image transformation operations before the images are saved for the user. The SLM500HL machine automates the capturing of layer images. Flash strips inside the machine are triggered when capturing the images. An image is captured after each powder layer is spread and after each layer is melted. Therefore, two images are captured per layer. The capturing of data on every layer of the build enables the examination of the specimens’ interior and exterior. The images are saved with a timestamp and layer number in the file name.

3.1 Flow-chart of the experiments

The flowchart of the experiments is shown in Fig. 3. The powder bed images and augmented images were enhanced at the image enhancement stage. The region of interest (ROI) was extracted from the enhanced image corpus. The data labelling stage used XCT images and CAD design of the test cylinders as the benchmark for labelling the images data. XCT images and CAD design were only used as reference data. That is why a dotted line shows their connection to the data labelling stage. The labelled data were passed to hyperparameter tuning, and the proposed CNN model was trained on labelled images using the best hyperparameters for the model. Finally, the model was evaluated against various criteria and produced excellent predictions on test images.

Fig. 3
figure 3

Flowchart of the experiments

3.2 Data acquisition

After the build was completed, the layer images from the build chamber were processed before passing to the ML models. Building more than one object simultaneously in an LPBF machine is common to save time and money. When printing the cylinders, this was the case as they were built with various other objects. The irrelevant portions of images were discarded through standard image processing, and the regions of interest (ROI) containing the cylinders were extracted. The final images of each cylinder were 190 pixels in height and 150 pixels in width. During the printing, an error with the camera system resulted in some missing layers in the image data set. Missing images and different starting and ending points for different cylinders were handled by renaming the cropped images. The new names consisted of a porosity flag, cylinder name, layer number, and pore size. The final data set had 963 images for each cylinder.

3.3 Data labelling

A supervised machine learning model requires labelled data for training. Labelling the images is a big challenge in AM and requires the input of domain experts. Data labelling is a laborious, time-consuming, and human-intensive task. We designed the 3D objects with seeded pores to overcome this hurdle. Two approaches were formulated to label the images. For both approaches, experts in additive manufacturing supervised the labelling process.

Approach 1: CAD-assisted labelling. The images from the printer were labelled based on the CAD file design. The 30mm cylinders, built with a uniform layer thickness of 30µm, resulted in 1,000 images. Since the pores were designed in the CAD file, the location of the pores was known. A visual representation of the CAD designs, along with the pore sizes and shapes, is shown in Fig. 2.

Approach 2: XCT-assisted labelling. We employed post-build XCT scans to assist with the image labelling. The XCT images are superior in terms of quality and a reliable source to see whether the pores formed or not inside the final 3D object. Some of the layer images, along with their corresponding XCT images, are shown in Fig. 4.

Fig. 4
figure 4

Sample powder bed and their corresponding XCT images. The images on the left are from the powder bed and on the right are their corresponding images from XCT analysis

3.4 Data augmentation

The data set consisted of 2889 images. However, the data were highly imbalanced. There were 2386 non-porosity images and only 503 images that contained porosity when using the CAD-assisted labelling approach. The class imbalance was even more significant in the XCT-assisted labelling approach, resulting in 2578 non-porosity and only 311 porosity images. Class imbalance is often a significant hurdle in training an unbiased machine learning model—mainly when the minor class is the one we wish to predict accurately. To address this problem, we employed data augmentation methods to over-sample the minority class the porosity images. The data-augmented parameters used in the experiments are shown in Table 1 along with their values. Data augmentation is an ingenious way to combat class imbalance. The vertical flip inverts the image upside down, and the horizontal flip reverses the rows of pixels. The width and height shift move the image pixels in one direction while keeping the image size constant. The empty spaces created during the shifting were filled by copying the nearest pixels. The data produced by augmentation are not artificial per se. It is the data created by slightly altering the current data pool in a controlled and conscious manner. Kang et al. [49] had used data augmentation to improve fire detection using a tiny YOLO algorithm. Similarly, many deep learning medical applications require data augmentation due to the scarcity and expensive nature of medical data [50]. The new, balanced data set consisted of 5135 total images, of which 2578 were non-porosity and 2557 were porosity.

Table 1 Data Augmented parameters to over-sample the porosity images

3.5 Data pre-processing

The images captured from the SLM500HL contained much noise. Image thresholding was applied to improve the standard of the images. The images in the grey-scale channel show two peaks in their pixel histogram analysis. Therefore, Otsu’s Binarisation [51] was applied to the images to improve their quality. Figure 5 shows a sample of porosity and non-porosity images after the Otsu thresholding. The images were re-scaled, and their pixel values were normalised before feeding them into the CNN. The data set was split into 70% training and 30% testing using a stratified split to ensure the same class distribution in both train and test sets. The training data set had 1804 non-porosity and 1790 porosity images. In comparison, the test data set had 774 non-porosity and 767 porosity images.

Fig. 5
figure 5

Sample images showing porosity and non-porosity in a layer image. The left image is the layer image and the right image is the same image, binarized using Otsu thresholding [51]

3.6 Hyper-parameter tuning

The convolutional neural network was selected to distinguish between porosity and non-porosity images. CNNs are known for their ability to extract features from images and are widely used in medical [52,53,54] and commercial applications [55, 56]. However, selecting the CNN architecture and training it from scratch require significant computational resources and time. Moreover, selecting the correct values for CNN’s hyperparameters is crucial for its learning and predictive ability.

The goal is to design an effective CNN architecture with good predictive capability and a reasonable training time. We experimented with six different CNN architectures to find the best architecture, starting from a simple one convolutional layer, one max-pooling layer, one dense layer to three convolutional layers, three max-pooling layers, and two dense layers. The architecture of the six models is shown in Table 2.

Table 2 CNN Model’s Architectures

Various hyper-parameters associated with each unique layer of the CNN were experimented with using a range of values. The hyper-parameters and their range of values used in the hyper-parameter tuning are shown in Table 4. We have employed the ADAM optimiser for our experiments. ADAM is a well-known optimisation algorithm, both computationally and memory efficient, and known to perform well for big data problems [57]. Finding the best values for the hyper-parameters of a CNN model requires extensive computational and memory resources. Commonly used hyper-parameter optimisers, such as random search and Bayesian optimisers, are slow and require significant amounts of memory. Hyperband uses adaptive resource allocation and early stopping criteria that speed up the hyper-parameter optimisation and are known to be 5 to 30 times faster than Bayesian optimisation methods [58]. Hyperband trains several models for a few iterations and scraps half of the low performing models. The process continues until the final best model is left. We used the Hyperband optimisation algorithm to find the best values of the hyper-parameters. The data set was split into the train, validation, and test sets. Initially, the data set was split into 70% training and 30% testing. The training data set is further split into 70% training and 30% validation data set. It is shown in Table 3. The training data set was used to train various models, and the validation data set was used to evaluate the models. However, the test data set was kept separated and used only in the final testing of the model. We used the stratified splitting method that ensures the same proportion of classes in train, test, and validation datasets. The best performing values of the set of hyper-parameters of each of the six CNN models are shown in Table 5 along with the accuracy and the loss value of the respective model. All of the model architectures achieved great accuracy and loss. We selected the architecture of model 3 and its respective hyper-parameters for our experiments. Model 3 achieved the best accuracy and loss score on the test data and had comparatively fewer trainable model parameters, making it a more efficient model.

Table 3 Train–Test Split of the dataset
Table 4 Hyper-parameters and their range of values for fine-tuning various models
Table 5 CNN Models and their accuracy on the best hyper-parameter values

3.7 Convolutional Neural Network model

The CNN is a robust classifier with the ability to transfer its learning from one similar problem to another [59, 60]. The reason for not using a pre-trained CNN model via transfer learning in this instance is the uniqueness of the problem at hand. The well-known pre-trained CNN models, such as VGG, ResNet, EfficientNet, Inception, are primarily trained on thousands of ImageNet images. However, none of them have been trained on powder bed images, which differ from the ImageNet images. To demonstrate the ineffectiveness of pre-trained CNN models, we downloaded VGG16 and trained it on powder bed images for porosity identification. The model was trained on a balanced data set. The model failed to identify a single porosity image and predicted all images as non-porosity. The model’s accuracy was 52%.

The novel CNN constructed for the experiments is shown in Fig. 6. The values of the hyper-parameters were selected based on the hyper-parameter tuning results described in the previous section. The final model consists of two convolution layers followed by a max-pooling layer after each convolution layer. Both convolutional layers had 96 filters, used ‘same’ as the padding method, and had the kernel size set to 3. However, the first convolutional layer used the ‘tanh’ activation function, and the second convolutional layer used the ‘relu’ activation function. Two max-pooling layers followed by each convolutional layer had a stride sliding window size of 4. The “flattened layer” was followed by two dense layers with zero dropouts. The first dense layer had 448 units and used the ’tanh’ activation function. The final layer, utilising the soft-max activation function, had two neurons corresponding to the possible outcomes: “porosity” or “non-porosity”. Finally, the learning rate of 0.0001 was used for the ADAM optimisation method. The model was trained using the categorical cross-entropy loss function.

Fig. 6
figure 6

Convolutional Neural Network Model

Python 3.9.0 was used along with TensorFlow 2.5.0, OpenCV 4.5.3, and Keras tuner 1.0.3. The experiments were carried out on Intel core i7 machine with 8GB RAM and 4GB NVIDIA Geforce GTX 1050 graphical processing unit (GPU). Hyperparameter tuning required the most resources, took almost 2 hours and 45 minutes running time, and created almost 10GB checkpoints. However, the final model required merely 10 seconds per epoch for training.

4 Results and discussion

A comparison of the results obtained by following both labelling approaches, CAD-assisted and XCT-assisted labelling, is shown in Figs. 7 and 8. The model trained on images labelled using the CAD design achieved an accuracy of 90%. In contrast, the model trained on the XCT-assisted labelling approach achieved 97% accuracy in identifying porosity images from non-porosity images.

The experiments revealed that the circular and cubical shape pores appeared similarly on the powder bed images. We designed the pores in different forms to have a variety of pores’ shapes. However, only gas porosity appeared circular during printing, whereas lack of fusion and voids appeared irregular, elongated, and non-uniform. In our experiments, circular and cubical pores appeared irregular and distorted.

Fig. 7
figure 7

Comparison of model evaluation metrics for both labelling approaches on non-porosity class

Fig. 8
figure 8

Comparison of model evaluation metrics for both labelling approaches on porosity class

Class imbalance is one of the most common challenges in data-driven solutions. Class imbalance occurs when there is a big difference in the number of occurrences of one class compared to other classes within the data set. When this is the case, accuracy, used as an evaluation metric, can be misleading if the data are biased towards a particular class. A model trained on imbalanced data would result in a biased model. In our experiments, more than 88% of the images were without porosity in the XCT assisted labelling approach, whereas 83% of the images belonged to the non-porosity class in the CAD assisted labelling approach.

The CNN model was trained on both balanced and imbalanced data set to study the impact of balanced and imbalanced data on the model’s training. We used data augmentation methods to combat the class imbalance problem. The XCT-assisted labelled images were balanced by oversampling the minority class. The balanced XCT-assisted images had 2578 non-porosity (50.20%) and 2557 (49.79%) porosity images. The model’s accuracy on balanced XCT assisted images was 97%. However, it is always desirable to evaluate the model with different evaluation criteria instead of solely relying on accuracy. Apart from precision and recall, the F1-measure was employed, allowing a harmonic mean to weight the accuracy measure to account for the imbalance correctly. Further insight can also be obtained by considering the confusion matrix and additional metrics obtained from the confusion matrix. The precision of a binary classifier is given by:

$$\begin{aligned} Precision = \frac{TruePositive}{TruePositive + FalsePositive} \end{aligned}$$

Precision is the measure of true positives among all the predicted positive cases. In other words, it specifies how many of all predicted positive cases are true. Precision can be an excellent metric for a manufacturer, where controlling false positives is critical. On the other hand, recall can better predict the model’s performance when false negatives have a high impact. The recall is calculated as:

$$\begin{aligned} Recall = \frac{TruePositive}{TruePositive + FalseNegative} \end{aligned}$$

It is a measure of how many true positives are identified. In binary classification with significant class imbalance, a critical task is often to identify between the false-positive and false-negative rates. For instance, a false negative is more dangerous than a false positive in cancer diagnostics. Similarly, in our case, a false-negative is more critical. An image with porosity identified as non-porosity is more dangerous as it will build the object with porosity. This demonstrates why recall is a better evaluation criterion for the model’s performance. For non-porosity, the recall of the model trained on in-process CAD-labelled images was 97%, 99% for imbalanced XCT-labelled images, and 97% for balanced XCT-labelled images. However, there is a significant difference in the recall of the two approaches for images with porosity. The recall on CAD-labelled images was 54%, while the recall of imbalanced XCT-labelled images was 85%. The recall of the model significantly improved from 85% to 97% when trained on balanced XCT-labelled images. The models’ precision, recall, and F1-score on balanced XCT-labelled images were 97%, 97%, and 97%, respectively.

In comparison, the model’s precision, recall, and F1-score on imbalanced XCT-labelled images were 89%, 85%, and 87%, respectively. The balanced data set resulted in a better, more generalised, and unbiased training of the model, as the model outperformed in terms of precision, recall, and F1-score. The model performed well for the majority class, non-porosity images, and there was very little difference in terms of precision, recall, and F1-score as shown in Fig. 7.

The reason for a poor recall value for CAD-labelled images was due to the high false-negative rate. This is observed in Fig. 9. The false-negative rate of the CAD-assisted labelling approach was 45.65%. This means that out of all images with porosity, 45.64% was wrongly classified as non-porosity images by the model. This is a significantly high number of miss-classifications of a crucial class. At the same time, the rate of false negatives in the imbalanced XCT-assisted labelling approach was only 14.72% and 3.26% for balanced XCT-labelled images. This shows that only a fraction of porosity images was miss-classified by our model when trained on balanced XCT-labelled images. The better performance of XCT-assisted labelling is mainly due to the better and more correct labelling of the image set. The CAD-assisted labelling had many non-porosity images wrongly labelled as porosity images. The false-positive rates of the model were insignificant; in the CAD-labelling approach, the false-positive rate was 2.6%, 1.36% for imbalanced XCT-labelled and 2.84% for balanced XCT-labelled images. This means that both CAD and XCT labelling approaches worked well in predicting the non-porosity images, and only a tiny percentage of non-porosity images were wrongly predicted as porosity images. A significant difference between the false negative (14.72%) and the false positive rate (1.36%) was observed for imbalanced XCT-labelled data. However, the false positive (2.84%) and false negative (3.26%) rates for balanced XCT-labelled images were insignificant and resulted in better, unbiased model training.

Fig. 9
figure 9

Comparison of false-positive and false-negative results from CAD-assisted labelling and XCT-assisted labelling

As stated previously, we chose spherical and cubical shapes for the seeded pores in our test cylinders. Experiments have revealed that the spherical and cubic pores have a similar appearance in the layer images. The model achieved an excellent accuracy of 97% on the image data set labelled with the help of XCT. The model’s accuracy in predicting different-sized pores is shown in Fig. 10. The model attained an accuracy of 96.09% on the most considerable-sized pores; the 2mm.

Fig. 10
figure 10

Model accuracy for different pore sizes

Similarly, the model predicted the 1-mm-, 0.8-mm-, and 0.5-mm-sized pores with more than 80% accuracy. Using the current printing setup, the smallest visible pore on in-process images was 0.2mm. Our model was 66.67% accurate in predicting the 0.2-mm-sized pores. The model’s ability to correctly identify pores reduced with the pore sizes. Apart from accuracy, precision, recall, and F1-score, the model’s loss curves were also calculated to ensure good training and generalisation. The loss curve of the final model trained on balanced XCT-labelled images is shown in Fig. 11. The model is over-fitting as the train and test loss curves diverge after initial training epochs. That is why we employed \(l_2\) kernel regularisation with a regularisation factor of l = 0.01 and 50% dropout to avoid over-fitting the model. The final model’s loss curve with regularisation is shown in Fig. 12.

Fig. 11
figure 11

Loss curve of final model with no regularisation

Fig. 12
figure 12

Loss curve of final model with \(l_2\) regularisation and dropout

4.1 Critical evaluation of overall results

Porosity is a challenging defect in AM. It has been reported that certain porosity types are not detectable with image processing [43]. Due to the tiny pores, the layer imaging system of many LPBF printers fails to capture the porosity on the images. The failure to capture pores smaller than 0.2mm was due to the limitation of camera systems. The resolution of the current camera system is 0.2mm/pixel. The current imaging system cannot capture pores smaller than 0.2mm as the pixel size is 0.2mm. A pixel is the building block of an image. The smallest detail that could be captured on an image corresponds to the camera system’s pixel size. That is why pores smaller than 0.2mm were not captured on powder bed images. XCT analysis is a post-build technique known for its high-resolution images. The images acquired by the XCT analysis of 3D metal objects are much clearer, cleaner, and of higher definition. Therefore, XCT scanning is better in identifying the small pore sizes than in-process imaging. Mireles et al. [24] stated that XCT scanning could find approximately 60% of smaller-sized pores when compared to an infrared imaging camera.

The accuracy of the CAD labelling approach was lower than that of the XCT labelling approach due to incorrectly labelled images. The images were labelled according to the CAD design, but some pores designed in the CAD file were not created in the final test specimen. This caused incorrect labelling of some images as porosity images when they should have been labelled as non-porosity images. The mislabelling of images resulted in a poor learning of the model. Some of the correctly predicted images are shown in Fig. 13. The images numbered 1, 2, 6 and 7 in Fig. 13 had porosity and were correctly predicted by the model. The remaining images (3, 4, 5 and 8) were correctly identified as non-porosity images. However, the model failed to classify some images correctly. Some of the wrongly classified images are shown in Fig. 14. The images numbered 2, 3, 4, and 8 in Fig. 14 showed no porosity but were wrongly labelled as porosity images. This resulted in a poorly trained model, and the model classified these images as non-porosity images. The images numbered 1, 5, 6, and 7 were correctly labelled but were wrongly classified. The images numbered 1 and 5 in Fig. 14 were porosity images but classified as non-porosity images by the model, whereas images numbered 6 and 7 were non-porosity images but were wrongly classified as porosity images by the model. The model’s high number of wrongly predicted images resulted in poor performance with a recall of only 54% and 90% accuracy.

Fig. 13
figure 13

Sample images of correct predictions of the CAD-labelled images

Fig. 14
figure 14

Sample images of incorrect predictions of the CAD-labelled image model

The poor predictability of the model trained on CAD-assisted images was due to the mislabelling of images. Due to its incredibly small size, the porosity is very challenging to observe with the naked eye on powder bed images. Therefore, we used CAD design information to label the powder bed images. However, due to the limitation of camera system resolution (0.2mm/pixel), the pores smaller than 0.2mm could not be captured on powder bed images. That is why our model’s smallest identifiable pore was 0.2mm in diameter. The pores of size 0.1mm, 0.05mm, and 0.02mm were not formed in the test cylinder and did not appear on the in-process images. However, according to the CAD-assisted labelling approach, those images were wrongly labelled as porosity images.

For the second approach, we employed XCT analysis to verify the actual creation of pores in the test cylinders. XCT images are much clearer, cleaner and pores captured on XCT images could be observed with the naked eye. The XCT analysis was used to see if pores smaller than 0.2mm were formed in the parts. Pores smaller than 0.2mm could not be seen in the XCT data; therefore, the LPBF machine may not be capable of forming features of this size. XCT-assisted labelling only labels the powder bed image as porosity image if its corresponding XCT image showed porosity. This is a much better, reliable but expensive way of labelling the powder bed images. The model’s excellent performance using XCT-assisted labelled images is mainly due to the correct and realistic labelling of powder bed images.

The pores in this study were designed into the CAD file. The LPBF machine processes them differently from how pores typically occur. The contour of each pore was scanned; this does not usually happen when porosity (lack of fusion or keyhole) forms. By scanning the pore boundary, the pore had a clear outer edge and was distinctly different from the melted material around it. Moreover, the images from the start and end of the circular porosity bubble were also not visible due to the changing cross-sectional area of the pore. According to the XCT-assisted labelling approach, in-process images were categorised as porosity images only if their corresponding XCT images showed porosity. This resulted in a better-labelled image data set and hence performed better. Some of the correctly predicted images following the XCT-assisted approach are shown in Fig. 15. It can be observed that the model successfully distinguished between porosity and non-porosity images. The images numbered 1, 2, 5, and 6 in Fig. 15 had porosity and were successfully identified by the model. Moreover, images numbered 3, 4, 7 and 8 had no porosity and were correctly classified as non-porosity images by the model. The model achieved a high accuracy of 97% with 85% recall on imbalanced data and 97% recall and 97% accuracy on balanced XCT-labelled images.

Fig. 15
figure 15

Sample images of correct prediction for the XCT-labelled image model

The gas flow inside the build chamber sometimes blows spatter onto the recently melted section of the powder layer. The blown unfused powder may appear on the image as porosity due to the image enhancement techniques employed in our experiments, but in reality, it is not porosity. Fortunately, only a small number of images had this problem, and they were less than ten in total. The images numbered 1, 4, 7, and 8 shown in Fig. 16 were predicted as porosity images by the model, but in reality, they belonged to the non-porosity class. This is a rare phenomenon, and the few miss-classified images in our experiments using the XCT-assisted labelling approach were mostly because of this problem.

Fig. 16
figure 16

Sample images of incorrect prediction of the XCT-labelled image model

The experiments revealed that labelling of the powder bed images from the printer could be improved significantly by using XCT images as a benchmark. This labelling technique is more reliable, realistic, effective, and yields better results. CAD-assisted labelling is simple but impractical and generally only suitable for controlled proof-of-concept studies. We have successfully trained a deep CNN on powder bed images from an LPBF machine. The CNN model can identify defects as small as 0.2-mm-sized pores on in-process images. In a similar study by [24], the smallest identifiable pore size by in-process imaging was 0.6mm. However, Zhang et al. [38] trained a CNN model capable of identifying as small as 0.1mm with an accuracy of 91.2%. The model’s performance depends on several factors. Among them, image standards and pore size are the most crucial. We have successfully fabricated 3D metal cylinders with pore sizes ranging from 0.2mm to 2mm. We captured the pores’ creation on images and successfully trained a deep neural network that is highly accurate in distinguishing the porosity and non-porosity images with an accuracy of 97%.

4.2 Next steps

In this study, synthetic seeding was used to directly form artefact defects to demonstrate how ML can be applied to in-process imaging in LPBF. The size and shape of the seeded pores are very close to the size and shape of natural porosity in LPBF. The limitation of this study is the spatial resolution of the camera set up with a pixel size equal to 0.2mm. A higher resolution camera or a reduced distance between the camera and the build area could detect smaller defects. Despite the hardware limitation of the current experiments, the study has successfully established the efficacy of CNNs in identifying the porosity from the in-process images of LPBF. It has also emphasised the significance of correct labelling of the LPBF images and fine-tuning of the model to enhance its performance. The proposed CNN model is scalable and accurate. Given high-quality in-process images, the model should detect pore sizes smaller than 0.2mm. Moreover, the current model could be used as a base model for future extension of LPBF defects detection. We strongly believe that this model would provide a better base model network for transfer learning, saving the need to train the model from the start, data augmenting, and heavy computation on tuning hyper-parameters. Retaining and reusing the model’s learning will likely provide a better starting point, significantly reduce the training time and computational resources needed for the model’s training. In further experiments, process parameters such as laser power, scan speed, and hatch spacing could be varied to simulate natural porosity in LPBF builds more closely. Besides, future experiments will be used to study the transfer-ability of the current model’s learning and its ability to identify other porosity types such as lack of fusion and voids and other LPBF micro-defects such as balling, and surface deformation. Identifying porosity from powder bed images would enable a closed-loop real-time monitoring system. Advanced printers, such as EOS M290 by EOS, have real-time printing parameter changing capabilities. The proposed ML solution would enable practitioners to identify defects and adjust printing parameters such as laser power and scan speed to avoid and remove defects. By automating defect detection from in-process imaging, the large data sets of layer images collected from each build could be summarised to highlight potential issues to the machine operator. This would significantly reduce the production cost as the 3D objects would require less post-build quality assessment. Overall, this will enable a more robust supply chain and reduce the time-to-market of 3D objects.

5 Conclusion

This study has established the efficacy of ML models in detecting LPBF defects from powder bed images. It investigated the application of a deep neural network model to predict the porosity from in-process images of LPBF. Besides data capture, labelling is the biggest challenge in developing accurate ML models. The study proposed two labelling approaches, CAD-assisted and XCT-assisted labelling of in-process images of the test cylinders designed with seeded pores of sizes ranging from 0.02mm to 2mm. Experiments revealed that the XCT information is a better benchmark for accurate labelling. The CAD-assisted labelling was unreliable as the pores designed into the CAD file might not be created in the final test specimen. The deep CNN model trained on the CAD-assisted labelled data set achieved an accuracy of 90%. However, the model failed to distinguish between porosity and non-porosity images due to the incorrect labelling of the images. In contrast, XCT-assisted labelling in-process images were reliable, accurate, effective, and produced better results. The deep CNN model distinguished the porosity images from non-porosity images with an accuracy of 97%. The model successfully detected pores as small as 0.2mm in size on in-process images. This is a big step towards porosity identification from in-process images. Moreover, we found that the balanced dataset resulted in a more generalised and unbiased learning/training than the imbalanced dataset. The experiments revealed that the balanced dataset significantly improved the model’s precision from 89% to 97% and the model’s recall from 85% to 97% compared to training on an imbalanced data set. The proposed model’s highly accurate predictability of porosity defects will help in post-processing cost reduction. The early in-process real-time porosity detection will enable AM machine operators to adjust the printing parameters to avoid defects.

In future experiments, a better camera setting will help capture higher definition images. This will, in turn, enable the capture of smaller-sized porosity. Defects such as porosity, balling, lack of fusion, and surface deformation will be created naturally in test objects by altering different process parameters such as laser power, scan speed, and scan strategy. More objects would be designed to encourage defect formation during the LPBF process to develop these findings further. Moreover, the current model’s learning will be tested on new defects by transferring its learning/weights to new deep learning models.