Smart laser Sintering: Deep Learning-Powered powder bed fusion 3D printing in precision medicine

,


Introduction
Traditional medicines, which often adopt a one-size-fits-all approach, are effective in only 30-50 % of patients (Lancet, 2018).This has driven the pharmaceutical industry's push towards personalized medicine, where medications are tailored to individual needs in terms of dosage and composition (Nørfeldt et al., 2019).However, the pursuit of personalized medicine brings challenges.A major challenge is that manufacturing personalized medicines using traditional methods is very expensive and inefficient (Seoane-Viaño et al., 2021).Two state-ofthe-art technologies offer promising solutions to these problems: threedimensional (3D) printing and Machine Learning (ML) (Trenfield et al., 2022).
3D printing is a term that describes additive manufacturing technologies that develop 3D objects from computer-aided designs (CAD), layer by layer (Andreadis et al., 2022;Krueger et al., 2024).This advanced technology enables the seamless tailoring of medicines and has been successfully used to develop a range of drug delivery systems (Awad et al., 2022;Funk et al., 2024).Selective laser sintering (SLS) is a 3D printing powder bed fusion technology that has attracted attention for pharmaceutical applications owing to its suitability for large-scale production and simplicity (Hettesheimer et al., 2018;Seoane-Viaño et al., 2024).Utilizing primarily carbon dioxide lasers, this method fuses powder particles.The key advantages of SLS include the ability to produce intricate 3D objects without the need for support structures and the use of powder feedstock material without requiring solvents (Charoo et al., 2020).This technology also allows for the recycling of feedstock material and is adaptable to large-scale production.However, SLS printing was not initially designed to produce medicines.Therefore, developing drug formulations for 3D printing, known as pharma-inks, is an iterative process that is difficult to streamline since it relies on user expertise to ensure successful printing outcomes (Carou-Senra et al., 2024).This presents a significant barrier to its implementation in the clinic (Awad et al., 2021).Predicting printability before printing medicines could save costs, and resources and eliminate the need for an expert in the clinic.
ML leverages data to learn patterns from data, instead of explicit programming, it has proven to be effective in making predictions in pharmaceutics (Gavins et al., 2022;Suryavanshi et al., 2023).In recent years, there has been a surge of interest in the potential ML applications within the field of 3D-printed pharmaceuticals, with studies exploring different 3D printing technologies.ML has been employed to optimize various aspects of 3D printing (Goh et al., 2021), including process parameter optimization (Gan et al., 2019), quality control (Scime and Beuth, 2019), and the CAD of 3D printed products (Bin Maidin et al., 2012).Among these, fused deposition modelling (FDM) has commonly emerged, with several successful attempts at predicting printed medicines' printability and mechanical properties (Elbadawi et al., 2020;Ong et al., 2022).ML has also shown promise in predicting outcomes for pharmaceuticals printed using inkjet technology (Carou-Senra et al., 2023).Previous work demonstrated the feasibility of using a decision tree to predict the effect of energy density and particle size distribution on the SLS printability of irbesartan tablets (Madžarević et al., 2021) and the use of multi-modal data to predict the printability of SLS formulations (Abdalla et al., 2023).While there has been research exploring the use of ML in other fields using SLS printing and neural networks (NN) for other 3D printing technologies (Mahmood et al., 2021), there has been limited research on the use of Deep Learning (DL), a subset of ML which mimics human neural circuitry, for SLS printing (Azizi, 2023).Furthermore, none of the existing studies have explored explaining or addressing the trade-off between accuracy and prediction confidence, a frequent problem within the application of ML in healthcare (An et al., 2023) and pharmaceutics (Bannigan et al., 2023).Moreover, the vast array of factors influencing the printability of drug formulationsincluding molecular structure, mechanical properties, particle size, melting point, and glass transition temperature, among others − has resulted in an inconsistency in the features employed to characterize medicines for ML.Evaluating these features and developing calibrated models is crucial for enabling accurate and confident predictions of the 3D printability of medicines.
To this end, this study aimed to develop an interpretable, uncertainty calibrated DL model to predict the printability of SLS formulations.Therefore, a Deep Ensemble was employed, which uses multiple NNs in parallel to make predictions, based on the state-of-the-art method developed by Lakshminarayanan et al. (Lakshminarayanan et al., 2017) for uncertainty quantification (UQ).The Deep Ensemble was supplemented with explainability analysis and utilized to predict the printability of SLS formulations.Multiple features underwent experimentation, revealing that the Morgan fingerprint (MFP) features offered the best approach for training the ensemble NN and yielded the best trade-off between confidence and accuracy, achieving 90 % accuracy and high confidence.Further explainability analysis revealed materials that either contribute positively or negatively to SLS printability, offering insights scientists can use to optimize SLS formulations.

Materials
All materials and suppliers can be found in Table S1.

Pharma-ink preparation process
Multiple formulations containing a variety of drugs and excipients were prepared.All formulations contained Candurin®, a photoabsorbent that is needed for printing using a blue diode laser that operates at a wavelength of 445 nm (Fina et al., 2017).The materials were sieved using a 180 mm sieve and weighed separately to make up 20 g of the final product.The materials were then mixed with a pestle and mortar until a uniform color was obtained and then sieved again using a 180 mm sieve.

SLS 3D printing
Cylindrical discs (10 mm diameter x 3.6 mm height) were designed on Onshape (Version 1.160, Boston, MA, USA) and the Standard Triangle Language (STL) files were exported into the Sintratec central programme (Version 1.1, Sintratec Kit, AG, Brugg, Switzerland).The pharma-inks were transferred to an SLS printer (Sintratec Kit, AG, Brugg, Switzerland) to print the products following the standard procedure found in the literature (Fina et al., 2017).Formulations were considered printable if the produced disc had no deformations or charring, had good structural integrity and shape, and maintained integrity during post-printing processing.

Particle size characterization
The particle size distribution of each material was measured in triplicate using a laser diffraction particle size analyzer, the Mastersizer Malvern 3000 (Malvern Panalytical, UK) with the Aero S dry powder dispersion attachment.An aliquot of the powder was added to the feeding tray, and air was used as the dispersion medium.The particle size distributions were obtained at 10 %, 50 %, and 90 % of the volume distribution.

Data curation
Data was utilised from Abdalla et al. (2023), which comprised information on 169 distinct medicines derived from 77 materials with varying material compositions, and whether they could be printed using a desktop SLS 3D printer (Sintratec Kit, AG, Brugg, Switzerland) into cylindrical discs (10 mm diameter × 3.6 mm height.This dataset was further supplemented with in-house formulations, which were developed in an identical matter and literature data (Allahham et al., 2020;Barakh Ali et al., 2019;Davis et al., 2021;Fina et al., 2018a;Fina et al., 2018b;Hamed et al., 2021;Thakkar et al., 2021a;Thakkar et al., 2021b;Trenfield et al., 2018;Trenfield et al., 2020) to produce a dataset of 278 pharma-inks made up of 115 materials.

Machine learning models
All ML models were run on a MacBook Pro (Operating System: macOS 12.6; Processor: 2.9 GHz Dual-Core Intel Core i5; RAM Memory: 8 GB; Apple, CA, USA).Python (Version 3.10.4)was used to run the ML models (Python Software Foundation).DL models were run on a Server (Operating System: Ubuntu 20.04 LTS; Processor: AMD EPYC 7282 16core 2.8 GHz; RAM Memory: 512 GB, GPU: RTX 3090 24 GB).Python was used to run the ML models (Python Software Foundation).All ML models were deployed using the Scikit-learn (Version 1.1.1)Python package (Pedregosa et al., 2011), except for extreme gradient boosting, which was deployed through its library (Xgboost Version 1.6.1)and the Deep Ensemble was deployed using PyTorch Lightning (Version 2.0.4) (Chen and Guestrin, 2016).To help visualize the data, two dimensionality reduction models were employed, t-SNE (van der Maaten and Hinton, 2008) and UMAP (McInnes et al., 2018).

Shapley values
SHAPley additive explanations (SHAP) is an algorithm that calculates each feature's contribution to the positive or negative predictions (Lundberg and Lee, 2017).To understand the decisions made by the ensemble, SHAPley values were computed using the Python SHAP package (Version 0.42.1).

Deep ensemble
An ensemble model of 5 identical NNs was employed to predict the printability of medicines (Lakshminarayanan et al., 2017).Each member Y. Abdalla et al. of the ensemble was trained on the entire dataset but with different initializations of the random weights.Each NN in the ensemble was a residual feed-forward network with N layers, each with a hidden size of H and rectified linear unit (ReLU) activation function.The input to each network was a vector of size F, where F is the number of input features, and the output was a probability distribution obtained through a sigmoid activation function.Each network was independently trained using binary cross-entropy loss function and Adam optimizer.1D Batch Normalization was used after each layer to improve the training stability and robustness to initialization.The predictions of the individual ensemble members were combined after the sigmoid activation by taking an average over the probabilities.The hyperparameters tuned were the learning rate (0.01, 0.001, 0.0001), the depth and width of the networks between (32,64,128) nodes and hidden size (1, 2, 3), weight decay (0.01, 0.001, 0.00001).For the Embedding model attention dropout was also tuned (0.0, 0.1, 0.2).The training was done for 25, 50 and 100 epochs for each model.The code for the NN has been made available as a supplementary file.
The input features used F for the first layer of a NN were varied.Prior to inputting into the NN, all features were normalized to a range of 0 to 1.The different features which were inputted into the model are described below: • Drug formulation composition: The most fundamental feature was a one-hot encoded feature vector, F = 77, of the different materials that comprise the pharma-ink.Materials present were assigned a fraction representing their proportion in the formulation and absent materials were assigned 0. This approach was included because it is commonly used in the majority of 3D printing machine learning papers, allowing for direct comparison of model performance.Furthermore, this feature set is the most interpretable and provides clear insights into the utilized formulations.• Embeddings: The feature set consisted of learning an input embedding lookup table based on material ID with an added bias depending on the proportion of the material in the drug, which was summed together, F = 32.Additionally, a self-attention mechanism was trialed into the input of each NN.

• Morgan fingerprint:
The MFP is a binary vector which represents the molecular structure of materials (Morgan, 1965).The simplified molecular-input line-entry system (SMILES) notation for each material was obtained from Pubchem, and the MFP (2048 bits, radius 2) was computed using Rdkit (Version 2022.9.5).Each MFP was scaled by the proportion of the material in the pharma-ink, with the latter represented as an array of the material fingerprints, F = 16348.• MFP and particle size: The MFP of materials, scaled according to their proportion in the pharma-ink, was combined with the particle size of the individual materials and a binary score to identify whether the material was sieved before printing.Particle size is critical for the SLS printer used, as the layer size is approximately 100 μm; hence, materials with a particle size greater than 180 μm do not print successfully.Particle size and size distribution also influence particle flow, making them valuable inputs for the model.Two different approaches were compared as inputs: using the median particle size (with a feature vector size of F = 16400) and using both the median and range of particle sizes (with a feature vector size of F = 16408).These variations were explored to understand the effect of different representations of particle size on the model's performance.

Model performance
Model performance was evaluated using 5-fold cross-validation, where the data is split into 5 folds and the model's ability to predict each fold is evaluated (Ting et al., 2010).The four metrics used were accuracy, area under the receiver operating characteristic curve (AUROC), log loss, and Brier scores (Table 1).

Ensemble performance
This study aimed to develop an uncertainty-optimized DL model to assess the printability of SLS formulations, therefore an ensemble of five neural networks, a state-of-the-art uncertainty quantification method was used.Initially, the study compared the performance of the ensemble NN with traditional non-DL-based machine learning models from previous work (Abdalla et al., 2023) to assess its suitability for the SLS dataset.Results from the comparative scores demonstrate that the Deep Ensemble notably outperformed the conventional models for the Formulation dataset (Fig. 1).This is evident by the low Brier score, which measures uncertainty, coupled with high accuracy and area under the receiver operator characteristic curve (AUROC).This finding contradicts existing literature, demonstrating that tree-based models generally surpass DL for small-to-medium sized tabular data sets (Grinsztajn et al., 2022).
Due to the model's promising potential, it was evaluated further.In addition to the feature sets used previously (Abdalla et al., 2023), three additional feature sets were selected.The MFP feature set was incorporated, given its successful track record in numerous drug discovery and pharmaceutical machine learning studies.The caveat is that the MFP is often considered too simplistic, with multiple materials having the same MFP (Dhakal et al., 2022).Secondly, particle size data was concatenated with the MFP within the model.This integration acknowledges the critical role of particle flow in SLS printability (Madžarević et al., 2021).Lastly, an experiment used an embedding lookup table based on material identification, coupled with an attention mechanism, to investigate the model's ability to learn insightful information about the constituent materials of the formulations.
Both the ensemble and individual NNs were trained using the different feature sets, the performance comparison for predicting the 3D printability of drug formulations is presented in Table 2. Given all the algorithmic metrics, the ensemble of independent NNs dominates a single NN on all metrics on average.The best-performing feature set incorporated both the MFP and particle size, achieving a high accuracy and AUROC and low uncertainty, as measured using the Brier score and log loss.This was closely followed by the MFP feature set alone.Both features perform worse without the implementation of proportions − 0.8824 accuracy (data not shown).The worst-performing features are Embedding (with attention) and Embedding (without attention).This is likely due to the small size of the dataset, meaning that the model is not able to learn the embeddings, especially when attention is employed.Overall, the results suggest that the MFP and particle size features are

Table 1
Common ML metrics.TP: true positive, TN: true negative, FP: false positive, FN: false negative, AUROC: area under the receiver operator characteristic curve, N: population size, C: number of classes, y i : actual value, p i : predicted probability.

Metric
Focus Calculation

Accuracy
The proportion of correct predictions among all predictions made TP + TN TP + TN + FP + FN

Log loss
Model calibration metric, error between the predicted probabilities and the actual binary labels Model calibration metric, measures the mean squared difference between the predicted probabilities and the binary labels

AUROC
Measures the ability of a model to distinguish between positive and negative classes, across different probability thresholds.
The area under the curve of true positive against false positive rate at different probability thresholds; 0.5 refers to random chance and 1.0 is a perfect model Y. Abdalla et al. the most effective in confidently predicting the printability of SLS pharma-inks, and these features have been used in the qualitative analysis.

Qualitative analysis
To gain insights into the decision-making processes of the model, the hidden features connecting to the output layer of each NN in the ensemble were extracted and compressed into two dimensions using UMAP and t-SNE.Although dimensionality reduction may lead to the loss of information regarding the model's decision-making process, it provides a visualization that enables general explanations for the model's decisions.The resulting plots are in Fig. 2. Two different NN hidden representations were plotted to highlight the ability of each NN to focus on different areas of the feature space.It is important to note that prior to inputting the data into the ensemble, dimensionality reduction of the data shows no inherent clustering or grouping of the data, demonstrating that the individual NNs can make meaningful decisions with the data.This is in line with previous research which demonstrated that unsupervised learning methods alone are insufficient for classifying the printability of medicines (O'Reilly et al., 2021).
The resulting plots show clear clustering for models that performed well, while poorly performing models have no inherent clustering (Figure S1).These visualizations can serve practitioners as a sanity check which can be performed on the hidden features to see if the model

Table 2
Performance comparison on 5-fold cross-validation for different features predicting the printability of SLS formulations.Single denotes a single network; Ensemble denotes combining predictions of 5 networks.The best results are highlighted in bold.Abbreviations: MFP: morgan fingerprint, AUROC: area under the receiver operator characteristic curve.to be deployed can make meaningful predictions.While the plots show large clusters of printable and non-printable formulations, they also identify subclusters of formulations with the same or similar materials and different grades and proportions of the same material.The model is also able to differentiate between polymers consisting of the same monomers and correctly classify their printability, despite not being given this information; this is likely through differences in particle size.
Particle size plays a role in classification, as seen in clusters with different material structures but similar particle sizes, for example, formulations containing Benecel K100LV and polyvinyl alcohol (Fig. 2 − UMAP 1; highlighted in brown).Furthermore, incorporating material proportions was crucial in accurately classifying formulations, this is evident as printable and non-printable formulations with identical materials but varying proportions were correctly classified by the model, despite clustering together.
The plots demonstrate that the model correctly identifies materials that make formulations non-printable.For example, the model accurately clustered formulations containing Tween 80, polyethylene glycol (PEG) 2000, and PEG 400, as being non-printable.Similarly, the model correctly identified a cluster of materials containing triethyl citrate (TEC), Tween 80, and polypropylene glycol (PPG) as not printable, despite their different molecular structures (Fig. 2 − UMAP 1; highlighted in violet).The only commonality was that their particle size was set to zero, as they were all liquids at room temperature.This demonstrates the model's ability to correctly identify the importance of particle size as an input, which played a crucial role in modifying the expected outcome.
The model correctly clusters formulations containing the same material but different grades, however their correct classification is inconsistent.For example, formulations containing Blanose 12M31P EP (printable) and Blanose 9M31F, 7MF (non-printable) are grouped (Fig. 2 − t-SNE 1; highlighted in red).Conversely, 12M31P EP is classified as printable by the model even though it has the same proportion of Blanose as the non-printable inks.This outcome exceeds expectations and highlights the model's ability to identify subtle material differences.On the other hand, a formulation containing Klucel LF was wrongly classified because it clustered with non-printable inks containing Klucel of different grades (Fig. 2 − UMAP 2; highlighted in violet).Different grades of materials are challenging to distinguish because manufacturers determine grades differently and they require different features to be identified.For instance, Klucel grades have varying molecular weights and particle sizes, but this could not be included since the molecular weights of multiple materials were not disclosed by the manufacturers.As a result, the model had difficulty differentiating between the various grades of certain materials, highlighting a weakness in its ability to classify them correctly.
The plots provide other valuable insights into potential reasons for the misclassification of the printability of formulations.For instance, a formulation containing PEG 2000 is misclassified as printable because it clusters with other, larger molecular weight PEGs (Fig. 2 − UMAP 1; highlighted in red).However, its smaller molecular weight results in a lower melting point, making it unprintable.Some misclassifications were also observed in formulations surrounded by unrelated materials in terms of structure and particle size, which could explain the inaccurate classification (Fig. 2 − UMAP 2; highlighted in red).For example, the model struggles with classifying a formulation containing multiple materials − Candurin®, mannitol, magnesium stearate, polyvinyl propylene (PVP), riboflavin, TEC, and xylitol.It was classified as printable, but there was no clear relationship with surrounding formulations, which were a mixture of printable and non-printable inks indicating the model's confusion in grouping it (Fig. 2 − t-SNE 2; highlighted in red).Misclassifications also occur due to the presence of proximate formulations with materials that had similar structures, such as the incorrect classification of a formulation containing cellulose acetate as nonprintable, which was grouped with non-printable formulations containing ethyl cellulose and chitosan, which are all polysaccharides (Fig. 2 − t-SNE 1; highlighted in violet).
While the combined feature set of particle size and MFP yielded the best performance in this study, the marginal difference compared to using the MFP alone warrants careful consideration.Notably, the MFP feature set had a higher AUROC, indicating a reduced likelihood of false positives, which could otherwise lead to significant waste.Integrating particle size necessitates additional laboratory work for its determination, adding complexity and resource demands to the process.Given the near-similar performance achieved by the MFP alone, without the requirement for any lab work, the justification for employing both MFP and particle size together becomes challenging.The MFP alone offers a pathway to save time and resources and enhance process automation.As a result, it was concluded that further studies should prioritize the exclusive use of the MFP, aligning with the broader goals of efficiency and effectiveness.

Exploring model performance on a larger dataset
The advantage of identifying an optimal feature set that bypasses the need for additional laboratory tasks allowed for dataset expansion by Y. Abdalla et al. including formulations from the scientific literature.This integration, otherwise inhibited by the local unavailability of certain materials, can be used to improve the model's performance, and broaden its applicability.The new dataset consists of 278 formulations made up of 115 materials.Exploratory data analysis of the new dataset revealed a similar trend to the initial data set.Candurin® was in 96.7 % of the drug formulations, and the most frequently used drug was Paracetamol, which was in 26 % of all the formulations (Fig. 3A).Table S2 enumerates the frequency of use of each material.Overall, 53.8 % of the formulations were successfully employed to print tablets (Fig. 3), meaning the target predictions are slightly less balanced than the previous predictions.This is anticipated as the mined literature only included positive data, which is a major barrier to ML in formulation development (Bannigan et al., 2023).
The model was re-trained with the new data, using the MFP-the best-performing dataset-and the drug formulation composition to explore explainability.A reduction in model performance became evident with the introduction of the new dataset (Table 3), likely due to previous overfitting, i.e. the previous model was not generalizable.Despite performing well on the unseen testing set, the materials in the previous dataset were much more similar.The initial dataset included multiple grades of the same material and other materials that share many chemical characteristics, and hence similar MFPs so that the model could perform well on the test set.Conversely, the new data set includes many new materials and forms a more heterogeneous data set, so although performance is reduced, this model is likely to be less overfit and more robust for classifying external data.In line with previous findings, this data demonstrates that the model trained using the MFP outperformed the drug formulation composition model.The model based on formulation composition showed improved performance than that observed with the smaller dataset.Additionally, on average, the ensemble improved compared to a single NN.
This research builds on the work of (Abdalla et al., 2023), which demonstrated the high accuracy of predicting SLS printability of medicines.However, this paper shows enhanced model performance and identifies the MFP as the optimal feature set for predicting printability.The advantage of this feature set over the previously used in vitro characterisation methods (Fourier transform infrared spectroscopy, Xray powder diffraction, and Dynamic Scanning Calorimetry) is that it requires no prior lab work.Furthermore, the developed model is both uncertainty-optimised and more interpretable, allowing users to have greater confidence in the predictions and a better understanding of the model's decision-making process.These improvements in interpretability and uncertainty optimisation can be applied to all current machine learning research in 3D printing, which has yet to demonstrate these aspects effectively.
Multiple studies have been carried out in the past few years investigating the use of ML to predict printability.Previous research on the printability of formulations reveals that NN has surpassed tree-based models in performance (Elbadawi et al., 2020;Muñiz Castro et al., 2021).This is notable even though the analysis utilized tabular data, a condition under which NN doesn't usually outperform tree-based models (Grinsztajn et al., 2022).In addition, the Deep Ensemble demonstrated performance comparable to other models, despite relying on a smaller dataset.A particular resemblance can be seen in the work by Madžarević et al. (2021) as they also employed SLS printing in their study.The work displayed performance metrics similar to the present study.However, it is worth noting that the data was largely consistent, comprising a small (27 formulations) dataset of similar formulations, all containing irbesartan.This similarity increases the likelihood of overfitting in the model.Most other papers in this field have focused on thermal properties, which are less relevant for SLS printability.Unlike the technologies that depend on thermal polymers, SLS printing emphasizes the importance of particle flow as a critical factor (Madžarević et al., 2021).Compared to other studies, the dataset herein was the smallest, yet the performance was on par with the others.

Validating the Deep ensemble
Finally, the developed model was tested using lab data to evaluate the model's application in real-life scenarios.New formulations (Table S3) were developed, and their compositions inputted into the model to assess printability before actual printing trials.Subsequent printing attempts with the drug formulation validated the model's predictions.Overall, 52 different formulations were tested, of which 46% are printable.Model performance on this data is in Table 4.
The reduced performance of the model on the new data was anticipated, as this data included materials that had never been encountered during training.This unfamiliarity is reflected in the high log loss scores, indicating that the new data falls outside the training data distribution.In line with the literature, this highlights the utility of an ensemble approach in determining such discrepancies (Nemani et al., 2023).This is particularly observed with the model trained using the MFP dataset, as most misclassified drug formulations are those containing materials that were not included in the training dataset.Some misclassifications were due to the presence of different grades of the same materials in the training dataset.For instance, a drug formulation containing Kollidon CL-SF was wrongly classified as printable, likely because the training set included the printable Kollidon CL-M.As both drug formulations would have an identical MFP, the model failed to differentiate between them.This issue could potentially have been mitigated by incorporating particle size into the model, enabling the recognition of different material grades.Such an addition would offer a more nuanced understanding of the differences between these closely related materials.The drug formulation composition model demonstrated significantly poorer performance compared to the MFP dataset.This outcome is expected, as the MFP dataset reveals the chemical composition of pharma-inks, enabling the model to learn fundamental insights about formulations.These insights can then be applied to other formulations with different materials but similar chemical properties.In contrast, the formulation dataset limits the model to leveraging prior knowledge of formulations that share the same materials.To improve the model's performance on materials within the initial dataset, the best approach is to re-train the model with an expanded dataset including new, more diverse materials.Other approaches to improve model generalizability include augmenting the data, adding noise to the training data, or adopting a semi-supervised approach to training the model (Bishop and Nasrabadi, 2006).

Exploring explainability
To validate the model's decision-making and gain further insight into its decision-making, the hidden features linking to the output layer of each neural network in the ensemble were extracted and compressed into two dimensions using UMAP and t-SNE, and compared to the input data (Fig. 4).Again, the model demonstrated that although the data initially was not clustered, it could group the printable and nonprintable formulations.The subclusters here are similar to those seen before (Fig. 2) − formulations with different proportions of the same material and those with different grades of the same material and materials with similar structures are grouped.However, while the plots provide some interpretability to the model's performance, they do not provide explainability and other methods should be used for this purpose.
To further explore model explainability and discern what materials contribute to a printable drug formulation, as well as which materials to avoid, the SHAP values of the model trained on the formulation composition dataset were calculated (Fig. 5).Notably, most of the materials demonstrate no contribution to the decisions made-a result anticipated due to the infrequent use of most materials, leading to zero values in most formulations and, thus, no contribution to the final decision.However, common trends were observed, such as the strong negative contribution of TEC, PPG, and PEG 400, to the printability of the formulations.This trend was consistent with the training dataset, where no inks containing these materials were found to be printable.In contrast, materials like Kollicoat® IR, Eudragit® RSPO, mannitol, and Eudragit® L100-55, which are frequently used in SLS formulations, were found to contribute positively to the decisions made.These plots hold significant value for scientists in guiding effective formulation development.By recognizing what contributes positively to SLS formulations and what materials have a negative impact, researchers can tailor their inks and include specific components to achieve a printable drug formulation.

Conclusion
The innovative approach presented in this study fuses two powerful technologies, ML and 3D printing, to evaluate the capacity of various ML models to predict SLS formulations' printability.An ensemble of NNs was trained on diverse features to confidently predict the printability of drug formulations, whilst focusing on interpretability.An Ensemble NN trained on the MFP of pharma-inks emerged as the optimal approach to predict SLS printability, it yielded the best balance between confidence and accuracy, achieving 90 % accuracy.Subsequent explainability analysis revealed the materials that contributed positively or negatively to SLS printability.This study is the first to evaluate the use of an interpretable, uncertainty-calibrated DL ensemble in both the field of 3D printing and pharmaceutics and the proposed workflow can be used to   Y. Abdalla et al. accelerate the development of personalized medicines.

Fig. 1 .
Fig. 1.Comparison of traditional ML model performance with the Deep Ensemble.The higher the accuracy and AUROC the better the performance and the lower the log loss and Brier score the lower the uncertainty, indicating a better calibrated model.Abbreviations: RF: random forest, LR: logistic regression, SVM: support vector machine classifier, GB: gradient boosting, XGB: extreme gradient boosting, DTr: decision tree, MLP: multilayer perceptron, KNN: K nearest neighbors, EXTr: extra trees.

Fig. 2 .
Fig. 2. T-sne and umap visualization of hidden features of the input data and the final hidden layer of the different individual nn.printable, not printable predictions and misclassifications of the ensemble are denoted, and circles highlight notable clusters.

Fig. 3 .
Fig. 3. Exploratory data analysis.(A) The distribution of use of different materials in the drug formulations and the (B) printability of different inks.

Fig. 4 .
Fig. 4. t-SNE and UMAP visualization of hidden features of the input data and the final hidden layer of the different individual NNs.Printable, Not Printable predictions and Misclassifications of the ensemble are denoted.

Fig. 5 .
Fig.5.SHAP analysis for the ensemble Neural Network (NN).The color of each dot represents the material contribution to the decision (high is pink and low is blue), and the horizontal position determines whether the material contributed positively or negatively to the printability of the formulation.The swarm plot for each member of the NN is shown, as well as the swarm plot for the entire ensemble.

Table 3
Performance comparison on 5-fold cross-validation for different features predicting the 3D drug formulation printability.Single denotes a single network; Ensemble denotes combining predictions of 5 networks.The best results are highlighted in bold.Abbreviations: MFP: morgan fingerprint, AUROC: area under the receiver operator characteristic curve.

Table 4
Performance comparison of the Deep Ensemble on predicting the printability of SLS formulations.Abbreviations: MFP: morgan fingerprint, AUROC: area under the receiver operator characteristic curve.