Machine learning based data driven inkjet printed electronics: jetting prediction for novel inks

Machine learning (ML) as a predictive methodology can potentially reduce the configuration cost and workload of inkjet printing. Inkjet printing has many advantages for additive manufacturing and printed electronics including low cost, scalability, non-contact printing and on-demand customization. Inkjet generates droplets with a piezoelectric dispenser controlled through frequency, voltage pulse and timing parameters. A major challenge is the design of jettable inks and the rapid optimization of stable jetting conditions whilst preventing common problems (no ejection, perturbation, satellite drop, multiple drops, drop breaking, nozzle clogging). Material consuming trial and error experiments are replaced here with a ML based jetting window. A dataset of machine and material properties is created from literature and experimental data. After exploratory data analysis and feature identification, various (linear and non-linear) regression models are compared in detail. The models are trained on 80% of the data and root mean square error (RMSE) is calculated on 20% test data. Simple polynomial relationships between the input and output features yield coarse prediction. Instead, small ensembles of decision trees (DTs) (boosted DTs and random forests) have improved predictive power for drop velocity and radius with RMSE of 0.39 m s−1 and 2.21 µm respectively. The mean absolute percentage error is 3.87%. The models are validated with experimentally collected data for a novel ink where no data points with this ink were included in the training set. Additionally, several classification algorithms are utilized to categorize ink and printer parameters by jetting regime (‘single drop’, ‘multiple drops’, ‘no ejection’). Categorization and regression models are combined to improve overall model prediction. This article demonstrates that ML can be used to predict ink jetting behavior from 11 different ink and printing parameters. Different algorithms are analyzed and the optimal combination of algorithms is identified. It is shown that experimental and literature data can be combined and an initial dataset is created that other reserachers can build on in the future. ML enables efficient material and printing parameter selection speeding up the development of novel ink materials for printed electronics by eliminating jetting experiments that are money, time and material intensive.


Introduction
As an additive manufacturing strategy for electronics, inkjet printing is promising due to its advantages such as room temperature deposition, absence of vacuum processes, compatibility with numerous cheap, flexible substrates, and large-scale roll-to-roll processing [1].Inkjet printing is a digital, maskless process that allows patterns to be changed on the fly.Various microelectronic devices have been inkjet printed, including transistors [2][3][4][5], RFID tags [6,7], or passives such as capacitors and inductors [8,9].Typical inkjet dispensing is a drop-on-demand (DoD) operation that ejects ink droplets from a piezoelectrically driven nozzle (see figure 1(a) for an illustration).
We consider a piezoelectrically actuated dispenser here that is controlled through pulse voltage, frequency, and timing parameters to convey the necessary droplet generation energy.These parameters have important consequences for printing consistency.Various inks are used for the printing of conductor, insulator, or semiconductor materials.These inks have different electrical and material properties for different applications: mass loading or concentration, particle size, viscosity, density, surface tension, and acoustic wave speed.All of these ink properties affect the jetting and the pattern formation on the substrate.Pattern formation on the substrate can be further manipulated by changing the properties of the substrate and the spacing and order in which drops are deposited [10][11][12][13][14][15][16][17].Here, we will focus on the ejection of drops from the nozzle, which is the first step in the inkjet process.There is only a small window of material and signal (e.g.jetting voltage, frequency, or timing) parameter combinations where there is stable jetting with optimal drop velocity and volume.Outside of this window, either there is no ejection from the nozzle or the jetting is unstable, and the drop breaks up into multiple droplets or satellites (see figure 1(b)).Drop quality is critical for successful manufacturing using inkjet printing.If the drops break or jet with satellites, the printed lines or circuit patterns will have irregular shapes and defects.This can adversely affect performance, yield, and variability of printed devices such as transistors.Therefore, methods need to be developed to achieve precise jetting.
Several theoretical and experimental studies have explored the underlying physics and experimental conditions for droplet generation and jetting characterization.Hoath et al estimated drop speeds from a range of industrial DoD inkjet print heads, namely Xaar, Spectra Dimatix, and MicroFab through simple theoretical models along with numerical simulations [18].They concluded that drop speed depends on fluid properties, nozzle exit diameter, and printer voltage magnitude, and different types of fluid (Newtonian, weakly elastic, or highly shear-thinning) demonstrate linearly increasing drop speed with voltage above a threshold.Using modeling and numerical simulations of fluids with varying fluid properties (surface tension, viscosity), they reported that the final drop speed is a function of voltage, the threshold voltage (a function of viscosity), and nozzle tip area.They maintained persistent frequency, rise time, fall time, and dwell time throughout the investigation.Liu et al showed that droplet formation is impacted not only by the fluid properties such as viscosity but also by the driving waveform parameters [19].Lai et al carried out computational fluid dynamics (CFD) to study physical phenomena of droplet ejection.They compared experimental results with numerical simulations and determined how nozzle channel curvature affects drop velocity, volume, and the quantity of satellite drops.In addition, they demonstrated that nozzle diameter, voltage amplitude, or frequency influence drop volume and velocity [20].Reis et al experimented with a piezoelectric inkjet printer to explore the drop volume and velocity as a function of voltage and frequency, and summarized drop volume as a function of Ohnesorge number [21].Yang et al eliminated satellite droplets by designing nozzles with super-ink-phobicity and ultralow adhesion with a view to enhancing Rayleigh filament instability along with speeding up drop pinch-off from the nozzle [22].Wu et al inspected the droplet formation in terms of volume and velocity resulting from monopolar drive voltage, frequency, timing, and compared the results with the acoustic wave propagation theory.They show that drop velocity and volume increase with increasing voltage [23].He et al created a binary fluid model with time-dependent actuation to investigate droplet formation in piezoelectric inkjet printing.They determined that nozzle wettability and high surface tension improve drop quality, and increase drop speed [24].Another common approach is to determine a jetting window of conditions under which stable droplets are formed [25][26][27].These windows are often expressed in terms of non-dimensional fluid mechanical numbers such Ideally, a stable stream of well-defined drops is created.Outside of this desirable regime, the ejected drops can break up into multiple droplets, or there can be no ejection.
as Z-number, Weber number (We), Capillary number (Ca), Ohnsorge number (Oh), or Reynolds number (Re).The stable jetting window is typically bounded by straight lines defined in terms of some of these non-dimensional quantities.There are several challenges with this approach.The boundaries of these windows are not very accurate, and different studies in the literature report jetting windows with somewhat different numerical values.This may be due to the fact that different reports study different printable materials that may behave differently.Another difficulty is that most of these non-dimensional numbers, except Oh and Z, contain a velocity term.This is no problem when results are analyzed after performing printing experiments.However, for a new ink, the drop velocity is generally not know a priori and needs to be determined experimentally as a function of printer parameters.Therefore, there is a need for a method that can predict jettability, drop velocity, drop volume, and optimal printer parameters for a new ink.
Machine learning (ML) is a technique that can potentially forecast drop velocity and volume as well as categorize jetting type.The strength of ML is that it can capture complex dependencies between a large number of input parameters and a desired output.This sets ML apart from the above reviewed analytical models.Whilst these models are powerful in terms of physical insight, they are limited in terms of quantitative predictions for a large and complex parameter space.In many cases, ML is a blackbox approach that does not give insight into how a model arrived at its answer; however, some algorithms such as decision trees (DTs) can also be interpreted to get insights into the underlying processes.Using ML, is motivated not only in terms of deriving a more accurate jetting window but also saving experimental time and cost [28].Recently, ML has been applied to the design of bioinks for a different printing method, extrusion printing of cell-laden gels [29,30].For inkjet printing, Wu et al have used a data-driven approach to predict drop speed and volume of a polymer ink.This work predicts jetting from only three signal parameters (voltage, pulse duration, and rise time) [31].These features are not sufficient if the material is varied as they do not include fluid properties such as viscosity, density, or surface tension.To create a more general model, a larger and more complete dataset is key.CFD simulations have been used to generate more training data with limited experimental validation [32,33].However, there is a shortage of experimental learning-based studies on printing and material attributes during inkjet printing at the micrometer scale.
Here, we demonstrate a more comprehensive ML approach that takes various ink and printer parameters into account (11 features in total).The model is trained on a range of inks to learn general dependencies using experimental data from our own lab and from literature.For previously untested inks, this learning-based approach can help select optimum ink material properties and identify the operational region of the nozzle signals to achieve the desired drop volume and velocity.

Methodology
The problem was divided into three parts: (a) collect data experimentally with different printing conditions and dissimilar materials, as well as compile literature data, (b) process experimental image data and explore feature importance, then relationships, (c) construct and validate predictive models for drop formation and finally assemble results.The chart in figure 2(a) represents the workflow.Lab image datasets of ejected drops are processed through an image processing pipeline, as shown in figure 2(b), to calculate the experimental drop velocity and radius.As the size of the dataset is of critical importance for ML, we combined our experimental data with literature data.The laboratory dataset was cleaned to exclude outliers and duplicates before merging the datasets.Then, feature relationships are extracted.For drop velocity and radius prediction, the same input features are pre-processed, sampled, and different forecasting models are applied to the training set with hyperparameter optimization.The optimized trained models are saved and tested on the test data.

Parameter variation
Data collection, identification of essential input features and target output are the preliminary steps for implementing a ML model.Input data assessment was completed in two stages: varying material and machine parameters.As a function of these inputs, the model predicts two continuous outputs (drop velocity and drop radius) as well as one discrete output (jetting quality with three levels: 'single drop' , 'no ejection' , or 'multiple drops').In the case where multiple drops are ejected simultaneously, the biggest drop is referred to as the primary drop, and the next biggest drop is termed as the secondary drop.
The critical machine parameters that are considered for successful inkjet printing are frequency, dwell voltage (V dwell ), echo voltage (V echo ), dwell time (t dwell ), echo time (t echo ), rise time (t rise ), fall time (t fall ) and nozzle diameter, as shown in figure 1(a).Each of these parameters has a different effect [34][35][36][37].The input waveform might consist of unipolar, bipolar, or sinusoidal pulses; nonetheless, the bipolar trapezoidal signal creates more stable drops.The signal waveform has a DC offset voltage level, which is called idle voltage, and it is set at zero.Other than this DC voltage level, the positive and negative pulse amplitudes have essential roles to play.The positive voltage amplitude is called dwell voltage, and the time required to reach this amplitude from the idle voltage is called the rise time, providing an interval for the initial fluid expansion.The negative voltage amplitude is named echo voltage.The time duration of the falling edge of the dwell pulse is labeled as fall time, which determines fluid compression and drop discharge time from the nozzle.Thus, echo voltage and its timing adjustment can potentially reduce unstable drop formation.Durations of the positive and negative voltage pulse plateaus are termed as dwell and echo time and need to be large enough to allow for pressure wave propagation through the dispenser.Conventionally, V dwell = −V echo and t dwell = 2 × t echo to allow pressure wave optimization within the nozzle [37].The collected literature data includes a wide range of values for V dwell , V echo , t dwell , and t echo .Density, viscosity, and surface tension were varied as material parameters affecting material printability.For the lab data collection, the feature space spans the following range: There are some physical restrictions for choosing the feature space.Materials parameters have been chosen to lie within the specifications of common inkjet nozzle manufacturers, Microfab and Fujifilm Dimatrix.There is a range for the printer parameter values (1-7) beyond which the nozzle does not work.When the frequency is above 2500 Hz or rise time, fall time, or dwell time exceed 40 µs, the jetting becomes unstable, and in most cases, the drop velocity and radius are not measurable.For a unipolar pulse, dwell voltage can go up to 80 V.In the bipolar case beyond 45 V, mostly unstable jetting is produced with echo voltage below −80 V. Considering all these physical restrictions, the above range was selected for collecting the lab data.

Experimental
The experimental data points were collected, varying the identified attributes as discussed in section 2.1, and merged with corresponding literature data points.

Data collection
Six solvents were used: triethylene glycol monoethyl ether (TGME), 2-propanol (IPA), 1-hexanol, toluene, methoxy ethanol, and a 50% ethylene glycol-water mixture.All solvents were reagent grade purchased from Sigma-Aldrich.Silver nanoparticle ink (ANP DGP 40LT-15C) with the major solvent TGME was bought from Advanced Nano Products, Co., Sejong, Korea.To have a variation in material properties, three binary mixtures of TGME and silver ink were prepared with 70%, 80%, and 90% silver ink concentration, respectively.In total, data was collected for six different solvents and three different concentrations of silver ink for model training and testing.An ink consisting of graphene oxide (GO) suspended in water and ethylene glycol was used for evaluation of the models with no data points from this ink being included in training.Jettability was tested using a custom-built inkjet printer using the Microfab JetDrive TM III electronics (Microfab Technologies, Inc., Plano, TX) controlled with a custom LabView program.The nozzle diameter was 60 µm (MJ-ATP-01-60-8MX, Microfab Technologies, Inc., Plano, TX).The nozzles can handle only low-viscosity materials with viscosity below 20 cP and surface tension between 20 and 50 mN m −1 .Viscosity was measured with a rheometer (microVISC-m, RheoSense Inc., San Ramon, CA).All the measured fluids are Newtonian.Viscosity was measured at different shear rates between 100 s −1 and 1000 s −1 , and viscosity was constant with shear rate.Surface tension was measured using the pendant drop method with a drop shape analyzer (FM40, Krüss Scientific, Hamburg, Germany).Density was measured using a precision balance.The inkjet nozzle was prepared by cleaning with acetone, IPA, and de-ionized (DI) water in a sonicator.A strobe light-emitting diode (LED) and camera are arranged co-linear with the nozzle and ejected drops to capture bright, clear, and focused drop images without blurring.The strobe frequency is equal to the drop ejection frequency.For each solvent, the same jetting setup was tested twice on two different dates.
Additionally, published data for six materials (alumina suspension in hydrocarbon media, paraffin, DI water, ethylene glycol, butyl carbitol, and silver nanopowder) were collected from literature and Microfab technotes [21,23,35,36,38].The collected literature data consists of different nozzle sizes and printers from different manufacturers for different materials, but contains only velocity information, not drop radius.Literature data was collected through the web-based image extraction tool WebPlotDigitizer.
Finally, 769 lab data points and 2176 literature data points [21,23,35,36,38] were combined.The full dataset as well as the Python code is available as supplementary material available online at stacks.iop.org/FPE/7/015009/mmedia.The version numbers of Python and the used libraries can be found in the supplementary information.Histograms of the collected lab data are displayed in figure 3(a).These histograms are useful as they indicate not only the range for each feature but also the jettable region using different colors.For instance, the voltage histogram shows that voltage is varied from 0 to 80 V. Below 10 V, jetting was not possible.This data also shows the obtained velocity and radius ranges.The observed radius values lie approximately within 25-55 µm, which corresponds to the nozzle radius 30 µm, but also depends on the other features that are varied at the same time.Figure 3(b) shows the distribution of the collected lab data points between the three different drop-ejection classes.Shannon entropy, known as Shannon Diversity Index in statistics, is used here as a measure of class balance using equations ( 1) and (2) [39].For the collected lab data with a set of 769 data points in three classes ('multiple drops' (346 points), 'single drop' (217 points), and 'no ejection' (206 points)), the entropy is computed with equation (1) where k = 3: where n is the total number of data points, k = 3 is the number of classes, and c i is the size or data count in each class.The Shannon homogeneity index E H is computed with equation ( 2) The value of E H is zero for a very unbalanced dataset, and for balanced data, the value should be close to 1 [40].For our lab dataset with 769 data points, the computed value of E H is 0.97.Therefore, the dataset is considered to be close to balanced.The remaining imbalance is due to the fact that the 'multiple drops' class corresponds to a larger portion of the parameter space because most unoptimized printing conditions lead to non-ideal jetting.

Image processing
All of these features are varied within their range as described in section 2.1, and the resulting drop images are captured with a camera using a VC500 video capture device with EZ-grabber version 3 software (Diamond Multimedia, Canoga Park, CA) with 720 × 480 pixels resolution.One pixel is equivalent to less than 1 µm.While capturing the image, a strobe LED illuminates the drops in flight.The generated drops are around 60 µm in diameter, and in the captured images, the drop diameter ranges from 40 to 80 pixels.The measurement error due to the one-pixel resolution limit is ±5% for the diameter and ±15% for the drop volume [41].For drop radius measurement, the delay is adjusted to obtain uniform round shaped drops without satellites.Figure 2(b) shows the image processing pipelines for the velocity and radius measurements.All the collected images are processed in OpenCV for Python performing RGB to grayscale conversion, noise elimination, binary thresholding, automatic cropping, and scaling to remove the nozzle area and keep only the generated drops.For radius measurement, SimpleBlobDetector is adopted from OpenCV, which works on thresholded binarized images (1: white background, 0: black drop foreground).In each binarized image, connected black drop pixels are grouped and form blobs.The centers of the blobs (drops) are computed, and blobs closer than the minimum threshold distance are merged.Finally, the centers and radii of the merged drops are computed and returned.For drop velocity calculation, two consecutive images of the same drop with 50 µs delay are used.After edge detection, the lowest bottom point of each drop is measured.The distance traveled by each drop is calculated from the difference in the position of the bottom point in two subsequent images and is divided by 50 µs to find the drop velocity [42].

Data processing
Drawing insights from data through ML requires careful data collection, pre-processing, and model selection.Data as the core of any ML algorithm needs to be in the form that the algorithm can understand and can reveal meaningful patterns from.Here, the 1st four steps of the CRoss-Industry Standard Process model for the development of ML applications are followed for all data management [43].First, as a process of data understanding, data engineering was carried out to convert raw literature data and lab data into a common structured form (CSV source) for ML.
Literature data is unstructured as it is a combination of documents, graph images, and table images.The collected lab data is also initially unstructured as it is collected as images.Literature data sources have been parsed, joined, and put into a tabular form.The unstructured lab image dataset is passed through the image processing algorithm as described in section 2.2.2 to measure the target drop velocity and radius values.The two sets of data are merged into the final structured CSV format.Second, a data processing operation is performed on the merged dataset.Data is cleaned to remove data points with outlier drop velocity and radius, to ensure that each row represents a unique printing condition, and to ensure that each column represents a distinct feature.Finally, we have 769 lab data points and 2176 literature data points.Feature engineering is implemented to understand the features and format them as expected by the ML model.Some missing values such as secondary drop radius and velocity, which occur in the case of 'multiple drops' , are imputed as zero for the 'single drop' and 'no ejection' cases.Outliers are removed using Scikit-learn [44].At first, 17 input features (material name, waveform type, printer name, material mass loading, dwell time, echo time, rise time, fall time, dwell voltage, echo voltage, frequency, density, viscosity, surface tension, wave speed, nozzle orifice diameter) and three targets (drop velocity, radius, jetting category) are collected.Then, irrelevant feature columns are dropped.Material name and material mass loading are represented by the material parameters density, viscosity, and surface tension.Waveform type is represented by the echo voltage value being zero for a unipolar waveform, and negative for a bipolar waveform.Printer name is represented by other printer attributes such as voltage, frequency, timing and nozzle diameter.The quality of each numerical input feature column is further improved through data standardizing (normalizing) and clipping outliers.The categorical target output (three jetting categories) is transformed to numerical representation through one-hot encoding.Third, the training and test evaluation subsets are selected through random sampling from the merged shuffled dataset.ML data models have been developed to create meaningful insights and forecast drop velocity, radius, and drop type using ensemble learning.The best performing ensemble algorithms are selected based on the RMSE for test data given by equation (3).Fourth, the algorithm performance is evaluated with untested data where N p = total number of data points, y i = target value, y p = predicted value.

Model architecture
Several different ML architectures were tested to quantitatively predict drop radius and velocity as well as classify the jetting regime.The most promising architectures are briefly discussed in this section.
More detailed descriptions can be found in the supplementary information.

Decision tree (DT)
DT is a white-box supervised learning procedure for discrete and continuous prediction tasks, which will be used here to forecast inkjet printing by learning simple decision rules from the printer and material features.In DT, the dataset is progressively partitioned at each node based on a thresholding criterion until it reaches a decision.Because of their simple if-then-else logic construction, decision rules are interpretable, and prediction cost is logarithmically dependent on the number of training samples.
Overfitting is suppressed by optimizing the minimum data sample size at each node and maximum tree depth.The Scikit-learn classification and regression trees (CART) adaptation [44] was implemented with binary trees using the 11 features and the largest information gain thresholding at each node [45,46].

Random forest (RF)
RF is an ensemble learning technique that builds N uncorrelated base learners (trees, linear models) by bootstrapping training data into different subsets [44].RF improves variance while aggregating uncorrelated trees through averaging and avoids overfitting [47].Being an average of multiple DT is more immune to training noise as opposed to a single DT.During each sampling, r(=√t) arbitrary features are chosen out of all t features to trade-off the sampling variance and reduce the correlation between the learners [48].As shown in figure 4(a), the RF regressor fits N DTs individually on bootstrap sampled subsets of the data and aggregates the trees through majority voting (for classification) or averaging (for regression).It took 0.517 s to train the model with grid search CV cross-validation to find the optimal number of trees N, maximum number of features r, and minimum number of samples in a leaf to set the stopping rule.

Gradient boosting (GB)
GB [49] was adopted here for regression and categorization tasks using Scikit-learn XGBoost [44].As shown in figure 4(b), boosting fits N DTs simultaneously on the training set and builds a recursive model to minimize the loss function RMSE.Crossvalidation training of the GB model with constrained (number of trees, tree depth, learning rate) optimization took 0.512 s.GB utilizes CART as base learners, as described in section 2.3.1.The maximum depth of each base learner tree was chosen through grid search CV.

Merged regressors and classifiers
A problem of applying tree-based predictors occurs in the case of zero output values, which can be erroneously predicted as small nonzero values.These zero values for drop velocity and radius correspond to the case where no drop is ejected.The model needs to Python package [50].A simple DT classifier with depth three and a K-nearest neighbor classifier with two neighbors were also constructed and performances were compared.

Model comparison and polynomial fitting
The final dataset consists of about 3000 data points pre-processed through a Scikit-learn pipeline [44] of normalization, scaling, categorical encoding, and improper and missing value elimination.Eleven important features (pulse duration, echo time, rise time, fall time, frequency, nozzle orifice diameter, voltage, echo voltage, density, viscosity, and surface tension) are selected through applying two different feature selection classifiers, GB and RF.In total, we trained 14 regressive models: 11 linear models (linear regression, ridge, ridgeCV, lasso, lassoCV, elastic net, Bayes ridge, orthogonal matching pursuit, Theil-Sen, RANSAC, Huber regressor) and three non-linear models (DT, RF, GB) [44].The models were trained with 80% of the complete dataset and the most important 11 features.The RSME (equation ( 3)) for 20% test data shows that simple linear relationships between the inputs and output do not give a good prediction and have a large RSME (see figures 5(a) and S1 for the corresponding R-squared values).Rather, non-linear regressive models, particularly DTs, RF, and GB, model the underlying physics with less error.Therefore, after deciding to implement non-linear regressive models, we focused on these best three models and also implemented averaging, weighted averaging, and majority voting with Scikit-learn [44] to minimize the RMSE of both drop radius and velocity prediction.The data were classified into three jetting regions: 'no ejection' , 'single drop' , 'multiple drops' .It was recognized that pulse duration, rise time, fall time, and frequency follow polynomial trends, while the others (nozzle diameter, viscosity, density, surface tension, voltage, echo voltage) exhibit linear trends.The polynomial degree that best fits the data without overfitting is determined by plotting the RSME as a knee curve in figure 5(b).Around 5 • , there is a sharp decline in the error rate showing that 5 • polynomial fitting most accurately describes the feature pattern relationships.Some of the other mentionable relationships from the collected data are displayed in figures 5(c)-(g), where dots represent experimental results, and solid lines are linear and polynomial fitting.Some important trends can be observed in these plots.Drop velocity and volume both show linear relationships with dwell and echo voltage while keeping other parameters unchanged, but maintain a polynomial relationship of 5th degree with the dwell time.Drop volume is plotted in these graphs rather than drop radius to highlight this linear relationship, which would be more difficult to interpret if drop radius was plotted.Figure 5(c) shows different slopes of drop volume with voltage depending on the pulse type and whether echo or dwell voltage is varied: approximately 65 pL V −1 for unipolar pulses (echo voltage set to zero), approximately 75 pL V −1 for bipolar waveform (echo voltage with a negative value), and approximately 55 pL V −1 for echo voltage variation at a fixed dwell voltage.Similar trends can be observed for drop velocity (see figure 5(d)).Figure 5(e) shows the minimum voltage value for creating drop ejection while other parameters are kept at a fixed value of dwell time 15 µs, rise and fall time 3 µs, frequency 1000 Hz, and echo voltage −30 V.As shown by Duineveld et al [51], the minimum velocity (V min ) required for creating a drop can be approximated by equation ( 4): where γ is surface tension, ρ is density and d is nozzle diameter.The calculated minimum velocity of each of the materials is plotted on the y-axis of the graph in figure 5(e), while the minimum required voltage value to create this minimum velocity was extrapolated from the linear fitting voltage vs velocity curve for bipolar pulses shown in figure 5(d).For TGME and silver, this velocity is found to be lower than for the low-viscosity materials.The measured minimum drop velocity (marked as the bar) leaving the nozzle in the lab setup is somewhat lower than the calculated result from equation (4) (marked as a circle) for all the materials highlighting the limitation of a simple analytical model.Futhermore, the required voltage cannot be predicted by the simple equation.Figure 5(f) shows that for increasing voltage (30-35 V), there is a peak shift in optimum dwell time.The maximum ejected drop volume and velocity shift towards the right for each of the materials.For low-viscosity materials, two peaks can be seen at different pulse dwell times.For example, IPA and hexanol show one small peak at 13 µs or 11 µs, and the other one at 27 µs or 25 µs respectively for 30 V. When rise and fall times are varied together by the same amount, the output velocity and radius maintain polynomial shapes (see figure 5(g)).Drop velocity and volume behavior of different materials are different when subject to the same signal parameters due to their differences in viscosity, surface tension, and density.It is not straightforward to predict drop velocity and volume from these material properties only.From the above results, it is evident that the prediction of jetting is a multi-dimensional problem.
The underlying behavior cannot be easily captured by simple linear or polynomial fitting, especially without a very large dataset.In the following sections, more sophisticated predictive methods are applied to the problem.

Drop velocity prediction
To predict the drop velocity from the machine and material parameters, the three most  pruned depth value of 10 makes the trees explainable and understandable as shown in figure 6(a).If it had not been optimized, the tree nodes would have expanded until all the leaves contain fewer than the minimum number of samples causing overfitting.For RF, 12 trees of depth 14 are selected through Grid-SearchCV with an average time utilization of 0.692 s.In this case, RF arbitrarily chooses a subset of the 11 features for final prediction with.The constraints for GB are selected through GridSearchCV: number of trees (10), maximum tree depth ( 14), learning rate (0.5), column sampling when constructing each tree (0.8), subsampling ratio of the training set to prevent overfitting (0.8), minimum child weight in each tree (2).Cross-validation training of the optimized GB model combines inputs from all ten trees for the final velocity decision through a voting process.Figure 6(a) illustrates how the DT model output can be interpreted.The maximum depth is set to three for better visualization.This example shows the echo voltage root node as the initial point for forecasting.The next split adds or subtracts a term to this sum, depending on the next node in the path.For each test data point, the path that matches the conditions is tracked, and an ultimate regression outcome is obtained.The output can be written as equation ( 5): Test Prediction = Bias + Root to decision node path contributions. ( It is evident in figure 6(a) that some features (echo voltage, viscosity) are utilized in multiple splitting stages, and so they are added as contributions several times.The value indicates the predicted velocity in each node.For instance, if the tree is used to predict the velocity for a test set of hexanol with echo voltage −30 V, density 815 kg m −3 , pulse duration 21 µs, voltage 30 V, frequency 1000 Hz, viscosity 4.59 cP, surface tension 25.73 mN m −1 , and nozzle orifice diameter 60 µm; it will follow the marked green path and will result in a velocity prediction of 3.13 m s −1 with a residual of (4.08 − 3.13) = 0.95 m s −1 , which is close to the RMSE of this simple tree model.The total contribution of viscosity is (2.28 − 2.05) + (3.13 − 2.9) = 0.47 m s −1 .The bias is 2.60 m s −1 , the contribution from fall time is 0.34 m s −1 , and the contribution of echo voltage is −0.27 m s −1 .Therefore, the overall prediction is the sum of all feature contributions and the bias equalling 3.13 m s −1 .
GB constructed with ten weighted trees has a much better overall RMSE of 0.398 m s −1 than a single DT with RMSE of 1.445 m s −1 .Weights are set for each tree output prediction, and an average is taken over them to calculate the final predicted velocity.Each of the booster trees has a maximum depth of 14.Each of the ten trees is not a very good predictor on its own compared to the aggregated prediction from the ten trees.Individual predictions of each booster tree can be explained by decomposing the prediction into the bias and contributions of different variables as shown in figure 6(b).
The optimized booster was used to predict the drop velocity of new data from the literature for the drop velocity of a silver nanopowder ink with different pulse amplitudes [38].None of the data from this paper was part of the training dataset.Two pulse amplitude values (36 V and 44 V) were tested with this model.The predicted velocity is very close to the experimental velocity.Each prediction can be expressed as a sum of feature contributions and bias from all the trees.With the help of the elif5 package code of Python, the decision path of the best trees for two test data points from [38] is broken down in figure 6(b).It shows that all the boosters predict a velocity of 2.195 m s −1 while the experimental velocity was 2.0 m s −1 , and the accuracy is 90.25%.The largest contributions for this particular test result are from echo voltage, voltage, viscosity, and surface tension.Bias is the mean velocity value of the training dataset.GB trees make dissimilar contributions for different datasets, although the bias (4.854 m s −1 ) remains the same for all.The right side of the table shows the aggregated boosting estimation for voltage 44 V as 4.468 m s −1 with an accuracy of 91.12%.For these two different test data points, the feature contributions are different as they are arranged based on their overall impact.For both tests, there is a noticeable impact of voltage, echo voltage, pulse duration, and nozzle orifice diameter.Figure S2 depicts the top 20 rules for velocity prediction extracted from the ten boosting trees by the Molnar rule fit algorithm [52].These rules are multiplied with their coefficients and summed to get the final prediction result out of the features.The voltage, nozzle orifice diameter, and pulse duration constitute the most important prediction rules.The importance column shows the percentage of data being affected by the corresponding rule.
The three tree-based regressive models exhibit difficulty in predicting zero values.For some attribute values, there is a distinct region where no drop ejection occurs.Velocity is zero and the model performance deteriorates.This 'no ejection' region can be separated from the jetting region ('single drop' , 'multiple drops') with a simple DT classifier.The intermediate regression values are multiplied with the classifier output, as in the algorithm shown in figure 4(c).The final predicted velocity multiplied with the classification result exhibits lower RMSE.To confirm that the predicted output agrees well with the real experimental results, the predicted output from the GB model was plotted against experimental data.It is clear that predicted velocity and test velocity agree well (see figure 7(a)), although there is some residual prediction error of 0.398 m s −1 .To further validate the GB model, it was tested with experimental velocity data for a graphene oxide ink that the training dataset contained no data points for.The ink has viscosity 8.7 cP, surface tension 57.96 mN m −1 , and density 1232 kg m −3 .The predicted velocity displays a linear trend with voltage in good agreement with the experimental data (see figure 7(b)), and the difference between the measured and predicted velocity is within the mentioned RMSE (0.398 m s −1 ) for the GB model.This is an important test for practical inkjet applications of the model because it means the behavior of new inks can be predicted without costly experiments.
The test results from the tuned simple DT, RF, and GB models in terms of RMSE are 1.55 m s −1 , 0.45 m s −1 , and 0.398 m s −1 , respectively as shown in figure 7(c).The DT prediction result is clearly the worst as the test and train RMSE are very large.RF is much better than DT as it selects features randomly during prediction.This means that it is less likely to under or overestimate the output for new untrained datasets.The additional regularization term and weight updates of GB help to avoid overfitting and result in the lowest test RMSE among these three models.The two best models (RF, GB) can be further enhanced by the implementation of simple ensemble methods: voting, averaging, and weighted averaging.In case of voting, the test predictions from RF and GB are regarded as votes and majority voting was adopted to get the final prediction output.The voting regressor module of Scikit-learn [44] is used as the voting model.For averaging, the training and test results from the two regressors are averaged separately to calculate training and test RMSE of the averaging model.For weighted averaging, three weight optimization techniques are deployed to minimize the final prediction RMSE: neural network (NN), RMSE minimization, and RF.The predicted final output (ŷ average ) is calculated using equation ( 6) with the RF and GB test prediction results (ŷ RF and ŷGB ) and their optimized weights (w RF and w GB ): For NN weight optimization, the number of hidden nodes was set to three, and the output node was set to one.The test prediction results from the two base models are fed through the input layer, and the prediction from the output layer is compared against the test dataset to calculate RMSE.For RMSE minimization, the RMSE (from equation ( 3)) is minimized using sequential linear-quadratic programming (SLQP) as an iterative optimization method for nonlinear problems or the Nelder-Mead method.The resulting weights from different weight optimization

Drop radius prediction
Drop radius is a better quantitative estimation of drop size than volume.It was measured through a graph-based blob edge detection algorithm from the processed drop images.For the regressive estimation of drop radius, the same 11 features were used as for drop velocity (voltage, echo voltage, echo time, rise time, fall time, pulse duration, nozzle diameter, frequency, viscosity, density, and surface tension).A challenge was that our lab data consists of primary and secondary drop volume.Here, primary denotes the main droplet.The secondary drop volume occurs due to drop breaking or multiple unwanted drops such as satellites as shown in figure 1(b).The volume was converted to total radius, which is the radius of an equivalent drop with the same volume as the primary and secondary drops combined.This radius represents the total fluid volume ejected from the nozzle irrespective of whether the jet subsequently breaks up into multiple drops.The information that there are multiple drops is stored as a categorical variable for classification and not determined by the radius or velocity value used for regression.The total drop radius (r t ) is calculated from the primary (r 1 ) and secondary (r 2 ) drop radius using the following formula: Secondary drop information is missing in the literature data as it only contained 'single drop' and 'no ejection' data.Finally, the lab and literature datasets were merged.Like velocity estimation, ten general models were fitted and tested on the dataset to identify which model is best for the prediction of total and primary drop radius.Again, the radius prediction is improved for the 'no-ejection' case by multiplying the results from CARTs, as shown in figure 4(c) similar to the velocity prediction model.It was found that the secondary drop radius is erratic and does not maintain an interpretable relationship with the signal and the material parameters.
Total and primary drop radius exhibit good training and test prediction results with three tree-based regressors (DT, GB, RF) as shown in figure 8(a).Weighted averaging was also adopted for the two best models (GB, RF) as described by equation ( 6).Weight optimization by SLQP is the best with the lowest total radius RMSE of 2.91 µm.Unlike velocity, most of the weight is applied to GB (0.85) and the rest to RF (0.15).This weighted averaging is effective for total drop radius estimation.However, majority voting among GB and RF gives better results for primary drop radius prediction.The test and train RMSE of each of the models are shown in figure 8(a).Figure 8(b) shows good agreement between the real and predicted total drop radius.It can be instructive to understand how RF has arrived at its radius prediction.RF is composed of a number of DTs, and it is quite impossible to comprehend the regression output by examining each tree.Each radius prediction can be decomposed into contributions from each feature and bias (mean radius of the training dataset).A sample primary drop radius prediction is shown in figure 8(c).The Python package tree interpreter is used for feature contribution interpretation.For every test data point, the contribution of each feature is not fixed; rather it changes according to the feature values depending on the decision path that is being traversed along each tree.The results show that viscosity, echo voltage, fall time and surface tension are the most important features affecting drop radius.For a sample test data point, the bias and feature contributions are broken down in table S1.There is a clear difference between velocity and radius prediction in terms of feature contribution.While for velocity the important contributions are from voltage, echo voltage, pulse duration and nozzle orifice diameter for this test sample, primary radius prediction depends on viscosity, echo voltage, fall time and surface tension values.Again, the radius prediction model was evaluated with untested graphene oxide ink data.The training dataset contained no data points with this ink.The results are displayed in figure 8(d).The predicted radius of the GO ink  maintains a linear relationship with the test voltage with a small residual error from the experimentally measured radius.

Jettability prediction
To further explore the conditions for stable jetting, data was categorized into three classes: 'no ejection' , 'single drop' and 'multiple drops' .Only 'single drop' is desirable.Because literature data does not have a clear indication of the jetting type, only data collected in our lab was incorporated in building this model.
All of these metrics are calculated with the Scikitlearn metrics package [44].The report is shown in table 3.Among these three classes, the 'single drop' class has the lowest scores because it can be misclassified as either 'no ejection' or 'multiple drops' .The model does not misclassify 'no ejection' as 'multiple drops' or vice versa as expected because these two classes are on opposite ends of the parameter space.Overall, these results demonstrate the good classification performance of the DNN.

Jetting window prediction
All the output results can be combined to forecast a jetting window.In the inkjet printing literature,  several types of windows have been proposed defined by pairs of non-dimensional numbers combining material parameters as well as drop velocity, such as Weber number (We) and Capillary number (Ca) defined by equations ( 12) and ( 13): We

Conclusion
Several ML models have been deployed successfully to investigate the impact of electrical signals and ink material properties on inkjet DoD drop velocity, volume, and jetting regime.Ensembles of DTs (GB and RF) were applied to predict the drop velocity and radius of 14 materials.The observed RMSE was 0.39 m s −1 and 2.12 µm respectively.The mean absolute percentage error is 3.87%.A neural network model was built to classify drop behavior as stable 'single drop' , 'multiple drops' , or 'no ejection' .It can predict the jetting category with 91.94% accuracy.The models were validated with an untested graphene oxide ink that was not part of the training dataset.Theoretical jetting windows are not very accurate for classifying the jetting type, and also require measurement of drop velocity experimentally.The model presented here can predict velocity accurately.Therefore, it is possible to omit the experiment and initial jetting classification can be obtained from the signal and material features only.The predicted jetting window will yield insights for the optimization of the printing conditions and ink material design.In the future, the dataset can be expanded by collecting data for more printable materials to improve the model accuracy.

Figure 1 .
Figure 1.(a) Illustration of the inkjet printing process.A voltage waveform is applied to the piezoelectric nozzle, which ejects ink droplets and is scanned relative to the substrate to create printed patterns.(b) Stroboscopic images of different jetting regimes.Ideally, a stable stream of well-defined drops is created.Outside of this desirable regime, the ejected drops can break up into multiple droplets, or there can be no ejection.

Figure 2 .
Figure 2. (a) Overall workflow of the data-driven inkjet optimization scheme.(b) Workflow of the image processing process to extract drop radius and velocity from experimentally collected stroboscopic drop images for each set of input features.

Figure 3 .
Figure 3. Input features and target distribution of the collected lab data.(a) Data distributions model input features (dwell time, voltage, rise time, fall time, frequency, echo voltage, viscosity, density, surface tension, echo time) and measured outputs (drop velocity, radius) displayed as histograms with y-axis representing the frequency of attribute occurrence.Different color bars represent different observed jettability regimes.(b) Target class balance of three jettability categories: 'multiple drops' , 'single drop' , and 'no ejection' .

Figure 5 .
Figure 5. (a) Comparison of test results between different algorithms for drop velocity and radius prediction.(b) Polynomial order optimization to predict output velocity.(c) Unipolar, bipolar, and echo voltage exhibit linear relations with drop volume.(d) Unipolar, bipolar, and echo voltage exhibit linear relations with drop velocity.(e) The minimum velocity of ejection for different materials, experimental and calculated.(f) Polynomial relation between dwell time and drop volume.(g) Polynomial relation between rise/fall time and drop volume.
promising models (DT, RF, and GB) are deployed.Based on individual performance, ensembles of DT, RF, and GB models are arranged through majority voting or weighted averaging to achieve the best performance.Tree-based ML models have a number of parameters to fine-tune through a search algorithm like Scikit-learn GridSearchCV [44].Grid Search CV makes use of k-fold cross-validation while exploring the best model parameter values to minimize velocity prediction errors.The hyperparameter values are saved and used later on to create the best estimator representative of each model.DTs are the most elementary model with fewer parameters to optimize.DT performance is optimized with maximum depth selection through GridSearchCV taking 0.171 s.A

Figure 6 .
Figure 6.Drop velocity prediction interpretation.(a) Example DT for velocity prediction.(b) Bias and feature contributions for test data from [38] for GB.

Figure 7 .
Figure 7. Velocity prediction models comparison.(a) Predicted velocity vs. real drop velocity for the DT drop classification (0/1) result multiplied with the GB velocity regression result.(b) Validating GB with new graphene oxide ink.The training dataset contained no data points from this ink.(c) Test RMSE comparison of different weight optimization models.

Figure 8 .
Figure 8. Drop radius prediction results.(a) Training and test RMSE for total and primary radius predictors.(b) Predicted total drop radius vs real total drop radius.(c) RF tree interpretation for primary radius prediction.(d) Validating RF drop radius model with new GO ink data.The training dataset contained no data points for this ink.
A three-layer DNN, a DT classifier with depth three and a K-nearest neighbor classifier with two neighbors are constructed and compared.All the classifiers are trained on 80% of the data and validated with the remaining test data (20%).The classification accuracy of these models for the test dataset is reported in table 2. DNN outperforms the other models having higher test accuracy.The confusion matrix for the best classifier (DNN) is shown in figure9as a demonstration of classification performance.The first type, 'multiple drops' , is tested with 66 actual data points, and the model accurately predicts 63.However, three (2.42% of total) of them are incorrectly labeled as 'single drop' .The next group, 'no ejection' , consists of 18 data points, of which two (1.61%) are classified erroneously.Out of 40 'single drop' instances, the model forecasts two (0.81%) as 'no ejection' and four (3.23%) as 'multiple drops' .This results in a total misclassification error of 8.06%.A classification report shows the categorizing performance of the final prediction algorithm in terms of precision (P), recall (R), f 1-score, and accuracy (A) as in equations (8)-(11) for each group concerning four outcomes: true positive (TP), true negative (TN), false positive (FP), and false negative (FN) TP + TN + FP + FN .(

Figure 10 .
Figure 10.(a) Experimental data points plotted on jettability window for nine different materials.(b) Predicted jettability plot created with the calculated We, Ca using predicted drop velocity from the Nelder-Mead weighted averaging model over the same lab dataset.There are no 'no ejection' data points, as the plot is on a log-scale and log(0) is undefined for the 'no ejection' class with zero velocity.

Table 1 .
(6)imized weights for calculating average of different models according to equation(6).Units of velocity RMSE are m s −1 .

Table 3 .
Jetting classification report for DNN.
[27]shown window was taken from literature[27].The data points are from our experimental dataset.Each material follows a straight line as expected.It can be observed that our lab data generally falls into the literature window as expected.However, there are some data points with 'multiple drops' within the pentagon-shaped jetting window, which should only contain single drops, showing the limitations of a simple window.'Noejection'with zero velocity cannot be plotted because of the log scaling of the Capillary and Weber axes, which both depend on velocity.The challenge with using such jettability windows is that data points can only be plotted retroactively after drop velocity has been measured experimentally.With the predictive model proposed here, drop velocity and jettable conditions can be predicted before conducting costly experiments, as shown in figure10(b).'Multiple drops' prediction generally occurs for large velocity values beyond the 'single drop' regime as expected.For toluene (on the lower boundary line of the jettable region), the 'single drop' and the 'multiple drops' prediction regions are mixed for both real experimental data points and points predicted by the model.The predicted jettability agrees well with experimental results.
where V is drop velocity, ρ is density, γ is surface tension, d is nozzle diameter, and µ is viscosity.Stable jettable conditions ('single drop') are enclosed in a window on a graph of two such non-dimensional numbers, as shown in figure10(a) for Ca-We.