The impact of unequal processing time variability on reliable and unreliable merging line performance

Abstract Research on merging lines is expanding as their use grows significantly in the contexts of remanufacturing, reverse logistics and developing economies. This article is the first to study the behavior of unpaced, reliable, and unreliable merging assembly lines that are deliberately unbalanced with respect to their coefficients of variation (CV). Conducting a series of simulation runs with varying line lengths, buffer storage capacities, and unbalanced CV patterns delivers intriguing results. For both reliable and unreliable lines, the best pattern for generating higher throughput is found to be a balanced configuration (equal CVs along both parallel lines), except for unreliable lines with a station buffer capacity of six. In that case, the highest throughput results from the descending configuration, i.e. concentrating the variable stations close to the beginning of both parallel lines and the steady stations towards the end of the line. On the other hand, ordering from the least to most steady station provides the best average buffer level. By exploring the experimental Pareto Frontier, this study also shows the combined performance of unbalanced CV patterns for throughput and average buffer level. Study results suggest that caution should be exercised when assuming equivalent behavior from reliable and unreliable lines, or single serial lines and merging lines, since the relative throughput performance of some CV patterns changed between the different configurations.


Introduction
Merging assembly lines are mass production, stochastic queueing systems in series. They may appear following supply chain disruptions to meet short-term needs and are also natural models for the queuing networks of many manufacturing and computer systems, e.g., parallel computer networks, supply chains, automotive, electronics, window and door factories (Nahas et al., 2014). They often have work-in-process (WIP) inventories kept between stations in buffer storage locations. Fig. 1 illustrates a conventional merging assembly line of two parallel lines with N stations and N-1 buffers, and a merging station assembling the output of the two parallel lines.
Unpaced assembly lines are frequently used in remanufacturing operations, reverse logistics, and developing economies (Liu et al., 2020).
With the circular economy expanding and global supply chains extending further into less developed economies, these lines represent a growing proportion of today's industrial configurations. Therefore, continued research on merging and unpaced assembly lines is much needed and contributes both academic and practical value.
Most research on assembly line design has focused on balancing problems. It has been generally assumed that fluctuations in the tasks' processing times are negligible (Battaïa and Dolgui, 2013;Tiacci, 2015) and, as a consequence, a balanced line will result in the most efficient assignment of resources. However, assembly lines with manual tasks could be subject to potential production fluctuations (Inman, 1999;Slack, 1982) from a variety of sources, which can interrupt and hold-up smooth line operations. In unpaced assembly lines, fluctuation sources can include variations in operator work speed, contingent upon experience, station tasks, and complexity and specificity of the assigned work elements (Doerr et al., 2004), or downtime due to machine breakdown (Li et al., 2009). These types of fluctuations can cause productivity losses and operation time variability.
Due to the effects of processing time variation on the blocking and starvation of the assembly line stations, it has been found that balancing an assembly line in terms of mean processing times might not be the most effective way to configure an assembly line for some scenarios, and that unbalanced patterns could produce a better performance (Hudson et al., 2015). The most widely-known example of an unbalanced pattern is the 'bowl phenomenon ' (McNamara et al., 2016), which maximizes the throughput of a serial line by assigning the fastest operators towards the center of the line.
In addition, assembly line designers must also consider where to position workers with different operation time variabilities -measured by the coefficient of variation (CV)-as some studies have shown that task variability assignment has a significant effect on the performance of an assembly line (Lau, 1992;Ö ner-Közen et al., 2017) and it might be difficult to actually balance the CVs of all stations. For example, Shaaban et al. (2013) suggested that assigning the steadiest operators (lowest CV) towards the center of a single serial line with unreliable machines produced the highest throughput, whereas assigning the steadiest operators towards the end of the line produced the lowest average buffer content along the line. Thus, a line with unbalanced CVs, i.e. a line with uneven CV assignment along the line, might be more capable of handling fluctuations from unexpected events (e.g. reduced availability, failure) or inherent variability (e.g. human differences) than a balanced line.
Although previous research has shown performance gains from unbalancing single serial lines in terms of CV, research evaluating the performance of merging lines with unbalanced CV has been scarce. RResults regarding the performance of unbalanced serial lines cannot be directly extrapolated to merging lines since the performance of some unbalanced patterns has been shown to differ between merging and single serial lines (Romero-Silva and . This paper intends to contribute to the field by addressing this gap and investigating whether (and which) unbalanced CV patterns have better performance than a balanced assignment of variability along the line.
Furthermore, since the effect of machine unreliability on the performance of unbalanced patterns in merging lines has also been shown to be significant (Shaaban and Romero-Silva, 2020), this paper studies the performance of merging lines with reliable and unreliable machines, and station operation time variability imbalance, to develop intuition about the behavior of merging lines. This study provides valuable, complementary insights to the traditional assembly line studies which focused on balancing the station's cycle times (see, e.g., Lai et al., 2016;Ö zcan, 2019) and buffer assignment optimization (see, e.g., Demir et al., 2014;Weiss et al., 2019), and examines whether a simple balanced assignment of CV along the line is better than an unbalanced assignment. We attain this objective by simulating a merging assembly line with two parallel lines and assessing its performance in terms of throughput and average buffer level. Furthermore, we test the performance of different unbalanced CV patterns under various line lengths, buffer capacities and machine-reliability profiles.
In the following sections, we first review the relevant literature, followed by a presentation of the research questions, motivation and study objectives. Subsequent sections discuss the methodology and experimental design, and provide the simulation results and analyses. The paper ends with a summary, discussion, conclusions and possibilities for the future development of the topic.

Single serial lines
Published research on single serial lines with unbalanced CV can generally be divided into reliable and unreliable lines. A review of pertinent studies follows.

Reliable lines with unbalanced CV
An early study by Anderson (1968) simulated a 4-station line, where the intermix of two deterministic stations with two variable stations was found to result in higher idle time (IT) levels than those of a balanced line. Other early studies on unbalanced lines suggest that incremental station placement, with the highest CV towards the end of the line, provides production improvements through lower IT, slight increases in throughput (TR), or sometimes both (see e.g. Payne et al. 1972;Kala and Hitchings 1973). Carnall and Wild's (1976) work found that bowl-shaped lines were associated with lower IT than inverted bowl-shaped lines and that unbalancing the CVs provided greater gains than unbalancing the mean processing times (MT). El-Rayah (1979) later confirmed Carnall and Wild's results regarding the superior performance of the bowl-shaped line arrangement.
De la Wyche and Wild (1977) studied lines with 3, 4 and 12 stations, and found that the shorter 3-and 4-station unbalanced lines experienced slight reductions in IT with the bowl pattern, but that the longer 12-station line experienced the opposite. This dichotomy suggested that the CV bowl pattern improvements only existed for shorter lines. Lau (1992) later studied unbalanced lines with 3 to 19 stations, and his observations corresponded with those of De la Wyche and Wild, finding that the bowl-shape performed better in short lines and provided inferior results in longer lines (N ≥ 9). Shaaban and Hudson (2009) combined multiple factors to study the behavior of unpaced lines in terms of their CV, with varied buffer storage sizes, number of stations and unbalanced CV configurations. Their results suggested that the best pattern (a bowl-shaped arrangement) simultaneously provides lower idle times and lower average buffer levels (ABL) than those of an equivalent balanced line.
Incorporating recent findings from behavioral operations, Ö ner- Közen et al. (2017) used simulation to compare paced and unpaced lines, where workers can speed-up their service times when needed in order to feed downstream workers or to unblock upstream workers. Their study found that unpaced lines are superior to paced lines in a realistic mixed-model production environment with a long line length. In unpaced conditions, an inexperienced worker should be placed in the middle of the line, while in paced conditions, he should be assigned to the first station. Workers capable of speeding-up should be placed in the middle of the line in both line types.

Unreliable lines with unbalanced CV
Very little has been published on the performance of lines with unbalanced CV subject to failure, i.e. unreliable. Caridi et al. (2006) have shown that a higher overall CV can be good for unreliable paced lines and suggested looking at TR and inventory effects. Shaaban et al. (2013) investigated the performance of unpaced, unreliable, single lines with unbalanced CV through simulations of lines with five and eight stations, buffer capacities of one, two, four and six units, and 12 different configurations with unbalanced CV. Their results showed that the best unbalanced CV patterns in terms of TR or IT were those where the steadiest stations are concentrated near the center of the line (a bowl-shaped configuration). Conversely, the best ABL results were found from either concentrating the steadier operators towards the center (bowl arrangement) or close to the end of the line (descending order). Rehman and Zheng (2015) simulated various line lengths and mean buffer capacities for unpaced, unreliable and reliable production lines to evaluate the design of multi-product production lines operating under different CVs for different products. They found that the unbalanced lines either outperformed or performed similarly to balanced lines.

Merging lines
The literature on uneven CV allocation in merging lines is sparse since most of the studies on merging lines have focused on mean times and buffer allocation patterns. To our knowledge, only three studies have thus far been published on reliable or unreliable merging lines with unequal CV. Below is a review of those works.

Reliable merging lines with unbalanced CV
The first study on reliable merging lines with unequal CV was by Futamura (2000), who examined the optimal allocation of servers in tandem queueing networks with unbalanced CV. He found an interaction between the CV of service time distribution and the number of servers at stations. Leung and Lai (2005) later assessed the installation of parallel workstations to improve cycle times and concluded that off-line parallel systems reduced buffer requirements and sensitivity to unbalanced CV and MT. Bhatnagar and Chandra (1994) studied the effect of variability from unreliable stations and imperfect yields on 3-station merging line assembly systems. They found greater TR improvements resulted from increasing production at individual stations than from increasing buffer capacity (BC).

Unreliable merging lines with unbalanced CV
On the other hand, studies on unbalanced merging lines in terms of MT and BC (Romero-Silva and Shaaban, 2019; Shaaban and Romero-Silva, 2020) have found that the relative TR performance of unbalanced patterns changes between reliable and unreliable lines, i.e. the best pattern is different for reliable and unreliable lines. This suggests that it is critical to evaluate the performance of both reliable and unreliable lines when investigating the overall performance of merging lines in the presence of imbalances.
Summarizing the research on reliable and unreliable single serial lines shows that the bowl CV shape results in a better performance for shorter lines and that a descending pattern results in lower average buffer contents. On the other hand, to the best of our knowledge, the performance of unbalanced CV patterns on merging lines has not yet been studied. This study addresses this gap and contributes to the field by applying simulation and statistical analyses to determine if unbalancing the lines can result in better performance when compared with the use of balanced CVs on reliable and unreliable merging lines.

Research objectives and questions
We examine reliable and unreliable merging lines with only one source of imbalance by permitting CVs to differ amongst stations. The other variables, BC and MT, are set so all buffers have equal capacity and all MTs are held equal. Furthermore, we assess the performance of different unbalanced CV patterns assigned to a merging assembly system with two parallel assembly lines, while considering various line lengths, buffer capacities and machine-reliability profiles.
Since research on the behavior of unreliable merging lines with unbalanced CV patterns is scarce, this study contributes to the literature by evaluating the performance of these lines, as indicated by TR and ABL, under a number of unbalanced CV patterns.
The main research questions of this study are: 1. Do unbalanced CV patterns, line length and buffer capacity significantly contribute to line performance? 2. Does an unbalanced CV pattern impact the performance of simulated reliable and unreliable merging lines, when compared to that of a balanced line counterpart? 3. Which patterns are the best in terms of line TR and ABL for both reliable and unreliable merging lines? 4. Which pattern leads to the best combined TR and ABL line performance for both reliable and unreliable merging lines? 5. Does unreliability influence the relative performance of unbalanced CV patterns?

Methodology
Merging lines are challenging to evaluate and they cannot be precisely disaggregated. Exact solutions for merging line networks can only be obtained by using numerical methods to analyze the underlying Markov chain, and exact solutions are not computationally feasible for lines longer than three stations and non-exponential distributions. To circumvent these issues, simulation is often applied to study the problem under more general conditions. Computer simulation was deemed the most suitable tool for this study, and Simio 9.147 simulation software (Kelton et al., 2014) was employed to study the behavior of unbalanced merging lines.

Model description
In this study, we consider a merging assembly system with two parallel lines (Parallel Line 1 and Parallel Line 2), exemplified by Fig. 1. Each parallel line is a serial line with N stations, which are connected by N-1 buffers. The first stations of Parallel Line 1 and 2 (S 11 , S 12 , respectively) are never starved, i.e. as soon as they finish a job they can immediately start the next job. Station i on either parallel line (S i1 or S i2 , respectively) feeds station i + 1 of the same parallel line through buffer i (B i1 or B i2 , respectively). Since the buffers of this line have limited capacity, it should be noted that if B i1 (B i2 ) is full, then S i1 (S i2 ) is blocked, as it has no space to release its working contents. Moreover, if B i1 (B i2 ) is empty, then S i+1,1 (S i+1,2 ) is starved as it has no material to start its operation.
Stations S N1 and S N2 , the final stations in each parallel line, feed buffers F 1 and F 2 , respectively, which are the buffers feeding the merging station. The merging station is starved if buffer F 1 or F 2 are empty since it needs the two components coming from both parallel lines to start processing the final assembly operation. The merging station is never blocked.
The two parallel lines are identical in terms of reliability, line length and buffer capacity. The only characteristic that can differ between the two parallel lines is the unbalanced CV pattern. Slack's (1982) research on histograms of work times experienced in practice concluded that the work time distribution is positively skewed and is closely described by a Weibull distribution, with a CV value averaging around 0.274. Consequently, the Weibull probability distribution, with a mean of 10 time units and an average CV value of 0.274, was used to model the processing times of all stations in both parallel lines. Furthermore, the transfer time of work units between stations and buffers was considered negligible, and there is only one type of product going through the line, with no changeovers taking place.

Research design
To permit the consideration of all desired levels of a specific factor combined with all levels of every other factor, a full factorial experimental design was chosen for the current study. In a full factorial design, it is possible to assess the effects of independent variables on dependent variables and investigate main and combined factor effects. In addition, more sensitive statistical tests are possible. Other less appropriate methods (e.g. Randomized Block Design or Latin Squares) set certain factors at a fixed level to control one or two sources of error, hence were avoided for this study.

Experimental factors
For parallel lines one and two, the independent variables and their levels were: We selected line lengths of 5, 8 and 11 for both parallel lines to consider odd and even numbers and to take into account the behavior of longer lines (N > 9) since different patterns can behave differently for longer lines (Lau, 1992). Buffer capacities of 1, 2 and 6 were modeled for both parallel lines to take into account restrictive and less restrictive buffered configurations.
Experiments with reliable lines included stations with no breakdowns, which allowed them to process work whenever they were not starved or blocked. On the other hand, experiments considering unreliable lines modeled stations that were subject to probabilistic failure. Based on empirical work from Inman (1999), this study modeled both the mean time between failure (MTBF) and mean time to repair (MTTR) with the exponential probability distribution. Furthermore, we used a failure rate of 0.001 breakdowns per time unit, and a repair rate of 0.010 repairs per time unit, i.e. MTBF was 1000 time units and MTTR was 100 time units. Consequently, station efficiency was determined to be 91% [MTBF 1000/(MTBF 1000 + MTTR 100)], identical to that used by Altiok and Stidham (1983) and Hopp and Simon (1993). It is worth noting that MTTFs are modeled based on machine operation processing time and not on production running time, and that all stations had the same MTBF and MTTR parameters when considering unreliable experiments.
To avoid the possibility of different patterns having different total CVs, which would directly impact the results (see, e.g., Khalil et al., 2008), the experiments were designed to have the same total CV for all the patterns. Three different variability profiles were used for the stations: relatively steady CV (S), relatively moderate CV (M), and relatively variable CV (V). For example, to be able to compare the bowl pattern in experiments with N = 5 (VSSMV) with the descending pattern (VVMSS), both patterns were comprised of two V stations, two S stations, and one M station, even though each pattern has a different order. A station that was assigned an S profile was effectively modeled with a processing time with CV = 0.04, an M station had a CV = 0.27, and a V station had a CV = 0.50 so that the average CV for the whole parallel line equaled 0.27. The merging station had an M profile for all the experiments. Table 1 summarizes the unbalanced CV patterns of each parallel line, where the variability profiles (S, M and V) assigned to each station are shown in the order of the stations on the line, i.e. starting on the left with the first station of the parallel line and ending on the right with the N th station.
The unbalanced merging line features are summarized in Table 2 Table A1 in Appendix.
Finally, even though the full factorial design of experiments accounts for 10 unbalanced CV patterns for Parallel Line 1 and 10 patterns for Parallel Line 2, which would result in 100 experiments per merging line configuration, we only considered a subset of 55 experiments to account for equivalent (mirror) experiments. For instance, an experiment with an unbalanced CV pattern in parallel line 1 (CVP1) of P1 and an unbalanced CV pattern in parallel line 2 (CVP2) of P2, i.e. the pair 'P1 P2', is equivalent to an experiment with a CVP1 of P2 and a CVP2 of P1, i.e. the pair 'P2 P1', because they are mirror configurations. Thus, mirror configurations were only considered once, e.g., the pair 'P2 P1' was considered in the experimental design but 'P1 P2' was not.
Considering all levels of all factors, our experimental design included (3 line lengths) x (3 buffer capacities) x (2 reliability profiles) x (55 CVP1 CVP2 patterns), for a total of 990 experiments. It is worth noting that, to ensure comparability among the experiments with different BC values, the unbalanced CV patterns for different values of BC are the same, when considering the same line length.

Performance measures and statistical analysis methods
The throughput/output rate (TR) and the average buffer level (ABL) were used as dependent performance measures (responses) for the entire line. TR is exceptionally valued in industries that deal with high volume to optimize their line designs and output. In contrast, ABL is valued in industries with high overheads, with costly space or raw material values. In these settings, a unit of buffer space can represent a substantial capital investment (Tempelmeier, 2003), so ABL is critical in lean buffering industries that focus on maintaining low levels of in-process stocks. Furthermore, to assess the combined TR and ABL performance, we identified the experimental Pareto Frontier, as it has been shown to be a useful and practical tool to analyze problems with more than one main objective (Efremov et al., 2009).
The TR and ABL data were analyzed using the following statistical methods: • Analysis of Variance (ANOVA) -to identify the relative contributions of the independent variables to the dependent performance variable. • Multiple comparisons with control using the Dunnett's t-testto compare the performance of unbalanced merging lines to the balanced line control. The 'R' software (The R Foundation, 2019), version 3.4.0 was used to statistically analyze the TR and ABL data. The packages agricolae, multcomp and rPref for 'R' were used for the Dunnett's t-test, the Tukey's HSD test, and finding the experimental Pareto Frontier, respectively.

Parameters of simulation runs
In generating representative simulation data, an appropriate nonsteady-state warm-up period, without measurements, should be selected to safeguard that observations are as close as possible to expected to normal operating conditions. Law (2014) proposed to carry out an initial simulation of the system under investigation, with the selection of one output variable for observation. A trial procedure of a 5-station reliable merging line established that an initial run of 20,000 min generated acceptable steady-state behavior, so all data collected during the first 20,000 min were discarded. Three hundred independent replications of 120,000 min per experiment were carried out, excluding the first 20,000 min of warm-up.
The same random number seed was intentionally used per random variate (processing time, MTBF and MTTF of each station) in all experiments to generate an equivalent sequence of events, sharpen any configuration contrasts, and reduce overall simulation variance. To have a more representative sample of the responses of this study, 300 independent replications were run per experiment, where the random number seeds did change from replication to replication. The final reported values for TR and ABL for each experimental point are the average TR and ABL values of the 300 replications.

Simulation results and data analysis
An ANOVA was carried out to answer Research Question 1: Do unbalanced CV patterns, line length and buffer capacity significantly contribute to line performance? All of the main factors in the experimental design were considered to conduct the ANOVA; however, the combination of CVP1 and CVP2 was considered as a single factor (CVP1 CVP2) to have a balanced design of experiments.
ANOVA's results for both TR and ABL are shown in Table 3. For TR, Table 3 shows that all effects and interactions had a significant effect on TR, and answers Research Question 1. Following the recommendations from previous studies (Albers and Lakens, 2018;Lakens, 2013;Yigit and Mendes, 2018), we used the omega-squared estimator (Ω 2 ) to assess the relative influence (effect size) of each factor on the responses. The resulting values of Ω 2 reported in Table 3 were calculated using the effectsize package for R (Ben-Shachar et al., 2020). Ω 2 values suggest that the factor with the strongest effect in terms of TR was the (un)reliability (RL) of the machines, followed by the buffer capacity (BC) and the line length (N). While the effect of the unbalanced CV patterns (CVP1 CVP2) was indeed statistically significant for TR, it is not as strong as N:RL and BC:RL interactions. On the other hand, Ω 2 values suggest that the overall effect of the unbalanced CV patterns (CVP1 CVP2) in ABL is the second strongest of all factors, with BC producing the strongest effect on ABL. Moreover, the CVP1 CVP2:RL interaction was the 3rd most influential factor for ABL. This suggests that, in terms of ABL, the performance of different unbalanced CV patterns might depend on RL. Furthermore, ANOVA results suggest that CV patterns have a higher impact on ABL than TR, as the Ω 2 values for the CVP1 CVP2 factor as well as its interactions with other factors were greater in the ANOVA for ABL than in the ANOVA for TR.
The Dunnett's t-test and Tukey's HSD test were carried out to help to answer Research Questions 2 and 3. The Dunnett's t-test was used to compare all unbalanced patterns against the balanced pattern (BD BD), which was considered as the control experiment because it is the simplest and most straightforward pattern. If an unbalanced pattern had both better performance than the balanced pattern and a statistically significant difference with the balanced pattern, then Research Question 2 can be answered positively: An unbalanced CV pattern has an impact on the performance of simulated reliable and unreliable merging lines, as compared to that of a balanced line counterpart.
Tukey's HSD test was used to assess whether statistically significant differences existed among all the set of experiments per merging line configuration, i.e. sets of experiments with the same values for N, BC and RL. If the best performing pattern was found to have a statistically significant difference with any other pattern, it could be suggested that said pattern was the best, helping to answer Research Question 3: Which patterns are the best in terms of line TR and ABL for both reliable and unreliable merging lines? Tables 4 and 5 show the best performing patterns in terms of TR and ABL, respectively, per merging line configuration. Tables 4 and 5 contain information about the pattern with the best performance, the average value for the response (TR or ABL), and the p-value of Dunnett's test. Tukey's HSD test results are shown in the Appendix (Tables A2-A5) as groups of statistically significant experiments (Tukey group). In this regard, a group of experiments with the same Tukey group letter/ number, e.g. 'a', have no statistically significant differences and are considered to have equal values in statistical terms, whereas experiments with different Tukey group letters/numbers are considered to have statistically significant differences. A Dunnett's p-value less than   In Table 4 it can be seen that the 'BD BD' pattern reached the highest TR for all reliable merging line configurations and most of the unreliable configurations with BC = 1 and 2, except for the configuration with N = 11, BC = 2 and RL = UR, where the 'BD P8' pattern (balanced, descending) reached the highest TR, although no statistically significant difference was found between the 'BD P8' pattern and the balanced pattern (Dunnett's p-value = 1). The pattern 'P8 P8' (descending, descending) was the pattern attaining the highest TR for all UR configurations with BC = 6. However, for experiments in which the highest TR was attained with unbalanced CV patterns, no statistically significant differences were found between the unbalanced CV patterns and the balanced pattern. So, in statistical terms, there is no difference between selecting a 'BD BD' pattern and a 'P8 P8' pattern in unreliable configurations (except for the UR scenario with N = 5 and BC = 1, where 'P8 P8' and 'BD BD' have statistically significant differences). The full TR   results regarding Dunnett's and Tukey's tests for reliable and unreliable configurations can be found in Tables A2 and A3, respectively, in the Appendix. Therefore, the balanced pattern (BD BD) performs very well in terms of TR in all scenarios; whereas the 'P8 P8' pattern is either the best pattern or statistically equal to the best pattern in the majority of unreliable scenarios. Similarly, since the 'BD P8' pattern has no statistically significant difference with the best pattern of each unreliable scenario (and is the best in the UR scenario with N = 11 and BC = 2), it can be suggested that this pattern also performs very well in terms of TR in all unreliable scenarios.
The results for ABL are much more straightforward than for TR. In all configurations the 'P8 P8' pattern produced the lowest ABL, having statistically significant differences with all the other patterns in both Dunnett's and Tukey's tests. The full ABL results regarding Dunnett's and Tukey's tests for reliable and unreliable configurations can be found in Tables A4 and A5, respectively, in the Appendix.
Unbalanced CV patterns are shown to have an impact on ABL performance of reliable and unreliable merging lines when compared to a balanced line. However, unbalanced CV patterns do not seem to have a statistically significant impact on TR performance when compared to a balanced line. This answers Research Question 2. Furthermore, the 'BD BD' pattern seems to be better (or equivalent) to all the other patterns in terms of TR, whereas the 'P8 P8' pattern is the best in terms of ABL, and answers Research Question 3.
To answer Research Question 4 (Which pattern leads to the best combined TR and ABL line performance for both reliable and unreliable merging lines?), we identified the experimental Pareto Frontier for all the merging line configurations tested in this study by plotting the TR results against the ABL results. Figs. 2-4 present these scatterplots for N = 5, 8 and 11, respectively, and highlight the Pareto experimental points (red points in the scatterplot), which are the points where the ABL performance cannot improve without making the TR performance worse, and vice-versa. Figs. 2-4 also highlight the Nadir experimental points (blue points in the scatterplot), which show the worst possible performance for both TR and ABL.
In addition to the single performance results already analyzed (i.e. 'BD BD' has great performance for TR and 'P8 P8' for ABL), Fig. 2 shows that for experiments with N = 5, the pattern P3 (zigzag -MVMSM) offers a good combined performance for both TR and ABL, as experiments using this pattern ('P3 P3', 'P8 P3', 'BD P3') were located in the experimental Pareto Frontier in reliable configurations with N = 5. The same can be said for experiments with N = 5, BC = 1 and RL = UR. It should be noted that many experiments in the experimental Pareto Frontier with an N = 5, BC = 1 and RL = UR configuration were found to have non-significant statistical differences among each other in terms of TR (see Table A3 in the Appendix).
For unreliable configurations where BC = {1, 2}, the 'BD P8' pattern (balanced, descending) shows a good combined performance for TR and ABL, a result that is repeated throughout all line lengths (see Figs. 2-4) regarding unreliable configurations with a BC = {1, 2}.
For unreliable experiments with BC = 6, the best-performing pattern for both TR and ABL in all line lengths is 'P8 P8'. This result shows how, in scenarios where the production flow is less stable (because of machine Finally, Figs. 2-4 clearly show how the performance of unbalanced CV patterns differs between reliable and unreliable scenarios since the set of Pareto experimental points is bigger for reliable than for unreliable lines, and answers Research Question 5 of the study (Does unreliability influence the relative performance of unbalanced CV patterns?). In fact, in Figs. 2-4 we can see the significant effect of the triple-interaction BC: CVP1 CVP2:RL found in the ANOVA, as the performance of unbalanced CV patterns depends on both the (un)reliability of the stations and the buffer capacities.
The full list of patterns located at the Pareto Frontier can be consulted in Table A6 in the Appendix, whereas the interactive versions of Figs. 2-4 can be found in the supplementary material.

Summary of results
Several conclusions emerge from these results. This study shows that for TR, the best performance results for both reliable and unreliable lines are generated by the 'BD BD' configuration (balanced CV pattern in both parallel lines). This is true for all N and BC levels considered, with a few non-significant exceptions, e.g. unreliable merging lines with BC = 6, in which case the 'P8 P8' pattern (descending CV pattern in both parallel lines) was the best. This generally contrasts with the reliable single line results of De la Wyche and Wild (1977) and Lau (1992) for relatively long lines. When compared with the unreliable single line TR results of Shaaban et al. (2013), our results differ from those of Shaaban et al. because we did not find an overall good performance of bowl patterns in terms of TR, although some bowl patterns showed a good combined performance by appearing in the experimental Pareto Frontier of experiments with N = {8, 11}.
For ABL, the pattern 'P8 P8' consistently and statistically outperformed the equivalent 'BD BD' line for all the factor levels simulated, for both reliable and unreliable lines. This generally agrees with the unreliable single CV unbalanced line results of Shaaban et al. (2013), who reported the best CV pattern was also a descending order configuration.
It was also found that all four main factors (N, BC, CVP1 CVP2, and RL) are very highly significant at the 0.0001 level for both TR and ABL, which agrees with multiple previous studies (Conway et al., 1988;Hillier and So, 1991;Patti and Watson, 2010;Shaaban and Romero-Silva, 2020;Tan, 1998). Moreover, interactions between CVP1 CVP2 and BC and RL were found to be relevant for the performance of ABL, evidenced by the 'P8 P8' pattern as the best-performing pattern in terms of both TR and ABL for experiments with unreliable machines and BC = 6.
Since ABL performance gained a significant advantage from applying an unbalanced CV pattern, Table 6 shows a summary of the results attained when applying the best pattern for reducing ABL (P8 P8) compared with the ABL results attained when using the balanced CV pattern (BD BD). Table 6 also shows the percentage gain between 'P8 P8' and 'BD BD' in terms of ABL. The highest percentage advantage in ABL of the 'P8 P8' pattern was found to be 44.43% for reliable lines (at N = 11, BC = 6), and 9.34% for unreliable lines (at N = 5, BC = 1). Both percentage gains are very highly significant in statistical terms (based on Dunnett's and Tukey's test results).
On the other hand, results from this study showed that the 'P9 P9' pattern (ascending CV pattern for both parallel lines) was the worstperforming pattern simultaneously for TR and ABL in all of the tested scenarios. Therefore, an assignment of stations in merging lines that orders the stations from the most stable station to the most variable should be avoided entirely.

Discussion, conclusions and future research directions
This study contributes to the body of knowledge on production lines by delivering new insights on performance improvements for reliable and unreliable unbalanced CV merging assembly lines.
Attaining a balanced assembly line in terms of CV is unlikely and unrealistic, so research on unbalanced lines contributes both academic and practical value.
This study found that statistically significant performance in terms of ABL can be achieved in reliable and unreliable merging lines with unbalanced CV (descending, descending). The substantial potential savings obtained are very highly significant (about 44% for reliable lines and over 9% for unreliable lines in the best cases).
The potential of multiplying such savings over the lifespan of a production line suggests it could be worthwhile to deliberately unbalance reliable and unreliable merging lines in certain settings, especially since the improvement in ABL entails no or very little further capital or resource expenditures, and requires only appropriately reassigning line operators. So, line designers may be interested in exploring how to suitably unbalance the variability of their merging lines.
Note that although these results show that an imbalance of CV can be advantageous, there remains a possibility for unbalancing the CVs incorrectly, which would lead to an undesirable reduction in performance of both TR and ABL, as shown by the total lack of performance  from the 'P9 P9' pattern (ascending pattern in both parallel lines). Thus, line managers must make sound decisions that align with their goals. ABL should be prioritized in reliable and unreliable lines if the objective is to keep low levels of WIP for lean buffering or just-in-time policies, e. g. the automotive and electronics industries. For low levels of WIP, the best (P8 P8) or similarly advantageous unbalanced patterns (such as combinations of descending with other configurations) would be the most appropriate. In industries with high demand and fully utilized, expensive operators (e.g. the IT or pharmaceutical industries), TR should be prioritized. In these cases, the highest possible TR improvements would be found with 'BD BD' pattern, for most of the cases, and the 'P8 P8' pattern for unreliable lines with BC = 6. If a manager prefers a combined overall TR and ABL performance, we provided the readers with the patterns that performed well in terms of both TR and ABL by finding the experimental Pareto Frontier for every merging line configuration tested in this study. In particular, a monotone descending order (P8 P8) is the most favorable configuration when considering unreliable merging lines with BC = 2 and 6.
The results from this study also indicate that production line managers and designers should be cautious in assuming equivalence between the behavior of reliable and unreliable lines since the performance of CV patterns differed between reliable and unreliable merging lines. Similarly, the behavior of single serial lines and merging lines should not be considered equivalent, because the performance of some patterns differs between them, as demonstrated by the results from studies investigating single serial lines, i.e. a bowl pattern increases TR in shorter single serial lines (Lau, 1992;Shaaban et al., 2013;Wyche and Wild, 1977). Interestingly, although the assumptions and models are very different (saturated vs. unsaturated lines, see  for an explanation of this difference), in this study we found the opposite suggestion than the one presented by Suresh and Whitt (1990), who suggested ordering the variability of the line in a monotone increasing order (P9 pattern in this study, which was the worst-performing pattern for TR and ABL) to increase the performance of tandem queues.
As is true of all research studies and methods, this study has certain limitations. While simulation is a valuable tool that can deal with complex line configurations more accurately than mathematical models, its results remain valid only to the particulars of the system and conditions simulated. Although a model represents reality, it is not real. Nevertheless, simulation allows us to rapidly and cost-effectively generate multiple alternatives to aid decision-making while avoiding the resource drains and case-specific nature of field observation.
Since the results are also based only on a limited number of configurations among an almost infinite number of alternatives for unbalancing the reliable and unreliable merging lines, there remain possibilities that have not been addressed. For instance, future studies could consider longer lines and different unreliability profiles for machine breakdown and repair. We only considered two unreliability profiles in this study, and it has been shown (Shaaban and Romero-Silva, 2020) that different unreliability profiles constitute a significant factor affecting the performance of merging lines. However, since this is the first study to evaluate the performance of merging lines with unbalanced CV patterns we believe that the experimental design developed here is a meaningful effort to develop valuable intuition about the effects of unbalanced CV patterns on the performance of merging assembly lines.
Further research extensions are possible from this study to continue to enhance area knowledge and aid production line engineers in improving line design decisions. Future research directions could explore unbalanced, reliable and unreliable merging lines with two or three joint sources of imbalance, allowing further complexity to be built. Another option would be to investigate the reliable and unreliable unbalanced disassembly lines that make up a large proportion of the reverse supply chain and remanufacturing industries, while others could investigate the effects of variability imbalance under non-steady-state conditions.