Sequential batching with minimum quantity commitment in N-level non-exclusive agglomerative hierarchical clustering structures

Article history: Received February 5 2020 Received in Revised Format February 28 2020 Accepted March 5 2020 Available online March 5 2020 This study considers a sequential batching problem with a minimum quantity commitment (MQC) constraint in N-level non-exclusive agglomerative hierarchical clustering structures (AHCSs). In this problem, batches are created for item types included in clusters according to the sequence of the levels in a given AHCS such that the MQC constraint as well as the maximum and minimum batch size requirements are satisfied, simultaneously. The MQC constraint ensures that more items than a committed minimum quantity must be batched at a level before items not batched at the level are sent to the next level. We apply the MQC constraint to control effectively the degree of heterogeneity (DoH) in the batching results. We developed a sequential batching algorithm for minimizing the total processing cost of items using properties identified to find better solutions of large-sized practical problems. Results of computational experiments showed that the developed algorithm found very good solutions quickly and the heuristic algorithm could be used in various practical sequential batching problems with the MQC constraint such as input lot formations in semiconductor wafer fabrication facilities, determination of truckloads in delivery service industry, etc. Also, we found some meaningful insights that dense cluster, small batch size, and tight MQC constraint are effective in reducing the total processing cost. Additionally, small batch size with loose MQC constraint seem to be helpful to reduce the DoH in the batching results. Finally, we suggested that the density of cluster, batch size, and MQC tightness should be determined simultaneously because of interactions among these factors. © 2020 by the authors; licensee Growing Science, Canada


Introduction
Clustering is the classification of a set of item types into clusters so that item types in the same cluster have more similar attributes to each other than those in other clusters. Examples of clustering include segmenting various customers, classifying product types into product lines or families, and grouping delivery destinations into several zones. We can group item types into clusters only once; however, sometimes, clusters that were already made are grouped again into larger clusters or divided again into smaller clusters. This kind of clustering is called hierarchical clustering or multi-level clustering. Item types in the same cluster become more homogeneous in the divisive hierarchical clustering structure (DHCS), whereas they become more heterogeneous in the agglomerative hierarchical clustering structure (AHCS) as the level of hierarchy increases. In the AHCS, a cluster includes more and more item types as the level of the hierarchy increases. Also, AHCS can be configured so that an item type included in a cluster at a level belongs to only one cluster at the next level, or it has a configuration that an item type included in a cluster is included in more than one cluster at the next level. We call the former the exclusive AHCS and the latter the non-exclusive AHCS.
Many studies have been conducted on clustering in areas of manufacturing, biotechnology (BT), information technology (IT), and logistics. Refer to Murtagh and Contreras (2012), Lim et al. (2017), andDelgoshaie et al. (2019) for previous studies on hierarchical clustering algorithms. In manufacturing, there are various studies on clustering techniques for cell formation of cellular manufacturing systems. Vakharia and Wemmerlöv (1995) compared the performance of seven techniques for hierarchical clustering with ten performance measures and identified that selecting suitable clustering technique is more critical than the choice of dissimilarity measure for better cell formation. Recently, Delgoshaei et al. (2019) conducted a literature review on clustering methods successfully used for the cell formation to explain drawbacks and shortcomings of cell forming and scheduling the formed cells. Also, clustering techniques are applied to scheduling and ordering problems. Chen et al. (2011) transformed a single machine batch scheduling into a special type of clustering problem and developed a solution algorithm using clustering techniques. Nagasawa et al. (2012) developed a method using canonical correlation and hierarchical cluster analysis (HCA) for classifying items according to shipping trends to determine a suitable ordering policy. Clustering is also actively being used in IT field. Arifin and Asano (2006) used HCA to develop a method for image segmentation by a histogram thresholding. Costa et al. (2013) developed a hierarchical method to classify structurally homogeneous clusters of XML documents considering multiple forms of structural components. Nunez-Iglesias et al. (2013) suggested an active learning paradigm for image segmentation to divide an image into meaningful parts using agglomerative hierarchical clustering. Bouguettaya et al. (2015) combined agglomerative hierarchical clustering and partitional clustering to reduce computational costs without losing the clustering performance while handling huge data collections. Zaitoun and Aqel (2015) compared clustering-based image segmentation techniques and recommended a hybrid solution that combines multiple methods for image segmentation because many factors influence the results of image segmentation. Huang and Ma (2019) suggested a new hybrid clustering method, which combines hierarchical clustering and K-means for image categorization.
Clustering techniques are widely used in BT field for detecting clusters in genomic data (Langfelder et al. 2008, Cameron et al. 2012, expressing gene (Jun et al. 2010), classifying protein families (Saunders et al. 2012), and finding the hierarchical clustering structure (HCS) for the DNA sequence (Cheng et al. 2013). Andreopoulos et al. (2009) surveyed important clustering applications in biomedicine to find strengths and weaknesses of existing clustering algorithms and provide guidelines for matching them to biomedical applications. In the logistics field, clustering techniques are used for clustering demand nodes to solve vehicle routing problems (Özdamar & Demir, 2012;Coral et al., 2017, Comert et al., 2018Hintsch & Irnich 2020), flow mapping, and clustering to effectively discover major flow patterns from large point-to-point spatial flow data (Zhu & Guo, 2014) and evaluating intercity ground transportation infrastructure and services to support the integration and development of urban agglomerations (Yue et al., 2019).
This study considers a sequential batching problem (SBP), wherein batching priorities are assigned based on AHCSs. The sequential batching arises in many areas including input lot formations in semiconductor wafer fabrication facilities, known as wafer fabs, bulk mail presorts for discounted mailing fees, and truck or container loads determinations in vendor managed inventory (VMI) environments. Unlike traditional batching problems, in this problem, batches are created sequentially from the first level of the AHCS given the highest priority to the last level given the lowest batching priority. For example, in a wafer fab, we can create an AHCS for batching priorities in the form of product hierarchy, which represents the hierarchical relationship among products. Product types are placed at the first level of the product hierarchy, whereas product lines, which are groups of product types with similar production processes, are placed at the second level. Product families, which are groups of product lines with a certain functional coherence, can be placed at the next level. The number of levels in an AHCS varies depending on the industry or company. In the postal service industry, the number of presort levels becomes that of the levels in the AHCS. For example, the Korea Post and United States Postal Service (USPS) have three and four presort levels for commercial bulk mail, respectively. Typically, many companies, including semiconductor manufacturers in other industries, have more than three hierarchical levels.
In a wafer fab, wafers are made up of lots and then released into the wafer fab, where the maximum number of wafers in a lot (i.e., lot size) is 25 in general. Also, a lot can be made up of wafers for single type of product or multiple types of products. We call the former homogeneous wafer lot and the latter heterogeneous wafer lot, which is split into several lots for multiple products during the wafer fabrication process. Because quantities to be released for product types are not a multiple of the lot size, the total number of lots (including the number of lots with a few wafers) increases significantly, which influences the productivity of the wafer fabrication if only homogeneous wafer lots are created. Therefore, it is required to create a certain number of heterogeneous wafer lots. However, it is important to create homogeneous wafer lots for each product type adequately before forming heterogeneous wafer lots for multiple product types, because excessive lot-split can lead to higher production costs. Here, an AHCS (based on product hierarchy) for product types is used to set priorities for sequential batching, wherein lots are formed sequentially from the first level of the AHCS to the last level. Particularly, the input lot formation by the sequential batching is very important for the customized small-volume production of semiconductor chips, such as application specific integrated circuits (ASIC) because hundreds of different kinds of products are fabricated in a typical wafer fab for ASIC chips of Korean global semiconductor manufacturers. See Kim and Lim (2012), Bang et al. (2012) and Lim et al. (2014) for the release of input lots and lot-split during the wafer fabrication process.
In the sequential batching process for the input lot formation, to avoid excessive lot-split, a certain number of homogeneous wafer lots is first formed for each product type before heterogeneous wafer lots for multiple product types are created. We refer to this number of (homogeneous) wafer lots as the committed minimum quantity at the first level of the AHCS, which should be appropriately set to control the heterogeneity degree more effectively, because the committed minimum quantity influences the number of lots with a few wafers and the total number of lots in the wafer fab, as mentioned above. Here, the degree of heterogeneity (DoH) can be defined as the ratio of production orders for which production quantities are satisfied by split lots from heterogeneous wafer lots. In this study, we include a special constraint, called the minimum quantity commitment (MQC) constraint, to effectively control the DoH in the batching results. The MQC constraint ensures that more wafers than the committed minimum quantity are formed as lots at a level before wafer lots are made at the next level of the AHCS. As high heterogeneous wafer lots lead to excessive lot-split in the wafer fabrication process, multiple deliveries for destinations occur from heterogeneous truckloads that include items for several destinations. Excessive multiple deliveries for destinations increase the total delivery costs and reduce customer satisfaction. The MQC constraint enforces that more items than the committed minimum quantity are formed as truckloads for each destination to avoid excessive multiple deliveries before truckloads for multiple destinations are formed.
As a study on the SBP for an exclusive AHCS, Lim et al. (2015) studied a presort loading problem (PLP), where three-level AHCS is defined by a discounted structure of mailing fees for presorted commercial bulk mail. Batches of mail pieces to be loaded onto mail trays are created for zip codes included in clusters according to the sequence of the levels in a given AHCS such that the maximum capacity of the mail tray and the minimum requirement for discounted mailing fees are satisfied. Lim et al. (2015) proved that PLP is a special case of transportation problem (TP) with the MQC constraint, where suppliers are divided into three levels, transportation costs from suppliers in the same level to any customers are not different from each other while transportation costs increase as the level of supplier increases and the increasing rate is gradually decreased by the tapering principle. They developed an exact solution method for the special case although the TP with the MQC constraint is NP-hard (Lim & Xu, 2006). Bang et al. (2016) developed a multi-dimensional dynamic programming (DP) algorithm for an input-lot formation in a semiconductor wafer fab, which is a kind of the SBP without the MQC constraint under N-level exclusive AHCS. As an extended version of Lim et al. (2015), Bang et al. (2016) extended the level of the AHCS from 3 to N and considered more general cost structure that does not follow the tapering principle. Lim et al. (2017) reduced the number of states in the DP algorithm developed by Bang et al. (2016). The DP algorithm, however, is not suitable to solve practical problems due to high computational complexity and very long computation time. Also, they did not consider nonexclusiveness of the AHCS and the MQC constraint for effective control over the DoH in the batching results, although they frequently appear in many application areas. See Lim et al. (2015) and Han et al. (2019) for a review of some studies on the MQC constraint. Studies on the batching problem have been conducted since a long time. However, there is no previous research on the SBP with the MQC constraint under the non-exclusive AHCS, although the MQC concept has been applied to supply-demand contracts and logistics-related decision-making. In this study, we consider the SBP with the MQC constraint in Nlevel non-exclusive AHCSs (shortly SBPM-NA). In this problem, batches are created for item types included in clusters according to the sequence of the levels in a given AHCS such that the MQC constraint as well as the maximum and minimum batch size requirements are satisfied simultaneously. Item types and their quantities to be batched are given at the first level of the AHCS, and those not batched at one level of the AHCS are collected and batched again at the next level. Note that batching at each level is not independent of batching at other levels, although the batching process is carried out sequentially from the first level until to the last level, because the results of batching at one level affect batch results at subsequent levels and batch results at a level also are influenced by batch results at previous levels. It costs more to process batches made at higher levels of the AHCS, because items with more heterogeneous item types are batched as the level of the AHCS increases. The MQC constraint ensures that more items than a committed minimum quantity must be batched at a level before items not batched at the level are sent to the next level. We apply the MQC constraint to control effectively the degree of heterogeneity (DoH) in the batching results. We developed a sequential batching algorithm for minimizing the total processing cost of items using properties identified to find better solutions of large-sized practical problems.
We explain the SBPM-NA using an example truck-loading problem in section 2. Also, we identify some solution properties for the SBPM-NA and describe a sequential batching algorithm. Then, results of computational experiments and concluding remarks are given. Fig. 1 shows a four level non-exclusive AHCS for an example truck-loading problem modeled as the SBPM-NA. In the truck-loading example, item types, their quantities (to be batched), batches of items, batch processing costs, and unit processing costs correspond to destinations, quantities to be transported to destinations, truckloads, and their freight charges, respectively. A cluster consists of destinations for trucks to trip. Assume that each item has equal weight and volume and that items are transported to nine destinations using the same type of trucks. Also, assume that the quantities of items to transport for each of the nine destinations are known. There exists a requirement on the minimum quantity of items that must be loaded (i.e., the minimum batch size requirement, shortly MIBR) and a limit on the maximum quantity of items that can be loaded (i.e., the maximum batch size limit, shortly MABL) onto each truck. That is, each truck has a minimum load requirement and a maximum load capacity. In addition, if a truck trips multiple destinations, the route and schedule for destinations are fixed and known. Level 1 in Fig. 1 shows the case that all destinations are not grouped into any clusters. In this case, batches are made by each item type (i.e., each destination) and, therefore, items loaded onto the same truck at Level 1 have the same destination. Level 2 has three clusters, Clusters 1-3, with nine destinations divided into these three clusters. Note that Destination 4 (Destination 6) is included in both Clusters 1 and 2 (Clusters 2 and 3). Items loaded onto each truck at Level 2 will be sent to destinations included in the same cluster. Likewise, Level 3 has two clusters, and items loaded onto each truck at this level will be transported to destinations included in the same cluster of this level. Finally, all destinations are grouped into a single cluster at Level 4, and items loaded onto each truck at this level have them as their destinations. As the level increases, more destinations are included in a cluster. Furthermore, note that Destination 4 (Destination 6) is included in both Clusters 1 and 2 (Clusters 2 and 3) at Level 2, whereas Cluster 2 at Level 2 is included in both Clusters 1 and 2 at Level 3, resulting in a non-exclusive AHCS. In this example, batching items corresponds to making truckloads such that two loading constraints, the minimum requirement and the maximum limit, are satisfied. Also, the total number of batches corresponds to the total number of trucks used to transport items to their destinations. Although the total transportation cost becomes the minimum when each of the truckloads has only one destination, items for different destinations may be loaded onto some trucks to satisfy the two loading constraints. When we make a truckload with several destinations, the truckload should have destinations included in the same cluster. Note that transportation costs for truckloads at higher levels are higher than those at lower levels, because the truckloads at higher levels have more destinations to visit. Therefore, the truckloading decision should be made through Level 1 to Level 4 sequentially for minimizing the total transportation cost. First, items are batched to make truckloads at Level 1, and then items not batched at this level are collected by the cluster at Level 2 and batched again to make truckloads for Clusters 1-3 at Level 2. Similarly, items not batched at Level 2 are collected by the cluster at Level 3 and batched again. This batching process continues until Level 3 and all remnants not batched until Level 3 are sent to Level 4, where all remnants are collected and processed individually. The MQC constraint ensures that items more than a committed minimum quantity must be batched and be made as truckloads at a level before items not batched at the level are sent to the next level. Table 1 gives a problem data of the truck-loading example. More than 15 items must be loaded for the truckload (TL) freight charge, and up to 20 items can be loaded on a truck. Freight charges for truckloads made at Levels 1, 2, and 3 are 100, 120 (or 130), and 150, respectively. Items not made as truckloads until Level 3 incur Lessthan-truck-load (LTL) freight charge (12 per item). Also, the TL and LTL freight charges become the batch processing and unit processing costs, respectively. Items are loaded onto several trucks from Level 1 to Level 4 sequentially, such that the truckloads satisfy the minimum truckload requirement and the maximum truck capacity.    Table 2 and Fig. 2 show a truck-loading decision by the sequential batching process and its graphical representation. Fifteen truckloads (from TL-1 to TL-15) are formed at Level 1, while each truckload contains only items for one destination. We call them homogeneous truckloads. For items not loaded at Level 1, three truckloads (TL-16, TL-17, and TL-18) are created for three clusters at Level 2. The truckload TL-16 passes through destinations 4, 3, and 1, whereas TL-17 (TL-18) contains items for destinations 6 and 5 (7, 9, and 6). We call them heterogeneous truckloads. Items for destination 6 are divided and loaded onto two truckloads, TL-17 and TL-18. The sequential batching process for all items is completed at Level 2 without any items charged by LTL freight rate. This is because the AHCS for the example problem is non-exclusive, where destination 6 is included in both Clusters 2 and 3 at Level 2. If the AHCS is exclusive and nine destinations are divided into three clusters, such as [1, 2, 3], [4, 5, 6], and [7, 8, 9], the batching process cannot be completed at Level 2. Instead, some of the items will be sent to Level 3 or Level 4, increasing the possibility that items for several destinations are loaded onto the same truck. Table 3 and Fig. 3 show another truck-loading decision and its graphical representation for the example problem. Here, we create truckloads by filling items up to the maximum truck capacity. We can observe an important difference between the item quantities of TL-5 and TL-6 for destination 3 of the first truck-loading decision in Table 2 and those of the second decision in Table 3. In the first decision, the number of items loaded in each truck is determined as the value between the minimum requirement for TL freight charge and the maximum capacity of the truck, instead of making full truckloads, as in the second decision. We can also find the same difference for destination 8. The total costs for the first and second decisions are 1,880 ((10015)+(1201)+(1302)) and 1,896 ((10014)+(1201)+(1301)+(1501)+(128)), respectively. Table 2 Multiple deliveries and receipts occur for five destinations (i.e., destinations 1, 3, 4, 6, and 7) in the first decision, as depicted in Fig. 2, whereas multiple receipts for more destinations (i.e., 1, 3, 4, 6, 7, and 8) occur in the second decision, as depicted in Fig. 3. Multiple deliveries from heterogeneous truckloads occur inevitably because there exist some remnants (not loaded at Level 1) for several destinations due to the minimum load requirement and the truck capacity. A similar situation to that of multiple deliveries also exists when forming input release lots in semiconductor wafer fabrication, wherein a cluster consists of product types. As mentioned earlier, it is necessary to make a certain number of heterogeneous wafer lots owing to the productivity issue and the lot size requirements. These heterogeneous wafer lots cause lot-splits during the wafer fabrication process. It is important to effectively control the DoH determined by heterogeneous truckloads or heterogeneous wafer lots that causes multiple deliveries or lot-splits, because they are directly related to the total processing cost of items. As shown from the two decisions for the truck-loading example, the number of destinations where multiple deliveries occur can be decreased when the number of items loaded in each truck is determined as the value between the minimum requirement and the maximum truck capacity, instead of filling items up to the maximum truck capacity. We use the MQC constraint for effective control over the DoH. We explain the MQC constraint in more detail in the following section. Also, the non-exclusiveness of the AHCS should be considered to complete the sequential batching process at a lower level.   Table 3 Notations to describe mathematically the SBPM-NA and the sequential batching algorithm are as follows.

Fig. 2. Graphical representation of sequential batching for results shown in
Parameters n index of levels (n = 1, 2, …, N) ( ) index of item types (for n=1) or clusters (for n2) at level n (Note that batches are made by each item type at Level 1. Also, note that i=1 at Level 1, that is 1 ( ) , is different from i=1 at Level 2, that is 1 ( ) , although we use the same index i for any level n for notational simplicity.) Λ ( ) set of all item types (for n=1) or all clusters (for n2) at level n Δ ( ) set of clusters at level n+1 that includes an item type ( ) (for n=1) or a cluster ( ) (for n2) at level n Υ ( ) set of item types (for n=2) or clusters (for n3) at level n1 that are included in ( ) at level n ( ) the MABL of ( ) ( ) the MIBR of ( ) (Note that ( ) = 0 for all ( ) , that is, the MIBR is not applied to level N.
Instead, all remained items not batched until level N1 are processed individually at the last level.) ( ) processing cost of items batched for ( ) (We assume that ( ) < ( ) for all nN1 and ( ) . That is, it costs more to process batches made at higher levels of the AHCS.) ( ) processing cost of items at the last level N (We assume that ( ( ) • ( ) ) > ( ) for all nN1 and ( ) , that is ( ) > ( ( ) ( ) ) . That is, it costs less to process items batched than not batched ones.)

Decision Variables
( ) quantity of items to be batched (shortly QTB) for ( ) (Here, note that ( ) is the QTB and it is given in advance for all ( ) . Also, note that ( ) is the quantity of items not batched (shortly QNB) until level N1 and they are charged by unit processing cost ( ( ) ) at the last level N.) ( ) quantity of items batched (shortly QB) for ( ) (Here, 1nN1.) ( ) the minimum number of batches (shortly MNB) needed to make batches with ( ) items satisfying the MQC constraints as well as the MIBR ( ( ) ) and MABL ( ( ) ) on the batch size for ( ) (Here, 1nN1.) , where the first term is the processing cost for items batched whereas the second term is the processing cost for items not batched until level N1.

Sequential Batching Algorithm for SBPM-NA
Let ( ) be the MNB needed to make batches with ( ) items satisfying the MIBR ( ( ) ) and MABL ( ( ) ). Also, let ( ) and ( ) be the minimum and maximum numbers of remnants, respectively, for ( ) when making ( ) batches with ( ) items to satisfy the MIBR and MABL. The following Property 1 to determine ( ) , ( ) and ( ) for given ( ) is derived from the solution property identified by Lim et al. (2015) for the SBP without the MQC constraint under three level exclusive AHCS.
Property 1. For a given ( ) , ( ) , ( ) and ( ) are as follows: is not an integer and ( ( ) is not an integer and ( ( ) Proof. Let  be the number of items included in each batch. When ( ) is not an integer and ( ( ) is not an integer and ( ( ) In this study, the committed minimum quantity and the MQC constraint are set to ( ) • ( ) and ( ) ≥ ( ) • ( ) , for each of all ( ) s, respectively. By setting it like this, the number of items included in each batch for ( ) is determined as a value between ( ) and ( ) while minimizing the number of batches needed for ( ) items, as shown in Property 1. Also ( ) satisfies the MQC constraint if ( ) is determined as a value between ( ( ) − ( ) ) and ( ( ) − ( ) ) as shown by Property 2.
Also, ( ) = ( ) according to Property 1. For (b) and (c) of Property 1, it can be proved similarly. ■ In the truck-loading example, the number of destinations where multiple deliveries occur can be decreased and the sequential batching process can be completed at an earlier level when the number of items loaded onto each truck is determined as a value between the minimum requirement for TL freight charge and the maximum truck capacity. Additionally, setting the MQC constraint ( ) ≥ ( ) • ( ) enables more effective control over the DoH in the batching results because we can change the completed level of the sequential batching process and the number of batches formed at each level by reducing or increasing the MIBR and MABL ( ( ) and ( ) ), while minimizing the number of batches (i.e., ( ) = ( ) ). In the truck-lading example, the DoH can be defined by the ratio of destinations where multiple deliveries occur. As mentioned earlier, batches are created for item types included in clusters according to the sequence of the levels in a given AHCS. Therefore, the committed minimum quantity ( ( ) • ( ) ) and batching results ( ( ) , ( ) , ( ) and ( ) ( ) ) are not determined for all levels at once but is determined sequentially for only the level to which batching is being performed from the first level. Because of this distinct feature, it is called sequential batching with the MQC constraint. Batching at each level is not independent of batching at other levels, although the batching process is carried out sequentially from the first level until to the last level, because the results of batching at one level affect batch results at subsequent levels and batch results at a level also are influenced by batch results at previous levels. Recursive equations of the dynamic programming (DP) can be formulated for the sequential batching problem. However, the DP approach is not suitable to solve practical problems due to high computational complexity and very long computation time as mentioned by Bang et al. (2016) and Lim et al. (2017). In this study, we develop a sequential batching algorithm that makes batches for two successive levels sequentially from the first level using properties identified to find better solutions quickly.

Table 4
A mathematical solution for the example truck-loading problem According to Properties 1 and 2, we can obtain a mathematical solution of the example truck-loading problem. Table 4 presents the solution of the example truck-loading problem given in Table 2 Table 1. The total processing cost is the sum of the total TL and LTL freight charges. Moreover, ( ) is the number of truckloads for ( ) , and ( ) items are divided and loaded onto the ( ) trucks. Following properties 3 and 4 indicate whether we can find an optimal solution of the SBPM-NA at Level 1 and Level 2, respectively. Proof. Note that ( ) is the cluster with the minimum batch processing cost among clusters in Δ ( ) . The QB at Level 1 becomes the maximum and satisfies the MQC constraint by Property 2, while the number of batches for them is also the minimum by Property 1. Also, all remnants not batched at Level 1 are sent to the cluster of Level 2 with the minimum batch processing cost, and they are batched for the cluster without any remnants. Therefore, total batch processing cost becomes the minimum. ■ Note that it is impossible to determine whether the solution in Table 4 is optimal based on Property 4 because remnants not batched at Level 1 are not sent to the cluster with the minimum batch processing cost at Level 2 (i.e.,  Table 4 is to be an optimal solution according to Property 4, because cluster 2 ( ) has the minimum batch processing cost among all clusters at Level 2.
The following Property 5 provides conditions for termination of the sequential batch process after Level 2. Note that Property 5 does not ensure an optimal solution of the SBPM-NA.  (1). Therefore, ( ) satisfies the MQC constraint by the Property 2. Also, the conditions (2) and (3)  Proof. ( ) satisfies the MQC constraint by the Property 2 and it is the maximum QB with the MNB since ( ) is the minimum quantity of remnants after batching by Properties 1 and 2. Also, ( ) is sent to the cluster where the minimum unit processing cost occurs. ■ The following Property 7 shows how to obtain a lower bound for optimal total processing cost of the SBPM-NA. Here, Here, ( ) , ( ) and ( ) are: is not integer and ( ( ) is not integer and ( ( ) Property 7. A lower bound for optimal total processing cost for the SBPM-NA, called LB, is Proof. By Property 1 through Property 3, the optimal total processing cost becomes ∑ when the sequential batching process is completed at Level 1. In this case, ( ) = ∅ and LB equals to the optimal total processing cost. On the other hand, the optimal total processing cost becomes ∑ when the sequential batching process is completed at Level 2. Here, ( ) * is an optimal solution at Level 2. In this case, becomes a lower bound for the SBPM-NA. ■ Using the properties identified above, we developed a sequential batching algorithm that makes batches for two successive levels from the first level. In this algorithm, for two successive levels n and n+1, we first determine both the QB for all ( ) s at level n (i.e., ( ) ) and the quantity of items not batched for ( ) at level n that are sent to the next level n+1 (i.e., ( ) ( ) and ( ) ). For all ( ) s, we tentatively set ( ) to ( ) − ( ) (i.e., the maximum quantity of items that can be batched.). As a result, ( ) items can be sent to clusters included in Δ ( ) at level n+1. In this algorithm, we send all ( ) items to only one cluster at level n+1 where the minimum batch processing cost occurs. That means we tentatively set . If all ( ) items can be batched for all ( ) s, the sequential batching process is completed at level n+1. However, if there exist ( ) such that ( ) > 0, we check whether some items among tentatively batched at level n can be sent additionally so that ( ) = 0 without violating the MQC constraint. If that is possible, we send additional items to the clusters so that ( ) becomes zero with the additional items from level n. Otherwise, we do not send the items batched at level n additionally to the next level and continue the sequential batching process. Now, we describe the sequential batching algorithm in detail.

Procedure 1 (Sequential Batching Algorithm)
Step 1. Compute ( ) and ( ) for each of all ( ) s according to Property 1. If ( ) = 0 for all ( ) s, make ( ) batches with ( ) for each of all ( ) s according to Property 1 and terminate by Property 3.
Step 2. For each of all ( )   . Compute the lower bound according to Property 7 and compute the total processing cost. Terminate.

Computational Experiments
We performed computational experiments for evaluating the sequential batching algorithm. Problem instances are generated as follows.
(1) Three cases for the number of item types: 400, 600 and 800.
(2) Three cases for the mean of the QTB for each item type (): 400, 600 and 800.
(3) Three cases for the variation of the QTB (=()): Low (=1.0), Medium (=1.5) and High (=2.0), where  and  are two parameters of the normal distribution N(,  2 ). Here, the size of variation becomes larger as  increases. Also, we excluded negative random values generated from the normal distribution. Here, remember that we set the committed minimum quantity to ( ( ) × ( ) ), where ( ) is the MNB by Property 1 and ( ) is the MIBR. That is, the bigger , the larger ( ) . Therefore, the MQC constraint becomes tighter.
The first three parameters (namely, the number of item types and the mean and variation of the QTB for each item type) define demand scenarios. We constructed 27 demand scenarios with the three parameters of three levels. The other four parameters define the clustering structure (the number of levels and density of clusters) and batch configuration (the MABL and MQC tightness). For each of the 27 demand scenarios, we tested 81 combinations for the four parameters of clustering structure and batch configuration. For each of the 81 combinations, we generated five problem instances and solved them using the sequential batching algorithm. Note that the number of item types, QTB, and batch and unit processing costs are the same for all the five problem instances of the same demand scenario, whereas the clustering structure and batch configuration are different. The batch processing costs were generated from DU(100, 120) for Level 1, DU(150, 200) for Level 2, DU(250 , 300) for Level 3, and DU(350, 400) for Level 4 when the MABL is small. Here, DU(a, b) represents the discrete uniform distribution with range parameters a and b. When the MABLs are medium and large, the batch processing costs for them were set to doubled and quadrupled, respectively, of those for the small MABL. The unit processing costs at the last level were set to satisfy ( ) > ( ( ) ( ) ) for all ( ) s. The sequential batching algorithm was coded in C programming language and run on a PC with an Intel Core i7 processor to solve the problem instances. The algorithm solved all the problem instances very quickly within seconds. The average percentage gap is 2.07%, which means that the algorithm finds very good lower and upper bounds. Here, the percentage gap is ((UBLB)LB)100, where UB is the total processing cost obtained by the algorithm and LB is the lower bound by Property 7. For each demand scenario, we performed analysis of variance (ANOVA) to know whether factors for the clustering structure and batch configuration have statistically significant effect on the total processing cost. Table 5 gives results of ANOVA. The MABL is a statistically significant factor that affects the total processing cost for all 27 demand scenarios. However, the number of levels is not statistically significant factor on the total processing cost for all demand scenarios. As shown in the results for demand scenarios S-1, S-2, S-3, S-10, S-11, S-12, S-19, S-20, and S-21, the MQC tightness affects the total processing cost when the average demand is small, regardless of the number of item types and size of variation. The MQC tightness also affects the total processing cost for all demand scenarios except S-27, when the number of item types is large, as shown in the results for demand scenarios from S-19 through S-26. When the number of item types is small, the density of clusters affects the total processing cost when both the average demand and size of variation are small, as shown in the results for demand scenario S-1. As the number of item types increases, the density of clusters affects the total processing cost in cases where the average demand or size of variation increases, as shown in the results for demand scenarios S-11, S-13, S-20, and S-22. When both the average demand and size of variation are small, interactions among the density of clusters, MABL, and the MQC tightness affect the total processing cost, as shown in the results for demand scenarios S-1, S-10, and S-19. Also, when the number of item types increases, the interaction between the MABL and MQC tightness affects the total processing cost for larger average demand or larger variation, as shown in the results for demand scenarios S-13, S-20, S-22, S-23, and S-25.  From the results of ANOVA, we now give some meaningful insights to help determine optimal parameters for the clustering structure and batch configuration from detailed results for some demand scenarios. Tables 6-8 show test results for demand scenario S-1, where the number of item types, average demand, and size of variation are small. Results for the 3-level case are shown in Table 6, whereas those for 4-level and 5-level cases are shown in Tables 7 and 8, respectively. All values in the fifth through the last column are the average values of test results for the five problem instances solved. In the tables, the first column gives the variation in the QTB. The QTB for each item type were generated from N(400, 400 2 ) in demand scenario S-1.   The second column gives the clustering structure, where the first value represents the number of levels and the remaining values represent the numbers of item types at level 1, clusters at level 2, and clusters at level 3, respectively. For example, the clustering structure 3-400-80-16 has three levels, 400 item types in the first level, 80 clusters in the second level, and 16 clusters in the third level. We set the portion of clusters (or product types) at a level that are included in two or more clusters at the next level to be between 15% and 45% to reflect non-exclusiveness of the AHCS. The fifth and sixth columns give the total processing cost and the percentage gap, respectively. The seventh column gives the average fill rate computed by dividing the average number of items included per batch by the MABL. For example, 48.5 items are included per batch on average when the fill rate is 0.98 and the MABL is 50. The eighth, ninth, tenth and eleventh columns represent the portions of the total QB, total QNB until the last level, and the QB at Levels 1 and 2, respectively. The last column represents the average level where the sequential batching process is completed. Results of the computational experiments for demand scenario S-1 show that dense cluster, small MABL, and tight MQC constraint are effective in reducing the total processing cost. Note that we assume the batch processing cost increases proportional to the MABL. If economies of scale at the batch processing cost occur as the MABL grows, a larger MABL may reduce the total processing cost. Furthermore, the larger MABL reduces the number of batches and may result in reduced material handling costs. However, the heterogeneity of each cluster becomes higher, and item types included in batches may also increase when we decrease the number of clusters to make the density of clusters denser. This may increase the batch processing cost and total processing cost. Test results for 4level and 5-level cases compared with results for the 3-level case present some new findings. The small MABL and tight MQC constraint help reduce the total processing cost further when we increase the number of levels from three to four or five if the variation in the QTB is not large, although the number of levels is not statistically significant factor on the total processing cost. Also, the QB increases slightly when we increase the number of levels. Tables 9-11 show test results for demand scenario S-14, where the number of item types, average demand, and size of variation are all medium levels. Tables 9, 10 and 11 show results for the 3-level, 4-level and 5-level cases, respectively. For this demand scenario, dense cluster, small MABL, and tight MQC constraint are still effective to reduce the total processing cost, although the density of cluster is not statistically significant. However, increasing the number of levels does not help reduce the total processing cost any more in this demand scenario unlike in the S-1 demand scenario where the number of item types, average demand, and size of variation are small.     Tables 12-14 show test results for demand scenario S-27 where the number of item types, average demand, and size of variation are large. For this demand scenario, dense cluster, small MABL, and tight MQC constraint are still effective to reduce the total processing cost although the effect of the density of clusters and MQC tightness on the total processing cost become less. From the results of the analyses so far, we can conclude that the density of cluster, MABL, and MQC tightness should be determined simultaneously, not separately, considering interactions of these factors and economies of scale and scope in batch sizes. Also, it is important to set the number of levels considering the variation in the QTB, MABL, and MQC tightness. Main results for other aspects including the fill rate, the QB and QNB are as follows. First, the QB at lower (i.e., earlier) levels increases as the MQC tightness eases, whereas the total processing cost increases because the fill rate decreases as a result of the loose MQC. Here, notice that the DoH may decrease when the sequential batching process is completed at a lower level because clusters at lower level contain less different item types. Second, the QNB increases, which results in increased total processing costs as the MABL becomes larger. Third, as the density of cluster becomes lower, the total processing cost increases because the QNB increases owing to less alternatives for the batch formation. Fourth, as the density of cluster and MABL become lower and larger, respectively, the sequential batching process is completed at higher (i.e., later) levels, which causes an increase in the QB at higher levels. These tendencies become more apparent when the mean and variation in the QTB are small and low. On the contrary, the degrees of these tendencies are diminished for a larger mean and higher variation in the QTB. Here, notice that the DoH may increase when the sequential batching process is completed at a higher level because clusters at higher level contain more different item types.
The analysis results mentioned above can be interpreted in terms of truck-loading and input-lot formation in the wafer fab as follows: Increasing the density of a cluster by including more destinations to the cluster, setting the minimum quantity that should be loaded onto a truck as large as possible, and utilizing smaller trucks can reduce the overall cost of delivery when making a decision on truck-loading using the sequential batching with the MQC constraint. In particular, trucks with a smaller capacity offer more opportunities for reducing multiple deliveries because the sequential batching process is completed at a lower level in this case. Additionally, as the mean and variation of quantities to transport to destinations become smaller, increasing the density of clusters and utilizing small trucks have an advantage in reducing multiple deliveries to destinations. For the input-lot formation in the wafer fab, including more product types into a product line or family, setting the minimum requirement of the number of wafers included in each lot to be larger, and the maximum limit on the number of wafers in each lot to be smaller can reduce the overall production cost. In particular, a smaller lot-size offers more opportunities for reducing lot-splits during the fabrication process. As the mean and variation in the number of wafers for product types to be released into the wafer fab become smaller, increasing the number of product types included in a product line or family, and small lot-sizes provide an advantage in reducing lot-splits. Finally, small capacity of the truck and small lot size with loose MQC constraint are effective to reduce the DoH that results in less multiple deliveries and lot-splits.

Concluding Remarks
In this paper, we studied a sequential batching problem with an MQC constraint in N-level non-exclusive AHCS. Here, batching priorities are assigned based on AHCSs and the MQC constraint is applied for more effective control over the DoH in the batching results. The sequential batching arises in many areas including input lot formations in semiconductor wafer fabs, presort loading of commercial bulk mail for discounted mailing fees, and determination of truckloads in VMI environments. We developed a sequential batching algorithm that makes batches for two successive levels sequentially from the first level using properties identified to find better solutions and compute lower bounds. Computational experiments were performed for evaluating the sequential batching algorithm and finding some managerial insights. The algorithm solved all the problem instances quickly enough for solving largesized practical problems. Furthermore, the average percentage GAP was 2.07%, which means that the algorithm finds very good lower and upper bounds. We found some meaningful insights that can be properly interpreted depending on the application area. First, the MABL is a statistically significant factor on the total processing cost, whereas the number of levels is not significant. Second, dense cluster, small MABL, and tight MQC constraint are effective to reduce the total processing cost. Third, interactions among the density of clusters, MABL, and the MQC tightness have effect on the total processing cost when both the average demand and size of the variation are small. The interaction between the MABL and MQC tightness affects the total processing cost for larger average demand or larger variation. Fourth, a small MABL and tight MQC help reduce the total processing cost for clustering structures with more (deep) levels when the variation in the QTB is not large. Fifth, small MABL with loose MQC constraint seem to be useful for reducing the DoH in the batching results. From these results, we suggest that the density of cluster, MABL, and MQC tightness should be determined simultaneously, not separately, considering interactions of these factors and economies of scale and scope in batch sizes. Also, it is important to set the number of levels considering the variation in the QTB, MABL, and MQC tightness. This research is a groundwork to study various real-world problems in which item sizes vary and, therefore, the number of items included in each batch may vary. Such problems are commonly encountered in the logistics industry, where heterogeneous items are loaded onto vehicles and transported to several destinations. Furthermore, the cost of batch processing may vary depending on the quantities and types of items included in batches. In flexible manufacturing environments, various items with different processing requirements are batched and processed together. It is necessary to conduct welldesigned computational experiments using domain-specific field data to apply the main results of this study to these actual problems in manufacturing, logistics, and so forth. Results of this study can provide guidelines for computational experiments and their analysis. In addition, rather than deciding batch configuration and clustering structure independently, we should determine them together as an integrated perspective. We can combine the sequential batching algorithm with clustering techniques based on machine learning for such integrated problems.