Solving the one-warehouse N-retailers problem with stochastic demand : An inter-ratio policies approach

Article history: Received February 8 2020 Received in Revised Format May 23 2020 Accepted July 17 2020 Available online July, 17 2020 In this paper, we consider a two-echelon supply chain in which one warehouse provides a single product to N retailers, using integer-ratio policies. Deterministic version of the problem has been widely studied. However, this assumption can lead to inaccurate and ineffective decisions. In this research, we tackle the stochastic version of two-echelon inventory system by designing an extension of a well-known heuristic. This research considers customer demands as following a normal density function. A set of 240 random instances was generated and used in evaluating both the deterministic and stochastic solution approaches. Due to the nature of the objective function, evaluation was carried out via Monte Carlo simulation. For variable demand settings, computational experiments shows that: i) the use of average demand to define the inventory policy implies an underestimation of the total cost and ii) the newly proposed method offers cost savings. © 2021 by the authors; licensee Growing Science, Canada

The deterministic version of the problem, in which the demands of each retailer are known with certainty, has been widely studied. Schwarz (1973), for example, found that for the case in which all retailers are equal, the optimal choice is a single cycle stationary policy. However, when retailers differ, although such policies can be quite effective they are not necessarily optimal. Furthermore, the proposed heuristic generates good solutions only for systems with fewer than ten retailers. As a result, new heuristics that improve performance can now be found in the literature. Roundy (1985) proposed two policies: an integer-ratio policy, and another that uses the power of two to minimize the average cost of the problem. Additionally, as an extension to these results, Abdul-Jalbar, Gutiérrez, & Sicilia (2006) analyzed the impact of the power of two constrain on inventory policies, consequently proposing a heuristic with which to find integer-ratio policies able to improve performance in the objective function.
More recently, Abdul-Jalbar et al. (2010) proposed a heuristic aimed at reaching balance between the costs of inventory replenishment and maintenance at each retailer. According to the authors, this method can be implemented on a spreadsheet, which improves its ability to impact on real systems. Likewise, computational experiments have shown that its performance is superior to that of other methods referencing the same problem. Some of the extensions proposed by the authors for the study of more complex supply chains have already been explored by other researchers (Zhao et al., 2016).
However, many inventory models proposed in the literature assumed that conditions of certainty exist in the environment being modeled. This assumption can lead to an inaccurate representation of the actual inventory system, and thus lead to extremely inaccurate and ineffective decisions (Bean, Joubert, & Luhandjula, 2016;Y. Li & Liu, 2019;Rodado, Escobar, García-Cáceres, & Atencio, 2017). As a consequence, uncertainty of inventory parameters is now a well-established phenomenon (Guchhait, Maiti, & Maiti, 2014;Shekarian, Kazemi, Abdul-Rashid, & Olugu, 2017). According to Torkul, Yılmaz, Selvi, & Cesur (2016), this uncertainty can be managed by studying the variance of orders, lead-times and backorders. For stochastic demands, the nature of uncertainties can be classified into two major groups: random and fuzzy (Fera, Fruggiero, Lambiase, Macchiaroli, & Miranda, 2017;Guchhait et al., 2014). Although a number of different authors have studied the impact of variability in inventory problems for single echelon systems (Disney, Maltz, Wang, & Warburton, 2016;Escuín, Polo, & Ciprés, 2017), the impact of variability in demand has received little attention with respect to the supply chain coordination problem (Ercan & Hakan, 2013).
In this context, two questions arise: i) how does demand variability impact the quality of solutions, under the deterministic approach, for systems involving one warehouse and several retailers, and ii) can the performance of deterministic solutions be improved if demand variability is included in the analysis? The present paper answers these two questions and proposes a modification of the Abdul-Jalbar heuristic so that backorder and holding costs for the safety stock are included. The remainder of the paper is organized as follows: Section 2 presents the research problem and describes the deterministic approach; Section 3 presents the proposed extension to the case in which demands can be modeled using a normal probability density function; finally, sections 4 and 5 present a numerical example and the results of computational experiments, respectively.

Problem statement and deterministic approach
Consider a system in which one warehouse supplies a single product to retailers. The aim is to find the policy that minimizes the total cost, avoiding economic sacrifices related to the division of the problem into independent models (Schwarz, 1973). The demand rate is constant and known; the warehouse is the only provider to the retailers and can serve as a storage point. Additionally, there are no backorders or initial inventory, and shipments to the warehouse are made instantly. The model parameters are: with three decision variables: According to Schwarz (1973), in supply chains with this configuration, optimal replenishment policies require that the order quantities vary with time. As a consequence, aiming at promoting real world implementations, many authors have studied a simpler class of strategies called stationary and nested policies (Abdul-Jalbar et al., 2010). While, in stationary policies each facility orders a constant batch in equally-spaced time instants, in nested policies all the retailers order at the same time that the warehouse (Abdul-Jalbar et al., 2006). Even though stationary and nested policies have managerial advantages, the nested nature may lead to very low effectiveness (Abdul-Jalbar et al., 2010). Hence, Abdul-Jalbar et al. (2010) designed integer-ratio policies in which, it is set up that = * for all = 1, … , . It this context, is either an integer value when ≤ or an integer ratio ( = ; ∈ ) when ≥ .
For these policies, the total cost of supplying a retailer can be calculated as: The coordination constant defines the number of times that a retailer is replenished during the cycle time of the warehouse. Consequently, the first term calculates the fixed cost of ordering on the planning horizon, while the second term refers to the cost of holding the average inventory. Figure 1 represents a policy comprising one warehouse and 2 retailers, with coordination constants > 1 y < 1. Given this policy, the warehouse must hold inventory to supply retailer 1's orders. The total cost of the system can be calculated as the cost of the warehouse (first term) and the sum of retailer costs (second term) as follows: (2)

The Abdul-Jalbar solution approach
In order to minimize this deterministic cost function, Abdul-Jalbar et al. (2010) proposed a heuristic based on the calculation of the ratio between ordering and holding costs for each retailer. This ratio is denoted and can be calculated using the following expression: Three steps are required in order to obtain inventory policies: Firstly, (step 0), all = 1 values are set and the total cost is calculated. Secondly (step 1), values are computed for each retailer. Since, the objective is to obtain values of close to 1, replenishment time must be increased or decreased for each retailer not meeting this condition. According to the authors, the best solutions are obtained for values of in the range [0.4; 1.2]. At each iteration of the first step, a marginal change in the values of is produced until they fall within the previously described range, or total costs increase from one iteration to the next. In either case, step 2 follows in which only one value of is changed at a time. Retailer with the highest value of or 1/ is searched, because it is the farthest from the target ( close to 1). The changes made to the values for such a retailer are the same as in the previous step. This procedure needs to be executed until the total cost start to increase. A more detailed overview of the pseudo-code of the Abdul-Jalbar heuristic (2010) is presented below.
∀ with < 0.4 increase value of to obtain value * Now if ≥ 1 increase value means set * = + 1 Otherwise, if < 1 then = with an integer value and increase value means set * = .
If > 1 then decrease value as in Step 1 to obtain value * . If < 1 then increase value as in Step 1 to obtain value * .
Step 2 else Go to Step 3 Step 3 Stop. in the lowest cost found

Including variability
In the above context, in order to include demand uncertainty it is necessary to modify the total costs function. Underlining the coordination between the warehouse and each retailer, a policy of periodic review of type ( , ) (Liu & Song, 2011) is assumed. This means that inventory is reviewed every time units and the required units are ordered to reach an inventory of units, which are delivered after time units. For this particular case, due to problem definition, the lead time is zero ( = 0). Likewise, it is assumed that demand follows a normal probability density function with known parameters. Research indicates that the normal approximation is robust when the coefficient of variation is lower than 0.5 (Tyworth, 2000;J. E. Tyworth & O'Neill, 1997). We use the calculation of the optimum service level proposed via the expression used in newsvendor model policies based on the following notation (Liberopoulos, Tsikis, & Delikouras, 2010) where * = Optimum service level for warehouse; * = Optimum service level for retailer j; = ( ) for warehouse; = ( ) for retailer j; ℎ = for warehouse; ℎ = for retailer j; In addition, we define penalty cost as the formulas: where, = ; Note that if we substitute Eq. (6) into Eq. (4) we get: Similarly, if we substitute Eq. (7) into Eq. (5) we get the same expression that Eq. (8) Finally, two new components of cost must be considered: the holding cost of safety stock, and the penalty cost. If demands are assumed to be normally distributed, the cost of safety stock can be calculated as: Likewise, the shortage cost will be: Then, the new total costs function is: The first term in the above equation represents the fixed costs of ordering for both the warehouse and the retailers; since this cost is a function of order frequency, it does not change with respect to the deterministic model. The second term represents the holding costs for the retailers and the warehouse. At the warehouse, these costs are a function of frequency because they are incurred only if the frequency is greater than 1. The remaining terms represent the cost of holding safety stock and penalty costs, for both the warehouse and each of the retailers.
As and are the decision variables on which this heuristic is based, given policy coordination, the costs function expressed in terms of these variables is shown below.
In order to include this information about demand into the heuristics proposed by (Abdul-Jalbar et al., 2010), a modification to the ratio indicator is also proposed aimed at considering the costs associated with demand uncertainty. In particular, the expected number of backorders, backorder costs, and the corresponding deviation are included.
Thus, the new can be calculated as follows: We can take the derivative of in Eq. (15) with respect to and set it equal to zero, obtaining thus the next expression: As expected, in the stochastic case, depends not only on the values of and but also on the service levels. Additionally, while in the deterministic approach it is possible to obtain a closed expression for through the first order conditions of the costs function, for the stochastic approach, as in the classic model ( , ), it is not possible to obtain a formula for the value of . In order to obtain such a value, we use the Generalized Reduced Gradient algorithm using as a starting point the optimal for the deterministic problem. As the new cost functions is non-convex we do not have guarantee to get the globally optimal solution.

Numerical example
There follows in this section an illustrative example of the results obtained using the proposed methodology, based on the adoption of a 5-retailer scenario. Detailed parameter information is provided below: In this instance we use a penalty ratio of = 2.3333. This lambda value produce a service level of 0.70. In addition, the coefficient of variation used is 0.1. Thus, the standard deviations were obtained from the expression = .
The instance was solved using both the deterministic and stochastic approaches. Table 2 shows the results obtained via the deterministic heuristic. As can be seen from this table, the supply time of the warehouse is 2.28 years, with retailers' frequencies varying from 0.2 and 4. This means that retailers 2, 4 and the warehouse replenish every 2.28 years, the retailer 1 replenish every 4.56 years, which is 2 × 2.28, the retailer 3 replenish every 11.4 years(5 × 2.28), and the retailer 5 replenish every 0.57 years (2.28/4).  Table 3 shows the results obtained using the stochastic version of the heuristic. One major change visible in this table is the warehouse cycle time, which is increased from 2.28 to 2.89 years. Consequently, some retailers change its order frequency and others keeping constant, but the time for replenish changes because the warehouse cycle time is greater. For example, retailers' frequencies 1 and 4 are kept constant, but they replenish every 5.78 years and 2.89 years respectively, which are greater than the deterministic case. The retailers 2 and 3 replenish faster because their values increased and their t values are lower. Finally, though the value for the retailer 5 increased, this is not sufficient to replenish faster according the new * , thus, the retailer 5 replenish every 0.58 years, slightly greater than the deterministic case. Once the two supply policies were defined, a Monte Carlo simulation model was run in order to evaluate the total costs function for both solutions. Each iteration results from simulate costs during twenty-four years (288 months), including backorders, holding and fixed costs. Fig. 3 displays a comparison between the average costs at each iteration of the simulation, for both methods. As can be seen from this figure, the solution provided by the probabilistic version of the heuristic consistently outperformed the deterministic solution.

Computational results
Computational experiments were designed in order to test two hypotheses regarding coordinated inventory systems between one warehouse and retailers: i) deterministic solutions underestimate the total cost of the policy and may lead to inaccurate decisions in situations of demand variability; and ii) it is possible to improve the performance of these solutions by making adjustments to the objective function and the ratio indicator for each retailer. In total, 240 random instances were generated by modifying the following three factors in order to measure their impact: number of retailers, from 5 to 100, varying from 5 to 5; the coefficient of variation, testing values of 0.04, 0.07, and 0.1; and backorder costs, testing penalty values of 2. 333, 4, 9, and 19. These lambda values are those that produce service levels of 70%, 80%, 90% and 95%. It was additionally assumed that demand followed a normal distribution (Andersson & Marklund, 2000;Berling & Marklund, 2014).

Underestimation of cost with deterministic policies
Instances were first resolved using the deterministic approach, a confidence interval for the total cost of the policy was subsequently calculated via Monte Carlo simulation. The number of iterations was determined by dynamically searching for a maximum amplitude range of 5% of the indicator average (Framinan & Perez-Gonzalez, 2015). Figure 4 compares the estimated annual cost (of the deterministic policy) to the average cost obtained in the simulation for each number of retailers; in this figure the X axis provides information regarding penalty values and coefficient of variation for each scenario. For the 5-retailer instance, for example, with a penalty factor of 19 (service level of 95%) and high deviation (coefficient of variation of 0.1), the estimated cost of the deterministic policy is 6023 monetary units and the average cost of simulation is greater than 11000 monetary units.

Fig. 4. Estimated and simulated average costs for each instance
Variance analysis was performed by adding an additional factor named technique with two levels: estimation and simulation. Based on ANOVA of this experiment (experiment 1), it was concluded that statistically significant differences (P-Value = 0.00) were present for principal effects, double effects, and triple effects (except by the triple interaction Retailers-CV-Service Level). Assumptions of normality (Kolmogorov-Smirnov test, P-Value = 0.00), homogeneity of variances (Levene test, P-Value = 1) and independence (Runs test, P-Value = 0.99) were all verified. Because the normality condition was not met, a separation of data was carried out, depending on their deviation, in order to perform other tests. These tests revealed same results using methods Bonferroni and LSD test. It was found that the means difference was significant both for the number of retailers and for backorders penalty values (related to service levels). It is important to state that as we are dealing with a deterministic case the estimation costs are constant within groups of retailers but simulation costs differs within groups of retailers due to backorders.

Fig. 5. Profile graphs of experiment 1
An ANOVA was then performed using as the response variable the percentage difference of estimated costs versus obtained costs from the Monte Carlo simulation (experiment 2). Assumptions of normality (Kolmogorov-Smirnov test, P-value = 0.00), homogeneity of variances (Levene test, P-Value = 0.7) and independence (Runs test, P-Value = 0.94) were validated. Again, statistically significant differences (each one with P-value = 0.00) were obtained for principal effects and double effects. Because the normality condition was also not met, a separation of the data was made, depending on their deviation, in order to perform other tests. These tests revealed same results using methods Bonferroni and LSD test. It was found that the means difference is significant for the coefficient of variation in pairs low-high and medium-high, but it was not significant to the pair low-medium. This indicate that costs are not significantly affected when coefficient of variation changes from 0.04 to 0.07. Additionally, was found a homogenous group of means in service levels 70% and 80%. Then, there is no statistically significant difference in costs when service level changes from 70% to 80%. It could be explained by the fact of backorders penalization not is as great as to produce a significant difference in costs.
Having said that, when analyzing the profile graphs (see Fig. 6) it can be concluded that the underestimation of cost is higher for instances in which the optimal service level is larger. This means that the higher the backorders cost of an instance (compared to holding cost), the more damaging it is to use the average demand to determine the inventory policy. On the other hand, underestimation is greater for high demand variability but not for large number of retailers.

Impact of the proposed heuristic
In the second phase of analysis, the same instances were resolved using the proposed probabilistic heuristic method. To compare the results with those presented in section 5.1, the solutions obtained via this approach were simulated using common random numbers. Additionally, an experiment was designed that included the method factor (deterministic, probabilistic), using as the response variable the total costs of each policy (experiment 3). According to ANOVA, principal, double and triple effects (Except the interaction Retailers-Coeficient of variation-Service Level) were significant (Each one with P-value = 0.00). Assumptions of normality (Kolmogorov-Smirnov test, P-Value = 0.003), homogeneity of variances (Levene test, P-Value = 0.37) and independence (Runs test, P-Value = 0.29) were verified. Because the normality condition was not met, a separation of the data was carried out, depending on their deviation, in order to perform other tests. These tests revealed same results using methods Bonferroni and LSD test. It was found that the means difference was significant both for the number of retailers and for backorders penalty values. About the latter, all comparisons show that each pair of service levels present significant difference except the pair of service levels 70% and 80%. This could be explained by the fact of penalty ratio are lower to these service levels and they are not sufficient to produce significant differences in costs.
Analysis of the resulting profile graphs (Fig. 7) reveals that the difference between the total costs obtained via the two methods is greater with increasing demand variability and backorder costs (related to service levels) from service level of 80%. However, the stochastic method produced lower total costs compared to the deterministic method, no matter the level of deviation or backorders factor. Hence, it is possible to conclude that our proposed extension of the Abdul-Jalbar heuristic (Abdul-Jalbar et al., 2006) provides savings in the total costs function. Finally, the problem was subjected to variance analysis, this time defining as the response variable the costs difference between the deterministic and stochastic methods. According to this ANOVA, principal and double interactions were significant (each one with P-value = 0.00). Assumptions of normality (Kolmogorov-Smirnov test, P-Value = 0.03), homogeneity of variances (Levene test, P-value = 0.78) and independence (Runs test, P-Value = 0.4) were verified. Because the normality condition was not met, a separation of the data was made, depending on their deviation, in order to perform other tests. These tests revealed same results using methods Bonferroni and LSD test. It was found that the means difference was significant both for the coefficient of variation in the high level and for backorders penalty values related to the highest service level.

Concluding remarks
The formulation of single-cycle policies, specifically for coordination scenarios between one warehouse and retailers, allows decisions to be made that minimize the cost of the two echelons without making sacrifices due to problem decomposition. However, the approaches proposed by other authors involve cases in which demand is known with certainty, which implies difficulties in implementing these methods to solve real-life industry problems. The present paper proposes a costs function that considers the coordination parameters of an inventories system comprising a warehouse and retailers, in a scenario of stochastic demand. Because of the nature of the decision variable, a model with inventory policies ( , ) was taken as the basis.
To solve the stochastic problem, an extension of the Abdul-Jalbar heuristic that considers backorders costs in decision-making is proposed. A set of 240 random instances was generated and used in evaluating both the deterministic and stochastic models. Due the stochastic nature of the objective function, evaluation was carried out via Monte Carlo simulation. The number of iterations of the simulation was dynamically calculated in order to ensure the desired confidence interval size of the performance measures. It is important to state that as we use coefficient of variations lower than 0.5, the use of the normal distribution is justified (Tyworth and O'Neil, 1996;Tyworth and Ganeshan, 2000).
An analysis of the obtained results reveals that using the average demand to define the inventory policy implies an underestimation of the total cost. This underestimation exists for all instance sizes and, as expected, as the demand variability and backorders cost increase, underestimation becomes greater. Consequently, the policy identified by the Abdul-Jalbar heuristic does not provide realistic solutions when used for demand variability settings. Finally, when a comparison between the simulations performed using both methods is made, it can be concluded that the newly proposed method offers savings for the system in question.
Future work will be aimed at analyzing the impact of variability in more complex supply chains. Similarly, simulationoptimization approaches may be used for the construction of solutions, a possibility that implies the development of efficient solution methods for the deterministic problem.