Quality grading of returns and the dynamics of remanufacturing

We consider a hybrid manufacturing/remanufacturing system where the returned products (cores) are classified into different quality grades. Each grade requires different remanufacturing operations and thus lead times. We examine the implications of the quality-grading scheme on the dynamic behavior of closed-loop supply chains, benchmarking this against a typical system where all the returns undergo the same remanufacturing process. Through control engineering techniques, we evaluate the Bullwhip and inventory performance of the supply chain by observing the step response of the orders and net stocks (the shock lens), analyzing the frequency behavior of these signals (the filter lens), and measuring their dynamics due to stochastic demand (the variance lens). Subsequently, we discuss the operational savings and additional costs derived from quality grading. We find that the pre-sorting mechanism allows for smoothing the supply chain operations; however, its impact on customer satisfaction is ambivalent. Indeed, we observe that the documented ‘lead-time paradox ’ of the rema-nufacturing process in hybrid systems results here in a ‘quality paradox ’ : lower quality returns may increase the performance of inventories. This affects particularly low-frequency demands. Importantly, we analytically derive the optimal setting of the closed-loop pipeline estimation in order-up-to policies for avoiding long-term inventory drifts. This analysis reveals key potential benefits of information transparency for improving the operational performance, and thus the environmental and economic sustainability, of closed-loop supply chains.

P u blis h e r s p a g e : h t t p s:// doi.o r g/ 1 0. 1 0 1 6/j.ijp e. 2 0 2 1. 1 0 8 1 2 9 < h t t p s :// doi.o r g/ 1 0. 1 0 1 6/j.ijp e. 2 0 2 1. 1 0 8 1 2 9 > Pl e a s e n o t e: C h a n g e s m a d e a s a r e s ul t of p u blis hi n g p r o c e s s e s s u c h a s c o py-e di ti n g, fo r m a t ti n g a n d Thi s v e r sio n is b ei n g m a d e a v ail a bl e in a c c o r d a n c e wit h p u blis h e r p olici e s. S e e h t t p://o r c a . cf. a c. u k/ p olici e s. h t ml fo r u s a g e p olici e s. Co py ri g h t a n d m o r al ri g h t s fo r p u blic a tio n s m a d e a v ail a bl e in ORCA a r e r e t ai n e d by t h e c o py ri g h t h ol d e r s .

Introduction
Accelerating the transition from linear to circular economic models has become a strategic priority for modern societies (Geng et al., 2019). This aims to reduce the environmental footprints of current consumption levels and lifestyle patterns by managing resources more sustainably. Such concerns have resulted in policymakers implementing ambitious directives that focus on closing the loop of product lifecycles; see e.g. European Commission (2015Commission ( , 2017 and United Nations (2018). Addressing these concerns, and thus increasing circularity levels, necessitates the development of new production paradigms that incorporate the collection and recovery of used products, such as those based on remanufacturing (Guide and van Wassenhove, 2009;Abbey and Guide, 2018;Pazoki and Samarghandi, 2020).
Indeed, remanufacturing, which can be understood as "the transformation of used products (referred to as cores), consisting of components and parts, into products that satisfy exactly the same quality and other standards as new products" (Guide and Jayaraman, 2000, p. 3780), needs to be a fundamental pillar of circular economies. As traditional manufacturing can be very resource-demanding and wasteful, remanufacturing has great environmental value. For example, Giutini and Gaudette (2003) claim that remanufacturing generally requires only about 15% of the energy used to manufacture the same product. And given the weight of the manufacturing sector in developed and developing countries, it is reasonable to assert that disseminating remanufacturing practices would lead to huge environmental benefits; see Parker et al. (2015). However, the implementation and operation of remanufacturing systems in practice is often arduous and costly, thus slowing down the much-desired shift towards a circular economy. A major reason behind this is that the knowledge of the operational dynamics of remanufacturing systems, and therefore the understanding of their management, is still relatively limited, as repeatedly noted in the literature (e.g. Wang and Disney, 2016;Braz et al., 2018;Cannella et al., 2021). For instance, van Wassenhove (2019, p. 2930) highlights that "we stepped into this field relatively late so the impact of that work is still modest". Also, given the additional complexity involved in closed-loop settings, an important portion of the results and findings for remanufacturing systems included in the literature are built on simplistic assumptions that often do not hold in practice -which, in the words of Guide and van Wassenhove (2009, p. 17), may lead to "elegant solutions addressing non-existent problems". All in all, it may be concluded that the set of tools available for managers to effectively and efficiently integrate circular economic practices into their production and distribution systems is often insufficient.
One of the key aspects that make remanufacturing systems particularly difficult to manage is the wide range of uncertainties that affect their processes (Goltsos et al., 2019a). While production planning mechanisms for traditional manufacturing need to primarily accommodate demand uncertainty, those for remanufacturing also need to account for uncertainty in returns. What is more, this source of uncertainty commonly becomes the dominant force in remanufacturing systems, with the channels for collecting cores being in general much more unpredictable than those of raw materials (Zeballos et al., 2012;Jeihoonian et al., 2017;Goltsos et al., 2019b). Overall, Seitz (2007) highlights that the uncertainties in the collection of used products are one of the main barriers to achieving profitability in this industry, which underscores the strategic importance of this issue.
Importantly, the uncertainty characterizing the returned items is twofold, affecting both their number and their condition (Atasu et al., 2008;Souza, 2013;Agrawal et al., 2015). While the former has a similar, quantitative, nature to demand uncertainty -as evidenced in their grouping together through the concept of net demand, i.e. demand minus returns (e.g. Kelle and Silver, 1989)-, the qualitative essence of the latter requires new considerations. Such quality uncertainty has often been ignored in the relevant literature, mainly due to the complexity of its analysis (Goltsos et al., 2019a). Nevertheless, it critically influences the performance of remanufacturing systems (van Wassenhove, 2019). Also, Abbey and Guide (2018, p. 375) underscore that discrepancies in the condition of cores have "significant implications for the nature of product design and type of strategy a firm should employ to meet customer demands".
In practice, quality uncertainty generally manifests itself through an increased variability in the time and cost required to process the cores. By way of example, Denizel and Ferguson (2010) note that the time required for restoring a used laptop could be up to three times higher when the item is in a bad condition than when it is in a good one. In light of this, quality grading, which refers to the categorization of returns into a finite number of quality grades, emerges as a reasonable, industrially prevalent solution for managing cores with different qualities (Ferguson et al., 2009;Zikopoulos, 2017;Sun et al., 2018). Interestingly, quality grading allows for processing the cores according to their condition, which prevents the quality uncertainty translating into high process uncertainty, which would be detrimental for the efficiency of the remanufacturing system (Goltsos et al., 2019a). Note that cores with different quality grades may be processed in different lines or cells with the aim of better managing the time and cost of remanufacturing (van Wassenhove and Zikopoulos, 2010).
Several studies demonstrate that quality-grading policies enable production and operations managers to reduce remanufacturing costs, including Aras et al. (2004), Behret and Korugan (2009), Ferguson et al. (2009), and Yanıkoglu and Denizel (2020). In addition, other publications have provided a greater understanding of the quality-grading effects in remanufacturing systems from other perspectives, such as network design and configuration (e.g. Radhi and Zhang, 2016;Jeihoonian et al., 2017;Masoudipour et al., 2017) or optimal lot sizing and scheduling (e.g. Panagiotidou et al., 2017;Zikopoulos, 2017;Sun et al., 2018). However, many important questions remain still unexplored in the literature that addresses the value of quality grading in remanufacturing. A fundamental one concerns the understanding of the dynamics induced by quality-grading policies in remanufacturing systems and their closed-loop supply chains. This entails considering the efficiency of the system in the satisfaction of customers as well as the magnitude of Bullwhip Effect in the wider supply chain, which is symptomatic of production and transportation inefficiencies, with the goal of recognizing the different implications of quality-grading policies.
Motivated by the different considerations so far, our aim is to explore the impact of quality-grading policies on the overall performance of remanufacturing systems. We do this by simultaneously looking at the smoothness of operation in the system and its capacity to satisfy customer demand in a cost-effective manner. Specifically, we investigate a hybrid manufacturing/remanufacturing system (HMRS) that receives cores with heterogeneous quality levels. We consider that the system installs a quality-grading mechanism that categorizes the returns in three quality grades (i.e. high quality, low quality, and beyond economical repair) as soon as they arrive to the production facilities. Later, high-and low-quality returns are restored in separated lines, each requiring a different processing lead time. We use as a benchmark a HMRS without quality grading, where all the products collected from the market pass through the same remanufacturing line.
It is important to highlight that our study focuses on a HMRS, in which both new and remanufactured products are used to satisfy the same customer demand. Both products are assumed to meet the same quality standards and have the same price (e.g. van der Laan et al., 1999b;Tang and Naim, 2004;Sarkar et al., 2019). HMRSs are a popular option in practice when the manufactured and remanufactured products are perfect substitutes, like in the spare parts industry (Souza, 2013) or the printing industry (van der Laan et al., 1999b). The study of other characteristics and/or topologies of remanufacturing systems, for instance those in which the new and remanufactured products are sold to different markets with different prices (see Debo et al., 2005;Goltsos et al., 2019a;Ullah and Sarkar, 2020), would also be of interest but is beyond the scope of this paper.
The understanding of the dynamic consequences of quality grading in HMRSs would allow for a better design of such systems to jointly reduce operating costs (by smoothing the supply chain operation) and increase the system throughput (by enhancing product availability). Such a comprehensive approach requires a systems methodology. In this sense, the closed-loop supply chain under consideration is modelled through control engineering techniques, a viable approach for analyzing the global behavior of supply chains that is well aligned with our research goals (see Dejonckheere et al., 2003;Spiegler et al., 2016;Ponte et al., 2019). In light of this, we analyze the response of the system in the time and frequency domains in a bid to obtain a deep understanding of its performance in a large spectrum of real-world contexts. Moreover, the control-theoretic analysis allows us to derive the optimal control of the HMRS.
Analyzing the value of quality grading from a supply chain dynamics viewpoint allows the deduction of relevant managerial implications. We observe that quality-grading policies enable an improvement in the operational performance of the HMRS, which strongly depends on: (i) the ratio between high-and low-quality returns and, (ii) the difference between the remanufacturing lead times. We reveal that information transparency also plays a key role in this problem. Hence, incurring additional inspection-related and layout costs induced by the qualitygrading mechanism may, or may not, be worthwhile depending on the interplays between the relevant factors. Interestingly, we also show that the documented lead-time paradox of HMRSs, which we describe in the next section, results in a quality paradox that deteriorates the inventory performance of the system for high-quality returns in specific circumstances. In this sense, we find that increasing the quality of returns may have a detrimental impact on the supply chain capacity for efficiently satisfying customer needs, a counterintuitive effect that managers need to be aware of.
The remainder of the paper is organized as follows. Section 2 reviews the relevant literature for the purposes of our study. Section 3 details the model of the HMRS by presenting the main assumptions, the sequence of events, the adopted order policy, the block diagrams, and the transfer functions. Section 4 reports the stability and static gain analysis, and derives the optimal regulation for the order policy through the estimate of the work-in-progress (WIP) pipeline lead time. Section 5 discusses in detail the dynamics and performance of the HMRS with quality grading by adopting a three-lens analysis. Section 6 provides an overview of the key findings and the most relevant managerial implications of our work. Finally, Section 7 concludes and reflects on next steps derived from our study.

Review of the literature
This paper contributes to advancing the understanding of two main bodies of knowledge in the sustainable operations management literature. First, we add to the discipline that examines the dynamic behavior of closed-loop supply chains by investigating the effects of quality grading on the overall performance of such systems, which has not been considered by prior works. Nonetheless, some of these works do have considered heterogeneity in the quality of cores, leading to relevant insights. From this perspective, in this section we first introduce this discipline and discuss the main findings of relevant papers, and then we particularly refer to those papers that analyze returns with different conditions. Second, our paper brings a new perspective to the studies that address the value of quality grading in remanufacturing by exploring how presorting the cores allows for the enhancement of the dynamics of the wider supply chain. In this sense, the last part of this section discusses the key findings and identifies the current gaps in this field, and presents our contribution.

The dynamics of circular economy supply chains
Transitioning our economy towards environmental sustainability prompts a change in the operation of supply chains. This goes from linear processes, strictly associated with forward materials flows (extract, make, use, dispose), to closed-loop variants that reflect the principles of circular economy by emphasizing the reverse materials flows (Guide and van Wassenhove, 2006;Ferguson and Souza, 2010;Genovese et al., 2017). Understanding the behavior of closed-loop supply chains and designing business structures that effectively integrate both flows of materials are fundamental catalysts for such transition. Otherwise, companies may probably never be willing to incorporate reverse logistics operations into their supply chains in markets where competition is intense, unless forced to do so by legislation.
Accordingly, closed-loop supply chains have increasingly attracted the interest of many researchers, who explore their environmental and economic opportunities and challenges from different perspectives; see, for example, the review by Govindan et al. (2015). However, a consolidated area of research in traditional production and distribution systems, often labelled as supply chain dynamics , has still received relatively little attention in closed-loop settings, as discussed by Wang and Disney (2016), Braz et al. (2018), and Goltsos et al. (2019a). This discipline analyzes the time-varying behaviors that emerge from the interactions between the nodes of the supply chain, and their impact on internal (i.e. production efficiency) and external (i.e. customer service level) performance. This area of research is commonly investigated via analysis of the Bullwhip Effect (Lee et al., 1997), the phenomenon of amplification of the variability of orders and inventories in supply chains. The Bullwhip Effect is known to have strong economic implications on production, transportation, and inventory costs, and may strongly affect the satisfaction of customers (Metters, 1997;Disney and Lambrecht, 2008;Isaksson and Seifert, 2016).
A paper by Tang and Naim (2004) is usually regarded as the first to explore the Bullwhip Effect in closed-loop supply chains. The researchers consider a HMRS and, interestingly, they observe that it can benefit from improved dynamics as compared to traditional, open-loop supply chains, especially if the information on the reverse materials flow is used to manage the forward flow. Specifically, they find that as the return rate increases, order variability generally decreases, which has also been noticed by later works, including Zhou and Disney (2006), Turrisi et al. (2013), Cannella et al. (2016), Dev et al. (2017), and Zhou et al. (2017). However, Hosoda et al. (2015) reveal that under certain scenarios HMRSs experience a higher order variability than traditional systems. Later, Hosoda and Disney (2018) and Ponte et al. (2019) show that the negative impact of the reverse flow on the Bullwhip Effect often occurs when the correlation between demand and returns is low, and hence the uncertainty on the quantity of cores is high. Recently, Ponte et al. (2020) show that order variability can be an increasing or decreasing function of the return rate, depending on the level of transparency in closed-loop supply chains.
The impact of closing the loop on inventory performance has also been investigated in the past, leading to what may be interpreted as contradicting findings. Zhou and Disney (2006) and Cannella et al. (2016) observe that increasing the return rate reduces inventory variability, thus helping to better manage the trade-off between service level and stock required; while Turrisi et al. (2013) and Ponte et al. (2020) find the opposite effect. The impact of the return volume on product availability thus seems to be very sensitive to the modelling assumptions. Interestingly, some works discover a lead-time paradox, according to which reducing remanufacturing times may have a negative effect on HMRSs. It was first reported by van der Laan et al. (1999a, p. 195), who observe that "a larger remanufacturing lead-time may sometimes result in a cost decrease", hence suggesting the existence of an optimal, non-zero, lead time. Similarly, Tang and Naim (2004) notice that long remanufacturing processes may help to improve inventory control. Hosoda and Disney (2018) show that the paradox tends to emerge when remanufacturing takes less time than manufacturing, given that in this case the orders issued to the manufacturing line cannot make the best use of the reverse flow information.

The dynamic effects of the variable quality of cores
As previously discussed, a relevant feature of most real-world closedloop supply chains, as compared to traditional ones, is the "high degree of variability in the quality of the used products that serve as raw materials for the production process" (Guide and van Wassenhove, 2001, p. 144). That is, a shipment of used products received at the remanufacturer's facilities will typically include cores with significantly different conditions, with some requiring noticeably more efforts (measured in terms of capacity, time, and/or cost) to bring them to the needed standards than others (Denizel and Ferguson, 2010). However, this important feature is clearly underexposed in the closed-loop supply chain dynamics literature.
Most prior studies assume a single remanufacturing lead time; e.g. Tang and Naim (2004), Zhou and Disney (2006), Turrisi et al. (2013), Cannella et al. (2016), and Ponte et al. (2019). This can be interpreted in two different ways. Firstly, these studies may implicitly assume that all returns are of the same condition, i.e. homogeneous quality, and thus require the same processing time. Alternatively, maybe more realistically in most practical contexts, they may consider that all cores are processed through the same remanufacturing process, even if they are collected in different conditions, i.e. heterogeneous quality. If this was the case in a practical scenario, the resultant remanufacturing lead time would be that for the "worst" quality cores. That is, the remanufacturing system needs to be able to deal with the cores with the lowest quality. This may be understood as a contextual interpretation of the biological Liebig's Law of the minimum, stating that performance is (often) not controlled by the total of resources, but by the most limiting one. One way or the other, these works do not actually consider the impact of accommodating and restoring cores with differing conditions in closed-loop supply chains.
In this sense, although prior literature offers some understanding of the dynamics of closed-loop supply chains, the relevant papers have barely taken into account the variable nature of the quality of returns. Only very few works have attempted to model heterogeneous quality conditions, which we consider next. Hosoda et al. (2015) and Hosoda and Disney (2018) consider this issue by modelling a random yield loss, which decreases the number of products in the remanufacturing line. In practical terms, this represents a binary classification of cores: some are assumed to be beyond economical repair (they exit the closed-loop system), and the rest are remanufactured. While this approach provides relevant insights (e.g. they observe that increasing the yield loss may have positive effects on system dynamics), all the remanufacturable cores are assumed to require the same, quality-independent, processing time. Thus, considering different quality levels and remanufacturing lead times emerges as a logical next step.
On a different approach, Zhou et al. (2017) study the quality issue by modelling three different types of restoring processes in the reverse flow of materials: refilling, remanufacturing, and refining. Cores undergo one of them depending on their condition, which results in three different levels of entrance in the forward flow of materials. They examine the dynamics of the supply chain by exploring various scenarios in which the three lead times (refilling, remanufacturing, and refining) are equal. They discover interesting multi-echelon effects in the supply chain, resulting in the fact that what applies in one echelon sometimes do not hold in the entire system. In contrast, we here consider the practical case of different processing time requirements within the same physical restoring process, remanufacturing, in a HMRS.
Recently, Dominguez et al. (2020) investigate the effects of variability in the remanufacturing lead time, which often occurs due to receiving returns in highly different conditions, on the Bullwhip Effect and inventory performance of closed-loop supply chains. They model a complex supply chain formed by a manufacturer, a remanufacturer, a distributor, and a retailer, where they find that remanufacturing lead-time variability dramatically contributes to aggravating the Bullwhip problem in such supply chains. Also, they demonstrate the benefits derived from information transparency to compensate for the negative dynamics induced by lead-time variability; however, exploring other, structural solutions for enhancing the behavior of the system when the quality of returns varies, such as those based on the quality grading of cores, is beyond the scope of their work.

The value of quality-grading policies for enhancing the dynamics of remanufacturing
As we have just discussed, previous works in the closed-loop supply chain dynamics literature have considered only to a certain extent the variable nature of the quality of the returned products. These works show that this variability has a significant impact on the performance of remanufacturing systems. However, they have not explored in detail how to appropriately accommodate and cope with this variability, and investigating these solutions becomes essential to facilitate the transition towards more sustainable production systems. In this regard, Goltsos et al. (2019a) claim that there are three main solutions to control uncertainty in the condition of cores: forecasting, incentivizing, and presorting. The first one is based on developing methods to estimate the quality of returns (see Goltsos et al., 2019b). Second, organizations may try to influence the quality of returns, for instance, by using buyback schemes or leasing (see Wei et al., 2015). This work is concerned with the third solution, which is based on categorizing the returned products in a finite number of quality grades.
While previous research efforts in the supply chain dynamics literature have not focused on the value of quality grading, studies in adjacent areas have provided insights that constitute relevant background for our work. The most closely related research works are those by Aras et al. (2004), Ferguson et al. (2009), and Yanıkoglu and Denizel (2020) whose main contributions are discussed below. Aras et al. (2004) formulate a model to investigate the conditions under which categorizing returns according to their quality results in cost savings in HMRSs (where manufactured and remanufactured products are perfect substitutes), assuming variability in the remanufacturing lead times. They attribute the cost savings to the prioritization of remanufacturing high-quality returns (as long as they are available). Importantly, they conclude that quality grading of returns is particularly effective when: (i) the difference in quality between cores is high; (ii) the ratio of the remanufacturing to the manufacturing cost is high; (iii) the ratio of the mean returns to the mean demand is high; and (iv) the demand rate is low (i.e. slow-moving products). Ferguson et al. (2009) study the remanufacturing operations of Pitney Bowes, a provider of hardware that facilitates mail management. Different from Aras et al.'s (2004) work, they explore the value of quality grading when both products are imperfect substitutes; therefore, remanufacturing necessitates a separate production plan. They suggest an optimal remanufacturing policy under deterministic demand and returns as well as capacity constraints by considering information about the quality of cores. Although the scenario they analyze is different, their results buttress the findings of Aras et al. (2004). Specifically, they report a net profit increase of 4% due to the use of quality grading (when compared to the no-grading system). Also, they identify two additional drivers of quality-grading policies in remanufacturing systems: (i) when the number of grades is, at most, five; and (ii) when the quality of cores is uniformly distributed across grades.
The previous studies assume that the categorization of cores in quality grades is perfect. In contrast, Yanıkoglu and Denizel (2020) examine the impact of presorting when inspection is not able to accurately categorize all the cores in the most appropriate grade. Using a robust optimization approach to tackle uncertainty, the authors observe that, while in general terms there is still economic value in the quality grading of cores, misclassifications provoke a clear reduction of this value. In a setting that is comparable to the one used by Ferguson et al. (2009), Yanıkoglu and Denizel (2020) report a mean increase in net profit of about 1% (compared to the no-grading system), which is significantly lower than the increase reported when the categorization is infallible.
These articles -as well as others that explore the issue of quality grading from different perspectives, such as Behret and Korugan (2009), Radhi and Zhang (2016) and Zikopoulos (2017)-undoubtedly provide a managerially relevant understanding of the benefits of quality grading in remanufacturing systems. However, they have not evaluated so far the implications of quality grading from a supply chain dynamics perspective. In this sense, assessing the impact of quality grading in the stability of the supply chain operations from the lens of the Bullwhip Effect, which plays a pivotal role in many industries, enriches this body of knowledge.
At the same time, in line with previous considerations, we may argue that the dynamics of closed-loop supply chains are not yet well understood. Some important characteristics of such systems still need to be incorporated into the analysis, leading to valuable insights for the design of efficient remanufacturing systems in practice. One of these characteristics is the heterogeneity in the condition of the cores, resulting in variable processing times, which is commonly dealt with through quality-grading mechanisms. Given the critical importance of this problem in practice, this emerges as a meaningful gap in the literature, as noted by Braz et al. (2018) and Goltsos et al. (2019a). This gap will be addressed in this research work.

Supply chain model
The Automatic Pipeline Inventory and Order-Based Production Control System (APIOBPCS) archetype, developed by John et al. (1994), is a well-recognized framework for modelling the interrelationships between the flows of information and materials in traditional supply chains. This model, which employs control-theoretic techniques, has been widely used in the last two decades for investigating the dynamic behavior of supply chains, as it covers a wide variety of realistic supply chain implementations . A review of the applications of this archetype can be found in Lin et al. (2017). Tang and Naim (2004) extend this archetype to model a closed-loop supply chain based on a HMRS. To this end, they add a collection process, through which a portion of the sold products comes back to the supply chain after the consumption (or usage) time, and a remanufacturing process, which restores cores to an as-good-as-new state. This model is built on solid assumptions and has been well-accepted in the academic field of closed-loop supply chain dynamics. Indeed, it has been used in many recent papers, such as Cannella et al. (2016), Zhou et al. (2017), and Ponte et al. (2019). For the sake of brevity, we do not discuss all the underlying assumptions of this supply chain model here. However, those interested in them can refer to Tang and Naim (2004) and John et al. (1994). The most important one, for the purposes of this research, is that all cores undergo the same remanufacturing process (with a predefined lead time), thus ignoring the fact that quality does vary in many real-world remanufacturing systems, as we discussed before.
In this paper, we extend this archetype to explore the opportunities afforded from inspecting the cores as they arrive and implementing a parallel remanufacturing structure for processing them according to their quality. This section details the HMRS under consideration. First, we describe the discrete-time sequence of events. Second, we explain how we have modelled the reverse flow of materials, including the quality-grading system. Third, we detail the inventory control mechanisms implemented for both the serviceable and the recoverable stocks. Finally, we represent the overall block diagram and derive the relevant transfer functions. Table 1 introduces the notation that we employ throughout this paper for the main variables and parameters of the HMRS. Following the common convention, we use lowercase letters for the variables in the time domain (e.g. x t ), and uppercase letters for the variables in the complex, Laplace domain (e.g. X(s)).

Sequence of events
The functioning of our closed-loop supply chain follows the sequence of events depicted in Fig. 1. It is important to note that this models the operation over time of a HMRS that manages the inventories according to periodic-review policies. These policies are common in practice (Dejonckheere et al., 2003), as they are generally less expensive to operate and easier to implement than continuous-review policies (Axsäter, 2003).
During each time period t, purchase orders are received from the customer and returns are collected. At the end of t, customer demand is met and, if necessary, a new backorder is created, which will be satisfied as soon as stock is available. In addition, the current state of the inventory is reviewed and future demand is forecasted. The inventory position includes both on-hand inventory (net stock) and on-order inventory (WIP), which in turn considers both the manufacturing and the remanufacturing processes. All this information is used at the beginning of the next period, t+1, to issue an order for manufacturing new products, and this process starts.
Moreover, at the beginning of t+1, the returns that have been collected along the preceding period are received by the remanufacturer and classified according to their condition into three grades: highquality returns (HR), representing remanufacturable cores in a 'better' condition; low-quality returns (LR), those in a 'worse' condition; and beyond-economical-repair returns (BER), which cannot be remanufactured as this would entail excessive costs. Then, the remanufacturable returns, i.e. HR and LR cores, are pushed into the remanufacturing process, specifically into the line associated with their condition; while BER cores exit the HMRS for cannibalizing or recycling purposes, or are lost to landfill. Later in period t+1, the manufactured, new products and the remanufactured, as-good-as-new products are received, which increase the position of the on-hand inventory that is available to serve the new demand of customers.

Modelling the quality-grading policy
As mentioned before, we assume that a fraction β ′ of the sold products are collected after a consumption time T c and then are classified into three quality grades. In line with previous studies (e.g. Behret and Korugan, 2009;van Wassenhove and Zikopoulos, 2010;Zeballos et al., 2012), using three quality grades may be a reasonable choice in practice. For example, ReCellular, a remanufacturer of cell phones studied by Souza et al. (2002), sort the used phones at their grading station into three levels (superior, average, inferior), and the Pitney Bowes scenario explored by Ferguson et al. (2009) also suggests the differentiation of three grades after testing the returns (good, better, best). Note that, on one hand, using only two grades (e.g. remanufacturable and non-remanufacturable) may be insufficient to significantly benefit from the quality-grading policy in terms of supply chain dynamics. On the other hand, employing many quality levels becomes rather expensive without generally yielding clear benefits; in particular, Ferguson et al. (2009) recommends not using more than five grades. Fig. 2(a) provides a conceptual representation of the quality-grading system that we have modelled in the HMRS under consideration. First, a fraction of the cores, defined by the average percentage γ ′ b , are classified as BER, which are not remanufactured due to insufficient quality. The remaining ones are classified as HQ or LQ cores, whose average percentages are defined respectively by γ Then, each of these two types of returns is processed in a different remanufacturing line, resulting in a parallel remanufacturing structure. By way of example, ReCellular has implemented a similar production system, as discussed by van Wassenhove and Zikopoulos (2010). The different condition of HQ and LQ cores results in different remanufacturing lead times; T rh for the line of HQ and T rl for the line of LQ. In our study, we assume that T rh < T rl . This is aligned with arguments put forward in the literature (e.g. Denizel and Ferguson, 2010;Masoudipour et al., 2017;Yanıkoglu and Denizel, 2020), and industrial observations of remanufacturers in different sectors. For the sake of convenience, in the mathematical analysis, we employ the average remanufacturable return yield βε[0, 1], which can be and γ h + γ l = 1, define the percentage of remanufacturable returns that can be classified as HQ and LQ, respectively. The mathematical representation, via a block diagram, of the reverse materials flow in the HMRS with the quality-grading system and the parallel remanufacturing lines can be seen in Fig. 2(b). Note that the first-order function 1 1+Ts represents an exponentially distributed lead time with average T; see e.g. Zhou et al. (2017).

Inventory control policies
We consider that the HMRS utilizes a push system in the operation of the parallel remanufacturing lines. In this sense, the production process starts as soon as the cores are available and, once they have been remanufactured, the as-good-as-new products arrive to the serviceable inventory. The push policy is a common assumption in many other studies of inventory control for remanufacturing (see e.g. Inderfurth and van der Laan, 2001), as it is an easy way to implement the prioritization of remanufactured over new products to satisfy customer demand, which has both environmental and economic value. Hosoda and Disney (2018, p. 315) argue that "it is reasonable to assume that a remanufacturer is motivated to use the push policy in order to quickly recover any costs associated with collecting and processing returns and avoid the costs of holding returns as inventory".
Further, the manufacturing process operates according to a proportional order-up-to model (POUT) model in the serviceable inventory. We select this policy as it is able to significantly outperform the widely-used, conventional order-up-to model and it is easy to implement in practice (Disney and Lambrecht, 2008;Zhou et al., 2017;Cannella et al., 2021). Tang and Naim (2004) consider three different types of closed-loop supply chain models, differing on their level of information sharing. We consider the most advanced one, which assumes the manufacturer has information on the remanufacturing WIP and uses it to order. In this sense, the manufacturing order is issued at the start of each period as the sum of three comparators, according to The first comparator considers the difference between the forecasted  demand and the remanufactured cores (both HQ and LQ), representing a net demand. The second one considers the difference between the target and the actual net stock, and is regulated through a proportional controller with a time constant T i . T i represents the time to adjust the net stock. In line with previous works (see Lin et al., 2017), we consider a constant target net stock that symbolizes the safety stock, tns t = S S . The third comparator applies the same rationale to the WIP; T w is the time constant of the controller that regulates the gap between the target and the actual WIP. The target WIP is obtained as the demand forecast multiplied by the estimated pipeline lead time, such that Note that T p , defining the WIP policy, should consider both the manufacturing and remanufacturing processes, as the WIP uses the information of both the forward and the reverse pipelines (under the defined information sharing policy). Indeed, T p is a key decision parameter in the closed-loop system and will be explored later in this paper.

Block diagram representation and transfer functions
The block diagram representing the relationship between the different variables in the HMRS with quality grading and parallel remanufacturing processes is shown in Fig. 3. We use solid lines for the materials and information flows in the traditional forward operations, while the dashed lines refer to the additional reverse logistics ones. Importantly, the block diagram incorporates all available sources of information that define the dynamics of this closed-loop supply chain.
The modelling process via control-theoretic techniques informs the formulation of transfer functions, which express analytically the relationship between the relevant outputs and inputs in the Laplace domain.
The key input triggering the dynamics of the HMRS is customer demand (d t ,D(s)). To analyze the response of the supply chain, consistently with previous works, we measure the order rate (o t , O(s)) and the on-hand inventory (ns t , NS(s)) as the key outputs. First, the relationship between the orders and demand is defined by where a OD = γ h T i T rl (T w + T rh ) + γ l T i T rh (T w + T rl ), b OD = γ h (T w T rl + T w T i + T i T rh ) + γ l (T w T rh + T w T i + T i T rl ), and c OD = (γ h + γ l )T w . Second, the relationship between the net stock and demand is

NS(s) D(s) = − T i ( T a T m T w s 2 + T m T w s + T a T w s + T a T m s − T p + T m ) (T a s + 1)(T i T m T w s 2 + T i T w s + T i T m s + T w ) +β T i (a NSD s 2 + b NSD s + c NSD ) (T c s + 1)(T rh s + 1)(T rl s + 1)(T i T m T w s 2 + T i T w s + T i T m s + T w )
; In the following sections, we explore the behavior of the HMRS with quality grading via the analysis of the transfer functions given by Equations (3) and (4).

Stability and steady-state analysis
Verifying the stability condition is a prerequisite to the analysis of the time and frequency behavior of a control system. In this regard, the HMRS is stable as long as all the poles (i.e. the roots of the denominator polynomial of the relevant transfer functions) are placed in the negative area of the complex plane. The functions shown in Equations (3) and (4) reveal that the stability of the closed-loop supply chain depends on the parameters T a , T c , T i , T m , T rl , T rh , and T w . Note that the only time parameter that does not determine the stability of the HMRS is T p . Nor does it depend on the yields β, γ h , γ l or the safety stock S S .
Considering that T c , T m , T rl , and T rh are physical, and hence nonnegative, lead times, it can be easily shown that the following conditions emerge to ensure the stability of the control system: (i) T a , T i , T w > 0; and (ii) T a , T i > 0, T w < − T m . Pathway (ii) holds mathematically but, given that negative values of T w do not have a reasonable practical meaning, pathway (i) may be interpreted as the general stability criteria. Indeed, we can state that the HMRS model is stable for all the possible logical combinations of these parameters.
We now focus on the static (or DC) gain, which determines the output/input ratio of the system under steady-state conditions. For a given transfer function G(s), the static gain can be obtained by lim s→0 G(s). Table 2 shows the static gains of the transfer functions relating the two outputs to the demand input.
As can be expected, a long-term increase in the product demand of, say, 100 units results into a long-term increase of 100(1 −β) units in the Fig. 3. Block diagram of the closed-loop supply chain model. orders for manufacturing new products, see Equation (5). That is, the HMRS does not need to increase the production order in the same magnitude as the demand, which happens in traditional supply chains. This occurs given that a percentage (defined by β) of this demand increase will return to the HMRS after consumption and be remanufacturable, eventually turning into remanufactured, as-good-as-new products.
On the other hand, the impact of a change in demand on the longterm response of the serviceable inventory is of particular interest, as it allows us to appropriately adjust the WIP policy through the decision parameter T p . Indeed, Equation (6) provides evidence that there is a potential offset of the steady-state value of the serviceable inventory when the HMRS faces demand variations. This potential offset deviates the actual inventory level from the target net stock, which results into a significant decrease in the inventory performance of supply chains; see Disney and Towill (2005). This offset can be avoided by setting the parameter T p through Therefore, Equation (7) provides the value of the pipeline lead time T * p in order to set the total target WIP while avoiding inventory offset. We underline that for the benchmark case of T rh = T rl ( = T r ), Equation (7) simplifies to T * p = (1 − β)T m + βT r , i.e. the one proposed by Tang and Naim (2004) for their HMRS without quality grading. In turn, this reduces to the well-known equation T * p = T m for open-loop supply chains, characterized by β = 0 (Lin et al., 2017). Looking at the relationship between Tang and Naim's (2004) solution and our Equation (7), it can be defined the equivalent remanufacturing lead time in HMRSs without a quality-grading policy as T r = γ h T rh + γ l T rl .
Overall, Equation (7) represents an important result of this work. As highlighted by Zhou et al. (2017, p. 500), appropriately regulating the parameter T p provides the system with "the ability to ensure customer service levels through maintaining inventory at an appropriate level". From this perspective, Equation (7) highlights the importance of accurately estimating the relevant yield and lead times, which can be facilitated by information transparency mechanisms in the HMRS. Fig. 4 represents T * p as a function of these relevant parameters in the HMRS, specifically, the average remanufacturable return yield (β) and the average HQ yield (γ h ), for two given scenarios of manufacturing and remanufacturing lead times (T m , T rh , T rl ). Notice in these graphs that, as per to Equation (7), T * p always vary between T m (for β = 0) and T r (for β = 1).
Understandably, due to βε[0, 1], T * p increases proportionately to the manufacturing (T m ) and remanufacturing lead times (T rh , T rl ). Because, as previously discussed, T rh < T rl , T * p is also increased by the ratio γ l /γ h . That is, the larger the percentage of HQ cores (γ h ) is, the lower the estimated pipeline lead time T * p becomes. It is also interesting to note that if the equivalent remanufacturing lead time T r is higher than T m , T * p increases proportionally to β, see Fig. 4(a); while if T r < T m , T * p decreases proportionally to β, see Fig. 4(b). Finally, if T r = T m , T * p is insensitive to β, as Equation (7) results in T * p =T r = T m .

The impact of quality grading on the production and inventory dynamics
We now focus on the performance of the HMRS by evaluating the dynamics of production orders and on-hand inventories. To this end, we employ the framework developed by Towill et al. (2007), which defines three complementary 'lenses' for looking at the Bullwhip behavior of supply chains. The shock lens considers the supply chain response to a unit step in demand, being the most appropriate for analyzing the behavior of the system against quick and large, abrupt demand Table 2 Static gains of the relevant transfer functions in the closed-loop supply chain.

O(s) D(s)
Note: As previously discussed, γ h + γ l = 1 (see Section 3.2). variations. The filter lens investigates the frequency response of the supply chain, which makes it especially suitable for understanding the effects of demands of different seasonal nature. It is also especially useful as every input (demand) signal can always be expressed as the sum of a series of waves with different frequencies and amplitudes. Finally, the variance lens explores the supply chain behavior when it faces stochastic demand by comparing the variability of the input and output signals. This approach thus fits well different types of stochastic, real-world demand patterns.
To analyze the consequences of pre-sorting and processing cores according to their quality in HMRSs, we compare our supply chain model with the baseline scenario, where all the returns regardless of their condition are pushed into the same remanufacturing process. As per prior discussions -the longer remanufacturing line would be able to process HQ cores, but the shorter one would not be able to process LQ ones-, it seems reasonable to assume that all the cores need to be processed in the longer (LQ) remanufacturing line.
In the dynamic analysis, we use the following combination of control parameters: T a = 16, T i = 8, T w = 8. This can be interpreted as an effective 'trade-off setting' that considers the perspectives of order and inventory variability in traditional systems, see John et al. (1994). The regulation of control parameters in closed-loop settings is an interesting topic that requires detailed investigation in future works. We also assume that the system always uses the optimal value of the estimated lead time T p for avoiding the inventory offset, as per Equation (7). The effects of over-and underestimating T p are explored in Appendix A.
Also, we use here the following manufacturing and consumption lead times: T m = 8, T c = 32. As in practice remanufacturing may take less or more time than manufacturing (e.g. Zhou et al., 2017;Hosoda et al., 2018;Ponte et al., 2020), we examine three lead-time scenarios that represent different types of realistic settings. These scenarios, summarized in Table 3, allow us to investigate the impact of lead times in particular detail.
In scenario LT-I, the remanufacturing of cores takes less time than the manufacturing of new products. To illustrate it, we consider T rh = 1, T rl = 7. Based on industrial evidence, this may be interpreted as the most common case in the real world (e.g. Teunter et al., 2004). Scenario LT-II reflects the contrasting case, where we assume T rh = 9, T rl = 15. In practice, remanufacturing may take longer than manufacturing when disassembly is a particularly hard task -for example, when products are not designed to be remanufactured (see Hatcher et al., 2011)-or when remanufacturing is much less automated than manufacturing (see Kurilova-Palisaitiene et al., 2018). Last, scenario LT-III replicates an industrial context where the remanufacturing of HQ cores takes less time than the manufacturing of new products, while the remanufacturing of LQ cores takes more time. In this case, we use T rh = 5, T rl = 11. Note that in the three scenarios T rl − T rh = 6 to nullify the effects of remanufacturing lead-time variations on the results. Also, note that in all cases the consumption lead time is the longest one, which typically happens in the real world.
Finally, in the following study, we will study different combinations of the average HQ and LQ yields, γ h and γ l . To this end, we consider that 80% of the sold products return to the HMRS after consumption, i.e. β = 0.8. Imposing a high volume of returns will allow us to observe in more detail the impact of quality grading.

Step response analysis, or the shock lens
First, we investigate the response of the order rate (of new products) and the net stock to a unit step in demand. This response, which offers rich information on the system dynamics, has extensively shown to give key insights on the long-term behavior of supply chains, both in traditional open-loop settings (e.g. Dejonckheere et al., 2003) and in the emerging closed-loop ones (e.g. Zhou et al., 2017).
We consider five levels of the average HQ yield, γ h = {0, 0.25, 0.5, 0.75, 1}; the average LQ yield is alwaysγ l = 1 − γ h . Notice the first and the last values of γ h reduce into systems where all the returns have the same quality -only LQ when γ h = 0; only HQ when γ h = 1. According to what we discussed before, γ h = 0 defines the Liebig's Law-based, baseline system. Fig. 5(a) and (c), and 5(e) show the step response of the manufacturing order rate in the three scenarios. Fig. 5(b), (d), and 5 (f) display the same information for the net stock. Fig. 5(a), (c), and 5(e) provide evidence of how increasing the percentage of HQ returns improves the dynamics of the system. It becomes visible in the three scenarios through a significant reduction in the initial overshoot of the order rate (peak of the response), which is expected to smooth significantly the production requirements in the HMRS. This finding uncovers relevant considerations.
First, it clearly shows that processing all the returns in the same way (i.e. through the lead time required for LQ cores) misses the opportunity of enhancing the dynamics of the HMRS. This underscores the potential improvement derived from quality-grading policies from the perspective of Bullwhip-induced costs in closed-loop supply chains. This insight clearly applies to responses of the HMRS against abrupt demand changes, as shown in Fig. 5(a), (c), and 5(e); but we can also expect that quality grading will reduce the Bullwhip Effect of closed-loop supply chains for other types of demands, which will be explored in the following subsections. Logically, the potential improvement increases as the difference between both remanufacturing lead times grows. Nonetheless, it is important to underline that in order to realize this potential the HMRS needs to develop measures aimed at increasing the quality of the returns collected.
Interestingly, the analysis of the behavior of the net stock responses, in Fig. 5(b), (d), and 5(f), yields some counterintuitive results. Overall, we observe two contrasting dynamic effects of increasing the percentage of HQ cores. On the one hand (a negative effect), it seems to increase the size of the trough in the response. On the other hand (a positive effect), improving the average quality of cores smooths the net stock dynamics -in some cases, it even avoids the generation of a peak in the response. Note that this tends to reduce the settling time. From inspection of the graphs, we observe that the negative effect dominates when remanufacturing lead times are shorter than manufacturing lead times, see Fig. 5(b); while the positive effect becomes more relevant in the opposite case, as shown by Fig. 5(d).
This implies that improving the quality of returns may have undesirable consequences, in particular, a negative impact on the inventory performance of HMRSs. In line with the previous discussion, this tends to occur when remanufacturing returned products takes less time than manufacturing new ones. This interesting phenomenon, which may be termed as a 'quality paradox', stems from the previously described paradox of remanufacturing lead times in HMRSs (e.g. Inderfurth and van der Laan, 2001). Hosoda and Disney (2018, p. 322-323) state that "when the remanufacturing lead time is less than the manufacturing lead time […], the lead-time paradox can emerge". This explains why improving the quality of cores does not have a positive impact on the dynamics of the inventory in scenario LT-I. At the same time, Hosoda and Disney (2018, p. 322) underline that "once the relationship of T r ≥ T m is established, such negative effects [derived from the paradox] simply vanish". This perspective fits with our analysis for scenario LT-II. The quality paradox will be explored in the frequency and variance domains in the following subsections.

Table 3
The three lead-time scenarios for our three-lens analysis.

Definition
Scenario LT-I Scenario LT-II Scenario LT-III T rh < T rl < Tm < Tc Tm < T rh < T rl < Tc T rh < Tm < T rl < Tc

Fig. 5.
Step response of the order rate and the serviceable inventory for different returns qualities.
B. Ponte et al.

Frequency response analysis, or the filter lens
We now look at the frequency behavior of the closed-loop supply chain in the three lead-time scenarios. The frequency response is shown via Bode plots in Fig. 6(a), (c), and 6(f), for the order rate, and Fig. 6(b), (d), and 6(f), for the net stock. These plots represent the system's amplification of the input demand signal for a wide range of frequencies.
For this reason, Bode plots have been previously used to investigate the performance of supply chains facing seasonal demands; see e.g. Naim et al. (2017). Fig. 6(a), (c), and 6(e) provide a new perspective to the benefits derived from quality grading in terms of order smoothing in HMRSs. The three scenarios illustrate that increasing the quality of the returns decreases the size of the peak in the Bode plot, as well as the amplification for frequencies around it. Even though the improvement may seem relatively small in the Bode plot, this translates into a significant smoothing of supply chain dynamics. For example, the peak in the baseline system (γ l = 1) in scenario LT-I is 2.15 dB; see Fig. 6(a). This corresponds to an amplification of the demand signal of approx. 28%. This amplification (i.e. Bullwhip Effect) can be significantly reduced, through the quality-grading system, down to approx. 5% (0.39 dB). This peak occurs for a frequency of approx. 0.085 rad/s, being s the reference time unit; hence, the highest amplification occurs for a seasonality of 74 periods in our HMRS.
Interestingly, one can notice that quality grading can engineer elimination of the Bullwhip Effect in the closed-loop supply chain. For instance, consider a seasonality of 30 periods (monthly, if time windows are interpreted as days; see frequency = 0.209 rad/s in Fig. 6(e)), the HMRS suffers from an amplified variability of orders in scenario LT-III, i. e. amplification >0 dB (i.e. Bullwhip > 1). However, we can see that the amplification can be lowered until it becomes lower than 0 dB -and thus the demand variability is attenuated by the HMRS (i.e. Bullwhip < 1)-by implementing a quality-grading mechanism. Finally, we note that for very low or very high frequencies quality grading does not affect appreciably the dynamics of the HMRS. Fig. 6(b), (d), and 6(f) offer a new way to observe the quality paradox in the inventory performance of HMRSs. For low frequencies (whose range depends on the relationship between the manufacturing and remanufacturing lead times), it can be seen that the no-grading baseline system outperforms the quality-grading HMRS in which all returns are HQ cores. Indeed, in view of these graphs, we cannot easily observe any significant positive impact of increasing the quality of returns in terms of inventory variability.
Interestingly, the paradox can be seen from a frequency viewpoint in the three scenarios. Nonetheless, it becomes especially meaningful in scenario LT-II, where the baseline HMRS may achieve a significantly better trade-off between service level and holding requirements. Having noted that, we underline that the paradox is only noticeable when the demand seasonality is characterized by long cycles. For example, it would emerge in daily demands with a strong seasonal component with a period of one year (365 days; freq. = 0.017 rad/day); but it would not emerge if the weekly pattern of the seasonality dominates (7 days; freq. = 0.898 rad/day).

Stochastic analysis, or the variance lens
Last, we assume that the demand is an independent and identically distributed (i.i.d.) random variable that follows a normal distribution N(μ, σ 2 ). This allows us to observe in detail how the HMRS with quality grading reacts to purely stochastic, uncorrelated demands. For instance, we use a mean of μ = 100 and a standard deviation of σ = 30, which generates a reasonable coefficient of variation of customer demand equal to 30% (see Dejonckheere et al., 2003). We simulate the response of the HMRS over a time horizon of 20,000,000 periods. We select such a large number in order to achieve a robust result, and also due to the low experimental effort of running these simulations in MATLAB.
The main results of this simulation study are presented in Table 4. This shows the variance of the production rate and net stock in the same three lead-time scenarios and for the five different combinations of the average HQ and LQ yields, γ h and γ l .
Overall, Table 4 reveals that, for i.i.d. demands, the quality-grading policy for the returns has a significant effect on the variability of the manufacturing orders. Specifically, the variance of the orders can be reduced by 37.01% in scenario LT-I, 25.40% in scenario LT-II, and 29.57% in scenario LT-III. This decrease in the volatility of production requirements can be expected to have a positive economic impact on the HMRS through a reduction in different sources of costs, such as those related to extra capacity or idle time and overtime working; see e.g. Disney and Lambrecht (2008) and Ponte et al. (2017). It can be highlighted that the highest Bullwhip reduction occurs when the remanufacturing lead times are lower than the manufacturing one (scenario LT-I). Overall, these results are consistent to those obtained in the step response and Bode plot analyses.
In contrast, for the parameter setting under consideration and this type of demand pattern, the quality-grading system slightly impacts on the variability of the net stock. This metric, and hence inventory costs, thus are quite robust to variations in the quality of cores. It is interesting to note that we cannot observe the quality paradox in the first two scenarios, given that improving the quality of cores generates a small reduction in the net stock variance; up to 3.42% and 2.61% respectively. In contrast, in scenario LT-III, we can notice a flat U-shaped relationship, with the minimum achieved for γ h = 0.5. This study is thus also aligned to the analysis of the previous lenses. Note that i.i.d. demands have highfrequency components, for which the impact of the quality of cores is marginal in the Bode plots, as discussed around Fig. 6. However, for stochastic low-frequency demands, we can confidently expect that improving the quality of course will result in a worsened inventory dynamics, in line with the discussion in the previous subsection.
In short, this section has provided a general understanding of the dynamics and economic performance of the HMRS with quality grading of returns by looking at the parameters related to the reverse materials flow, in particular, lead times and quality yields. To complement this analysis, an assessment of the impact of the WIP pipeline lead-time estimate, T p can be seen in Appendix A. This offers interesting insights on the impact of information transparencies on the HMRS; showing that while overestimating the pipeline lead time triggers an avoidable increase of the order and inventory variabilities, underestimating it may contribute to the reduction of the Bullwhip Effect but would occur at the expense of an enormous investment in safety stock.

Discussion of findings and managerial implications
The analysis has produced valuable findings that contribute to the literature on closed-loop supply chains and have implications for managerial practice. As a general rule, our results indicate that a qualitygrading policy enables practitioners to smooth the operation of HMRSs by processing the returned products in specific lines or cells according to their quality. This smoothing has a clear economic value. However, the impact of such policy on the inventory performance of the system is ambivalent, with counterintuitive effects evident that need to be considered by supply chain managers. We have also observed that the benefits of categorizing the cores according to their quality and their restoration in parallel remanufacturing lines rely heavily on accurately estimating the relevant yields and lead times and using this information for production planning purposes.
In this section, we elaborate on the findings of our research study and reflect on their managerial implications. First, we discuss the main findings.
1. Implementing a quality-grading policy smooths the manufacturing order rate. In other words, categorizing cores according to their quality and Fig. 6. Bode diagram of the order rate and the serviceable inventory for different returns qualities. Note. As is a common practice in Bode representations, the vertical axis represents the supply chain's amplification of the demand signal in decibels, A dB . Its relationship with the amplification in absolute value, |A|, is given by A dB = 20log 10 |A|.
processing them in separate remanufacturing lines allows for a reduction of the Bullwhip Effect in closed-loop supply chains based on HMRSs. This behavior has been observed in the three lead-time scenarios and the various demand patterns considered. The dynamic improvement is more significant the better the average condition of returns is, as well as when the difference between the processing lead times required by the various qualities is high, which is aligned with the drivers posed by Aras et al. (2004).

Consequences of quality-grading practices on inventory performance
highly depend on the frequency of the demand signal. We have revealed that for low-frequency demands, e.g. long seasonality or high intermittency (see Babai et al., 2020), the inventory response of HMRSs benefits from lower quality returns (with longer remanufacturing lead times, which is a consequence of the lead-time paradox). In this sense, a target customer service level may be achieved with a lower investment in inventories when returns have lower qualities. In contrast, for medium-or high-frequency demands, the impact of quality grading on inventory performance is small. The step response analysis helps us interpret these results: quality grading increases the trough in the response (a negative effect) but also smooths the dynamics of the net stock, which often reduces the settling time (a positive effect). The strength of these effects depends on the relationship between the different manufacturing and remanufacturing lead times. 3. Leveraging benefits of quality-grading mechanisms requires an adequate establishment of a WIP policy based on information transparency. We have formulated a solution for estimating the pipeline lead time in HMRSs considering both the manufacturing and remanufacturing WIP. This is given by Equation (7), which adapts Tang and Naim's (2004) rule to scenarios with quality grading. Setting T p accurately ensures customer service level by avoiding long-term inventory offsets. To this end, it is fundamental to know the portion of used products that come back to the HMRSs in a remanufacturable condition, the average percentage of HQ and LQ cores, and the manufacturing and remanufacturing lead times. Otherwise, an overestimation of T p generates positive inventory offsets (high holding costs), while underestimating T p , despite it may reduce the Bullwhip Effect, leads to negative inventory offsets (high stock-out probability).
These findings uncover relevant considerations for supply chain professionals that operate in circular economy systems or aim to close the loop and bring circularity into their traditional systems, which we now address. I. Quality grading for improving the dynamics of closed-loop supply chains. Categorizing the incoming returns into several quality grades enables the development of more efficient circular economy systems. Specifically, the design of a parallel remanufacturing structure, with lines for cores in similar conditions, proves to be a useful instrument for mitigating the Bullwhip Effect in closed-loop supply chains. This structure would better inform forward scheduling, reducing the inefficiencies derived from operational variability. This perspective complements prior findings that have underlined the value of quality grading from other viewpoints (e.g. Aras et al., 2004;Ferguson et al., 2009;Sun et al., 2018), by showing that quality grading can also guide remanufacturing systems towards the advocated production smoothing.
The dynamic benefits however are naturally constrained by different aspects. In this regard, if (1) there is a significant variability in the quality of the cores, (2) the associated lead times are substantially different, and (3) the volume of high-quality returns is considerable; the supply chain can benefit from a higher reduction in Bullwhip-induced costs. However, if the condition of the cores is largely homogeneous, or the times required by the different grades are relatively similar, the (additional) investment in quality-grading mechanisms may not be justified. All in all, to leverage and accentuate the immediate benefits in terms of supply chain dynamics, managers need to design an appropriate quality-grading policy by: (1) differentiating the cores into grades with substantially different processing times; and (2) developing incentive policies for encouraging the customer's return of products in the best possible condition. II. Quality-grading implementation requires a firm understanding of the nature of customer demand. At the same time that quality grading generally allows for taming the Bullwhip Effect, depending on the characteristics of market demand, it may also undermine the net stock variability of remanufacturing systems, thus increasing inventory-related costs. Therefore, prior to implementing any quality-grading policy, it is necessary to understand the behavior of market demand and take a decision based on the following guidelines: (1) in the presence of a high-frequency demand (e.g. weekly seasonality) or if there is no clear seasonality (e.g. stable, i.i.d. demand), the said policy can be adopted without lowering the inventory performance; while (2) in the presence of a lowfrequency demand (e.g. strong yearly pattern), the Bullwhip reduction may come at the expense of worsening the serviceable inventory response. Thus, when the closed-loop supply chain faces demands with regular fluctuations over long time horizons, such as in seasonal industries, a trade-off analysis between the expected Bullwhip smoothing and decreased inventory performance becomes necessary. In addition, our study suggests that it is also important to consider whether or not demand suffers from sudden and dramatic changes. Relevantly, for closed-loop supply chains operating in anomalous market environments, quality-grading policies may accentuate the stock-out risk. However, they may also reduce the subsequent holding costs by avoiding excessive stock variance (i.e. long cycles of big stock-outs followed by large inventories). III. The value of information transparency and accurate forecasting for effective quality grading. In general terms, HMRSs with qualitygrading policy and parallel remanufacturing have the potential to outperform HMRSs with no grading. To realize this potential, managers need to be fully aware of the value of information visibility and the accuracy of shared data in closed-loop supply chains.
Usually, studies dealing with information sharing in supply chains investigate the role of transmitting reliable real-time data on customer demands upstream. It has been shown how different collaboration frameworks based on sharing market demand information notably improve the performance of traditional supply chains, including Trapero et al. (2012) and Dominguez et al. (2018). The role of information in closed-loop settings has been less explored so far. The few studies in this area have focused on sharing real-time information on the amount of collected items, such as Dadhich et al. (2015) and Cannella et al. (2016), which is exploited for ordering purposes. However, in HMRSs with quality grading, information sharing should not be limited to the quantity of cores, but should also consider (1) the quality of the cores, and (2) the associated remanufacturing lead times. In fact, if such yields or lead times are not known or erroneously estimated, the level of customer satisfaction can decrease significantly and/or the investment in inventories can increase substantially. Thus, to benefit from effective quality grading, managers need to design robust inspection or forecasting processes for accurately estimating the condition of cores and their remanufacturing lead times, as well as properly using this information into the ordering policy.

Conclusions and next steps
Research on new business structures for efficient closed-loop supply chain operation is gaining momentum in a bid to encourage organizations to adopt circular economy practices in their processes. Understanding the dynamic behavior of these supply chains is essential for facilitating and accelerating such transition towards a circular economy. One of the defining characteristics of said supply chains is the variable nature of the quality of the products collected from the market, which is particularly hard to manage for organizations. However, this characteristic has been underexposed in the literature. From this perspective, this research investigates the impact of quality-grading policies (i.e. categorizing the returns into a finite number of quality grades) on the performance of a HMRS. We approach this issue from the prism of supply chain dynamics, which allows us to simultaneously consider the production and inventory implications of the relevant business structures.
Our research shows the potential benefits for the closed-loop supply chain derived from remanufacturers adopting a quality-grading policy and processing the cores in separate lines or cells according to their quality. We observe that the order variance can be reduced up to around 37%. In this sense, quality grading can lead to relevant production cost savings. Thus, we offer a different perspective to the operational benefits of quality grading, which complement the findings of previous studies. We also find that the operational value of quality grading, related to the well-known Bullwhip Effect, grows as the percentage of high-quality items increases. That is, the actual benefits captured by the closedloop supply chain depend on the collection system's ability to gather returns in the best possible condition. In light of this, adopting policies for increasing the quality of returns from the market allows one to accentuate the benefits of quality grading in closed-loop supply chains.
Our analysis leads to different insights in terms of inventory behavior. Interestingly, we notice two effects of the quality-grading system. On the one hand, it tends to smooth the inventory response of the system (by avoiding the generation of positive peaks when facing step responses). On the other hand, it tends to increase the size of the potential stock-out in the supply chain (i.e. the negative peak of the response). The strength of both effects depends on the relationship between the remanufacturing and the manufacturing lead times. As a result, while it may be intuitively appealing to expect that improving the quality of cores has also a positive impact on inventory performance, we reveal that the opposite is sometimes the case. We refer to this as the quality paradox, which complements the well-known lead-time paradox of the remanufacturing process in HMRSs. That is, improving the quality of returns may obstruct the cost-effective satisfaction of customers.
A noteworthy contribution of our research is that we provide a solution for appropriately calibrating the WIP policy in HMRSs with quality grading by accurately determining the overall pipeline lead time (considering both the manufacturing and remanufacturing processes). This solution is optimal from the perspective of the balance between service level and stock holding costs as it avoids a detrimental inventory offset. The target WIP should decrease as the percentage of high-quality cores grows. Meanwhile, it may increase or decrease as the return yield grows, depending on the relationship between the lead times. This perspective highlights the need for information transparency in the closed-loop supply chain; otherwise, the relevant actors will not be able to accurately estimate the yields and lead times, and the potential benefits of quality grading may be not achieved.
Finally, we underline that some interesting avenues for future research emerge from this work, as the same methods and similar models can be employed to explore other topologies and assumptions, including specific real-world environments. One of these avenues may be related to our assumption of linearity in the supply chain, which follows previous research efforts in this field. Adopting nonlinear models could allow scholars to better reflect some real-world scenarios, such as capacitated and lost-sales systems; see Disney et al. (2020). Also, breaking the common assumption in HMRS of perfect substitution (between new and remanufactured products), could also lead to industrially relevant insights; see Goltsos et al. (2019a). On a related note, considering different markets for new and remanufactured products would better model many practical settings and may affect the value of quality grading; see Guide and van Wassenhove (2009). Finally, we have assumed in this work that the categorization of products between the various grades is perfect. It may also be worth considering how the dynamics of the supply chain vary when the grading system is not 100% accurate, as well as exploring the dynamic consequences of using different numbers of quality grades.

Declaration of competing interest
We wish to confirm that there are no known conflicts of interest associated with this research work. need for appropriately regulating the decision parameter T p in HMRSs with quality grading, which significantly differs from HMRSs with no grading, as discussed in Section 4. This analysis captures the value of information sharing for controlling the flow of materials in HMRSs with quality-grading policies. Note that Equation (7) reveals that accurately estimating the three different yields (i.e. β, γ h , γ l ) and the relevant lead times (i.e. T m , T rh , T rl ), is essential for establishing an appropriate WIP policy. To concentrate our efforts on the impact of T p , we only explore scenario LT-III, see Section 5, and we assume γ h = γ l = 0.5. In addition, we will also explore the issue through the three Bullwhip lenses. Fig. A.1 displays the unit step response of the order rate (a) and the serviceable inventory (b) for different estimations of the pipeline lead time. We use the optimal value of T p from Equation (7), T * p , which eliminates the long-term offset in the net stock, as well as two values higher than this one (representing overestimations of the pipeline lead time) and two values lower than this one (representing underestimations of this lead time).

Fig. A.1.
Step response of the order rate and the serviceable inventory for different pipeline estimations. Fig. A.1(b) confirms that T * p removes the long-term offset in the net stock. Higher values of T p result in positive inventory offsets, while lower values lead to negative offsets; in both cases deviating the actual inventory position from the target level. In such cases the HMRS is not able to find the desired balance between service level and holding requirements. In light of this, when T * p is not used, the integral of time-weighted absolute error (ITAE) -a useful metric in evaluating the inventory performance of supply chains -is infinite.
This allows us to perceive the benefits of using T * p in the POUT policy for managing the HMRS. At the same time, Fig. A.1(a) reveals that low values of T p may help to reduce the variability in the orders issued to the manufacturing line. Note that the peak of the response reduces as T p decreases. Having noted that, this benefit would be small in comparison with the damaging consequences of the inventory drift (either excess stock build-up or deterioration of customer service level) in most practical scenarios.
We now explore the frequency domain.  The former confirms that underestimating T p may help the HMRS to cope with the Bullwhip Effect. However, the latter provides evidence that this strategy would have a negative impact in terms of inventory stability, especially for low frequencies. Fig. A.2(b) shows that it is only when T * p is used that the amplification converges to 0 (i.e. -∞ dB) as frequency decreases. For high-frequency demands, the inventory variability amplification of the HMRS is not sensitive to T p ; however, the mean inventory is not necessarily well adjusted.
Finally, we look through the variance lens. We measure the order and net stock variances when the HMRS faces an i.i.d. normally distributed demand, N(100,30 2 ). We conduct the simulations in the same conditions as those described in Section 5.3. Given that T p strongly impacts on the tradeoff between service level and inventory required, we also measure the mean of these signals. The results can be seen in Table A.1. In accordance with our previous analysis, Table A.1 shows how underestimating T p has positive effects on the HMRS in terms of order smoothing. Interestingly, this may also contribute to (slightly) reducing the net stock variability. However, inspection of the table reveals that this improvement occurs at the expense of unbalancing the trade-off between service and inventory level. 'Mean' rows reveal that low T p tends to enormously decrease the customer service level -that is, a much higher safety stock would be necessary. On the other hand, overestimating T p overprotects the serviceable inventory at the same time as the variability increases.