Order Fulfillment Errors and Military Aircraft Readiness

Purpose – This paper aims to measure the effect of supply discrepancy reports (SDRs) on military aircraft readiness metrics, including aircraft availability, not mission capable supply (NMCS) hours, cannibalizations andmission-impaired capability awaiting parts (MICAP) hours. Design/methodology/approach – Monthly SDR, NMCS, aircraft cannibalizations and MICAP data from 2009 to 2018 are analyzed using linear regression and independent samples t-tests to examine whether discrepant shipments negatively impact aircraft readiness. Findings – Results of linear regression were significant in 4 of 12 analyses, suggesting that SDRs are a significant predictor of increased cannibalizations. Results of independent samples t-tests found MICAP hours were significantly higher on discrepant shipments compared to nondiscrepant shipments in all three analyses. Practical implications – This research will increase awareness of the extent to which SDRs degrade aircraft readiness, and provide an opportunity for United States Department of Defense (DoD) supply chain leaders to take action to improve order fulfillment performance in their organizations. Originality/value – Little research has been done investigating the impact of SDRs within the DoD, and to the best of the authors’ knowledge, no previous study has examined the effect of SDRs on military aircraft readiness metrics.


Introduction
The United States Department of Defense (DoD) manages nearly 5 million critical spare parts in support of over 2,300 weapon systems, and routinely fulfills over 25 million sales orders annually across the Army, Navy, Air Force and Marines (Government Accountability Office, 2016). These weapon systems rely on spare parts shipments to maintain mission readiness. Unfortunately, not all spare part orders are error-free. The objective of this paper is to gain insights into the weapon system readiness impacts of order fulfillment errors.
Military weapon system readiness is most commonly measured in terms of mission capable (MC) rate, which is defined as the proportion of time all unit-possessed weapon systems are capable of performing designated missions (Brooks, 2013). Weapon systems in a not mission capable (NMC) condition are further categorized based on the cause, either because of a spare part requirement (NMCS), a maintenance requirement (NMCM) or both (NMCB). Weapon system availability is another common readiness indicator closely related to MC rate and defined as the proportion of time a fleet is mission capable, regardless of whether systems are unit-possessed. This accounts for weapon systems off-station or undergoing depot-level maintenance. While MC rate is effective in measuring unit readiness, weapon system availability is typically used as an indicator of enterprise-level readiness (Meserve, 2007).
A symptom of supply errors is the prevalence of weapon system cannibalizations. Cannibalizations involve the deliberate removal of working components from one weapon system for use on another. According to Curtin (2001), cannibalizations are performed as a result of "pressures to meet readiness and operational needs" when parts are not available through the supply system.
As indicators of military readiness, the metrics discussed above are most likely to be impacted by errors in the order fulfillment process. Formally known in the DoD as supply discrepancies and documented using the supply discrepancy report (SDR), order fulfillment errors can delay repairs of critical assets and thus increase NMCS hours and cannibalizations, while reducing overall weapon system availability.
The United States Air Force (USAF) alone has reported over 20,000 SDRs on average over each of the past five years, or about 0.4% of the average annual orders (3.13 million) from 2014 to 2018. The majority of these errors originate directly from vendors and pass undetected through Defense Logistics Agency (DLA) distribution centers, while a small number originate from USAF repair depots. The most common discrepancies involve incorrect quantities, material received in damaged condition, incorrect items received and material with expired shelf life, among many others. In total, there are over 160 unique SDR discrepancy codes, each identifying a distinct error condition. Upon receipt of discrepant material, customers annotate the discrepancy code on the SDR, and submit to the shipping activity to determine the root cause of the errors, effect corrective actions and prevent recurrence (Defense Logistics Manual 4000.25, 2016).
Order fulfillment issues are not restricted to the military. A 2013 survey assessed performance among 500 companies on key operational metrics including percentage of orders fulfilled damage-free, complete and with correct documentation. Results found the average error rate in order fulfillment among the companies surveyed was approximately 0.5% (Tillman et al., 2013). Though small percentagewise, the total frequency of errors can be severely disruptive to large organizations that fulfill orders on a mass scale. Research has consistently shown clear linkages between order fulfillment errors and negative operational outcomes to include reduced customer satisfaction, repurchasing behavior and profitability (Davis-Sramek et al., 2009;Anderson et al., 1994).
While the impact of order fulfillment errors is well understood in the private sector, there is a lack of studies that investigate the impact of order fulfillment errors on operations in military settings. Toward this end, this paper examines the impact of SDRs on commonly used metrics of aircraft readiness within the USAF of fleet health including aircraft availability, not mission capable supply (NMCS) hours, cannibalizations, and mission-impaired capability awaiting parts (MICAP) hours. Note that MICAP hours measures order fulfillment cycle time on high-priority shipments.

Literature review
The streams of research relevant to this study include the various outcomes associated with effective and ineffective order fulfillment practices, both in the private sector and the USAF. In the private sector, such outcomes include customer satisfaction, loyalty behaviors and profitability. In the USAF, the primary outcome of interest of order fulfillment is readinessa condition in which weapon systems are mission capable and available to perform designated missions. The impact of poor order fulfillment, however, is not well understood, though previous research has examined the impact of these errors in terms of the financial cost.
2.1 Order fulfillment in the private sector Errors in the order fulfillment process for consumer goods are known to result in decreased customer satisfaction, reduced loyalty and repurchase behaviors, and ultimately lower profitability (Davis-Sramek et al., 2009;Chandrashekaran et al., 2007;Hewett et al., 2002;Anderson et al., 1994;Reichheld and Sasser, 1990). Thus, in recent years, retailers have placed greater focus on ensuring customer needs are met and errors are minimized (Davis-Sramek et al., 2008). Such outcomes are valuable performance measures for private sector companies seeking to maximize profits, but less useful to nonprofit and government agencies such as the military.
According to Bowersox et al. (2002), customer satisfaction is achieved when the supplier's performance meets or exceeds expectations. Thus, customer satisfaction is a direct consequence of effective order fulfillment. Davis-Sramek et al. (2009) found that order fulfillment quality was a significant predictor of customer satisfaction, as well as commitment to continue to do business again in the future among retail customers of a large manufacturing company. Similarly, Hewett et al. (2002) and Chandrashekaran et al. (2007) found customer satisfaction was directly related to repurchase intentions and continued patronage in both large service organizations and industrial markets.
According to Reichheld and Sasser (1990), high customer satisfaction indicates loyalty, which will in turn lead to increased profitability because loyal customers ensure a steady stream of future revenue. Anderson et al. (1994) compiled customer satisfaction surveys of 77 firms across a wide variety of industries and compared the responses to each firm's endof-year return on investment, a long-term measure of economic health. Results indicated that customer satisfaction, measured in the first half of the fiscal year, was a significant predictor of year-end economic returns. Similarly, Stank et al. (2003) surveyed customers of large 3PL providers and found that companies with higher service performance, measured by order accuracy and responsiveness, held greater market share than competitors, and were rated higher in customer satisfaction and loyalty. Griffis et al. (2012) found that order fulfillment, and satisfaction with online retailers, is a predictor of referral behavior among customers. These findings are significant because they suggest that order fulfillment service quality leads not only to customer satisfaction, but also greater purchasing behaviors and potential profitability.
Not surprisingly, the opposite is true of businesses that fail to meet customer expectations. Rao et al. (2011) found that retailers who failed to deliver upon order fulfillment promises experienced reduced future orders, as well as reduced dollar value of subsequent orders. Order fulfillment glitches were also associated with increased order anxiety, measured by proxy using the number of times a customer checked the order status online as an indicator of anxiety.
The negative outcomes discussed above are evidence of the operational impact associated with ineffective order fulfillment practices in the private sector. Our research, Order fulfillment errors however, focuses on a public sector setting in which the "customer" has no choice from where to source. That being said, it is the authors' experience that fulfillment errors, such as supply discrepancies, create distrust for the supply system which incentivizes deviant behavior such as spare parts hoarding.

United States Air Force order fulfillment
While order fulfillment in consumer markets has been studied exhaustively, few have examined the impact of ineffective order fulfillment in the USAF. Furthermore, the impact of discrepant shipments cannot be assessed using traditional measures of customer satisfaction, loyalty and profitability, because these metrics are not applicable to the USAF mission. The relevant research focuses on the financial costs associated with discrepant shipments. Bray (1990) quantifies two costs resulting from discrepant shipmentsadministrative costs and holding costs. Administrative costs include SDR processing, investigation and resolution, and are estimated to average US$519 per shipment (in 2017 dollars). Holding costs result from the storage and handling of discrepant items, and from the lost investment opportunity for money tied-up in supplies. These costs are estimated at 3.22% of the contract value for a typical DLA item. However, the readiness impact of order fulfillment issues was not addressed (yet it was recommended for analysis in future research). This study aims to help address this gap by quantifying the readiness degradation resulting from supply discrepancies in terms of aircraft availability, MICAP hours, NMCS hours and cannibalization rates.

Impact of supply on availability
Supply availability has a direct and clear theoretical connection to asset availability. For example, much work has been done on determining stock levels for spare parts (Sherbrooke, 1968;Muckstadt, 1973), where the key trade-off under consideration is the cost of buying additional spares versus reduced aircraft availability resulting from insufficient spare part support. In other words, spare parts are essential to carry out required maintenance, and thus ensure aircraft are available to perform assigned missions.

Hypotheses
Based on the theoretically well-understood operational impact of order fulfillment errors, we hypothesize that: SDRs will be associated with reduced operational readiness in the USAF; and SDRs will be associated with significantly higher order fulfillment cycle time on MICAP shipments.

Methodology and results
This study used linear regression and independent samples t-tests to describe the relationship between SDRs and relevant aircraft metrics across the USAF over a 10-year period. SDR data was analyzed at the aggregate level covering the entirety of the USAF from 2009 to 2018, as well all SDRs at the subordinate major command (MAJCOM) level, including Air Combat Command (ACC), Air Mobility Command (AMC) and Air Force Global Strike Command (AFGSC). These commands were selected because they are responsible for distinct mission areas and are composed of aircraft unique to these mission sets, including fighters, bombers and mobility aircraft. Although it is not possible to directly tie SDRs to individual weapon systems, we expect this approach will provide the greatest insight into how SDRs impact the distinct mission sets outlined above. While SDR data does not include mission area, this information is included in MICAP data and was therefore used as the basis of t-test analysis for comparison of MICAP hours between SDR and non-SDR groups. Specifically, fighter, bomber and mobility mission areas were assessed to most closely correspond with the MAJCOM data used for regression analysis.
Monthly SDR, NMCS, cannibalizations and MICAP data from 2009 to 2018 were obtained using DLA and USAF information systems. SDR data was obtained from WebSDR, a DLA transaction services system that provides comprehensive information surrounding each SDR occurrence, including submitter, shipper, national stock number (s) (NSN), discrepancy code (s), dollar value and unique document number, among others.
Aircraft data, including aircraft availability, NMCS hours and cannibalizations were obtained using the Logistics, Installation, and Mission Support-Enterprise View (LIMS-EV), an SAP Business Intelligence system providing integrated USAF logistics data from over 60 standalone systems (Petcoff, 2010). Data was then exported to Microsoft Excel and underwent preprocessing to ensure the information was consistent with the SDR data and formatted properly for analysis. Additionally, SDR data was detrended by differencing monthly values to remove any trend or seasonality present in the dataconsistent with standard practices for time series data. Because each MAJCOM possesses multiple unique aircraft, data from each aircraft type were aggregated to provide one metric each for aircraft availability, NMCS rates and aircraft cannibalizations for every month.
Finally, MICAP data obtained from LIMS-EV provided information surrounding each MICAP over the sample period, including cause code, document number, MICAP hours, source of supply and NSN, among others. Prior to analysis, MICAP document numbers were matched with SDR document numbers to determine which MICAPs had supply discrepancies reported.

Descriptive statistics
We now provide some descriptive statistics regarding number of MICAP incidences and joint incidences of SDRs and MICAP incidents. Of the three mission areas sampled, fighter aircraft had the highest number of MICAP incidences over the 10-year period with 190,317. Mobility aircraft had the second highest number of MICAP incidences over this timeframe with 136,585. Bomber aircraft had the fewest number of MICAPs with 51,588, though it is important to note that no MICAPs were reported for the year 2009 because AFGSC was not established until August of that year, and MICAP data was not available in LIMS-EV until 2010 (see Table 1).
With regard to SDRs occurring on MICAP initiated shipments, the overall frequency of SDRs was very low. Among the 378,490 MICAP incidences across fighter, mobility and bomber aircraft, a total of just 2,068 SDRs (0.55%) were reported, consistent with the rate of  Table 2.

Primary results
Simple linear regression equations were calculated both at the aggregate and for each command assessing the relationship between SDRs and the following metrics: aircraft availability, NMCS hours and cannibalizations. Additionally, t-tests were conducted to determine whether differences in MICAP hours existed between MICAP-initiated shipments that occurred with and without SDRs.
3.2.1 Aggregate. A simple linear regression was calculated to predict aircraft availability, NMCS aircraft and cannibalizations based on the number of reported SDRs. A significant regression equation was found for cannibalizations (F(1, 117) = 59.79, p < 0.0001), with an R 2 of 0.338 (0.333 adj.) (see Figure 1).
The USAF's predicted change in cannibalizations from one month to the next is equal to À2.75 þ 0.34(SDRs). Thus, roughly one cannibalization occurs for every three SDRs. Inspection of the quantile-quantile (QQ) plot of the normal quantiles against  Figure 2). Further, there is no discernable pattern in the residual scatterplot, suggesting the variance is satisfactorily homogenous (see Figure 3). Results of all three regression equations calculated for ACC are shown in Table 3. No significant findings were observed for aircraft availability or NMCS hours.
3.2.2 ACC. A simple linear regression was calculated to predict aircraft availability, NMCS hours and cannibalizations based on the number of reported SDRs. A significant regression equation was found for cannibalizations (F(1, 117) = 7.40, p < 0.0075), with an R 2 of 0.059 (0.052 adj.) (see Figure 4).
ACC's predicted change in cannibalizations from one month to the next is equal to À0.91 þ 0.11(SDRs). Approximately one cannibalization occurs for every ten SDRs. Review of the QQ plot of the normal quantiles against the residuals suggests the residuals are normally distributed (see Figure 5), and variance appears to be homogenous (see Figure 6).
Results of all three regression equations calculated for ACC are shown in Table 4. No significant findings were observed for aircraft availability or NMCS hours, though aircraft     Figure 7). AMC's predicted change in cannibalizations from one month to the next is equal to À0.19 þ 0.11(SDRs). Thus, roughly one cannibalization occurs for every ten SDRs. Inspection of the QQ plot of the normal quantiles against the residuals suggests the residuals are normally distributed (see Figure 8), and the variance appears to be homogenous (see Figure 9).
Results of all three regression analyses calculated for AMC are shown in Table 5. No significant findings were observed for aircraft availability or NMCS hours. These findings again support the hypothesis that SDRs are associated with increased cannibalizations.
3.2.4 GSC. A simple linear regression was calculated to predict aircraft availability, NMCS hours and cannibalizations based on the number of reported SDRs. A significant regression equation was found for cannibalizations (F(1, 117) = 6.51, p < 0.012), with an R 2 of 0.054 (0.046 adj.) (see Figure 10).
AFGSC's predicted change in cannibalizations from one month to the next is equal to 1.23 þ 0.09(SDRs). Similar to ACC and AMC, approximately one cannibalization occurs for every ten SDRs. Review of the QQ plot of the normal quantiles against the residuals  indicates the residuals are normally distributed (see Figure 11), and the variance is generally homogenous (see Figure 12). Results of all three regression analyses calculated for AFGSC are shown in Table 6. No significant findings were observed for aircraft availability or NMCS hours. These findings offer some support to the hypothesis that SDRs are associated with reduced aircraft readiness; however, the magnitude of the effect was smallest among this sample.

Effect of supply discrepancy reports on mission-impaired capability awaiting parts hours
Following the regression analysis, independent samples t-tests were calculated using SPSS to determine whether the mean number of monthly MICAP hours among fighter, mobility and bomber mission areas was different between MICAP-initiated shipments with and without an SDR reported. Results indicated that the average MICAP hours per shipment were significantly greater for shipments occurring with an SDR across each of the three mission areas (see Table 7). This finding suggests that SDRs can increase the time taken to fulfill an MICAP-initiated order. Tests for normality and homogeneity of variance were also  conducted for each test to determine whether the samples met the basic assumptions for analysis. As expected, the MICAP hours sampled were not normally distributed (w = 0.000), because of the presence of extreme outliers which resulted in the distributions being skewed to the right. Results of Levene tests for equal variance found no difference in variance between the samples (p = 0.215). Although the normality assumption was violated, the t-test is considered robust against this assumption because the sampling distribution of the test statistic approaches normality with a sufficient sample size, according to the Central Limit Theorem (Edgell and Noon, 1984). This finding provides evidence in support of the hypothesis that SDRs are associated with increased order fulfillment cycle time on highpriority shipments.

Discussion
Order fulfillment errors are of significant concern to private sector firms because of the proven impact these discrepancies can have on operational outcomes. Little research, however, has been done to assess the impact of supply discrepancies in the USAF, a domain in which such errors can have significant operational consequences. Thus, the purpose of this study was to determine the magnitude of the relationship between SDRs and aircraft readiness metrics.

Order fulfillment errors
Results of linear regressions found that of the three metrics tested, SDRs had the greatest impact on aircraft cannibalizations. They have a significant positive relationship with the number of cannibalizations across all three commands studied and Air Force wide. Given this finding, further analysis of other high-SDR commands such as Air Force Materiel Command (25% of total) and the Air National Guard (13% of total) would potentially yield greater insight into this phenomenon. Despite modest R-square values found at the command-level analyses, these results taken collectively provide strong evidence that the occurrence of an SDR increases the likelihood that an aircraft part will be cannibalized from another aircraft. USAF leaders should be particularly concerned, given the adverse impacts known to result from cannibalizing aircraft parts, including increased workload demands for maintenance personnel, unintended mechanical side effects and potentially reduced aircraft availability (Curtin, 2001). The practice of cannibalizing parts has become so commonplace over the years that inoperable aircraft in some maintenance units, known colloquially as hangar queens, are reserved for the sole purpose of parts (Larson, 2002). These actions disrupt the supply chain for repair parts by effectively hiding actual demand, reducing the accuracy of forecasts and making it more difficult to plan for future demand. Thus, a reduction in SDRs could improve readiness by both lessening the unnecessary and potentially damaging practice of cannibalizations, freeing maintenance personnel to devote resources to other aircraft in need of repair and helping to ensure all demands are captured by the supply chain for future planning. The prevalence of cannibalizations, while concerning, may help explain the lack of impact SDRs were found to have on aircraft availability and NMCS rates. Given the significant increase in MICAP hours on SDR shipments, it is possible that cannibalizations  are performed to some extent as a means of mitigating the impact of longer customer wait times, thereby preventing any meaningful impact on aircraft availability or NMCS rates to be seen. As an organization that closely monitors aircraft readiness metrics and compares performance across flying units, it is well documented that USAF commanders have historically felt pressured to use cannibalizations as a means to improve operational metrics (Curtin, 2001). There is evidence, however, that this practice may be on the decline as the complexity of modern weapon system design, including the use of artificial intelligence and machine learning, has reduced the interchangeability of common components (Johnson, 2018). Still, the impact of these advances is likely minimal as most aircraft still operate on decades-old platforms and will for the foreseeable future.
While not initially considered, it is possible some SDRs negatively impact NMCM rates without affecting NMCS rate. A possible example could be parts received with both exterior packaging damage that is documented using an SDR, and interior material damage that was undocumented until later discovered by maintenance personnel when the aircraft was in an NMCM status. Other examples include parts not adhering to specifications, or parts in otherwise unserviceable condition not immediately recognized upon receipt. Such discrepancies could increase NMCM rather than NMCS conditions if the discrepant parts were receipted for and subsequently issued to maintenance personnel before the errors were recognized. Further analysis of this relationship, controlling for extraneous and potentially confounding variables, will be required to validate the degree to which SDRs truly affect aircraft availability.
Results also indicate that MICAP hours tend to increase with SDRs, supporting the hypothesis that order cycle time on high-priority shipments will increase with order fulfillment errors. This finding is meaningful because it suggests that SDRs are not merely administrative discrepancies easily rectified post-shipment, but serious errors that directly impact mission effectiveness. More concerning still is the discovery that such errors, while infrequent, are more likely to occur on MICAP-initiated shipments. Taken together, these findings provide strong evidence that SDRs directly impact USAF operations, and action should be taken to reduce order fulfillment errors throughout the supply chain.

Implications
Although minor, the relationship between SDRs and adverse aircraft metrics was found to be significant in 4 of the 12 regression analysesall related to cannibalizations. This provides modest evidence in support of the stated hypothesis that SDRs would negatively impact military readiness. Despite low R-square values, there is significant practical value because results are consistent across all samples, demonstrating a clear connection between SDRs and cannibalizations. In fact, the low R-square values are unsurprising, considering there are likely several variables that contribute to the variability in aircraft availability, NMCS rates and cannibalizations. The four significant regression equations and the finding of increased MICAP hours on discrepant shipments among all three assessed mission areas suggest that a reduction in SDRs could provide modest improvement in aircraft readiness throughout the USAF.
Given that order fulfillment performance metrics are not traditionally tracked at the functional level within DoD distribution centers, results of this study give reason for leaders to place greater scrutiny on order fulfillment quality within their organizations. A recent study conducted within a major DoD distribution center found that performance measurement and employee feedback initiatives resulted in a 35% reduction in supply discrepancies over a 17-week period (Weber, 2018). Employees who were provided feedback regarding previous supply discrepancies they had committed were significantly less likely to commit future discrepancies. Thus, if similar initiatives were implemented across more USAF suppliers, aircraft metrics may show measurable and significant improvement. From 2012 to 2016, the most prevalent SDR type by far were shortages (37%), more than three times the incidence of the next highest discrepancy type, "other" (11%). Therefore, greater emphasis on accurate order quantities from suppliers could be a starting point for leaders seeking to reduce the prevalence of SDRs and bolster aircraft readiness.

Suggestions for future research
While this paper establishes the link between SDRs and readiness, it does not investigate the "why" for SDRs. For example, other discrepancies such as transportation discrepancies and product quality discrepancies could be investigated to assess their individual impact on aircraft metrics and be linked to SDRs. Another avenue for future research is to investigate factors specific to certain locations that may moderate the impact of SDRs, such as geographic distance from suppliers and types of aircraft assigned. Even seemingly innocuous factors including differences in requisitioned parts and demand patterns could influence the effect SDRs have on readiness metrics.
Although the present study focused only on aggregate SDR totals, future research could investigate the effect of discrepancy types on aircraft readinesssuch as shortages, wrong material and misdirected shipments. This would be useful in determining which SDR types are most detrimental because some types, such as overages, are unlikely to result in negative operational impact. Finally, the discovery that SDRs occur on MICAP-initiated shipments at a rate 30% higher compared to non-MICAP-initiated shipments warrants significant attention. MICAP incidences are directly tied to NMCS conditions, and any delay or error in order fulfillment will, by definition, negatively impact readiness. Thus, it is even more crucial these orders are fulfilled error-free. This disparity may be due in part to higher reporting of SDRs on MICAP-initiated shipments because of the importance of the shipment. It is unclear, however, to what extent SDRs are underreported, so this explanation cannot be offered without further investigation.

Conclusion
In an attempt to address the lack of studies on the operational impact of order fulfillment errors on aircraft readiness, this paper analyzes over 95,000 SDRs over a 10-year period in the USAF for their impact on operational metrics, including aircraft availability, NMCS hours, cannibalizations and MICAPS hours. We find evidence that order fulfillment errors, in the form of SDRs, significantly impact USAF cannibalization rates across the ACC, AMC and AFGSC MAJCOMS, and result in higher MICAP hours across fighter, bomber and mobility mission areas. Although the effect of our findings was statistically small, this study provides an important step forward in understanding a little studied, but highly impactful problem.