Predicting success in United States Air Force pilot training using machine learning techniques

doi:10.1016/j.seps.2021.101121

Socio-Economic Planning Sciences

Volume 79, February 2022, 101121

https://doi.org/10.1016/j.seps.2021.101121 Get rights and content

Highlights

•
This research examines specialized undergraduate pilot training candidate data.
•
Machine learning techniques are utilized to obtain insights on candidate success.
•
Results show tree-based machine learning techniques outperform neural networks.
•
This work informs future selection criteria for pilot training candidates.

Abstract

The chronic pilot shortage that has plagued the United States Air Force over the past three years poses a national-level problem that senior military members are working to overcome. Unfortunately, not all pilot candidates successfully complete the necessary training requirements to become fully qualified Air Force pilots, which wastes critical time and resources and only further exacerbates the pilot shortage problem. Therefore, it is important for the Air Force to carefully consider whom they select to attend pilot training. This research examines historical specialized undergraduate pilot training (SUPT) candidate data leveraging a variety of machine learning techniques to obtain insights on candidate success. Computational experimentation is performed to determine how selected machine learning techniques and their respective hyperparameters affect solution quality. Results reveal that the extremely randomized tree machine learning technique can achieve nearly 94% accuracy in predicting candidate success. Additional analysis indicates degree type and commissioning source are the most important features in determining candidate success. Ultimately, this research can inform the modification of future SUPT candidate selection criteria and other related Air Force personnel policies.

Introduction

United States Air Force (USAF) senior leaders have repeatedly expressed concerns about the service's pilot shortage. In 2017, then Secretary of the Air Force (SECAF), Heather Wilson, disclosed that the military branch was 2,000 pilots below sustainable manning levels. She further stated that this shortfall was poised to “break the force” [1]. The situation has not improved in the intervening years; if anything, it has grown more dire. As of early 2020, USAF senior leaders reported the shortfall had increased to 2,100 pilots [2]. This research seeks to mitigate this shortfall by reducing pilot candidate attrition during training. The goal is to determine what, if any, factors make a pilot candidate more likely to successfully complete their training, and ultimately to use this knowledge to influence future pilot selection processes.

The genesis of the current pilot shortfall can, in part, be traced back to the sequestration-driven federal budgets of 2013–2015. Unpredictable funding levels prohibited implementation of a forward-looking personnel policy and led to the limited contemporary population of junior pilots. The USAF depends on younger pilots to replace senior pilots. This allows senior pilots to serve in administrative staff positions wherein they gain headquarters (i.e., corporate-level) experience and are generally provided a reprieve from frequent deployments. The contemporary pilot shortfall has instead forced senior pilots to remain in the cockpits, reduced the pilot presence on headquarters' staffs, and increased fears that large numbers of senior pilots will leave the service due to the fatigue caused by their operational commitments. This has caused a ripple effect throughout the USAF, disrupting the operations of other career fields as they surge to fill the manning void left on headquarters’ staffs.

Correcting the pilot shortage requires a multi-faceted approach. In 2020, then United States Air Force Chief of Staff (CSAF) General David Goldfein acknowledged that there is no “silver bullet” that will solve the problem [2]. A concentrated effort on several fronts to increase recruiting, reduce training attrition, improve training timeline efficiency, and enhance retention are all necessary efforts to ameliorate the USAF pilot shortage. The USAF has already instigated efforts on many of these fronts. Air Combat Command is revamping fighter pilot training attempting to streamline the current 40 month pipeline [3]. The Air Force Personnel Center has offered retention bonuses to senior pilots in exchange for a multi-year service commitment [4]. The United States Air Force Academy has sent an increasing number of its graduates into the pilot training pipeline [5]. Notably, Air Education and Training Command has made a concentrated effort to improve Specialized Undergraduate Pilot Training (SUPT), the initial skills training for pilots in the USAF [6]. However, efforts to reduce training attrition without sacrificing quality standards are less well-developed. Such is the focus of this study.

Given the extensive time and monetary commitments required to train a pilot [7], a candidate failure during training represents a significant loss to the USAF and is counterproductive in the attempt to reduce the current pilot shortfall. Therefore, in this study we examine how modern machine learning techniques may be used to build a classifier that helps to predict candidate success in SUPT. By doing so, we present a method to reduce SUPT attrition as a component of the larger endeavor to reduce the USAF pilot shortfall.

In adopting this perspective, our research extends the application of machine learning in human resource and educational tasks to the military environment. In the civilian sector, machine learning has been applied to hiring practices [8], job performance prediction [9], employee retention [10], and recruitment [11]. [12,13] provide thorough literature reviews in this regard. As is the case in numerous machine learning tasks, neural networks take a prominent role in human resource applications (e.g., Refs. [14,15]. Other modern techniques such as deep learning [16] and generative adversarial networks [17] are also beginning to take hold.

Educational data mining (EDM) is another closely related field that leverages machine learning on educational data to improve pedagogical practice [18,19]. It is broadly defined and encompasses analysis that empowers the decision making of myriad actors (e.g., teachers, course developers, administrators, etc.). Relevant applications include the design of e-learning education systems [20,21], the analysis of pedagogical strategies [22], student performance prediction [23,24], and personalized feedback dissemination [25]. [26] reviewed contemporary EDM applications finding that most machine learning algorithms can be leveraged; additionally, the authors provide a summary of how each algorithm is specifically employed. For further information on this field of study, we refer an interested reader to the comprehensive surveys by Refs. [27,28].

Despite these advancements, machine learning for human resource management or pedagogical applications in the military is much less developed; a limited number of examples exist in these areas leveraging operations research techniques of any variety (e.g., Refs. [[29], [30], [31], [32]]. The relative dearth of research is noteworthy. Although related, military personnel and training systems are significantly different from their civilian counterparts; they are insular systems wherein all training and promotion occur endogenously.

Nevertheless, the DoD is increasingly looking to artificial intelligence and machine learning to increase the efficacy and efficiency of its processes [33]. With specific regard to recruiting, practitioners have recently applied machine learning to identify promising candidates for special operations forces [34]. Our research illustrates how such an approach can be extended to USAF pilot selection, another resource and time expensive training pipeline. The methods utilized herein are not necessarily meant to supplant human expertise, but rather to supplement it. That is, the machine learning algorithms examined provide decision support to humans tasked with selecting USAF pilot candidates that have the highest probability of successfully graduating. This research also illustrates how insights from the models developed herein can be used for alternative military personnel tasks (e.g., training pipeline monitoring and human resource allocation). In so doing, our research takes initial steps towards adapting the varied insights drawn from machine learning in civilian human resource management and education to the military setting.

The remaining sections of this manuscript are structured as follows. In Section 2, we provide an overview of the USAF pilot training pipeline, the empirical basis for current application testing, and an overview of recent SUPT attrition statistics. In Section 3, we provide an empirical exploration of numerous machine learning techniques to identify which subset are most promising for the development of an effective classifier. In Section 4, we illustrate the efficacy of each technique, provide experimental results relating to hyperparameter selection, and determine the relative importance of the data features. Notably, we demonstrate counter-intuitive behavior regarding the superior performance of relatively simpler techniques over others generally considered more capable. In Section 5, we discuss the relevance of our quantitative findings to USAF senior policymakers, propose a composite pilot selection index, and enumerate potential policy recommendations. Finally, in Section 6, we provide closing remarks and directions of future inquiry.

Section snippets

Background

This section outlines the USAF pilot training process, discusses foundational research related to current application testing, and explores recent SUPT attrition statistics.

Distinguishing between classification algorithms

Classification methods applied to our setting should accommodate multiple qualitative constraints. Given the recent emphasis in the DoD on explainable AI [50], any machine learning method utilized herein must be readily interpretable; if not, the developed classification method risks being disregarded by policymakers offhand. Furthermore, the classifier must exhibit a high degree of accuracy while effectively managing disproportionately in the data with respect to the number of successful SUPT

Hyperparameter tuning effects and feature importance

In this section, we design and conduct computational experiments to examine how different algorithms and parameter settings impact performance measures for predicting SUPT candidate success. Experiments and analyses are conducted utilizing sci-kit learn (Version 0.23.2) within the Python modeling environment [56], R (Version 4.0.2), and MATLAB 2020A on a Lenova ThinkPad equipped with a 2.60 GHz Intel i7-9850H processor and 64 GB of RAM.

We conduct multiple mixed-factorial experiments on the

USAF policymaking discussion

In order to inform real-world processes, herein we discuss the relevance of the previous section's analysis to USAF policymaking decisions. Furthermore, we propose a novel composite pilot selection index and provide a list of recommendations for USAF decision makers to consider for future SUPT candidate selections.

Conclusions

This research examines specialized undergraduate pilot training candidate data for the United States Air Force. The intent of this study is to determine which machine learning techniques can be utilized to accurately predict SUPT candidate success, and what features of the data are the most important delineators. Utilizing Python's sci-kit learn module a variety of machine learning techniques are explored. Computational experimentation is performed to determine how selected machine learning

Disclaimer

The views expressed in this article are those of the authors and do not reflect the official policy or position of the United States Air Force, United States Department of Defense, or United States Government.

Author statement

Phillip R. Jenkins: Conceptualization, Methodology, Software, Formal Analysis, Investigation, Data Curation, Visualization, Writing-Original draft preparation. William N. Caballero: Conceptualization, Methodology, Software, Formal Analysis, Investigation, Data Curation, Visualization, Writing-Original draft preparation. Raymond R. Hill: Writing-Reviewing and Editing, Validation, Resources.

Acknowledgments

This research is partially supported by the Air Force Office of Scientific Research (AFOSR) under the Dynamic Data and Information Processing (DDIP) portfolio. Likewise, the authors thank the editorial board and anonymous reviewers for their constructive comments that helped improve the content and presentation of this paper.

Dr. Phillip R. Jenkins is an Assistant Professor of Operations Research in the Department of Operational Sciences at the Air Force Institute of Technology, Wright-Patterson Air Force Base, Ohio.

References (69)

C.F. Chien et al.
Data mining to improve personnel selection and enhance human capital: a case study in high-technology industry
Expert Syst Appl
(2008)
M.A. Valle et al.
Job performance prediction in a call center using a naive bayes classifier
Expert Syst Appl
(2012)
D. Pessach et al.
Employees recruitment: a prescriptive analytics approach via machine learning and mathematical programming
Decis Support Syst
(2020)
S. Strohmeier et al.
Domain driven data mining in human resource management: a review of current research
Expert Syst Appl
(2013)
R.S. Sexton et al.
Employee turnover: a neural network solution
Comput Oper Res
(2005)
C. Romero et al.
Educational data mining: a survey from 1995 to 2005
Expert Syst Appl
(2007)
N.D. Bastian et al.
Models and methods for workforce planning under uncertainty: optimizing us army cyber branch readiness and Manning
Omega
(2020)
J.A. Schofield et al.
Utilizing reliability modeling to analyze United States air force officer retention
Comput Ind Eng
(2018)
B. Abbasi et al.
Predicting solutions of large-scale optimization problems via machine learning: a case study in blood supply chain management
Comput Oper Res
(2020)
G. Lesinski et al.
Multi-objective evolutionary neural network to predict graduation success at the United States military academy
Procedia Computer Science
(2018)

G. Lesinski et al.

Application of an artificial neural network to predict graduation success at the United States military academy

Procedia Computer Science

(2016)

S. Losey

Air force leaders: ‘we’re going to break the force’

S. Losey

Air force: No progress in closing pilot shortfall

S. Losey

Building a better fighter pilot: Acc seeks to slash training time

O. Pawlyk

The air force failed to close its pilot Manning gap for the fifth year in a row

J. Svan

Air force academy sending more cadets to pilot training to stem shortage

O. Pawlyk

Here's how the air force hopes to train 1,500 new pilots a year

M.G. Mattock et al.

The relative cost-effectiveness of retaining versus accessing Air force pilots

(2019)

Y. Zhao et al.

Employee turnover prediction with machine learning: a reliable approach

H. Jantan et al.

Intelligent techniques for decision support system in human resource management

Decis Support Syst

(2010)

M.J. Somers

Application of two neural network paradigms to the study of voluntary employee turnover

J Appl Psychol

(1999)

Y. Ma et al.

A deep choice model for hiring outcome prediction in online labor markets

Int J Comput Commun Contr

(2020)

D. Nemirovsky et al.

Providing actionable feedback in hiring marketplaces using generative adversarial networks

C. Romero et al.

Data mining in education

Wiley Interdisciplinary Reviews: Data Min Knowl Discov

(2013)

R. Hübscher et al.

Domain specific interactive data mining

M. Millecamp et al.

Diy: learning analytics dashboards

S. Shen et al.

Exploring induced pedagogical strategies through a markov decision process framework: lessons learned

Journal of educational data mining

(2018)

A. Cano et al.

Interpretable multiview early warning system adapted to underrepresented student populations

IEEE Transactions on Learning Technologies

(2019)

V. Ramesh et al.

Predicting student performance: a statistical and data mining approach

Int J Comput Appl

(2013)

A. Pardo et al.

Using learning analytics to scale the provision of personalised feedback

Br J Educ Technol

(2019)

A.D. Kumar et al.

Review on prediction algorithms in educational data mining

Int J Pure Appl Math

(2018)

B. Bakhshinategh et al.

Educational data mining applications and tasks: a survey of the last 10 years

Educ Inf Technol

(2018)

C. Romero et al.

Educational data mining and learning analytics: an updated survey

Wiley Interdisciplinary Reviews: Data Min Knowl Discov

(2020)

N.C. Forrest et al.

An air force pilot training recommendation system using advanced analytical methods

INFORMS Journal on Applied Analytics

(2021)

Cited by (13)

Responsible machine learning for United States Air Force pilot candidate selection
2024, Decision Support Systems
The United States Air Force (USAF) continues to be plagued by a chronic pilot shortage, one that could be exacerbated by an accompanying shortfall in the commercial airlines. As a result, efforts have increased to alleviate this shortage by finding methods to reduce pilot training attrition. We contribute to these efforts by setting forth a decision support system (DSS) for pilot candidate selection using modern machine learning techniques. In view of the recent Responsible Artificial Intelligence Strategy published by the United States Department of Defense, this research leverages interpretable and explainable machine learning methods to create traceable and equitable models that may be responsibly and reliably governed. These models are used to regress candidates’ average merit assignment selection system scores based on information available for selection and prior to training. More specifically, using data provided by the USAF from 2010 to 2018, this paper develops and analyzes multiple interpretable models based on Gaussian Bayesian networks, as well as multiple black-box models rendered explainable by SHAP values and conformal prediction. A preferred pair of interpretable and explainable models is selected and embedded within a DSS for USAF pilot candidate selection boards: the Air Force Pilot Applicant Selection System. The utilization of this DSS is explored, the analyses it enables are discussed, and relevant USAF policymaking issues are examined.
Toward automated instructor pilots in legacy Air Force systems: Physiology-based flight difficulty classification via machine learning
2023, Expert Systems with Applications
The United States Air Force (USAF) is struggling to train enough pilots to meet operational requirements. Technology has advanced rapidly over the last 70 years but USAF pilot training has not. Modern operational requirements demand a change and, for this reason, USAF senior leadership has advocated for innovation. The automation of instructor and evaluator pilots in select bottlenecks (e.g., simulators) is one such measure. However, to implement this vision, numerous technical issues must be mitigated. Accurate classification of flight difficulty is a foundational problem underpinning many of these technical issues, which requires either the acquisition of new systems or the development of new procedures. Therefore, given this need and the costly nature of purchasing new equipment, physiological-based classification of flight difficulty is our focus herein. Leveraging multimodal data from a designed experiment of pilots landing a simulated aircraft, we develop a high-quality machine learning pipeline for classifying flight difficulty, called the Multi-Modal Functional-based Decision Support System (MMF-DSS). MMF-DSS distills a tabular set of features from our multimodal and functional data through the use of functional principal component analysis, summary statistics, and BorutaSHAP. In this manner, information is derived from the time-series data via the generation of hundreds of features, of which a small subset having the most predictive capability is discerned. Four full factorial designs are used to perform hyperparameter tuning on a set of classifiers. In so doing, a superlative technique is identified. Impacts on executive decision making are examined as well as associated policymaking implications. Alternative classifiers are considered for use within our pipeline that trade predictive accuracy for cost efficiency, and recommendations for choosing among these alternatives is provided.
Machine learning prediction and classification of behavioral selection in a canine olfactory detection program
2023, Scientific Reports
The United States Air Force pilot diversity dilemma
2023, Equality, Diversity and Inclusion
A Multi-Classifier Ensemble Algorithm for Predicting Travelers Repurchases Based on Evidence Theory
2023, SSRN
New methods for new data? An overview and illustration of quantitative inductive methods for HRM research
2023, arXiv

View all citing articles on Scopus

Dr. Phillip R. Jenkins is an Assistant Professor of Operations Research in the Department of Operational Sciences at the Air Force Institute of Technology, Wright-Patterson Air Force Base, Ohio.

Dr. William N. Caballero is an Adjunct Assistant Professor of Operations Research in the Department of Operational Sciences at the Air Force Institute of Technology, Wright-Patterson Air Force Base, Ohio.

Dr. Raymond R. Hill is Professor of Operations Research in the Department of Operational Sciences at the Air Force Institute of Technology, Wright-Patterson Air Force Base, Ohio.

View full text

Published by Elsevier Ltd.

Predicting success in United States Air Force pilot training using machine learning techniques

Highlights

Abstract

Introduction

Section snippets

Background

Distinguishing between classification algorithms

Hyperparameter tuning effects and feature importance

USAF policymaking discussion

Conclusions

Disclaimer

Author statement

Acknowledgments

Expert Syst Appl

Expert Syst Appl

Decis Support Syst

Expert Syst Appl

Comput Oper Res

Expert Syst Appl

Omega

Comput Ind Eng

Comput Oper Res

Procedia Computer Science

Procedia Computer Science

Air force leaders: ‘we’re going to break the force’

Air force: No progress in closing pilot shortfall

Building a better fighter pilot: Acc seeks to slash training time

The air force failed to close its pilot Manning gap for the fifth year in a row

Air force academy sending more cadets to pilot training to stem shortage

Here's how the air force hopes to train 1,500 new pilots a year

The relative cost-effectiveness of retaining versus accessing Air force pilots

Employee turnover prediction with machine learning: a reliable approach

Intelligent techniques for decision support system in human resource management

Decis Support Syst

Application of two neural network paradigms to the study of voluntary employee turnover

J Appl Psychol

A deep choice model for hiring outcome prediction in online labor markets

Int J Comput Commun Contr

Providing actionable feedback in hiring marketplaces using generative adversarial networks

Data mining in education

Wiley Interdisciplinary Reviews: Data Min Knowl Discov

Domain specific interactive data mining

Diy: learning analytics dashboards

Exploring induced pedagogical strategies through a markov decision process framework: lessons learned

Journal of educational data mining

Interpretable multiview early warning system adapted to underrepresented student populations

IEEE Transactions on Learning Technologies

Predicting student performance: a statistical and data mining approach

Int J Comput Appl

Using learning analytics to scale the provision of personalised feedback

Br J Educ Technol

Review on prediction algorithms in educational data mining

Int J Pure Appl Math

Educational data mining applications and tasks: a survey of the last 10 years

Educ Inf Technol

Educational data mining and learning analytics: an updated survey

Wiley Interdisciplinary Reviews: Data Min Knowl Discov

An air force pilot training recommendation system using advanced analytical methods

INFORMS Journal on Applied Analytics