Readiness Redefined, But Not Measured
Immediately after I completed my upgrade to F-15E multi-ship flight lead in 2017, my squadron deployed to the Middle East in support of Operation Inherent Resolve. A flight lead is trusted to command a four ship of fighter jets and, as part of this upgrade training, I had spent the previous six months learning how to lead counter-air and contested air-to-ground missions against a peer threat. Put simply: I was trained to fight and defend against other fighter jets. However, when I deployed for combat, the focus was entirely on close air support, a mission set I had not recently practiced, against an enemy with no jets to fight against. My most recent training to fight a peer adversary was not relevant.
Immediately following that deployment, I started the instructor upgrade, where I taught the advanced flying skills emphasized in the flight lead upgrade, but which I once again had not recently accomplished during the six-month combat deployment. Twice in one year, the Air Force had me fly missions that were misaligned with my readiness. The Air Force’s leadership believed I was mission-ready because I had flown the “required” number of sorties each month to stay current. Yet, in each instance the skills necessary for success in each mission had decayed at precisely the wrong time. And leadership had no manner in which to determine or track that skill decay.
In this case, the Air Force fell short of answering Richard Bett’s three key questions: Ready for what? Ready for when? What needs to be ready? The Air Force could not determine what mission set I needed to be ready for: high-intensity conflict or uncontested close air support. The Air Force could not determine when I needed to be ready to fight each mission set: Should I be training for close air support before the deployment instead of high-intensity counter air? The Air Force could not determine what skills needed to be upgraded and which could be allowed to decay in preparation for the deployment or the instructor upgrade.
The Air Force’s leadership understands that two decades of low-intensity air conflict in the Middle East has resulted in a force unprepared for major combat operations against a peer threat. In September 2023, Secretary Frank Kendall ordered a comprehensive review of readiness across the Air Force, stating that “every person and organization in the Department, starting today, needs to consider these questions; If asked to go to war today against a peer competitor, are we as ready as we could be? What can we change in each of our units and organizations to be more ready?”
The Air Force cannot currently measure the readiness of its individual pilots because it relies on one-size-fits-all training metrics. It asks squadron commanders to only report the readiness of the entire squadron, and only on a subjective scale.
As a former fighter pilot who implemented analytics into fighter pilot training while on active duty, a current member of the Air National Guard, and a civilian working for a dual-use AI company, I have a personal and professional interest in this issue. My experience makes me believe that it is important for the Air Force to implement the foundational technology necessary to determine pilot readiness based on individual performance, or as scholar Todd Harrison would describe it, outcome-based readiness. While the Army has implemented individualized skill tracking in their virtual training, and the Navy has adopted individual sailor competency tracking across entire careers, the Air Force lags behind both of its sister services. To accurately determine the readiness of its pilots, the Air Force should create an Ascension-to-Retirement Competency Profile that tracks every pilot’s attainment of skills through training, and then the retention of those skills while serving in operational units. This will enable higher individual readiness levels, objective squadron readiness assessments and more effective training.
The Air Force Cannot Measure Pilot Readiness
Two years ago, Air Force Chief of Staff Gen. Charles Brown and Marine Corps Commandant Gen. David Berger argued in these pages that the Department of Defense needed to redefine readiness: “The joint force requires a holistic, rigorous, and analytical framework to assess readiness properly.” However, their focus was on modernizing equipment, not training. In June 2023, Berger and then Deputy Assistant Secretary of Defense for Force Readiness Kimberly Jackson proposed to redefine readiness at the strategic level by evaluating the holistic readiness of the joint force. They argued that “historically, the department has defined readiness by measuring personnel, training, equipment, and maintenance at the unit level to ensure each is ready for its assigned missions. This definition is useful but not entirely sufficient to fully understanding the military’s preparedness to execute its missions.”
While the Department of Defense attempts to evaluate readiness at the strategic level, and believes they can evaluate readiness at the unit level, the Air Force additionally should develop the capability to determine the readiness of its individual airmen. The Air Force cannot determine the readiness of its squadrons if it cannot determine the readiness of each pilot. Squadron leaders cannot make decisions on who should fly which training and combat missions if they do not have the foundational technology tracking the readiness of each pilot. A fundamental axiom of analytics and AI is that “garbage in leads to garbage out,” and the metrics taken into the current training readiness model are inadequate: Squadron commanders do not have access to accurate, detailed, objective metrics on their pilots. Therefore, commanders report the readiness of their squadrons based on superficial metrics and an obsolete Ready Aircrew Program Tasking Memorandum, resulting in a subjective output for the most important military metric — warfighter readiness.
A Subjective Process
Each month, every squadron commander must report the readiness of the squadron in the Defense Readiness Reporting System, the centralized and authoritative web-based software that U.S. Code Title 10 requires all military branches to use to report unit-level readiness. The report is broken down into four parts, three of which are objective (personnel, equipment availability, and equipment readiness), and one which is subjective (training). Personnel readiness is defined as a ratio that compares available deployable personnel versus all assigned personnel. Equipment availability is a ratio comparing the amount of equipment in the squadron’s possession versus the amount of equipment it is authorized to possess. Equipment readiness measures how many mission-capable aircraft a squadron currently has versus the total number of authorized aircraft. The personnel, equipment availability, and equipment readiness metrics are objective numerical inputs, leading to objective numerical outputs. For example, an individual F-15E jet has 30-plus years of maintenance records that include scheduled inspections, depot maintenance, pilot debriefs, and jet modifications, all so that the Air Force can understand that specific jet down to its individual rivets.
Conversely, according to the Congressional Research Service’s The Fundamentals of Military Readiness, “The final assessed resource area — training — allows for the most subjectivity. Training readiness does not lend itself to quantifiable evaluation as easily as personnel and equipment readiness; it relies more heavily on the commander’s professional military judgment.” The Air Force’s most important resource, the readiness of its pilots, is the only resource that the service has opted out of objectively quantifying. Training readiness is a commander’s subjective assessment of how well the entire unit performs certain mission essential tasks. The possible assessments are “trained” (T), “needs practice” (P), or “untrained” (U).
The Congressional Research Service report explains how this process works: “The methodology assigns a weight of 3 to each “T,” 2 to each “P,” and “1” to each U. These figures are summed and then divided by the product of 3 multiplied by the number of [mission essential tasks]. The resulting quotient is multiplied by 100 to produce a percentage, which is interpreted according to a published scale.” The four ratings (three objective, one subjective) are then combined into a 1 to 5 “C-scale” declaring how ready the squadron is to deploy (1 = ready, 5 = not prepared). This process is, quite obviously, not intuitive. In simple terms: The Defense Readiness Reporting System has three high quality quantitative metrics and one qualitative assessment. It then attempts to combine these into a 1 to 5 scale for ease of use. This approach results in three problems. First, the methodology is statistically specious. Second, the data is subjective. And third, the reporting is not timely. As a result, the Air Force is not truly capturing readiness rates.
Imprecise Reporting
The Defense Readiness Reporting System translates subjective assessments of skills into a ratio. Ratios are based on equal spacing between numbers (50 percent is equally separated from 40 percent as it is from 60 percent) and a true zero. But in this case, “trained” versus “needs practice” is not mathematically separated (it is an ordinal scale), and “untrained” does not mean zero percent trained. Therefore, these metrics are not conducive to a ratio. Worse, the Defense Readiness Reporting System recognizes this fallacy, but instead of correcting it, the report combines the three quality ratios that are statistically valid (personnel, equipment availability, and equipment readiness) with this fourth invalid ratio into an ordinal scale to create the C-scale. This method reduces the quality of the data for the first three readiness ratios to align with the qualitative training data. Put simply: The Defense Readiness Reporting System is construing a subjective assessment as an authoritative objective metric.
Second, the three objective ratings (personnel, equipment availability, and equipment readiness) are all based on objective data. However, training is a subjective judgment made by the commander. This is because the Ready Aircrew Program Tasking Memorandum and aircrew tracking systems were designed as a one-size-fits-all solution to continuation training. The memorandum states that each inexperienced pilot must fly nine sorties and conduct three simulator events per month (with certain exceptions), along with currency requirements (i.e., the pilot must drop a simulated weapon every 60 days). After each sortie, the pilot then fills out a training accomplishment report that states what the pilot accomplished in that sortie.
When filling out the Defense Readiness Reporting System report, the squadron commander uses those training accomplishment reports to see how many of the squadron’s pilots flew the required number of sorties that month, and how many did not. The commander can also see how many of a specific mission set (for example, defensive counter-air) the squadron flew, and make a judgment call on the squadron’s capability. It is even possible for the commander to view on which days specific pilots flew specific missions. As a result, the commander could decide to declare, on a per pilot basis, who is mission-ready for each mission. However, unless a commander has intimate knowledge of every skill of every pilot, then these determinations are — at best — a guess. The commander does not currently have access to the tools required to make either better subjective determinations of proficiency or entirely objective determinations. With updated tools, commanders would gain insight into the outcome-based readiness of the squadron’s individual pilots, which is crucial to understanding if the force is prepared to fight.
The third problem with the current system is that the commander fills out this report once a month. A hypothetical example is useful to explain the problem: If Capt. Stephens flew defensive counter-air on Oct. 3, is he still mission ready a month later? Does that subjective assessment of readiness change if, instead of an inexperienced wingman, it was the Weapons School graduate? How does the commander compare the readiness of these two pilots? Currently, the Air Force’s only recognition of skill decay is the basic recognition that fighter pilot skills do decay, that they decay faster for inexperienced pilots than experienced, and therefore inexperienced pilots are required to fly one more sortie per month than experienced pilots. Because of these limitations, the commander has no option but to assert the readiness of his squadron as a whole, subjectively, on just one day of each month. The readiness of individual pilots, and therefore of the squadron as a whole, fluctuates on a daily basis but the squadron can be called to war on any day, so the commander needs to have real-time insight into his pilots’ ability to execute every assigned mission essential task. Before advanced analytics and AI, this manner of declaring readiness was laudable, but now we can do better.
How to Fix It
The Air Force recognized the ability of analytics to assist in maintaining its aircraft and therefore whole-heartedly embraced predictive maintenance to increase equipment readiness. To revolutionize training readiness, the Air Force should move ahead with predictive training.
The Army, via the Synthetic Training Environment Experimental Learning for Readiness program, has proven that predicting skill decay in individual skills for operational personnel is possible and useful. Likewise, the goal of the Navy’s Surface Warfare Combat Training Continuum program is to track combat skills of operational sailors throughout their careers. In each case, the program objectively measures the readiness levels of individual warfighters, with the goal of doing so over years of training.
The Air Force should take the lessons learned from the Army and Navy. Specifically, once a pilot becomes mission-ready, artificial intelligence can use the pilot’s past performance to predict how quickly their knowledge and skills will decay, and then recommend how often the pilot should train and which knowledge and skills the pilot should practice during each training event. At the highest level, some inexperienced fighter pilots will require more than nine sorties per month, while others will require fewer. But the real power of an Ascension-to-Retirement Competency Profile is that squadrons will know exactly what skills have decayed for each pilot to the point that the pilot would be unable to reach the mission outcomes necessary for victory. For instance, Capt. Stephens may not be ready for a defensive counter-air mission because his flight leadership skills have deteriorated, while Lt. Smith may not be ready for the same mission set because her defensive timeline knowledge has decayed. The squadron can then schedule those pilots for the appropriate training. Then, throughout the pilot’s career, these skill-decay curves will shift as the pilot becomes more proficient generally, and each curve will shift as the pilot masters some skills faster than other skills.
This skill decay per mission set will become even more powerful when connected to 19th Air Force Pilot Training Transformation’s initiative to track skill growth in every competency from the student’s first day in Undergraduate Pilot Training to the student’s final flight in fighter/bomber fundamentals by combining the skill-growth curve for every competency in a pilot’s undergraduate training with the student’s performance in his assigned aircraft in the formal training unit and mission qualification training, AI will better plot and predict the skill decay of every skill and mission set on the individual level. These metrics will be continuous, so that commanders will have a complete view of the readiness of every pilot in every mission set on every day. With individualized readiness training, pilots can maintain a level of readiness in alignment with AFFORGEN, the Air Force’s acronym for the force sustainment model that places squadrons in a 24-month deployment cycle so that readiness requirements can be increased as the squadron nears its “available to commit” phase, and then decreased in its “reset” phase. For instance, commanders can determine the level of readiness they require from each assigned pilot depending upon which phase of the cycle the squadron is in and the expected mission sets the squadron will fly in that phase. Squadrons will be able to individualize training and track readiness based on their needs, and the needs of the Air Force.
The first iteration of this process involves mostly subjective data and the training data remains reliant on the instructor’s judgment of a student’s performance. The operational data is dependent on what a pilot records after flight. However, creating the tool to collect, analyze, and report on that data would already dramatically improve current training readiness reporting. In the near term, novel approaches to collecting and evaluating pilot performance via telemetric data, including from the Air Force’s Quick Reaction Instrument Package will allow automated evaluators to determine pilot performance continuously and objectively. By combining automated evaluation with career-long competency tracking, the Air Force can reach Harrison’s vision of output-based readiness metrics with a feedback loop.
When the Air Force tracks individualized skill growth and decay curves over the course of a career, they can unlock the secrets behind pilot performance. Discovering which students most quickly gain knowledge and skills, and which operational pilots increase their skills and decrease their decay curves the fastest, will allow the Air Force to determine what led to success and failure. In the F-15E formal training unit, we barely scratched the surface of this when we linked failure in the unit back to specific tasks in the beginning of the same student’s training in undergraduate pilot training, and then pushed those insights to undergraduate pilot training to influence syllabus changes at the beginning of a pilot’s career. If officers in the Air Force had access to the data and resources described above, they could have discovered numerous actionable insights into pilot success, resulting in individualized training and predictive ascension testing.
Ready Pilots Win Air Wars
Individuals determine the outcome of combat, and individual readiness determines the likelihood of that outcome, but the Air Force’s method of determining the readiness of its pilots is flawed, which reduces the probability of victory in the next air war. To be prepared for major combat operations against a peer threat, the Air Force needs to be able to understand the readiness of its pilots continuously, and at a more detailed level than is currently available.
The Air Force needs to create an Ascension-to-Retirement Competency Profile for every pilot, connecting the pilot’s skill growth in undergraduate pilot training and the formal training unit to the pilot’s skill retention in operations, for every skill. Unit leaders need this to make accurate training and combat manning decisions, instructors need this to individualize training, and Air Force leaders need this to understand the readiness of their force. Furthermore, for future AI and analytic breakthroughs measuring and predicting pilot performance to be actionable, they will require this foundational technology to already be in place. To increase readiness immediately, and to prepare for the future, the Air Force should create the analytic backbone to track training readiness continuously for every pilot.
Matthew Ross is a former F-15E evaluator pilot with over 1500 flight hours, including over 350 combat hours, and is a current member of the Air National Guard. As an instructor in the F-15E formal training unit, he implemented AI-based human performance processes into fighter pilot training. He is currently the Director of Government Solutions at Eduworks.
The views and opinions expressed in this article are those of the author and do not necessarily reflect the views or positions of the U.S. Air Force or the Eduworks Corporation.
Image: Mark Hybers
Comments are closed.