Levels of Evaluation and Implications for Assessing Training
One widely regarded approach to assessing and evaluating training on multiple criterion levels is Kirkpatrick's framework for Training Evaluation (Kirkpatrick 1959, 1994). It provides a comprehensive and systematic alignment for assessing desired outcomes. The Secure, Live, Virtual, and Constructive (LVC) Advanced Technology Demonstration (SLATE ATD) provided a great opportunity to gather data related to pilot assessments and reactions to the technology as a novel readiness construct. Given the aim to demonstrate, evaluate, analyze, and report the current technology readiness levels of LVC enabling technologies, a benefit of level 1 evaluation is its potential to pinpoint out missing content areas. With that view, the Capstone event resulted in 16 live aircraft, four virtual cockpits, and numerous live and virtual air-to-air and surface threats. Detailing the information exchange, this includes payoffs in experiences and reactions to the specific events and more broadly, LVC capabilities as they apply to future blended training environments. This paper will highlight the training effectiveness concerns with introducing LVC such as hardware displays, software libraries, content authoring, and brief/debrief capabilities. Proposed attributes such as LVC fidelity and its effects on training quality and realism (e.g., the level of detail displays can produce) classify types of thinking. Referring to specific views, the criterion space we examine here adequately introduces proficiency-based training to enable equitable learning opportunities of sufficient scale and complexity for each pilot. Given the nature of the ATD as a technology demonstration, not so much a training or training research event, aircrew responses to the capabilities of SLATE as demonstrated are viewed in light of developing tools to assess training system impact. General conclusions can be noted that justify the construct oriented approach to the reactions measure. In support of the Kirkpatrick model, the process used to describe and explain or otherwise hypothesize the cause of certain behaviors, guides a variety of uses for the data. Transparency of complex relations can be credited to the logical system of inferences among such elements/dimensions (variables) and the primary outcomes of learning and behavior change. Results included utility-type and affect-type reactions measures providing data that informs decisions regarding the best conditions for learning. The convergence or divergence of desired outcomes and available resources clarify the requirements for specialized product development efforts. Defining relevant criteria facilitates discussions as to how to mitigate certain risks and sheds insight into the observed and potential training value of the capabilities of the demonstration. The ability to detect systematic trends or effects is central to the strategic success of technology integration in operational contexts. Expanding knowledge by experience, revealing actual conditions or stressors that exist endorses the selected design approach and provides a convincing argument for the appropriateness of level 1 evaluation. Scoping the design, a balance of power and synchronization of outcomes (trade-off decisions) demonstrate evidence of new learning. Lessons learned and opportunities are uncovered within our instrumentation. The pairing of the Kirkpatrick Training Evaluation Model along with new conceptualizations of the training criterion space puts a renewed focus on enhancing conventional training outcomes.