The Case Study With Deloitte

Research Analytics Consulting • February 1, 2025

Challenge

Deloitte provides continuing education, with the requisite CPE credit, to their professionals. Their programs are all accredited by the AACSB and are large-scale high-stakes assessments. As such, the item and test analysis must be comprehensive, precise and psychometrically rigorous. Additionally, annual reports must be provided for continuous quality improvement of the assessments associated with courses. Previous psychometric analysis for the reports only used Classical Test Theory (CTT) approaches. Even though high reliability of these assessments is desired for consistent measurement of learner knowledge and ability, leveraging only CTT can result in unstable statistical estimates, because the statistics associated with this method are sample and test dependent. Modern test theory utilizing item response theory (IRT) is a more appropriate and rigorous approach.

Solution

In this project we have evaluated over 100 learning assessments, used to measure learning outcomes of continuing education courses for accounting professionals using both CTT and IRT to optimize their learning assessments. This allows us to quantify item performance on each of the final assessments using psychometric metrics. We have helped to identify opportunities for improvement while ensuring compliance with ongoing board certification requirements. These quantitative analyses are followed by expert qualitative analyses to recommend next steps for achieving more reliable and valid measurement of knowledge of learning objectives.

Impact

The quantitative analyses are used to identify problematic items that may be improved to more validly measure learner ability with respect to the intended learning objectives. Using the results of the qualitative analysis, item writers are able to further adapt future assessments to improve reliability and better ascertain learner proficiency. Items identified as problematic can also be opportunities to clarify educational content presented to learners.

Methodology

We used IRT, CTT, DIF analyses, and modern data visualizations to quantify test item performance and identify opportunities for improvement of assessment. This included the full scope of the item from the question to the response options, with visualization of the various item characteristic curves to ascertain how items were performing across examinees. Once items were flagged as problematic, a qualitative review was conducted, using a standardized approach to make recommendations to item writers with respect to how to write items of higher quality.

Results

This is a year-to-year ongoing project. In year one we worked to evaluate the test and assessment process and write the initial base code so the results are accurate and repeatable. In year two we repeated the analysis and worked to semi-automate the process due to the large number of assessments. We have begun working on qualitative psychometric aspects around item writing, which includes rigorous evaluation of the assessment constructs in the context of the goals and objectives. This ensures the items are measuring what is intended and helps to increase the statistical validity and reliability. In the next several years, we will continue automation utilizing machine learning and artificial intelligence. We will also explore predictive validity to ensure the assessments are useful for important outcomes.

Tools Used:

Python for data extraction, cleaning and pre-processing; R for statistical analysis (tidyverse, dplyr, CTT, difR, psych, car, flextable, knitr), FlexMIRT and R (mirt, ggmirt) for IRT analysis.

IDD adult caregiving, living situations and employment.
By Research Analytics Consulting November 13, 2025
Insights from a Florida pilot study reveal adults with IDD feel emotionally supported but lack opportunities for independence, growth, and employment.
Analyzing factors that influence the effectiveness of teen pregnancy prevention efforts.
By Research Consulting Analytics November 10, 2025
The Content, Pedagogy, Implementation, and Context Components (CPIC) Study aims to better understand Evidence Based Programs for Teen Pregnancy Prevention.
George W. Bush Institute School Leadership Initiative
By Research Analytics Consulting January 31, 2025
Enhancing leadership in schools to drive student achievement through comprehensive training and support. Schedule a free consultation today.
Evaluating outcomes and effectiveness of an innovative public health program
By Research Analytics Consulting May 5, 2024
We work closely with NJPAG as their external evaluator. We have created all of the processes and procedures associated with data collection and analysis.