Publication
Mitigating Bias in Item Retrieval for Enhancing Exam Assembly in Vocational Education Services
Alonso Palomino Garibay; Andreas Fischer; David Buschhüter; Roland Roller; Niels Pinkwart; Benjamin Paassen
In: Weizhu Chen; Yi Yang; Mohammad Kachuee; Xue-Yong Fu (Hrsg.). Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track). Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), Albuquerque, New Mexico, Pages 183-193, ISBN 979-8-89176-194-0, Association for Computational Linguistics, 2025.
Abstract
In education, high-quality exams must cover broad specifications across diverse difficulty levels during the assembly and calibration of test items to effectively measure examinees' competence. However, balancing the trade-off of selecting relevant test items while fulfilling exam specifications without bias is challenging, particularly when manual item selection and exam assembly rely on a pre-validated item base. To address this limitation, we propose a new mixed-integer programming re-ranking approach to improve relevance, while mitigating bias on an industry-grade exam assembly platform. We evaluate our approach by comparing it against nine bias mitigation re-ranking methods in 225 experiments on a real-world benchmark data set from vocational education services. Experimental results demonstrate a 17% relevance improvement with a 9% bias reduction when integrating sequential optimization techniques with improved contextual relevance augmentation and scoring using a large language model. Our approach bridges information retrieval and exam assembly, enhancing the human-in-the-loop exam assembly process while promoting unbiased exam design
