Publikation
Revision Transformers: Instructing Language Models to Change Their Values
Felix Friedrich; Wolfgang Stammer; Patrick Schramowski; Kristian Kersting
In: Kobi Gal; Ann Nowé; Grzegorz J. Nalepa; Roy Fairstein; Roxana Radulescu (Hrsg.). ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland - Including 12th Conference on Prestigious Applications of Intelligent Systems (PAIS 2023). European Conference on Artificial Intelligence (ECAI), Pages 756-763, Frontiers in Artificial Intelligence and Applications, Vol. 372, IOS Press, 2023.
Zusammenfassung
Current transformer language models (LM) are large-
scale models with billions of parameters. They have been shown to
provide high performances on a variety of tasks but are also prone
to shortcut learning and bias. Addressing such incorrect model be-
havior via parameter adjustments is very costly. This is particularly
problematic for updating dynamic concepts, such as moral values,
which vary culturally or interpersonally. In this work, we question
the current common practice of storing all information in the model
parameters and propose the Revision Transformer (RiT) to facilitate
easy model updating. The specific combination of a large-scale pre-
trained LM that inherently but also diffusely encodes world knowl-
edge with a clear-structured revision engine makes it possible to up-
date the model’s knowledge with little effort and the help of user
interaction. We exemplify RiT on a moral dataset and simulate user
feedback demonstrating strong performance in model revision even
with small data. This way, users can easily design a model regarding
their preferences, paving the way for more transparent AI models
