Skip to main content Skip to main navigation

Publication

The Effect of Error Rate in Artificially Generated Data for Automatic Preposition and Determiner Correction

Fraser Bowen; Jon Dehdari; Josef van Genabith
In: The Third Workshop on Noisy User-generated Text (W-NUT 2017) - Proceedings of the Workshop. Workshop on Noisy User-generated Text (NUT-2017), located at EMNLP 2017, September 7, Copenhagen, Denmark, Pages 68-76, ISBN 978-1-945626-94-4, Association for Computational Linguistics, 9/2017.

Abstract

In this research we investigate the impact of mismatches in the density and type of error between training and test data on a neural system correcting preposition and determiner errors. We use synthetically produced training data to control error density and type, and “real” error data for testing. Our results show it is possible to combine error types, although prepositions and determiners behave differently in terms of how much error should be artificially introduced into the training data in order to get the best results.