Publikation

JTaCo & SProUTomat: Automatic Evaluation and Testing of Multilingual Language Technology Resources and Components

Christian Bering, Ulrich Schäfer

In: Proceedings of the LREC-2006 workshop on quality assurance and quality measurement for language and speech resources. International Conference on Language Resources and Evaluation (LREC) Genoa Italy 2006.

Abstrakt

We describe JTaCo, a tool for automatic evaluation of language technology components against annotated corpora, and SProUTomat, a tool for building, testing and evaluating a complex general-purpose multilingual natural language text processor including its linguistic resources (lingware). The JTaCo tool can be used to define mappings between the markup of an annotated corpus and the markup produced by the natural language processor to be evaluated. JTaCo also generates detailed statistics and reports that help the user to inspect errors in the NLP output. SProUTomat embeds a batch version of JTaCo and runs it after compiling the complex NLP system and its multilingual resources. The resources are developed, maintained and extended in a distributed manner by multiple authors and projects, i.e., the source code stored in a version control system is modified frequently. The aim of JTaCo & SProUTomat is to warrant a high level of quality and overall stability of the system and its lingware resources.

qaqmlsr-sproutomat-final.pdf (pdf, 300 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence