Publikation
MLT-DFKI at CLEF eHealth 2019: Multi-label Classification of ICD-10 Codes with BERT
Saadullah Amin; Günter Neumann; Katherine Dunfield; Anna Vechkaeva; Kathryn Annette Chapman; Morgan Kelly Wixted
In: CLEF 2019 Working Notes. Conference and Labs of the Evaluation Forum (CLEF-2019), 10th Conference and Labs of the Evaluation Forum, September 9-12, Lugano, Switzerland, CEUR-WS.org, 9/2019.
Zusammenfassung
With the adoption of electronic health record (EHR) systems,
hospitals and clinical institutes have access to large amounts
of heterogeneous patient data. Such data consists of structured
(insurance details, billing data, lab results etc.) and unstructured
(doctor notes, admission and discharge details, medication steps etc.)
documents, of which, latter is of great significance to apply natural
language processing (NLP) techniques. In parallel, recent advancements
in transfer learning for NLP has pushed the state-of-the-art to new
limits on many language understanding tasks. Therefore, in this paper,
we present team DFKI-MLT's participation at CLEF eHealth 2019 Task 1 of
automatically assigning ICD-10 codes to non-technical summaries (NTSs)
of animal experiments where we use various architectures in multi-label
classification setting and demonstrate the effectiveness of transfer
learning with pre-trained language representation model BERT
(Bidirectional Encoder Representations from Transformers) and its recent
variant BioBERT. We first translate task documents from German to
English using automatic translation system and then use BioBERT which
achieves an F1-micro of 73.02% on submitted run as
evaluated by the challenge.