Skip to main content Skip to main navigation

Publikation

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Felix Friedrich; Simone Tedeschi; Patrick Schramowski; Manuel Brack; Roberto Navigli; Huu Nguyen; Bo Li; Kristian Kersting
In: Computing Research Repository eprint Journal (CoRR), Vol. abs/2412.15035, Pages 1-20, arXiv, 2024.

Zusammenfassung

Building safe Large Language Models (LLMs) across multiple languages is essen- tial in ensuring both safe access and linguistic diversity. To this end, we conduct a large-scale, comprehensive safety evaluation of the current LLM landscape. For this purpose, we introduce M-ALERT, a multilingual benchmark that evaluates the safety of LLMs in five languages: English, French, German, Italian, and Spanish. M-ALERT includes 15k high-quality prompts per language, totaling 75k, with category-wise annotations. Our extensive experiments on 39 state-of-the-art LLMs highlight the importance of language-specific safety analysis, revealing that models often exhibit significant inconsistencies in safety across languages and categories. For instance, Llama3.2 shows high unsafety in category crime_tax for Italian but remains safe in other languages. Similar inconsistencies can be observed across all models. In contrast, certain categories, such as substance_cannabis and crime_propaganda, consistently trigger unsafe responses across models and languages. These findings underscore the need for robust multilingual safety practices in LLMs to ensure responsible usage across diverse communities.

Weitere Links