AIS Logo
← Back to Library
Bias Measurement in Chat-optimized LLM Models for Spanish and English

Bias Measurement in Chat-optimized LLM Models for Spanish and English

Ligia Amparo Vergara Brunal, Diana Hristova, and Markus Schaal
This study develops and applies a method to evaluate social biases in advanced AI language models (LLMs) for both English and Spanish. Researchers tested three state-of-the-art models on two datasets designed to expose stereotypical thinking, comparing performance across languages and contexts.

Problem As AI language models are increasingly used for critical decisions in areas like healthcare and human resources, there's a risk they could spread harmful social biases. While bias in English AI has been extensively studied, there is a significant lack of research on how these biases manifest in other widely spoken languages, such as Spanish.

Outcome - Models were generally worse at identifying and refusing to answer biased questions in Spanish compared to English.
- However, when the models did provide an answer to a biased prompt, their responses were often fairer (less stereotypical) in Spanish.
- Models provided fairer answers when the questions were direct and unambiguous, as opposed to indirect or vague.
LLM, bias, multilingual, Spanish, AI ethics, fairness