Large language models (LLMs) that underlie chatbots “think” in English, even if questions are asked in other languages, writes New Scientist, citing a study by scientists from the École Polytechnique Fédérale de Lausanne. To understand what language LLMs actually use when processing queries, the researchers studied three versions of Meta✴'s Llama 2 model. Because Llama 2 is open source, the researchers were able to see every step of the request processing.
According to one of the researchers, they opened these models and studied each of their layers. AI models consist of several layers, each of which is responsible for a specific stage of query processing: one translates written prompts into tokens, the other contextualizes each token to ultimately provide an answer.
The models were offered three types of queries in Chinese, French, German and Russian. One involved repeating a given word, the second asked to translate from one non-English language to another, and the third asked to fill a one-word gap in a sentence, for example: “___ is used for sports such as football and basketball.”
By tracking the processes that LLM goes through to answer a query, the scientists found that the processing path through the layers almost always goes through what they call the English subspace. That is, if you ask a model to translate from Chinese to Russian, the Russian characters pass through the English subspace before returning to Russian, the scientist says, a strong sign that the models are using English to help themselves understand the query.
This has raised concerns among scientists that using English as a medium to train a model to analyze language carries the risk of extending the resulting limitations in worldview to other linguistically and culturally distinct regions.
“If English becomes the primary language in which systems process requests, we are likely to lose concepts and nuances that can only be appreciated in other languages,” says Carissa Véliz of the University of Oxford.
There are also more fundamental risks associated with coding generative AI used around the world with Anglocentric values, said Aliya Bhatia of the Center for Democracy and Technology in Washington, DC. “If a model is used to generate text in a language it is not trained in, this may result in culturally irrelevant hallucinations, and if the model is used to make asylum decisions for a community that does not fit into the Anglocentric imagination of a society, the model may stand between the individual and access to safety,” she says.
If you notice an error, select it with the mouse and press CTRL+ENTER.