Investigating Cultural Bias in Multilingual Pretrained Language Models
WANG Wenxuan, DAI Ruyi / February 2023 (244 Words, 2 Minutes)
Under the guidance of Mr. Wang Wenxuan, this project aims to investigate and quantify the cultural bias in ChatGPT responses, a powerful language model that has shown varying responses depending on the cultural background of the user. The importance of this research lies in the growing ubiquity of language models in various applications, such as virtual assistants, automated chatbots, and content generation. As these models become increasingly influential in shaping human interaction and decision-making, it is crucial to understand and address potential biases that may arise due to their training data or algorithms.
The methodology for this study involves selecting questionnaires from various cultural, political, and other domains, which cover a diverse range of cultures, languages, and regions. These questionnaires will be translated into different languages and modified to ensure their original meaning while making them more suitable for the language models to answer. By submitting these modified questions to the models and analyzing their responses, we aim to measure the extent of cultural bias in ChatGPT.
Following the evaluation of ChatGPT’s responses, the next step will be to test more language models to explore their potential cultural bias. Additionally, we will analyze the characteristics of their training data and the impact of their responses on users from different cultural backgrounds. Finally, we will investigate ways to mitigate cultural bias in language models, such as increasing language diversity, data diversity, or data scale.
The expected outcomes of this research include insights into the extent of cultural bias in ChatGPT and other language models, potential impacts of such biases on users from diverse backgrounds, and a better understanding of how to mitigate cultural bias in language models. The findings from this study will contribute to ongoing efforts in minimizing cultural bias in language models and improving their accuracy, fairness, and inclusivity. Ultimately, this research aims to promote the development of AI systems that are more respectful of cultural diversity and better serve users from various backgrounds.
Regarding my previous research on adversarial attacks, I have shifted my focus due to the challenges associated with launching attacks on large-scale pre-trained models like GPT. However, I have gained substantial knowledge about adversarial attacks in NLP, as I have been summarizing my findings during weekly group meetings (see below).
can also click here for previous works