Karolina Stańczak

I am a Postdoctoral Researcher at the ETH AI Center. Previously, I was a Postdoctoral Researcher at Mila - Quebec AI Institute and McGill University School of Computer Science with Prof. Siva Reddy. I have earned my PhD from the Department of Computer Science at the University of Copenhagen where I was supervised by Isabelle Augenstein and co-supervised by Ryan Cotterell at ETH Zurich. My thesis, titled A Multilingual Perspective on Probing Gender Bias, was awarded the SCIENCE Faculty's PhD Award for advancing innovative techniques to detect gender bias both in natural language and language models.

My research interests encompass AI safety, cultural alignment, and interpretability of large language models.

Before starting my PhD, I completed the MSc in Statistics at the Humboldt University of Berlin. Before that, I obtained a Bachelor of Science in Economics also at the Humboldt University of Berlin.

Besides, prior to starting my PhD, I have worked as a data science consultant for Deloitte Analytics Institute.

You can find me on: Twitter, GitHub, LinkedIn.

news

Apr 01, 2025	I joined the ETH AI Center in Zurich as a Postdoctoral Fellow.
Feb 06, 2025	I attended the Inaugural Conference of the International Association for Safe and Ethical AI, where I presented our paper Societal Alignment Frameworks Can Improve LLM Alignment.
Nov 11, 2024	I attended EMNLP 2024 and presented there 3 publications: Benchmarking vision language models for cultural understanding, Social bias probing: Fairness benchmarking for language models, and The Causal Influence of Grammatical Gender on Distributional Semantics.
Sep 20, 2024	I was honored to receive the SCIENCE PhD award from the University of Copenhagen.

selected publications

EMNLP 2024
Benchmarking Vision Language Models for Cultural Understanding

Shravan Nayak, Kanishk Jain, Rabiul Awal, and 5 more authors

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Nov 2024

Abs Bib HTML PDF

Foundation models and vision-language pre-training have notably advanced Vision Language Models (VLMs), enabling multimodal processing of visual and linguistic data. However, their performance has been typically assessed on general scene understanding - recognizing objects, attributes, and actions - rather than cultural comprehension. This study introduces CulturalVQA, a visual question-answering benchmark aimed at assessing VLM’s geo-diverse cultural understanding. We curate a diverse collection of 2378 image - question pairs with 1-5 answers per question representing cultures from 11 countries across 5 continents. The questions probe understanding of various facets of culture such as clothing, food, drinks, rituals, and traditions. Benchmarking VLMs on CulturalVQA, including GPT-4V and Gemini, reveals disparity in their level of cultural understanding across regions, with strong cultural understanding capabilities for North America while significantly weaker capabilities for Africa. We observe disparity in their performance across cultural facets too, with clothing, rituals, and traditions seeing higher performances than food and drink. These disparities help us identify areas where VLMs lack cultural understanding and demonstrate the potential of CulturalVQA as a comprehensive evaluation set for gauging VLM progress in understanding diverse cultures.
@inproceedings{nayak2024benchmarkingvisionlanguagemodels, title = {Benchmarking Vision Language Models for Cultural Understanding}, author = {Nayak, Shravan and Jain, Kanishk and Awal, Rabiul and Reddy, Siva and van Steenkiste, Sjoerd and Hendricks, Lisa Anne and Stańczak, Karolina and Agrawal, Aishwarya}, booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing}, month = nov, year = {2024}, address = {Miami, Florida, USA}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2024.emnlp-main.329}, pages = {5769--5790}, dimensions = {true} }

TACL

The Causal Influence of Grammatical Gender on Distributional Semantics

Karolina Stańczak, Kevin Du, Adina Williams, and 2 more authors

Transactions of the Association for Computational Linguistics, Nov 2024

Bib

@article{stanczak2023grammatical,
  author = {Stańczak, Karolina and Du, Kevin and Williams, Adina and Augenstein, Isabelle and Cotterell, Ryan},
  title = {The Causal Influence of Grammatical Gender on Distributional Semantics},
  year = {2024},
  shorthand = {TACL},
  journal = {Transactions of the Association for Computational Linguistics},
  url = {https://arxiv.org/abs/2311.18567},
  publisher = {MIT Press},
}

NAACL 2022

Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models

Karolina Stańczak, Edoardo Ponti, Lucas Torroba Hennigen, and 2 more authors

In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jul 2022

Bib Video

@inproceedings{stanczak-etal-2022-neuron,
  title = {Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models},
  author = {Stańczak, Karolina and Ponti, Edoardo and Torroba Hennigen, Lucas and Cotterell, Ryan and Augenstein, Isabelle},
  shorthand = {NAACL'22},
  booktitle = {Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  month = jul,
  year = {2022},
  address = {Seattle, United States},
  publisher = {Association for Computational Linguistics},
  url = {https://aclanthology.org/2022.naacl-main.114},
  pages = {1589--1598},
}

EMNLP 2024
Social Bias Probing: Fairness Benchmarking for Language Models

Marta Marchiori Manerba^*, Karolina Stańczak^*, Riccardo Guidotti, and 1 more author

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Nov 2024

Abs Bib

While the impact of social biases in language models has been recognized, prior methods for bias evaluation have been limited to binary association tests on small datasets, limiting our understanding of bias complexities. This paper proposes a novel framework for probing language models for social biases by assessing disparate treatment, which involves treating individuals differently according to their affiliation with a sensitive demographic group. We curate SoFa, a large-scale benchmark designed to address the limitations of existing fairness collections. SoFa expands the analysis beyond the binary comparison of stereotypical versus anti-stereotypical identities to include a diverse range of identities and stereotypes. Comparing our methodology with existing benchmarks, we reveal that biases within language models are more nuanced than acknowledged, indicating a broader scope of encoded biases than previously recognized. Benchmarking LMs on SoFa, we expose how identities expressing different religions lead to the most pronounced disparate treatments across all models. Finally, our findings indicate that real-life adversities faced by various groups such as women and people with disabilities are mirrored in the behavior of these models.
@inproceedings{marchiori-manerba-etal-2024-social, title = {Social Bias Probing: Fairness Benchmarking for Language Models}, author = {Marchiori Manerba, Marta and Stańczak, Karolina and Guidotti, Riccardo and Augenstein, Isabelle}, shorthand = {EMNLP'24a}, booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing}, month = nov, year = {2024}, address = {Miami, Florida, USA}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2024.emnlp-main.812}, pages = {14653--14671}, author+an = {2=highlight}, }