LLMs for Counterfactuals (EMNLP’24)

1 minute read

What minimal changes to this text would cause the text classifier to change its prediction?

Counterfactual texts, i.e. minimal changes to inputs that alter a model’s predictions, are an important technique in Explainable AI (XAI) for understanding model behaviour.

image-center

Bach and Paul evaluated the ability of open-source and closed-source LLMs (GPT-4, GPT-3.5, LLAMA-2, Mistral) to generate counterfactual texts in a variety of tasks (sentiment analysis, natural language inference, and hate speech detection).

Results in a nutshell:

LLMs excel at generating fluent counterfactuals, but often make excessive changes.
Generating counterfactuals for sentiment analysis is easier than for natural language inference and hate speech detection, where label reversal is less reliable.
Human-generated counterfactuals outperform LLM-generated counterfactuals for data augmentation.
Using LLMs to automatically assess the quality of generated counterfactuals is prone to bias: GPT-4 in particular tends to favour its own results.

image-center

Reference

(Nguyen et al., 2024)

Van Bach Nguyen, Paul Youssef, Christin Seifert, and Jörg Schlötterer. LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study. Findings of the Association for Computational Linguistics: EMNLP 2024. 2024.

BibTeX

@inproceedings{Nguyen2024_emnlp_llms-for-generating-counterfactuals,
  author = {Nguyen, Van Bach and Youssef, Paul and Seifert, Christin and Schl{\"o}tterer, J{\"o}rg},
  booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2024},
  title = {{LLM}s for Generating and Evaluating Counterfactuals: A Comprehensive Study},
  year = {2024},
  address = {Miami, Florida, USA},
  editor = {Al-Onaizan, Yaser and Bansal, Mohit and Chen, Yun-Nung},
  month = nov,
  pages = {14809--14824},
  publisher = {Association for Computational Linguistics},
  code = {https://github.com/aix-group/llms-for-cfs/},
  file = {:own-pdf/Nguyen2024_emnlp_llms-for-generating-counterfactuals_publisher.pdf:PDF},
  url = {https://aclanthology.org/2024.findings-emnlp.870}
}

xAI Lab

LLMs for Counterfactuals (EMNLP’24)

Reference

You May Also Enjoy

Content Selection in Text Summarization and Simplification

Student Chatbot Marcel (EMNLP’25)

Student Initiative Receives Hessian Teaching Award

Annotation-Free Breast Cancer Prediction (CBM)