CAI Logo

Automatic Annotation of Attribution of a Mind to AI

Image for project Automatic Annotation of Attribution of a Mind to AI

Description: Humans have the astounding ability to attribute mental states to their interaction partners, so-called Theory of Mind (ToM), which is reflected in the terms and language used to explain behaviour (Malle 2012). While beneficial in human interactions, ToM can have negative consequences when users interact with AI agents. It may lead to overestimating the AI's intelligence, intentionality and trustworthiness (Watson 2020, Salles et al. 2020).

The goal of this project is to create an automatic annotation tool that detects utterances that attribute a mind to AI systems. The annotation can be learned from the coding scheme and examples provided here (Malle 2014, Malle undated). It can be tested on AI research papers or twitter conversations to see whether the AI research community falls short in preventing the negative effects of ToM. The outcome is the automatic annotator and an analysis of the mind attribution in AI research.

Supervisor: Susanne Hindennach

Distribution: 30% Literature, 50% Implementation, 20% Analysis

Requirements: Interest in Theory of Mind and philosophy/linguistics, information retrieval, NLP

Literature:

Malle, Bertram F. 2012. Folk Theory of Mind: Conceptual Foundations of Human Social Cognition. The New Unconscious.

Malle, Bertram F. 2014. F.Ex: A Coding Scheme for Folk Explanations of Behavior, version 4.5.7. https://research.clps.brown.edu/SocCogSci/Coding/Fex%204.5.7%20(2014).pdf

Malle, Bertram F. Naturally Occurring Explanations. https://research.clps.brown.edu/SocCogSci/Coding/natural.html

Salles, Arleen, Kathinka Evers, and Michele Farisco. 2020. Anthropomorphism in AI. AJOB Neuroscience, 11(2), p.88-95.

Watson, David. 2020. The Rhetoric and Reality of Anthropomorphism in Artificial Intelligence. Minds and Machines, 29, p.45-65.