Understanding Code-Mixing in Controlled Dialogues

To enable naturalistic conversational agents for multilingual users, dialogue systems need to be extended to converse with bilinguals, potentially using multiple languages in an utterance (i.e. code-mixing). Yet little is known about human preferences for code-mixing in the context of a dialogue. To fill this gap and to study preferred code-mixing styles, we incorporate linguistically-motivated strategies of code-mixing into a rule-based goal-oriented dialogue system.

We collect a corpus of 587 human–computer text conversations between our dialogue system and fluent Spanish–English bilinguals. From this new corpus, we analyze the amount of elicited code-mixing, types of code-mixing strategies people use, and whether they entrain to the system’s code-mixing. Based on these exploratory findings, we give recommendations for future code-mixing dialogue systems.

People

Emily Ahn

MLT '19 -> PhD Student, UW

Alissa Ostapenko

MLT '22 -> Amazon

Understanding Code-Mixing in Controlled Dialogues

People

Emily Ahn

Alissa Ostapenko

Related Papers