Language Use in Written Dialog: Studying the Enron Corpus

old_uid10129
titleLanguage Use in Written Dialog: Studying the Enron Corpus
start_date2015/11/03
schedule11h30
onlineno
summaryWhile there have been many studies based on the Enron corpus, surprisingly few have treated the data as what it is: written dialog. I will present a series of studies performed at Columbia University whose aim is to understand how linguistic choices in dialog are affected by various aspects of the communicative setting, such as power, gender, and the underlying social network. Specifically, we have investigated how power relations affect linguistic choices, both lexical choices and choices in terms of dialog acts. We see pervasive differences between language use by people in power and people without power, which allows us to predict who has power in a dialog. We have asked how this power-related behavior changes when we incorporate the gender of the discourse participants in the analysis. We have found profound differences in language use between men and women in power, and also in female and non-female gender environments (the gender environment reflects the gender of all discourse participants). We are investigating how the social networks that pre-exist a particular dialog relate to power relations. We find that when we take the content of emails into account, we can make better predictions about power relations than if we only use meta-data (as has often been done in the literature). Finally, I will report on ongoing work to distinguish personal email from professional email. We find that the social network helps us find personal email.
responsiblesGrau