PhD thesis Submitted in fulfilment of the requirements for the title of Doctor of Philosophy School of Computing Science, College of Science and Engineering, University of Glasgow, UK.
Review by Marina Santini
The PhD thesis Role of emotion in information retrieval by Yashar Moshfeghi starts filling a gap in an area of Information Retrieval (IR) and Information Studies (IS) that is still underinvestigated: the role played by emotion in searchers’ behaviour.
Although it is crystal clear that searchers use emotionally-rich documents from the internet to satisfy their needs — from blogs to tweets — the influence that emotion might have in information retrieval and seeking (IR&S) is insufficiently studied and often overlooked by IR practitioners.
Yashar Moshfeghi puts forward a daring claim “human emotion is a central motivation (either directly or indirectly) behind IR&S behaviour” (p. 4). In order to investigate and structure searchers’ IR&S behaviour, the author defines three concepts — namely emotion need, emotion object and emotion relevance — and presents a conceptual map that utilizes these concepts in IR tasks and scenarios. In the proposed conceptualization, emotion need is a desire to be in a particular emotional state; emotion object is an emotion representation of a document; and emotion relevance is the relation between emotion aspect of documents and emotion needs (see pp. 148-149).
The thesis contains nine chapters, divided into four parts.
In Part I — Chapters 1-3 — the author presents the motivation, his claims and research objectives, a general IR background and an overview of previous research in emotion in IR.
The author argues that “it is important to understand the emotion aspect of documents to better understand and satisfy the searcher’s needs” (p. 4). He also claims that emotion need is prior to information need, and the information seeking behaviour is a way to alleviate emotion needs. He suggests that, in a text, facts (interpreted in terms of topicality) should be differentiate from emotion.
In this Part a definition of emotion is proposed. Although, the terms “affect”, “sentiment”, “emotion” and “mood” are often used interchangeably, the author differentiates the terms in the following way (p. 28):
• affect is the general term under which other concepts (e.g. sentiment, emotion, mood) are encompassed
• sentiment is characterized by a positive or negative feeling
• emotion is a cognitively elaborated sentiment
• mood is a general pleasant or unpleasant feeling
In particular, “[t]he difference between emotion and mood is that emotion has an immediate object or identifiable action or event whereas mood does not; and that emotion tends to be short and occur in bursts, whereas mood tends to persist longer and be flat” (p. 28). [cf. also the definitions proposed by Alexander Osherenko http://www.forum.santini.se/2012/11/thesis-review-opinion-mining-and-lexical-affect-sensing/]
Emotion is considered one of the essential aspects of the communication process since it has a significant role in virtually every aspect of the human personal and social life. This argument is explained with evidence from psychology and sociology (Chapter 3).
Part II includes a single chapter, Chapter 4, which contains the main theoretical tenets of the thesis, i.e. the concepts of emotion need, emotion object and emotion relevance.
Interestingly, the author claims that “emotion need is more fundamental than an information need in the sense that if an information need exists it implies that there is an underlying emotion need to satisfy this information need. The whole IR&S behaviour is thus driven by an emotion need. However, the converse may not necessary be true, e.g., a user could want to be happy/sad/angry but without having a well-defined I[nformation]N[eed]. Thus, whenever information need is discussed, an emotion need is pre-existent. In the case when the emotion need of the searcher is to diminish the negative feelings associated with a lack of knowledge (i.e., an IN), the emotion need would be satisfied if the IN associated with it is resolved. For example, if a searcher’s IN is to know about topic x, the searcher must believe4 that information about x has been acquired, in order for their emotion need to be satisfied. Thus, the emotion need will not be resolved unless the underlying information need is resolved,since in this context, the information need is the dominant one.” (p. 55).
Less convincing is the definition of emotion object as a “third view” independent both from the text creator’s emotion and the readers’ emotion. The proposed “third view” is based on the assumption that “emotions can be assigned to a document and considered as attributes of that document” (p. 59) by virtue of “common-sense knowledge”, i.e. cultural knowledge. “Common-sense represents the norms of the belief of a society with respect to concepts, actions, named entities, etc. For example, stealing is bad, helping is good. This common knowledge is the glue that allows observers to understand the authors. For instance, according to common-sense, a tsunami is a natural disaster that is potentially destructive: it is a sad and unpleasant event. The closer one’s belief system is to the norm of the society, the more similar their experienced emotion to the ones based on common-sense is. An observer’s emotional experience, even if is different, is complementary rather than contradictory information.” (p. 59).
Finally, emotion relevance is a “relation between the emotion aspect of a document and the emotion need of an individual” (p.59).
Part III describes the practical work of this thesis:
Chapter 5 (Text-based Emotion Extraction System) presents a comparative study of text-based emotion extraction techniques. The study shows that our implementation of emotion extraction method, called OCC1, is more accurate in terms of precision and F-measure. Also the effect of various base lists in the performance of OCC1 was analysed.
Chapter and Chapter 7 (Movie Recommender Systems)
The results of the study in Chapter 6 indicate that incorporating emotion features improves the accuracy of rating prediction in Collaborative Filtering. However, the proposed approach suffered from some limitations, the most serious ones being , data sparsity, cold start problems and scalability. These limitations are explored in Chapter 7 that further investigates their role in a more elaborate Collaborative Filtering system in order to overcome these issues in such systems. Results showed that emotional features consistently play a role in improving the recommendation quality.
Chapter 8 (News Retrieval System) discusses the effectiveness of using the emotion representation of documents (news articles) to diversify ranking results in order to better cover relevant subtopics. The author proposes to use emotional features to enhance the diversity of the retrieved results since they offer a new way to diversify information, based on emotion, and explore another dimension of emotional relevance. Results show that the diversification of the retrieved results improved the effectiveness of the system. They also show that some topics gain more from emotion-based diversification than others. “Although the overall improvements were marginal, the results were encouraging.” ((p. 161).
Part IV contains a single chapter, Chapter 9, that lists and summarizes the contributions of the thesis.
The thesis Role of emotion in information retrieval is a considerable step forward towards more “human-like” IR&S. According to traditional IR, searchers’ needs are limited to information needs. In this thesis we get a more sophisticated interpretation of searchers’ needs that includes emotion as both primary and secondary factor in IR&S behaviour. What is more, the author shows that emotion need is a central need in the whole searcher’s need system. Sensibly, Yashar Moshfeghi re-interprets and adapts theories from psychology and sociology to broaden the perspective on searchers’ behaviour. The emotion-aware applications presented in the thesis (emotion extractor, recommender systems, news retrieval system) show that the inclusion of emotion helps indeed improve the effectiveness of IR systems.
The thesis is a good opening for an area of research — emotion impact on IR systems — where much is still undiscovered.
One trail that should be followed in coming research projects is the relation between genre and emotion. In this thesis the author deals only with movie genre, because it can be easily extracted from the IMDB website and associated to films in the test collections (see Chapters 6 and 7). Noticeably, the author states “Among semantic spaces, in terms of the number of features, genre space is the closest one to emotion spaces. Movie review emotion and genre spaces have the same improvement in terms of MAP for the 500K dataset.” (p. 115). The spontaneous question is then: how come document genre has not been taken into account together with document emotion in the other two text-based applications, i.e. the emotion extractor and the news retrieval system? In particular, I suspect that the modest improvement in the effectiveness of the news retrieval system would have been higher if the genre of the article had been considered. This is because there is often a tight connection between genre and emotion. Some genres are more “emotional” than others — namely all the subjective genres, such as EDITORIALS or COLUMNS — while other are less emotional and more factual — for instance, FEATURE ARTICLES or NEWSWIRES. Therefore a suggestion for the improvement for the diversification of retrieved news would be the inclusion also of genre features. First, because genre and emotion reinforce each other in many cases; second because genre conveys both the document’s information structure and the emotional involvement of the document’s creator, the latter being a boosting factor in emotion profiling and identification. In his definition of “emotion object” as the document itself in terms of topicality, the author shuts out, in my opinion, two influential emotional devices, i.e. the subjectivity/objectivity information provided by genre conventions and the author’s rhetorical power of persuasiveness.
Well … discussion is now open on the new frontiers in holistic IR.