Opinion Retrieval and Ranking: the creeping and ineluctable force of Genre

Last Updated: 27 May 2013

Two fundamental principles concurring to the definition and characterization of the concept of genre are conventions and expectations. Simply put, in textual (written or spoken) communication, genres are words that connote different types of text. For instance, on the web the home page genre is different from the blog genre; in a company, the minutes genre is different from the white paper genre; in the press the leader genre is different from the letter to the editor genre…

Genres have the power of shaping information following rhetorical and discourse patterns that have become conventionalized. Genre conventions are implemented by the writer(s). When acknowledged, genre conventions raise predictable expectations in the readers or more generally in those who “process” a text… Although I am oversimplifying here, broadly speaking I believe that genre conventions and the resulting expectations could be profitably exploited for opinion retrieval/ranking and for many sentiment-related applications. I have already started the discussion on this topic some years ago with many members of the Language Technology (LT) and Information Retrieval (IR) communities around the world. I wish now to add a little piece to it with some reflections triggered by a recent novel.

I am currently enjoying the reading of Sweet Tooth, by Ian Mc Ewan, and I am struck by how well the writer describes the acquisition of a tacit genre awareness by Serena Frome (the main character) in view of an interview with MI5 (you can find more details on the plot here). Serena is “prepared” for the interview by her mentor/lover, Tony.

In this expert, narration is settled in 1972. The word “genre” is not mentioned, but Serena suddenly discovers the world of press genres and the word  “leader”. (Italics belongs to the original, bold is added by me for emphasis; Page numbers refer to the paperback edition, Vintage Books, London).

“He [Tony] insisted I read a newspaper every day, by which of course he meant The Times, which in those days was still the august paper of record. I hadn’t bothered much with the press before, and I had never even heard of a leader. Apparently, this was the ‘living heart’ of a newspaper. At first glance, the prose resembled a chess problem. So I was hooked. I admired those orotund and lordly pronouncements on matters of public concern. The judgements were somewhat opaque and never above a reference to Tacitus or Virgil. So mature! I thought any of these anonymous writers was fit to be World President.

And what were the concerns of the day? In the leaders, grand subordinate clauses orbited elliptically about their starry main verbs, but the letters pages no one was in any doubt. The planets were out of kilter and the letter writers knew in their anxious hearts that the country was sinking into despair and rage and desperate self-harm. The United Kingdom had succumbed, one letter announced, to a frenzy of akrasia  – which was, Tony reminded me, the Greek word for acting against one’s better judgement. (Had I not read Plato’s Protagoras?). A useful word. I stored it away. But there was no better judgement, nothing to act against. Everybody has gone mad, so everyone said. The archaic word “strife” was in heavy use in those rackety days, with inflation provoking strikes, pay settlements driving inflation, thick-headed, two-bottle-lunch management, bloody-minded unions with insurrectionary ambitions, weak government, energy crises and power cuts, skinheads, filthy streets, the Troubles, nukes. Decadence, decay, decline, dull inefficiency and apocalypse […]” pp. 26-27

During the interview, Serena “adopts” and echoes the leader genre:

“I spoke the language of a Times leader, echoing patrician, thoughtful-sounding opinions that could hardly be opposed.” pp. 40-41.

It appears very clearly in the description of Serena’s genre discovery that both leaders and letters (to the editor) are strongly opinionated genres. Opinionated yes, but in different ways, since they show different genre conventions. Serena becomes aware of the differences, so in her interview she choose to echo one genre (the leader) and not the other (the letters). Which are the differences?

(See the bold text in the citations)

In leaders, opinions are expressed in a persuasive style anchored to (pseudo-) logical reasoning. The reasoning is expressed in “grand subordinate clauses orbited elliptically about their starry main verbs”. So syntax seems to have higher weight than other linguistic elements, and the stress on the construction of the reasoning through well-build and complex syntax makes the leader a psuedo-objective text (a leader cannot be “truly” objective like a scientific report based on empirical evidence).

On the contrary, in the letters no effort for objectivity is made. Letters are not even persuasive, they are openly emotional. Syntax becomes irrelevant. Persuasiveness becomes emotional involvement and delegated to lexicon, namely adjectives (eg. think-headed, bloody-minded) ,  nouns (eg. apocalypse, decadence) and verbs (eg. succumbed).

I do not see practical difficulty in conveying this genre awareness into a intelligent information system. Syntactic parsing and other NLP/LT tools are rapidly progressing and emotional lexica are becoming more and more refined…  maybe it is the right time for a closer and stable cooperation between LT) and IR is approaching…

In conclusion, I would say: the understanding of the textual/linguistic conventions regulating genre differences seems to be one of the keys to assess which text better satisfy our expectations and information needs. Content/topic, genre, and sentiment are three close friends that always hang around together, but they are different in their nature so they are in their specific features.

PS: It is a truism, but let me repeat it: literature is an irreplaceable magnifier

 

7 comments for “Opinion Retrieval and Ranking: the creeping and ineluctable force of Genre

  1. Francisco J. Valverde
    26 May, 2013 at 10:19

    “and to this she must add the improvement of her mind by extensive reading”. Pseudo-quote, of course. 😉

    I always enjoy your posts and this has given me some food for thought: I am around people who are doing opinion mining and they do not seem to be the least bit concerned about the conventions of the genre they are mining!

  2. 27 May, 2013 at 15:40

    From Systemic Functional Linguistics, Text Analysis, stylistics (LinkedIn group)

    Sukhdev Singh • Genres are distinguished by their stage by stage or step by step culturally recognised structures with a purpose. Foe example, the stages in the letter may not be the same are in an application or a technical report. Similarly, the textual structure in terms of what information and at what stage should occur would determine the difference between a blog and a home page although their words may be similar, for they may belong to the same register i.e. website etc.

  3. 27 May, 2013 at 15:45

    From Critical Discourse Analysis (LinkedIn Group)

    Katherine W Hirsh • Thanks @Marina, this was quite and interesting piece and I very much enjoyed seeing Ian McEwan’s work used to illustrate genre differences.

    *————————–
    William Marcellino, Ph.D. • There’s software developed at Carnegie Mellon University by Dave Kaufer and Suguru Ishizaki that does this. DocuScope identifies the textual features that human readers recognize genre by aggregating the representational choices–style at the lexical and phrasal level–in a text. Basically the aggregation of micro-features add up to genre at the whole text level, and those micro-text features can be counted and statistically analyzed.

    Jeff Collins’ dissertation typed all the genres in the BROWN Corpus, and I used it in mine to type the register Marine officers use in public speech.

    You can download it for free from CMU with a non-commercial license.

    *———————

  4. 27 May, 2013 at 15:50

    From Corpus Linguistics (LinkedIn Group)

    Наталья Завьялова • Awesome material! Thank you for sharing it with us.

  5. Ismael Arinas
    1 June, 2013 at 21:50

    Dear Marina,
    Genre awareness in NLP is missing in Ontologies and other applications. This lack may derive from the fact that usually users tend to look for solutions that can be applied to any type of text and based only on vocabulary.
    If we can develop queries that are informed by the internal rhetorical structure of a genre we can search for prototypical features (subjective language, specific knowledge, etc) in those moves that exhibit them. Genre awareness accounts for differences in how the same knowlegde appears framed differently in genres with different purposes, audiences, and/or conventions.

  6. 3 June, 2013 at 05:58

    Thanks Ismael. I think you are right…

Leave a Reply

Your email address will not be published. Required fields are marked *

*