Towards a Computational Theory of Digital Genre (I): Working Definition of Genres for Computational Purposes

Towards a Computational Theory of Digital Genre (I): Working Definition of Genres for Computational Purposes

by Marina Santini – Last Updated: 29 Oct 2012

1. What is a (textual) genre?
• A genre is a class of texts with similar communicative, textual and linguistic features.

2. What characterizes a genre?
A genre:
• Must have a name
• Must be recognized within a community
• Must be produced or retrieved during a task
• Must have conventions
• Must raise expectations
• Can change over time. It is an cultural artifact (culture here includes society, media, techonology, etc.)

3. What characterizes a digital genre?
• The same characteristics listed above.
• A digital genre is any kind of genre that has a digital form, such as emails, chats, online academic papers, online newspaper articles, blogs…
• A digital genre can be any paper genre converted into a digital form OR a class of texts that do not have any countepart in the paper world, such as home pages, About Us web pages, FAQs, webzine articles, personal blogs, corporate weblogs …

4. Genre Characterization: Ex: Recent but fully acknowledged genres

• Name: a genre must indicate a class, a family (for genre name formation, see Görlach, 2004). Recent digital genres: blogs, tweets, chatlogs, etc.
• Community: a genre is not something individual. A genre is a textual form that is used and recognized by a community (cf. personal style). Ex: Blogs àbloggers and blog readers; academic home pages à academics; etc.)
• Task: a genre meets a RECURRENT communication need. Ex: personal home page genre tells us something about a person; a technical blog gene is informative about some specific technology; etc.)
• Conventions: ex : a personal blog genre is made of posts organized in reversed chronological order where a blogger communicates personal and subjective views on some facts.
• Expectations: when reading a personal blog, readers expect to read something personal (personal facts or personal opinions) and expect the technical possibility to leave a comment, if they wish to do so.
• A genre is a cultural artifact: a genre might evolve over time (see: Weblogs: a history and perspective by Rebecca Blood, 2000) might disappear if the society changes (ex : Chansons des gestes). New genres emerge with new media, new technologies, new information needs.

5. Genre Characterization – Ex: A novel and fully emerged genre, the query log genre
• Name: in line with other digital genres (ex: web log à blog)
• Community: internet users, IR practitioners
• Task: information needs specified in a search engine
• Conventions: short texts written in”keywordese”
• Expectations: to find relevant information
• Cultural artifact: a product of our media-based, internet-based society OR a subproduct of search engines

6. The query log genre: Languistic and Textual Conventions
• Length: short text (a query log can be seen as a corpus of very short texts, shorter than tweets, mobile text messages, chat logs, etc.)
• Sublanguage/Jargon: ”keywordese”
• Register: neutral
• Morphology: LITTLE
• Syntax : OCCASIONALLY (usually no articles, no prepositions, no subclauses, etc.)

7. Query Log Genre: The Benefits
• Expressed in a ”lean” sublanguage, the keywordese:
• reduced morphology
• reduced syntax
• short texts
• Mostly Nouns and Verbs
• Reduced size: compare a 2-years collection of emails vs a 2-year collection of query logs
• = REDUCED SIZE, REDUCED PRE-PROCESSING; NO DATA CLEANING!

8. Query Log Genre: Expectations
• short texts written by users to find relevant information through a search engine.
• The texts (queries) must express information needs a.k.a. users’ intents.
• It is good practice to be cautious with the interpretation of users’ intents. However, If we mine query logs with a simple quantitative approach, it is possible to extract recurrent information needs and build upon them.

9. Why is a classification by genre beneficial for a computational approach?
• The main benefit is the contextualization of information! A genre is a CONTEXT carrier because it is based on recurrent conventions and predictable expectations. A genre provides the communicative context and the communicative purpose for which a text has been produced. The concept of genre is both a semantic and a pragmatic concept (i.e. it includes the semantic meaning + the situational/communicave context).
• Complexity reduction & e-Learning: a text receives identity throught belonging to a certain genre and and this identity reduces the cognitive effort.
• Information understanding & Forensic Linguistics: genre competence increases self-protection against digital crimes (such as fishing, hoaxes, cyberbullying and threats) because it can help spot genre anomalies and consequently malicious intentions
• Findability & Information Retrieval: since the membership of a document in a genre tells us something about the communicative context in which the document has been produced. From the communicative context, we can derive or infer or assess the relevance of this document to our information needs.
• Predictivity and Automatic Summarization + Text Summarization: being based on recurrent conventions and predictable expectations, it is possible to identify where the most important and relevant information is located within a document.

10. Genre is ubiquitous
• Language does not exist in abstract.
• Language use changes with the situation, purpose, audience, emotional state, etc.
• We might express the same meaning with different words according to different communicative contexts, using different genres according to the task, the audience, the purpose, etc.

I would appreciate your comments, thoughts and objections on this view of genre for computational purposes. Thanks in advance, Marina
***End of the post***

To be continued in the post: Towards a Computational Theory of Digital Genre (II): The Fuzzy boundaries of genre classes

——
Changes Log: 29 Oct 2012
——

10 comments for “Towards a Computational Theory of Digital Genre (I): Working Definition of Genres for Computational Purposes

  1. 6 November, 2012 at 12:40

    I wonder where your genre definition considers humans, for example, the author of a genre piece. Is it in the genre community?

    Without diving deep in the theory, you might be interested in the theory of Bakhtin http://en.wikipedia.org/wiki/Mikhail_Bakhtin

  2. 6 November, 2012 at 12:55

    Alexander,
    I will reply to you in my next post about computational genre theory. I am glad you start being intrigued by the concept of genre 🙂

    • 6 November, 2012 at 14:53

      Bakhtin is somebody that was sooo important for linguists I know that they only spoke about him. And I was nearby — it was hard not to hear what they say… 🙂

  3. Christophe Clugston
    13 November, 2012 at 18:19

    Ollowing Shepperd and Waters (1998) evolution of cyber/digital genres I would wonder what you feel of Swales (1990) use of Purpose. I am attempting to use the idea of purpose (academic writing in process) as the overarching classifying device for web (digital/cyber) genre. While structure follows the purpose and then drives the content and style. I would welcome your thoughts on this subject.

  4. 13 November, 2012 at 20:08

    Hi Christophe,

    as Swales (1990: 46) says “it might be objected that is somewhat less overt and demonstrable feature than, say, form and therefore serves less well as a primary criterion”.
    What taxonomy of purposes would you like to use for your genre classification? If you think of rhetorical purposes, such as instructional purpose, narrative purpose, persuasive purpose, argumentative purpose etc., I would agree with you. Apparently they are universal and you can find them in genre in all media and in all times. However, it depends on the type of genre classification you are aiming at. These rhetorical purposes are super-categories, each encompassing many genres. For instance, there many different narrative genres defined for different communities and audiences, such as: fairy tales, newspaper stories, short stories, medical records, etc. Above all, in many genres, these purposes are all there in different proportions, for example the film review genre has a narrative section and an argumentative-persuasive section… But you might have something different in mind…

  5. Christophe Clugston
    14 November, 2012 at 13:14

    Hello
    Thanks for your quick response. To clarify: I am looking at labeling a new cyber/digital genre (variant/extant genre–Watters and Shepperd, definition). Approaching it through Genre Hierarchy or Chains–it is found within Netvertising. I feel that the entire macro genre of Netvertising can be defined by its purpose (to sell). The contents, then, determine the domain or the topic. In a greater sense I feel that looking at genre through a form, content, and function tripartite lens the individual lens which has the most power (the salient one) is clearly purpose (in Netvertising). Each end/user looks on the internet for a purpose (they do not search for form, nor do they actually look for content–as that will be determined by the purpose: how to biuld a house, which car is the best buy, what is the best travel destination for 10 thousand Euros, etc.). This is a rather brief overview of what I am currently writing about. I would welcome a more involved discussion of what I am attempted and your feedback, if that is possible. This could be done through e mail, if you would like. Regards, Christophe

  6. 16 November, 2012 at 16:58

    Hi Christophe,

    I would say that the purpose is “to sell”. I would say that the perpose is “to persuade to buy”. The content is something diffrent from the topic.
    The content or the core purpose in this case would be “persuasion” the topic can be variable: a car, a house, a service, etc. So the persuasion trait remains and the topic changes. Think also about the similarity of selling website to some politician websites: they both try to persuade an audiende. They can be very similar in their “marketing” language or techniques, but again the topic or, if you like the obejct, of this marketing is completely different and directed to different target audiences.

    Please, read the last review published in this blog (i.e. Emotion and IR) and try to understand from which angle you want to take for genre classification. Users’ needs are not always triggered by specific purposes… I would suggest that you send me some of your papers or experiments, I will give a more specific opinions.

  7. Christophe Clugston
    21 November, 2012 at 02:04

    Hello

    I have tried to send you an e mail. I think it better to move to a more non world wide discussion of the subject. Please let me know if you do or do not receive my e mail.
    Regards,
    Christophe Clugston

  8. christophe clugston
    11 June, 2013 at 10:54

    Just a quick update. I am using Longacre’s definition which puts advertising not in persuasion but in the hortatory classification. For more about this see his work on analyzing a fund raising letter (1992). And for the question about the taxonomical hierarchy: I am following a model inspired by Lee and Steen’s work. For the Supra Genre (I use this term and not Super Genre) of Advertising to the Sub-Genre that I am proposing (LSWA) makes a great deal of sense.

  9. 11 June, 2013 at 10:56

    Thanks for the update, Chris. Much appreciated!

Leave a Reply

Your email address will not be published. Required fields are marked *

*