Articles Comments

The WebGenre Blog: The power of genre applied to digital information. By Marina Santini » Entries tagged with "automatic genre identification"

Spreading the Word about (Web)Genre Research

Spreading the Word about (Web)Genre Research

What is genre? Why is it useful to master genre conventions? Can we classify document genres automatically? Around the world, lots of researches and scholars belonging to a wide range of disciplines are trying to provide answers to these and to many other questions. Aristotle suggested the first genre classification scheme by dividing literature into Tragedy, Comedy and Lyrics (well, I am oversimplifying…).  Aristotle smoothly classified all the knowledge of his time, so arguably classifying genres … Read entire article »

Filed under: discussions, reading suggestions, references, reflections

Seminar – Towards Contextualized Information: How Automatic Genre Identification Can Help

Seminar Series Laboratory for Cognition, Interaction and Language Technology (CILTLab) Linköping University, Linköping, Sweden, Tuesday 28 August 2012 Abstract: Genre is one of the textual dimensions that can be used to reconstruct the communicative context needed to assess the value of information with respect to a purpose (business, learning, finding, monitoring, predicting, etc.). When we know the genre of a text, we can surmise the CONTEXT where a text has been created and for which purpose. Therefore we can more confidently decide whether a text contains the information we are looking for. For example, factual texts might have more credibility than opinionated texts. In this respect, genres such as press conferences, declarations or announcements by a White House spokesman might be more reliable than subjective genres, e.g. newspapers’ editorials or op-ed articles. On the … Read entire article »

Filed under: abstracts, announcements, seminars

White Paper: Automatic Genre Identification – Testing with Noise

Automatic Genre Identification – Testing with Noise by Efstathios Stamatatos, Serge Sharoff, Marina Santini – Copyright © 2012, All rights reserved.   Citation:  Stamatatos E., Sharoff S., Santini M. (2012). Automatic Genre Identification – Testing with Noise. [White paper]. Retrieved from The genre collections used in the experiments are available here. The reference list is here. In the experiments described below, genre classes coming from three genre collections have been used: Santinis7 (Santini, 2007). KI-04 (Meyer zu Eissen and Stein, 2004), and HGC (Stubbe and Ringlstetter, 2007). These genre collections have been created by different people, in different universities, for different purposes, with different criteria, and different notions of what genre is. Since genre is a complex concept and genre classes can be characterized in different ways, we assume that having a AGI algorithm … Read entire article »

Filed under: collaborative blogging, computational models, featured, signed posts, white papers

Overview: Automatic web Genre Identification (AGI)

Genre is a fundamental component of human communication, but the definition of genre is vague, as genre classes can indicate a text type, a discourse practice, a rhetorical strategy, a cognitive class, or any textual category. In this post I provide a short overview of previous and current approaches to Automatic web Genre Identification (AGI). In its early stage, (AGI) builds upon the seminal work of Douglas Biber (Biber, 1988). Although Biber did not perform any AGI, he explored the linguistic variation (focusing on the difference between spoken and written) within different genres using statistical approaches based on computable features outputted by Biber’s tagger, such as the number of that-deletions or verbs in the past tense. … Read entire article »

Filed under: overviews

Reading Suggestion: Adjectives and adverbs as indicators of affective language for automatic genre detection (2008)

Rittman, Robert and Nina Wacholder. (2008). Adjectives and adverbs as indicators of affective language for automatic genre detection. Proceedings of AISB 2008 Convention, Symposium on Affective Language. Aberdeen, Scotland, April 1-2, 2008. Abstract. We report the results of a systematic study of the feasibility of automatically classifying documents by genre using adjectives and adverbs as indicators of affective language. In addition to the class of adjectives and adverbs, we focus on two specific subsets of adjectives and adverbs: (1) trait adjectives, used by psychologists to assess human personality traits, and (2) speaker-oriented adverbs, studied by linguists as markers of narrator attitude. We report the results of our machine learning experiments using Accuracy Gain, a measure more rigorous than the standard measure of Accuracy. We find that it is possible to classify … Read entire article »

Filed under: reading suggestions, references