Articles Comments

The WebGenre Blog: The power of genre applied to digital information. By Marina Santini » Entries tagged with "AGi"

Spreading the Word about (Web)Genre Research

Spreading the Word about (Web)Genre Research

What is genre? Why is it useful to master genre conventions? Can we classify document genres automatically? Around the world, lots of researches and scholars belonging to a wide range of disciplines are trying to provide answers to these and to many other questions. Aristotle suggested the first genre classification scheme by dividing literature into Tragedy, Comedy and Lyrics (well, I am oversimplifying…).  Aristotle smoothly classified all the knowledge of his time, so arguably classifying genres … Read entire article »

Filed under: discussions, reading suggestions, references, reflections

Seminar – Towards Contextualized Information: How Automatic Genre Identification Can Help

Seminar Series Laboratory for Cognition, Interaction and Language Technology (CILTLab) Linköping University, Linköping, Sweden, Tuesday 28 August 2012 Abstract: Genre is one of the textual dimensions that can be used to reconstruct the communicative context needed to assess the value of information with respect to a purpose (business, learning, finding, monitoring, predicting, etc.). When we know the genre of a text, we can surmise the CONTEXT where a text has been created and for which purpose. Therefore we can more confidently decide whether a text contains the information we are looking for. For example, factual texts might have more credibility than opinionated texts. In this respect, genres such as press conferences, declarations or announcements by a White House spokesman might be more reliable than subjective genres, e.g. newspapers’ editorials or op-ed articles. On the … Read entire article »

Filed under: abstracts, announcements, seminars

AGI: Structured and Unstructured Noise

How would you handle automatic text classification in noisy conditions? This is what has been done, to my knowledge, in Automatic web Genre Idintefication (AGI). By noise here I refer to two different disturbing factors*: 1) the training sample and test sample come from different sources/annotators; 2) the test set contains genre classes that are not present in the training set. These two types of noise reflect the following real-world conditions when working with genre, namely: 1) since genre is a complex notion that has been interpreted in different ways, the identification of same genre class can vary depending on the research agenda or individual preferences; 2) we cannot possibly conceive a genre classifier that has a good performance if we include all existing genres either on the web or in … Read entire article »

Filed under: dialectic, discussions, overviews

Overview: Automatic web Genre Identification (AGI)

Genre is a fundamental component of human communication, but the definition of genre is vague, as genre classes can indicate a text type, a discourse practice, a rhetorical strategy, a cognitive class, or any textual category. In this post I provide a short overview of previous and current approaches to Automatic web Genre Identification (AGI). In its early stage, (AGI) builds upon the seminal work of Douglas Biber (Biber, 1988). Although Biber did not perform any AGI, he explored the linguistic variation (focusing on the difference between spoken and written) within different genres using statistical approaches based on computable features outputted by Biber’s tagger, such as the number of that-deletions or verbs in the past tense. … Read entire article »

Filed under: overviews

CLT seminar (University of Gothenburg): 2011-06-16, 10:15 – 12:00

 Marina Santini – Computational Models for Automatic Web Genre Identification http://www.clt.gu.se/seminar/2011-06-16/clt-seminar-marina-santini Date:  2011-06-16 10:15 – 12:00 Where:  L308, Lennart Torstenssonsgatan 8 Broadly speaking, “genre” is a classification concept. A genre is a recurring and recognized pattern of communication that has a specific name. The web hosts many recognised genres, such as FAQs, press releases, product descriptions, instructions, guides, e-magazines, blogs, professional profiles, how-tos, web ads and reviews. Each of these genres serves a number of communicative and social purposes and carries additional contextual information that helps the reader interpret the content. Can web genres be identified and detected automatically? Which computational models have been tried out so far in automatic genre identification research? How well do they perform? In this talk, I will present and discuss the latest findings in automatic genre identification and suggest viable … Read entire article »

Filed under: announcements