Articles Comments

The WebGenre Blog: The power of genre applied to digital information. By Marina Santini » Entries tagged with "domain"

Dissemination: A cross-domain analysis of task and genre effects on perceptions of usefulness (2012)

A cross-domain analysis of task and genre effects on perceptions of usefulness by Luanne Freund, University of British Columbia, Vancouver, Canada Information Processing & Management, In Press, Available online 30 October 2012   Abstract Search systems are limited by their inability to distinguish between information that is on topic and information that is useful, i.e. suitable and applicable to the tasks at hand. This paper presents the results of two studies that examine a possible approach to identifying more useful documents through the relationships between searchers’ tasks and the document genres in the collection. A questionnaire and an experimental user study conducted in two domains, provide evidence that perceptions of usefulness are dependent upon information task type, document genre, and the relationship between these two factors. Expertise is also found to have an effect on … Read entire article »

Filed under: dissemination

Dissemination: Cross-Genre and Cross-Domain Detection of Semantic Uncertainty (2012)

Cross-Genre and Cross-Domain Detection of Semantic Uncertainty György Szarvas, Veronika Vincze, Richárd Farkas, György Móra, Iryna Gurevych* Computational Linguistics, June 2012, Vol. 38, No. 2, Pages 335-367   Uncertainty is an important linguistic phenomenon that is relevant in various Natural Language Processing applications, in diverse genres from medical to community generated, newswire or scientific discourse, and domains from science to humanities. The semantic uncertainty of a proposition can be identified in most cases by using a finite dictionary (i.e., lexical cues) and the key steps of uncertainty detection in an application include the steps of locating the (genre- and domain-specific) lexical cues, disambiguating them, and linking them with the units of interest for the particular application (e.g., identified events in information extraction). In this study, we focus on the genre and domain differences of … Read entire article »

Filed under: dissemination

Contextify: How to Contextualize Information

Marina Santini. Copyright © 2012 Work in progress: Contextify is a metadata tagger that performs text and content enrichment. Contexify enriches information through text classification and content markup. How can we capture context from a text? I would start with genre, sublanguage, and domain i.e. three textual dimensions that say something about the communicative context in which a text has been issued: A ”weird” word like ”Spweet” is not a typo if it belongs to a Twitter micropost (genre and sublanguage: tweet spam) A ”normal” word like ”mouse” is a specialized term if it belongs to the computer domain.   Other examples: surfing (sport, internet communication), agile (ordinary word, software),  sentence (law, grammar), appeal (ordinary language: ”appeal for help” or  legal sublanguage: to lodge an appeal, genre: newspaper, court act) etc. Context helps disambiguate words and assess the … Read entire article »

Filed under: discussions, dissemination, featured

The Path Forward: From Big Unstructured Data to Contextualized Information

How can we convert massive quantities of unstructured data to structured information? What kind of “structure” do we need for a reliable interpretation of this undomesticated data? I suggest thinking of a text-analytic framework based on “context”. Search keywords, events, entities, sentiments, attitudes, polarities, opinions etc. have a different weight and require a different assessment depending on the kind of texts, the situational context, the  field of discussion, and the authority of the source, as well as on the purpose of use. For example, for an official use, factual texts might have more credibility than opinionated texts. In this respect, press conferences, declarations or announcements by a White House spokesman might be more reliable than newspapers’ speculations or op-ed articles. On the contrary, if we want to test the pulse and … Read entire article »

Filed under: dialectic, discussions