The WebGenre Blog: The power of genre applied to digital information. By Marina Santini » Entries tagged with "query logs"

Presentation: How Emotional Are Users’ Needs? Emotion in Query Logs

According to recent IR research, searchers’ behaviour is not only limited to traditional informational, navigational and transactional needs. A novel hypothesis is that the seeking behaviour is driven by emotion. These experiments are part of SearchInFocus, a study centred on search. How Emotional Are Users’ Needs? Emotion in Query Logs from Marina Santini … Read entire article »

Filed under: featured, slides

Contextify: How to Contextualize Information

Marina Santini. Copyright © 2012 Work in progress: Contextify is a metadata tagger that performs text and content enrichment. Contexify enriches information through text classification and content markup. How can we capture context from a text? I would start with genre, sublanguage, and domain i.e. three textual dimensions that say something about the communicative context in which a text has been issued: A ”weird” word like ”Spweet” is not a typo if it belongs to a Twitter micropost (genre and sublanguage: tweet spam) A ”normal” word like ”mouse” is a specialized term if it belongs to the computer domain.   Other examples: surfing (sport, internet communication), agile (ordinary word, software),  sentence (law, grammar), appeal (ordinary language: ”appeal for help” or  legal sublanguage: to lodge an appeal, genre: newspaper, court act) etc. Context helps disambiguate words and assess the … Read entire article »

Filed under: discussions, dissemination, featured

Mining Query Logs: Query Disambiguation & Understanding through a KB

Marina Santini. Copyright © 2012 Work in progress Talking about  query logs, Karlgren (2010) points out: “There are several reasons to be cautious in drawing too far-reaching conclusions: we cannot say for sure what the users were after; [...]“. However, some linguistic problems can be sorted out, for example those related to sublanguage, terminology, multi-word expressions, etc. Interestingly, the use of different sublanguages has been studied by Karin Friberg Heppin in her PhD thesis: Resolving Power of Search Keys in MedEval. A Swedish Medical Test collection with User Groups: Doctors and Patients. Karin highlights how patients (laymen) and doctors (experts) use different vocabulary (or terminology) to indicate the same concept. For example, patients might use the word “painkiller” while doctors may prefer the word “analgesic” to refer to the same treatment. Different sublanguages … Read entire article »

Filed under: discussions, featured, reflections

Applying Findability to Mine Query Logs for BI: Preliminaries

 Marina Santini. Copyright © 2012  Thanks for sharing pointers and for giving hints to the question: “Can anyone suggest references about mining query logs for BI and CEM?” ( Pls feel free to add comments to the blog post, if more suggestions come to your mind.  The question of this week is: “How can I profitably use query logs for making better business decisions and predict future trends?”  Citing from (Rud, Olivia (2009). Business Intelligence Success Factors: Tools for Aligning Your Business in the Global Economy. Hoboken, N.J: Wiley & Sons. ISBN 978-0-470-39240-9.), Wikipedia states: “Business intelligence (BI) is defined as the ability for an organization to take all its capabilities and convert them into knowledge, ultimately, getting the right information to the right people, at the right time, via the right channel. This produces large amounts … Read entire article »

Filed under: discussions, featured, reflections, requests

Mining query logs for BI and CEM

Can anyone suggest references about mining query logs for BI and CEM? … Read entire article »

Filed under: discussions, featured, requests

Abstract: Conventions and Mutual Expectations

Conventions and Mutual Expectations – Understanding Sources for Web genres by Jussi Karlgren In: Genres on the Web Computational Models and Empirical Studies Alexander Mehler, Serge Sharoff and Marina Santini Text, Speech and Language Technology Volume 42, 2011, DOI: 10.1007/978-90-481-9178-9 Abstract Genres can be understood in many different ways. They are often perceived as a primarily sociological construction, or, alternatively, as a stylostatistically observable objective characteristic of texts. The latter view is more common in the research field of information and language technology. These two views can be quite compatible and can inform each other; this present investigation discusses knowledge sources for studying genre variation and change by observing reader and author behaviour rather than performing analyses on the information objects themselves. … Read entire article »

Filed under: abstracts