Marina Santini – Computational Models for Automatic Web Genre Identification
Broadly speaking, “genre” is a classification concept. A genre is a recurring and recognized pattern of communication that has a specific name. The web hosts many recognised genres, such as FAQs, press releases, product descriptions, instructions, guides, e-magazines, blogs, professional profiles, how-tos, web ads and reviews. Each of these genres serves a number of communicative and social purposes and carries additional contextual information that helps the reader interpret the content. Can web genres be identified and detected automatically? Which computational models have been tried out so far in automatic genre identification research? How well do they perform? In this talk, I will present and discuss the latest findings in automatic genre identification and suggest viable future directions.