Articles Comments

The WebGenre Blog: The power of genre applied to digital information. By Marina Santini » Archive

Book Review: Building and Using Comparable Corpora (Springer, 2013)

I would like to recommend “Building and Using Comparable Corpora” (edited by S. Sharoff, R. Rapp, P. Zweigenbaum and P. Fung) to those who are working with or are interested in multilingual and monolingual comparable corpora. The volume is an edited collection of articles covering many topics related to the compilation, measurement and use of comparable corpora. It is divided into two parts and includes 17 articles. I found this volume useful and inspiring for my research. The volume is comprehensive and still up-to-date, although it collects extended papers from a BUCC (Building and Using Comparable Corpora) workshop held in 2011, or articles written between 2011-2012. The book starts with an informative overview (article 1), where issues are presented neatly and where the definitions of the different types of corpora … Read entire article »

Filed under: reading suggestions, references, reviews

Book Review: The Personal Weblog (2016)

— draft version — AUTHOR(S): Schildhauer, Peter; TITLE: The Personal Weblog SUBTITLE: A Linguistic History SERIES: Hallesche Sprach- und Textforschung. Language and Text Studies. Recherches linguistiques et textuelles – Band 14 YEAR: 2016 PUBLISHER: Peter Lang AG ISBN13: 9783631662748,9783631662748,9783631662748 ANNOUNCED IN: http://linguistlist.org/issues/27/27-2198.htmla Introduction “The Personal Weblog: A Linguistic History” is a monograph that describes and interprets the evolution of the personal weblog genre. The study of the personal weblog is corpus-based. The corpus was created using material from The Internet Archive. The volume is written in English. It is based on the author’s PhD thesis (p. 17), originally written in German. The reading of this book is recommended to all those interested in genre analysis, genre evolution, genre classification, blog genre analysis. Summary The volume ”The Personal Weblog: A Linguistic History” has 308 pages. It includes Acknowledgements, Contents Overview, Table … Read entire article »

Filed under: featured, reviews

Book Review: Fundamentals of Predictive Text Mining 2nd Ed. (2015)

Book Review: Fundamentals of Predictive Text Mining 2nd Ed. (2015)

Book Review: Weiss S. M., Indurkhya N. and Zhang T. (2015). Fundamentals of Predictive Text Mining. Springer-Verlag, London. Second Edition Informer website Winter 2016 Issue, Book Review The volume “Fundamentals of Predictive Text Mining”, 2nd ed. has nine chapters, a table of contents, a list of references, a Subject Index and an Author Index. The book also includes a Preface written by the three authors, Summary Abbriavions: ML=Machine Learning; NLP=Natural Language Processing; IR= Information Retrieval 1) In Chapter 1, “Overview of … Read entire article »

Filed under: featured, reading suggestions, reviews

Book Review: Sequences in Language and Text (2015)

Book Review: Sequences in Language and Text (2015)

Book review by Marina Santini in publication on the LinguistList – http://linguistlist.org/issues/27/27-1505.html – Book announced at http://linguistlist.org/issues/26/26-2205.html EDITOR: George K. Mikros EDITOR: Ján Macutek TITLE: Sequences in Language and Text SERIES TITLE: Quantitative Linguistics [QL] 69 PUBLISHER: De Gruyter Mouton YEAR: 2015 REVIEWER: Marina Santini, Uppsala University Reviews Editor: Helen Aristar-Dry SUMMARY The volume “Sequences in Language and Text” is an edited collection of 14 chapters. The book also includes: a Foreword by the editors G. Mikros and J. Mačutek , a Subject Index and … Read entire article »

Filed under: reviews

Thesis Review: Resolving Power of Search Keys

Heppin, Karin Friberg (2010). Resolving Power of Search Keys in MedEval a Swedish Medical Text Collection with User Groups: Doctors and Patients. PhD thesis, Gothenburg University, Sweden Thesis: http://www2.gslt.hum.gu.se/dissertations/friberg.pdf Errata: http://www2.gslt.hum.gu.se/dissertations/friberg.pdf Opponent Stefan Schulz; Defence Presentation: http://user.meduni-graz.at/stefan.schulz/presentations/2010_Gothenburg_Defence.pptx The thesis “Resolving Power of Search Keys in MedEval a Swedish Medical Text Collection with User Groups: Doctors and Patients” opens with crucial questions in Information Retrieval (IR). The general question is: 1. What type of search keys are effective when searching for information in a collection of documents? Language-specific questions refer to how to handle compounds, since around 10%2 of words in Swedish running texts are compounds Then, important questions are: 2. What is the best way to treat compounds? 3. When is it beneficial to use individual compound constituents as search keys and when does it ruin a search? The thesis … Read entire article »

Filed under: reviews

Thesis Review: Cross-Language Ontology Learning

Hjelm, Hans (2009) Cross-language Ontology Learning. Incorporating and Exploiting Cross-language Data in the Ontology Learning Process. Academic dissertation for the Degree of Doctor of Philosophy in Computational Linguistics at Stockholm University, 2009. Permalink: http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-8414 Review by Marina Santini   The PhD thesis ”Cross-language Ontology Learning” presents a framework for automating cross-language ontology creation systems and suggests a setting in which cross-language data can be profitably integrated. The high-level task is to computerize the acquisition of semantic knowledge. In Information Science, ontology is “a way of representing knowledge or structuring the terminology within a domain” (p. 14 ). The thesis focuses on the learning of domain ontologies and limits its scope to studying is-a hierarchies. Ontology learning is the automated acquisition of a domain terminology from raw natural language texts. That is, “given a collection of … Read entire article »

Filed under: dissemination, reviews

Thesis Review: Emotion in Information Retrieval

Moshfeghi, Yashar (2012) Role of emotion in information retrieval PhD thesis Submitted in fulfilment of the requirements for the title of Doctor of Philosophy School of Computing Science, College of Science and Engineering, University of Glasgow, UK. Thesis Download (A copy can be downloaded for personal non-commercial research or study, without prior permission or charge) Amazon UK — Amazon USA Review by Marina Santini The PhD thesis Role of emotion in information retrieval by Yashar Moshfeghi starts filling a gap in an area of Information Retrieval (IR) and Information Studies (IS) that is still underinvestigated: the role played by emotion in searchers’ behaviour. Although it is crystal clear that searchers use emotionally-rich documents from the internet to satisfy their needs — from blogs to tweets — the influence that emotion might have in information retrieval and … Read entire article »

Filed under: reviews

Thesis Review: Opinion mining and lexical affect sensing

Alexander Osherenko, Opinion mining and lexical affect sensing. Computer-aided analysis of opinions and emotions in texts. PhD thesis published by SVH, 2010 (p. 255) Amazon: http://www.amazon.com/Opinion-mining-lexical-affect-sensing/dp/383812488X;Free download: http://opus.bibliothek.uni-augsburg.de/opus4/frontdoor/index/index/docId/1469 Reviewed by Marina Santini The PhD thesis “Opinion mining and lexical affect sensing” by  Alexander Osherenko contains nine chapters, six appendices and an index. The thesis presents approaches to emotion recognition from texts belonging to several emotional corpora. Four approaches are experimented and discussed in detail, namely (1) the statistical approach based on data mining techniques; (2) the semantic approach leveraging on semantic features and grammatical interdependencies; (3)  the hybrid approach that combines the statistical approach and the semantic approach; and finally (4) the multimodal-fusion that builds upon the linguistic modality and acoustic modality. … Read entire article »

Filed under: reviews

Meetup Report: Big Data & Predictive Modeling – What’s happening in Sthlm?

On Thursday, September 6, 2012 the first meetup on BIG DATA & PREDICTIVE MODELING- WHAT’S HAPPENING IN STHLM? was held at the Klarna Headquarters in Stockholm. The event was very successful and (according to the organizer) unexpectedly crowded (about 90 attendees) of passionate practitioners and, more generally, of people interested in big data (like myself). Although I could not attend the socialization slots before and, above all, after the event at the bar, it was a very informative and enjoyable meeting and I hope that similar events will be held in the future. … Read entire article »

Filed under: reports, reviews

Review: Creating Corpora With Active Learning

PhD thesis reviewed by Marina Santini Fredrik Olsson, Bootstrapping Named Entity Annotation by Means of Active Machine Learning: A Method for Creating Corpora. Doctoral thesis, University of Gothenburg, 2008 Download thesis from this page: http://soda.swedish-ict.se/3518/ The PhD thesis “Bootstrapping Named Entity Annotation by Means of Active Machine Learning: A Method for Creating Corpora” by Fredrik Olsson contains 13 chapters and an appendix with the base learner parameter settings. The Introduction unfolds the problem and the argument, and the remaining 12 chapters describe the Background (Part I, Chapters 2-5), presents the BootMark method ( Part II, Chapter 6), test the proposed method (Part III, Chapters 7-12) and summarize findings, experience, and viable future directions (Part IV, Chapter 13). The thesis describes a bootstrapping method for named-entity recognition based on active learning — BootMark. The … Read entire article »

Filed under: reviews