Category: dissemination

Reblogging: Practical advice for machine learning

Practical advice for machine learning: bias, variance and what to do next By Mikael Huss at Follow the data (http://followthedata.wordpress.com/about/) The online machine learning course given by Andrew Ng in 2011 (available here among many other places, including YouTube) is highly recommended in its…

Impact of Sociolinguistics in Opinion Mining Systems

Signed post by Alexander Osherenko, Socioware Development, osherenko@socioware.de Full paper: Considering Impact of Sociolinguistic Findings in Believable Opinion Mining Systems Proceedings of The Fifth International Conference On Cognitive Science. 2012. Kalinigrad, Russia (http://www.informatik.uni-augsburg.de/~osherenk/final_kalinigrad.pdf) Opinions are frequent means of communication in…

Contextify: How to Contextualize Information

Marina Santini. Copyright © 2012 Work in progress: Contextify is a metadata tagger that performs text and content enrichment. Contexify enriches information through text classification and content markup. How can we capture context from a text? I would start with…

Reblogging: Informer, Spring Issue

Informer Newsletter of the BCS Information Retrieval Specialist Group Spring 2017 Table of Contents Editorial: By Udo Kruschwitz on April 28, 2012 Conference Review: ECIR 2012 Industry Day: By Franco Maria Nardini on April 26, 2012 Book Review: Search Analytics…

Online Course: Social Network Analysis

An interesting online course is offred for free by Coursera through University of Michigan in September 2012: Social Network Analysis “This course will use social network analysis, both its theory and computational tools, to make sense of the social and…

Dissemination: Acknowledgement Search Engine and Next Generation Search Engines

1)  AckSeer is a beta automatic acknowledgment indexing search engine that explores automatic identification, entity extraction and indexing of acknowledgements from papers. In addition acknowledged entities are extracted within the acknowledgment passages. Currently, AckSeer indexes acknowledgments from more than 500,000 papers in CiteSeerX. These acknowledgements contain…

Dissemination: Web Corpora Available

1) Common Crawl web corpus — WebDataCommons is offering 3.2 billion quads current RDFa, Microdata and Miroformat data extracted from 65.4 million websites.  Two Common Crawl web corpora are available: one corpus consisting of 2.5 billion HTML pages dating from…