Clade – a freely available, open source taxonomy and autoclassification tool by Charlie Hull at Flax (http://www.flax.co.uk/blog/) One way to manage digital information is to classify it into a series of categories or a heirarchical taxonomy, and traditionally this was done…
Category: dissemination
Reblogging: Practical advice for machine learning
Practical advice for machine learning: bias, variance and what to do next By Mikael Huss at Follow the data (http://followthedata.wordpress.com/about/) The online machine learning course given by Andrew Ng in 2011 (available here among many other places, including YouTube) is highly recommended in its…
Impact of Sociolinguistics in Opinion Mining Systems
Signed post by Alexander Osherenko, Socioware Development, osherenko@socioware.de Full paper: Considering Impact of Sociolinguistic Findings in Believable Opinion Mining Systems Proceedings of The Fifth International Conference On Cognitive Science. 2012. Kalinigrad, Russia (http://www.informatik.uni-augsburg.de/~osherenk/final_kalinigrad.pdf) Opinions are frequent means of communication in…
Contextify: How to Contextualize Information
Marina Santini. Copyright © 2012 Work in progress: Contextify is a metadata tagger that performs text and content enrichment. Contexify enriches information through text classification and content markup. How can we capture context from a text? I would start with…
Reblogging: A little tutorial on mapreduce
By Joel Westerberg at Follow the data This is a short tutorial to explain the concept of map/reduce. This tutorial can be executed on a Unix system, like Linux or OS X. We’ll first process the data sequentially and then…
Free Online Course: Agile development method for Software as a Service (SaaS) using Ruby on Rails
Software Engineering for SaaS https://www.coursera.org/course/saas Start: May 18 2012 This course teaches the engineering fundamentals for long-lived software using the highly-productive Agile development method for Software as a Service (SaaS) using Ruby on Rails. Twelve principles underlie the Agile Manifesto,…
Reblogging: Informer, Spring Issue
Informer Newsletter of the BCS Information Retrieval Specialist Group Spring 2017 Table of Contents Editorial: By Udo Kruschwitz on April 28, 2012 Conference Review: ECIR 2012 Industry Day: By Franco Maria Nardini on April 26, 2012 Book Review: Search Analytics…
Online Course: Social Network Analysis
An interesting online course is offred for free by Coursera through University of Michigan in September 2012: Social Network Analysis “This course will use social network analysis, both its theory and computational tools, to make sense of the social and…
Dissemination: Acknowledgement Search Engine and Next Generation Search Engines
1) AckSeer is a beta automatic acknowledgment indexing search engine that explores automatic identification, entity extraction and indexing of acknowledgements from papers. In addition acknowledged entities are extracted within the acknowledgment passages. Currently, AckSeer indexes acknowledgments from more than 500,000 papers in CiteSeerX. These acknowledgements contain…
Dissemination: Web Corpora Available
1) Common Crawl web corpus — WebDataCommons is offering 3.2 billion quads current RDFa, Microdata and Miroformat data extracted from 65.4 million websites. Two Common Crawl web corpora are available: one corpus consisting of 2.5 billion HTML pages dating from…