Articles Comments

Headline

Summary: Looking for Corpora…

Dear All, In this post I collect all the suggestions I got for the following request: “Looking for Corpora in….” http://www.forum.santini.se/2014/03/looking-for-corpora-to-explore-cross-linguality/ Big thanks to (hope I have not forgotten anybody): Johannes Heinecke, Dominika Rogozinska, Mohamed-Zakaria KURDI, Bartosz Ziólko, Olga Whelan, Margarita Borreguero, Zuloaga, Ayesha Zafar, Will Snellen, Katherine (Katie) Skees Hund, Anna Matyszczyk, Massinissa Ahmim, Marcin Feder, Maria Pia Montoro, Lawrence Niculescu, Jesus Vilares, Ewa Gwiazdecka, Jack Bowers, Taner Sezer, Yvonne Adesam, Kadri Muischnek, Anne Tamm, Ralf Steinberger, Ricardo Campos, Edyta Jurkiewicz-Rohrbacher, Pat, Sara Castagnoli. Suggestions were sent through: Mailing Lists: Corpora List (corpora@uib.no), BCS IRSG (IR@jiscmail.ac.uk) LinkedIn Groups: Corpus Linguistics, Computational Linguistics, Natural Language Processing, Applied linguistics, Terminology Services. Hope this list of corpora is useful for everybody working with multi- and cross-linguality. Please do not hesitate to contact me if you wish to contribute … Read entire article »

Latest

Looking for Corpora to explore Cross-Linguality

Dear All, I am looking for corpora of any genre in the following languages: English, Swedish, Polish, Italian, Finnish, Estonian, and Hungarian. I am already aware of a number of corpora (several posts in this blog are dedicated to the dissemination of corpora-related information). These corpora are mostly in English. I would like now to focus on: 1) additional languages and 2) additional genres, such as search query logs, tv scripts, emails, tweets, whatsup messages, etc. All genres are well accepted! The only requirement is: corpora must be free and publicly available. Everybody must be able to replicate or extend experiments using the same corpora/datasets. The purpose of the experiments is to explore cross-linguality in different settings. Please, read the use … Read entire article »

Lecture 3: Structuring the Unstructured via Sentiment Analysis

Lecture 3: Structuring Unstructured Texts Through Sentiment Analysis from Marina Santini Bookmark on Delicious Recommend on Facebook Share on Linkedin Tweet about it Subscribe to the comments on this post … Read entire article »

Lecture 2: From Semantics to Semantic-Oriented Applications

From the “Natural Language Processing” LinkedIn group: John Kontos, Professor of Artificial Intelligence I wonder whether translating into formal logic is nothing more than transliteration which simply isolates the part of the text that can be reasoned upon using the simple inference mechanism of formal logic. The real problem I think lies with the part of text that CANNOT be translated one the one hand and the one that changes its meaning due to civilization advances. My own proposal is to leave NL text alone and try building inference mechanisms for the UNTRANSLATED text depending on the task requirements. All the best John” Bookmark on Delicious Recommend on Facebook Share on Linkedin Tweet about it Subscribe to the … Read entire article »

Lecture 1: Semantic Analysis in Language Technology – Introduction

Lecture 1: Semantic Analysis in Language Technology – Introduction

Lecture 1: Semantic Analysis in Language Technology from Marina Santini Quick overview on basic concepts of semantic analysis, lexical semantics, computational lexical semantics, computational semantics, formal semantics, representation of meaning… Bookmark on Delicious Recommend on Facebook Share on Linkedin Tweet about it Subscribe to the comments on this post … Read entire article »

Course: Semantic Analysis in Language Technology

Uppsala University: Department of Linguistics and Philology Semantic Analysis in Language Technology (2013)         Credits: 7,5 hp Syllabus: 5LN456 Teacher: Marina Santini The course website will be update regularly during the teaching session with additional material. Last Updated: 23 October 2013 Course website: http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm Nov, 12 (Tue) 10‑12 9-2042 (Turing) Course introduction [OH]. J&M 17–18 Nov, 14 (Thu) 10-12 9-2042 (Turing) Introduction to essay assignment (EA) [OH]. Nov, 19 (Tue) 10-12 9-2042 (Turing) IE/PAS, PAS assignment [OH] Johansson and Nugues 2008, J&M 20.9 Nov, 21 (Thu) 10-12 9-2042 (Turing) EA and PAS supervision – Nov, 26 (Tue) 10-12 9-2042 (Turing) Sentiment analysis BL 1–4 Nov, 28 (Thu) 10-12 9-2042 (Turing) Sentiment analysis BL 5–7 Dec, 03 (Tue) 10-12 9-2042 (Turing) Supervision – Dec, 06 (Thu) Deadline EA, step 1 Dec, 10 (Tue) 10-12 9-2042 (Turing) EA presentations – Dec, 12 (Thu) 10-12 9-2042 (Turing) WSD [OH] J&M 19–20. Dec, 17 (Tue) 10-12 9-2042 (Turing) WSD. Deadline EA, feedback to another group (link to submitted essays below) – Jan, 20 (Mon) 2014-01-20: Deadline, all assignments Intended learning outcomes In order to pass the course, a student must be able to: describe systems that perform the following tasks, apply them to authentic linguistic data, and evaluate the results: disambiguate … Read entire article »

Lecture 7: Learning from Massive Datasets

Lecture 7: Learning from Massive Datasets from Marina Santini In this lecture we explore how big datasets can be used with the Weka workbench and what other issues are currently under discussion in the real world, for ex: big data applications, predictive linguistic analysis, new platforms and new programming languages. Bookmark on Delicious Recommend on Facebook Share on Linkedin Tweet about it Subscribe to the comments on this post … Read entire article »

Cloud & Big Data Day

On 24th Sept 2013, I attended the CLOUD & BIG DATA DAY in Stockholm (Kista) organized by SICS and EIT ICT Labs. Cloud & Big Data Day is part of SICS Software Week that takes place every year. The specific purpose of the Cloud & Big Data Day was to “feature leading international and Swedish experts from industry and academia, who present the cutting edge of cloud computing technologies. The intended audience is professionals in IT and its applications for all areas in industry and academia”. The presentations were all interesting and covered a wide range of projects and applications centered on BIG DATA: from how to harness pentabytes of data at Spotify, to big cellular … Read entire article »

Lecture 6: Ensemble Methods

Lecture 6: Ensemble Methods from Marina Santini What is an “ensemble learner”? How can we combine different base learners into an ensemble in order to improve the overall classification performance? In this lecture, these questions are addressed. Bookmark on Delicious Recommend on Facebook Share on Linkedin Tweet about it Subscribe to the comments on this post … Read entire article »

Lecture 5: Structured Prediction

Structured prediction or structured learning refers to supervised machine learning techniques that involve predicting structured objects, rather than single labels or real values. For example, the problem of translating a natural language sentence into a syntactic representation such as a parse tree can be seen as a structured prediction problem in which the structured output domain is the set of all possible parse trees. Bookmark on Delicious Recommend on Facebook Share on Linkedin Tweet about it Subscribe to the comments on this post … Read entire article »