Articles Comments

The WebGenre Blog: The power of genre applied to digital information. By Marina Santini » Entries tagged with "language technology"

Book Review: Fundamentals of Predictive Text Mining 2nd Ed. (2015)

Book Review: Fundamentals of Predictive Text Mining 2nd Ed. (2015)

Book Review: Weiss S. M., Indurkhya N. and Zhang T. (2015). Fundamentals of Predictive Text Mining. Springer-Verlag, London. Second Edition Informer website Winter 2016 Issue, Book Review The volume “Fundamentals of Predictive Text Mining”, 2nd ed. has nine chapters, a table of contents, a list of references, a Subject Index and an Author Index. The book also includes a Preface written by the three authors, Summary Abbriavions: ML=Machine Learning; NLP=Natural Language Processing; IR= Information Retrieval 1) In Chapter 1, “Overview of … Read entire article »

Filed under: featured, reading suggestions, reviews

Lecture: Word Senses

Lecture: Word Senses

Outline: word senses, lexical semantics, homonymy, polysemy, metonymy, meronymy, antonomy, synonmy, hyponymy, hypernymy, wordnet, mesh, babelnet, lemma, wordform, zeugma test, senseval, selectional restrictions, membership meronymy, part-whole meronymy, semantic analysis, language technology Lecture: Word Senses from Marina Santini … Read entire article »

Filed under: featured, lectures, slides

Lecture 5: Interval Estimation (ML4LT)

Topics: inferential statistics, statistical inference, language technology, interval estimation, confidence interval, standard error, confidence level, z critical value, confidence interval for proportion, confidence interval for the mean, multiplier, Lecture 5: Interval Estimation from Marina Santini … Read entire article »

Filed under: lectures

Lecture 2: Basic Concepts in Machine Learning for Language Technology

Machine Learning for Language Technology 2014 – Course Schedule … Read entire article »

Filed under: featured, lectures

Lecture 3: Structuring the Unstructured via Sentiment Analysis

Lecture 3: Structuring Unstructured Texts Through Sentiment Analysis from Marina Santini … Read entire article »

Filed under: lectures

Course: Semantic Analysis in Language Technology

Uppsala University: Department of Linguistics and Philology Semantic Analysis in Language Technology (2013)         Credits: 7,5 hp Syllabus: 5LN456 Teacher: Marina Santini The course website will be update regularly during the teaching session with additional material. Last Updated: 23 October 2013 Course website: http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm Nov, 12 (Tue) 10‑12 9-2042 (Turing) Course introduction [OH]. J&M 17–18 Nov, 14 (Thu) 10-12 9-2042 (Turing) Introduction to essay assignment (EA) [OH]. Nov, 19 (Tue) 10-12 9-2042 (Turing) IE/PAS, PAS assignment [OH] Johansson and Nugues 2008, J&M 20.9 Nov, 21 (Thu) 10-12 9-2042 (Turing) EA and PAS supervision – Nov, 26 (Tue) 10-12 9-2042 (Turing) Sentiment analysis BL 1–4 Nov, 28 (Thu) 10-12 9-2042 (Turing) Sentiment analysis BL 5–7 Dec, 03 (Tue) 10-12 9-2042 (Turing) Supervision – Dec, 06 (Thu) Deadline EA, step 1 Dec, 10 (Tue) 10-12 9-2042 (Turing) EA presentations – Dec, 12 (Thu) 10-12 9-2042 (Turing) WSD [OH] J&M 19–20. Dec, 17 (Tue) 10-12 9-2042 (Turing) WSD. Deadline EA, feedback to another group (link to submitted essays below) – Jan, 20 (Mon) 2014-01-20: Deadline, all assignments Intended learning outcomes In order to pass the course, a student must be able to: describe systems that perform the following tasks, apply them to authentic linguistic data, and evaluate the results: disambiguate instances of polysemous lemmas [word sense disambiguation, WSD]; use semantic analysis in the context of information extraction … Read entire article »

Filed under: announcements, lectures

Lecture 1: Introduction – Machine Learning for Language Technology

What Is Machine Learning? Machine learning is programming computers to optimize a performance criterion using example data or past experience. We have a model defined up to some parameters, and learning is the execution of a computer program to optimize the parameters of the model using the training data or past experience. The model may be predictive to make predictions in the future, or descriptive to gain knowledge from data, or both. Machine learning uses the theory of statistics in building mathematical models, because the core task is making inference from a sample. (Alpaydin, 2010) In this lecture, we discuss supervised learning starting from the simplest case. We introduce the concepts of: Margin, Noise, and Bias. … Read entire article »

Filed under: lectures

Towards a Safer Web (with Language Technology)

Last Updated: 25 June 2013 On 18 June 2013, I attended an interesting conference on cybersecurity. The conference was held in one of the conference rooms at the Police Academy in Rome*. The title of the conference was “Critical Infrastructure Protection – Telecommunications”** and Italian was the working language. The conference was organized by  the I.C.S.A Foundation (Intelligence Culture and Strategic Analysis) (http://www.fondazioneicsa.it/?lang=3). Those who can understand Italian can read a press release here: http://www.fondazioneicsa.it/UserFiles/File/convegno_polizia.pdf As you can imagine, there were many people working for the Police and Defence Departments, but also people coming from industry and academia. I attended this conference because, in my opinion, Language Technology (LT) can help cybersecurity in many ways. We are currently thinking of a LT project, SafeWEB, whose aim is to detect threatening, mischievous and treacherous … Read entire article »

Filed under: discussions, dissemination, reports

Opinion Retrieval and Ranking: the creeping and ineluctable force of Genre

Last Updated: 27 May 2013 Two fundamental principles concurring to the definition and characterization of the concept of genre are conventions and expectations. Simply put, in textual (written or spoken) communication, genres are words that connote different types of text. For instance, on the web the home page genre is different from the blog genre; in a company, the minutes genre is different from the white paper genre; in the press the leader genre is different from the letter to the editor genre… Genres have the power of shaping information following rhetorical and discourse patterns that have become conventionalized. Genre conventions are implemented by the writer(s). When acknowledged, genre conventions raise predictable expectations in the readers or more generally in those who “process” a text… Although I am oversimplifying here, broadly speaking … Read entire article »

Filed under: discussions, quotes, reflections

Report: Language in the Digital Age – META-NORD National Workshop

Report: Language in the Digital Age – META-NORD National Workshop by Marina Santini Held in Stockholm, Sweden, 23 Nov 2012 Download program and presentations here. I was very happy to attend the workshop “Language in the Digital Age” last week in Stockholm. It was informative and inspring. The workshop’s venue – Stacken at Nalen’s (a building from the end of XIX century) – is a fascinating example of architectonic re-use. Stacken (literally meaning “The Stack”, but probably a nickname to refer to the boxing ring) was the former boxing gym of the still existing Narva Boxningsklubb. Now Stacken is an cosy conference/banquet room decorated with four thin columns that add status and elegance to events (http://www.cityfinder.se/sv/node/1453#/10) The speakers and the audience (about 50 people) represented a wide range of interests, from the linguistic needs of the … Read entire article »

Filed under: reflections, reports, seminars