Articles Comments

The WebGenre Blog: The power of genre applied to digital information. By Marina Santini » Archive

ML4LT: Machine Learning for Language Technology – A Gentle Introduction

— Last Updated: 27 Feb 2017 — Log: Debriefing available (Jan 2016) Marina Santini’s contact details: marinasantini dot ms at g-m-a-i-l ML4LT is an online self-paced introductory course in Machine Learning for Language Technology. It has been designed for linguists and for undergraduate students in Computational Linguistics. The course includes 10 lectures, both theoretical and practical. The practical part relies on the Weka Machine Learning Workbench (free software). [See Lab1 for installation]. The content of this page is based on selected material from the course: “ML4LT: Machine Learning for Language Technology 2016, Undergraduate Students”, Uppsala University. I will update this page regularly with links, videos, labs, assignments and literature. When visiting this page keep an eye on the “last updated” date. The course and the linked material will be updated and upgraded … Read entire article »

Filed under: featured, lectures, slides

Lecture: Semantic Word Clouds

Lecture: Semantic Word Clouds

Topics: folksonomy, social tagging, tag clouds, automatic folksonomy construction, word clouds, wordle,context-preserving word cloud visualisation, CPEWCV, seam carving, inflate and push, star forest, cycle cover, quantitative metrics, realized adjacencies, distortion, area utilization, compactness, aspect ratio, running time, semantics in language technology Lecture: Semantic Word Clouds from Marina Santini … Read entire article »

Filed under: featured, lectures, slides

Lecture: Ontologies and the Semantic Web

Lecture: Ontologies and the Semantic Web

Topics: Semantic Web, Web 3.0, shared understanding, shared semantic annotation, tree of Porphyry, ontology,wordnet, mesh,rdf, iri, description logics, DLs, Owl, WebProtege, domain-specific,Sparql, tags, ontology learning, classes, relations, axioms, instances, semantics in language technology. Lecture: Ontologies and the Semantic Web from Marina Santini … Read entire article »

Filed under: featured, lectures, slides

Lecture: Summarization

Lecture: Summarization

Topics: abstracting, extractive summarization, abstractive summarization, summarization in question answering, single vs. multiple documents, query-focused summarization, snippets, unsupervised content selection, topic signature-based content selection, rouge, recall oriented understudy for gisting evaluation, semantics in language technology, Lecture: Summarization from Marina Santini … Read entire article »

Filed under: featured, lectures, slides

Lecture: Relation Extraction

Lecture: Relation Extraction

Topic: databases of relations, knowledge graph, DBpedia, freebase, ACE, relation extractors, hand-written patterns, supervised machine learning, semi-supervised learning, bootstrapping, distant supervision, unsupervised learning from the web, semantic analysis in language technology. Relation Extraction from Marina Santini … Read entire article »

Filed under: featured, lectures, slides

Lecture: Question Answering

Lecture: Question Answering

Topics: IBM’s Watson, Apple’s Siri, WolframAlpha, factoid questions, complex questions, narrative questions, IR-based approaches, knowledge-based approaches, hybrid approaches, IR-based question answering, answer type taxonomy, passage retrieval,mean reciprocal rank, MRR, semantic analysis in language technology Lecture: Question Answering from Marina Santini … Read entire article »

Filed under: featured, lectures, slides

Lecture: IE – Named Entity Recognition (NER)

Lecture: IE – Named Entity Recognition (NER)

Topics: Information Extraction, Named Entity Recognition, NER, text analytics, text mining, e-discovery, unstructured data, structured data, calendaring, standard evaluation per entity, standard evaluation per token, sequence classifier, sequence labeling, word shapes, semantic analysis in language technology IE: Named Entity Recognition (NER) from Marina Santini … Read entire article »

Filed under: featured, lectures, slides

Lecture: Vector/Distributional Semantics

Lecture: Vector/Distributional Semantics

Topics: term-context matrix, distributional models, Zellig Harris, John Rupert Firth, PMI, Pointwise Mutual Information, PPMI, Positive Pointwise Mutual Information, joint probability, marginals, smoothing, cosine metric, cosine similarity measure, dot product, vectors. Lecture: Vector Semantics (aka Distributional Semantics) from Marina Santini … Read entire article »

Filed under: featured, lectures, slides

Lecture: Word Sense Disambiguation

Lecture: Word Sense Disambiguation

Topics: word sense disambiguation, wsd, thesaurus-based methods, dictionary-based methods, supervised methods, lesk algorithm, michael lesk, simplified lesk, corpus lesk, graph-based methods, word similarity, word relatedness, path-based similarity, information content, surprisal, resnik method, lin method, elesk, extended lesk, semcor, collocational features, bag-of-words features, the window, lexical semantics, computational semantics, semantic analysis in language technology. Lecture: Word Sense Disambiguation from Marina Santini … Read entire article »

Filed under: featured, lectures, slides

Lecture: Word Senses

Lecture: Word Senses

Outline: word senses, lexical semantics, homonymy, polysemy, metonymy, meronymy, antonomy, synonmy, hyponymy, hypernymy, wordnet, mesh, babelnet, lemma, wordform, zeugma test, senseval, selectional restrictions, membership meronymy, part-whole meronymy, semantic analysis, language technology Lecture: Word Senses from Marina Santini … Read entire article »

Filed under: featured, lectures, slides