The WebGenre Blog: The power of genre applied to digital information. By Marina Santini » DaisyKB





DaisyKB is an open-source machine-readable multi-lingual lexical knowledge base designed for a broad range of end-users (language students, teachers, linguists, language technologists, etc.).

DaisyKB is a container where chunks of language in use are stored. DaisyKB relies on the assumption that words are used in “chunks” and not individually. A language chunk is made of several words that are customarily used together, such as “in my opinion,” “to make a long story short,” “How are you?” or “Know what I mean?”

Therefore, the main tenet underpinning DaisyKB is that single words are often not enough to understand the meaning of a sentence, an utterance, or a text. Language chunks, instead, show the behaviours and the different senses of word units, i.e. they reveal the context, or situation, where language is used. Since the use of natural languages varies according to context, context is the fundamental key to disambiguate meaning and to provide the correct interpretation of a spoken or written text.

DaisyKB incorporates two types of contexts: 1) the pragmatic context represented by situational categories such as genre, domain, register, style, frequent queries, etc., and 2) linear co-text represented by frequent co-occuring words, idioms, collocations, syntactic patterns, etc.

Discussion Group

  • Marina Santini (Sweden)
  • Jussi Karlgren (Sweden)
  • Lars Borin (Sweden)
  • Lars Arhenberg (Sweden)
  • Rickard Domey (Sweden)

Comments are closed.