ILK: Induction of Linguistic Research Group - Demos
  
ILK Demos
Morpho-phonology, text-to-speech
  • NeXTeNS: Open Source Text-to-Speech for Dutch

    NeXTeNS stands for `Nederlandse Extensie voor Tekst naar Spraak', or `Dutch Extension for Text to Speech'. The projects aims at developing a modern, clean, multi-platform, open source text-to-speech system for Dutch that is freely available for research and education purposes.

  • MBLEM and MBMA: Memory-Based Lemmatization and Morphological Analysis

    MBLEM is a lemmatizer for English, German, and Dutch. It converts inflected wordforms to their lemmas. MBMA, a morphological analyzer for Dutch, performs basic segmentation, redoes spelling changes, figures out inflectional features, and assigns POS tags.

Lexical semantics and collocations
  • MBWSD-D: Dutch Word Sense Disambiguation

    This Dutch Word Sense Disambiguation demo, trained on the Schrooten and Vermeer corpus, assigns sense tags to polysemous words on the basis of the context they occur in. The demo is limited to a small set of frequent words.

  • Frasometer: n-gram strength / surprise visualisation in text

    This demo visualises the degree of surprise in Dutch text: which words are expected to be in each other's neighbourhood, and which are not. The underlying word-n-gram statistics are log-likelihood weights of word-n-grams where n ranges from 2 to 25, computed on a 120 M words Dutch newspaper text corpus.

  • Word Salad Generator

    WOPR, a memory-based language model, can be (ab)used to generate "sentences" by letting it predict the next word on its own output. Try our Dutch, Swedish, and English word salad generators.

PoS tagging and parsing
  • CGN tagger-lemmatizer

    Tagger-lemmatizer developed for the Spoken Dutch Corpus. Two demos are available: one purely trained on transcribed speech, and one adapted to work with written Dutch. The tagger is based on MBT; the lemmatizer is based on MBLEM. The tagger for written Dutch also accepts ASCII text files as input, producing tagged and lemmatized files as output.

Information retrieval, extraction, and recommendation
  • ILK Expert Ranker

    The Expert Ranker webdemo ranks the members of the ILK workgroup on their expertise on the topic of the query (text) you enter.

  • Factoid Memory Machine (Currently down)

    Search for factoids in Dutch newspapers from 1985-1998; the demo answers "who", "where", and "what" questions (in Dutch). Based on shallow parsing and named-entity recognition.

Antal.vdnBosch@uvt.nl | Last update: