| |
|
|
|
|
Morpho-phonology, text-to-speech
-
NeXTeNS: Open Source
Text-to-Speech for Dutch
NeXTeNS stands for `Nederlandse
Extensie voor Tekst naar Spraak', or `Dutch Extension for Text to
Speech'. The projects aims at developing a modern, clean,
multi-platform, open source text-to-speech system for Dutch that is
freely available for research and education purposes.
-
MBLEM
and MBMA:
Memory-Based Lemmatization
and Morphological
Analysis
MBLEM is a lemmatizer for
English, German, and Dutch. It converts inflected wordforms to their
lemmas. MBMA, a morphological analyzer for Dutch, performs basic
segmentation, redoes spelling changes, figures out inflectional
features, and assigns POS tags.
Lexical semantics and collocations
-
MBWSD-D: Dutch Word Sense Disambiguation
This Dutch Word Sense Disambiguation demo, trained on the Schrooten and Vermeer corpus, assigns sense tags to polysemous words on the basis of the context they occur in. The demo is limited to a small set of frequent words.
-
Frasometer: n-gram
strength / surprise visualisation in text
This demo visualises the degree of surprise in Dutch text: which words
are expected to be in each other's neighbourhood, and which are not.
The underlying word-n-gram statistics are log-likelihood weights of
word-n-grams where n ranges from 2 to 25, computed on a 120 M words
Dutch newspaper text corpus.
-
Word Salad Generator
WOPR, a memory-based language model, can be (ab)used to generate "sentences" by letting it predict the next word on its own output. Try our Dutch, Swedish, and
English word salad generators.
|
|
|
|
|
|
|
|
|
|
|
PoS tagging and parsing
-
CGN tagger-lemmatizer
Tagger-lemmatizer developed for the Spoken Dutch
Corpus. Two demos are available: one purely trained on transcribed
speech, and one adapted to work with written Dutch. The tagger is
based on MBT; the lemmatizer is
based on MBLEM. The tagger for
written Dutch also accepts ASCII text files as input, producing tagged
and lemmatized files as output.
Information retrieval, extraction, and recommendation
-
ILK Expert Ranker
The Expert Ranker webdemo ranks the members of the ILK workgroup on their expertise on the topic of the query (text) you enter.
-
Factoid Memory Machine (Currently down)
Search for factoids in Dutch newspapers from 1985-1998; the demo
answers "who", "where", and "what" questions (in Dutch). Based on
shallow parsing and named-entity recognition.
|
|
|
|
|
|
|
|