| |
|
|
|
|
|
WOPR: Memory-based word prediction and language modeling
WOPR is a wrapper around the k-nearest
neighbor classifier
in TiMBL, offering word
prediction and language modeling functionalities. Trained on a text
corpus, WOPR can predict missing words, report perplexities at
the word level and the text level, and generate spelling correction
hypotheses. Read more about WOPR in the reference guide offering installation instructions, walkthroughs, basic performance measurements (including a comparison to SRILM), and an overview of options.
The WOPR name is a blatant cultural reference to the
mainframe computer WOPR, "War Operation Plan
Response", a key role player in the 1983 US movie War Games. Through a
hacked phone dailup connection, WOPR enjoys playing games with a teenager played by a young
Matthew
Broderick, almost causing a full nuclear war. Image from
Wikipedia.
Features
- Generates language models
- Tests language models on new text, reporting perplexities, prediction distributions, word-level entropies and perplexities
- Optionally exports ARPA-formatted language model files
- Optionally filters its output for spelling correction candidates
WOPR is free software; you can
redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free
Software Foundation.
Written by
WOPR is written by Peter Berck, with input from Antal van den
Bosch and Ko van der Sloot.
|
|
|
|
|
|
|
|
|
|
|
Archived versions
You can find archived versions of Wopr in our public software repository:
|
|
|
|
|
|
|
|
|
|
|
|
Download and installation
To install, please make sure you have
installed Timbl (version
6.4 or higher). After unpacking the tarball, './configure
--prefix=<install_dir>; make; make install' should work, if
you have writing permission in install_dir and if Timbl is
installed there as well.
For documentation on Wopr, see the
following PDF document. This document
supersedes the old wiki documentation (still
available here). For
information on using WOPR
in Memory-based Machine
Translation, follow
this HOWTO.
Wopr has been tested on
- Intel platform running several versions of Linux
- AMD64 platform running gentoo linux
- Mac OS X platform
WOPR incorporated
WOPR is used in MBMT, our memory-based machine translation software.
Sponsor
WOPR is developed as part of the Implicit Linguistics project, funded
by NWO, the Netherlands Organisation
for Scientific Research.
References
For more information and background on WOPR, see
Word Salad Generator (Demo)
WOPR can also be used to generate sentences, by letting it predict the next word on its own output. For word salads in several languages and genres, click:
English, in the style of Jane Austen's Emma
English movie subtitles
Dutch newspaper text
English+Dutch movie subtitles
Dutch Wikipedia, 4-gram model
Dutch Wikipedia, 6-gram model
Swedish European Parliament speeches and debates
It can take up to thirty seconds before you get a reply the first time.
|
|
|
|
|
|
|
|