WOPR: Memory-based word prediction and language modeling
  
WOPR: Memory-based word prediction and language modeling
WOPR: Memory-based word prediction and language modeling

WOPR is a wrapper around the k-nearest neighbor classifier in TiMBL, offering word prediction and language modeling functionalities. Trained on a text corpus, WOPR can predict missing words, report perplexities at the word level and the text level, and generate spelling correction hypotheses. Read more about WOPR in the reference guide offering installation instructions, walkthroughs, basic performance measurements (including a comparison to SRILM), and an overview of options.

The WOPR name is a blatant cultural reference to the mainframe computer WOPR, "War Operation Plan Response", a key role player in the 1983 US movie War Games. Through a hacked phone dailup connection, WOPR enjoys playing games with a teenager played by a young Matthew Broderick, almost causing a full nuclear war. Image from Wikipedia.

Features

  • Generates language models
  • Tests language models on new text, reporting perplexities, prediction distributions, word-level entropies and perplexities
  • Optionally exports ARPA-formatted language model files
  • Optionally filters its output for spelling correction candidates

WOPR is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation.

Written by

WOPR is written by Peter Berck, with input from Antal van den Bosch and Ko van der Sloot.
Archived versions

You can find archived versions of Wopr in our public software repository:

Download and installation

To install, please make sure you have installed Timbl (version 6.4 or higher). After unpacking the tarball, './configure --prefix=<install_dir>; make; make install' should work, if you have writing permission in install_dir and if Timbl is installed there as well.

For documentation on Wopr, see the following PDF document. This document supersedes the old wiki documentation (still available here). For information on using WOPR in Memory-based Machine Translation, follow this HOWTO.

Wopr has been tested on

  • Intel platform running several versions of Linux
  • AMD64 platform running gentoo linux
  • Mac OS X platform

 

WOPR incorporated

WOPR is used in MBMT, our memory-based machine translation software.

 

Sponsor

WOPR is developed as part of the Implicit Linguistics project, funded by NWO, the Netherlands Organisation for Scientific Research.

 

References

For more information and background on WOPR, see

 

Word Salad Generator (Demo)

WOPR can also be used to generate sentences, by letting it predict the next word on its own output. For word salads in several languages and genres, click:

English, in the style of Jane Austen's Emma
English movie subtitles
Dutch newspaper text
English+Dutch movie subtitles
Dutch Wikipedia, 4-gram model
Dutch Wikipedia, 6-gram model
Swedish European Parliament speeches and debates

It can take up to thirty seconds before you get a reply the first time.

Antal.vdnBosch@uvt.nl | Last update: