MBT: Memory-Based Tagger
  
MBT: Memory-Based Tagger
MBT: Memory-based tagger generation and tagging

MBT is a memory-based tagger-generator and tagger in one. The tagger-generator part can generate a sequence tagger on the basis of a training set of tagged sequences; the tagger part can tag new sequences. MBT can, for instance, be used to generate part-of-speech taggers or chunkers for natural language processing. It has also been used for named-entity recognition, information extraction in domain-specific texts, and disfluency chunking in transcribed speech.

Features
  • Tagger generation: tagged text in, tagger out
  • Optional feedback loop: feed previous tag decision back to input of next decision
  • Easily customizable feature representation
  • Allows user-provided features
  • Automatic generation of separate sub-taggers for known words and unknown words
  • Can make use of full algorithmic parameters of TiMBL
  • NEW: server mode is now available though a separate package, MbtServer.
  • Debian, Ubuntu, RPM, and Fink packages available
Documents and reference

MBT download and links

Consult these installation instructions for details on how to install this software if you are using a Debian, Ubuntu, or Fedora-based system. If you want to build the code from source yourself, download

MBT server functionality is now in a separate package, MbtServer. If previously you used MBT in server mode, you will now need to install and run MbtServer:

An installation of MBT assumes installed versions of TiMBL, version 6.4 or higher, and TimblServer, version 1.3 or higher. See:

TiMBL Tilburg Memory-Based Learner
TimblServer TiMBL wrapper, adding server functionality to TiMBL
MBSP demo of memory-based English shallow parsing, including Mbt
Frog Dutch morpho-syntactic processing, including Mbt for POS tagging
CGN Tagger-Lemmatizer demo of Dutch PoS tagging with the CGN tag set, including Mbt
Kiswahili PoS tagger demo using Mbt, at aflat.org

Further information

Walter Daelemans Antal van den Bosch
CLiPS, Computational Linguistics and Psycholinguistics Research Center ILK, Induction of Linguistic Knowledge Research Group
University of Antwerp Tilburg University

Archived versions

You can find archived versions of Mbt in our public software repository:

http://software.ticc.uvt.nl/

Last update: