| |
|
|
|
|
|
TiMBL - Tilburg Memory-Based Learner
-
TiMBL
The TiMBL
software package is a fast, tree-based implementation of
k-nearest neighbor classification. The package includes the
IB1, IB2, TRIBL, TRIBL2, and IGTree algorithms, and offers various
weighting metrics. Server functionality offered by separate TimblServer wrapper.
-
Dimbl
TiMBL wrapper performing parallel k-NN classification on multi-CPU machines
-
paramsearch
Wrapped progressive sampling search for automatic algorithmic parameter optimization for TiMBL and other machine learning algorithms
-
python-timbl
Python language bindings for TiMBL.
-
rtimbl
Ruby language bindings for TiMBL.
-
knngraph
Visualizes nearest neighbors in a TiMBL instance base.
Tools
-
CLAM
Computational linguistics application mediator, turning (legacy) NLP software into RESTful webservices and webapplications, written by Maarten van Gompel.
-
ewnpy
A command line interface to the Dutch Eurowordnet. Written by Erwin Marsi.
-
chunklink
Converts Penn Treebank II files into a one-word-per-line format containing (at least) the same information as the original files. This script was used to generate the data for the
CoNLL-2000 Shared Task. Written by Sabine Buchholz.
-
FoLiA
Format for Linguistic Annotation. A rich XML-format supporting a wide variety of linguistic annotations, using a extensible and universal paradigm. Written by Maarten van Gompel.
-
PyNLPl
Python Natural Language Processing Library (PyNLPl, pronounce as: pineapple). A collection of Python modules for a wide variety of NLP tasks. Written by Maarten van Gompel.
-
suffixtree
C++ package implementing the suffix tree datatype. Written by Menno van Zaanen.
-
sarrays
C++ package implementing the suffix array datatype. Also prints ngrams and skipgrams. Written by Herman Stehouwer.
|
|
|
|
|
|
|
|
|
|
|
|
Packaged
TiMBL and TimblServer,
MBT and MbtServer, and
Ucto
have been packaged for Debian, Ubuntu, and Fedora.
Consult this page for further instructions:
Generic NLP software
- Mbt
A customizable
tagger-generator and tagger combined in one. Based
on a tagged corpus, Mbtg generates a tagger, for instance for
part-of-speech tagging or named-entity recognition. Mbt processes text
from left to right, and uses a feedback loop to take its own previous
decisions into account.
-
MBMT and PBMBMT
Memory-based machine translation based on trigrams (MBMT, written by Antal van den Bosch and Peter Berck) or phrases (PBMBMT, written by Maarten van Gompel).
-
ABL
The Alignment-Based Learning grammatical inference system by Menno van Zaanen.
-
DEMOCRAT
Deciding between Multiple Outputs Created by Automatic Translation, a consensus-driven machine translation system by Menno van Zaanen and Harold Somers.
-
WOPR
Memory-based
language modeling, written by Peter Berck.
-
Ucto
Generic tokenizer and sentence splitter, written by Maarten van Gompel and Ko van der Sloot.
Dutch language and speech technology
-
Frog
Frog
(formerly called Tadpole) is a modular system integrating a tagger,
lemmatizer, morphological analyzer, and dependency parser based
on TiMBL
and MBT. Read about Frog
in this
paper.
-
NeXTeNS
A multi-platform, open source text-to-speech system for Dutch.
Generic data mining software
|
|
|
|
|
|
|
-
|