|
TreeTalk: Memory-Based Grapheme-Phoneme Conversion Demo
|
The TreeTalk demo converts Dutch or English words to their
phonetic transcription in the SAMPA (Dutch) or DISC (English) phonetic
alphabet, and also generates speech audio. This speech audio is
synthesized by the MBROLA speech synthesizer
from the Circuit Theory and Signal
Processing Lab at the Faculté Polytechnique de Mons
Primary word stress is indicated
for each word by an apostrophe (') preceding the stressed
syllable. The maximum word size is limited to 50 characters.
TreeTalk is currently in development within the PhD project of Bertjan
Busser. A short paper about the Dutch version of TreeTalk is
available (first in list below).
A short list of work-in-progress: we are working on
predicting suprasegmental prosodic information so that TreeTalk will
be able to predict prosodic breaks, boundary tones, and sentence
accents. We will also try to incorporate expansion of abbreviations
and more intelligent tokenization.
Previous work on grapheme-phoneme conversion and earlier descriptions of
TreeTalk by the ILK
group can be found in the following publications:
-
G.J. Busser (1998). TreeTalk-D: A Machine Learning approach to Dutch
word pronunciation. In P. Sojka, V. Matousek, K. Pala, and I. Kopecek (Eds.) (1998)
Proceedings TSD Conference, Masaryk University, Czech Republic,
pp. 3 - 8.
-
[abstract,
gzipped postscript]
.
-
W. Daelemans and A. van den Bosch (1993). Tabtalk: Reusability in
Data-Oriented Grapheme-to-Phoneme Conversion. In
Proceedings of Eurospeech, Berlin, 1459-1466.
-
[abstract,
postscript]
-
A. van den Bosch and W. Daelemans (1993).
Data-Oriented Methods for Grapheme-to-Phoneme Conversion
In Proceedings of the Sixth conference of the European
chapter of the ACL, ACL, 45-53.
-
[abstract,
postscript]
-
A. van den Bosch, A. Content, W. Daelemans, and B. de Gelder (1994).
Measuring the Complexity of Writing Systems.
Journal of Quantitative Linguistics,
1, 3, 178-188.
-
[abstract,
postscript]
-
W. Daelemans and A. van den Bosch (1996). Language-Independent
Data-Oriented Grapheme-to-Phoneme Conversion. In Van Santen, J.,
R. Sproat, J. Olive, and J. Hirschberg (eds.) Progress
in Speech Synthesis. New York: Springer Verlag, 77-90.
-
[abstract,
postscript]
-
A. van den Bosch (1997). Learning to pronounce written words. A study in
inductive language learning. Ph.D. Thesis, Universiteit Maastricht, The
Netherlands. Cadier en Keer: Phidippides.
Copyright © 1998 ILK
Research Group, Tilburg University. All rights reserved.
|
Last update: 14 December 1999
|