The focus of the ROLAQUAD research project (2004-2008) was basic research in the context of a multiple-turn question-answering dialogue application. Team members of ROLAQUAD published work on the following topics:

  • Constraint satisfaction inference for structured output learning, applied to, among other tasks,
    • Open and closed-domain entity recognition;
    • Multi-label document classification;
    • Dependency parsing;
    • Machine translation.
  • Taxonomic knowledge extraction from semi-structured encyclopedic medical texts;
  • Question-answering in a closed domain through semantic tagging of medical concepts and relations;
  • Pragma-semantic tagging of dialogue acts.

Software

Apart from project-internal modules for question answering and semantic tagging, the ROLAQUAD project spun off or contributed to the following open source software releases:

Publications

In reverse chronological order. (See also: all ILK publications)
  • Canisius, S., and Van den Bosch, A. (2007). Recompiling a knowledge-based dependency parser into memory. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-2007), Borovets, Bulgaria, pp. 104-108. [pdf]
  • Van den Bosch, A., Busser, G.J., Canisius, S., and Daelemans, W. (2007). An efficient memory-based morpho-syntactic tagger and parser for Dutch. In P. Dirix, I. Schuurman, V. Vandeghinste, and F. Van Eynde (Eds.), Computational Linguistics in the Netherlands: Selected Papers from the Seventeenth CLIN Meeting, Leuven, Belgium, pp. 99-114. [preprint pdf]
  • Lendvai, P., and Geertzen, J. (2007). Token-based chunking of turn-internal dialogue act sequences. In Proceedings of the 8th SIGDIAL Workshop on Discourse and Dialogue, Antwerp, Belgium, pp. 174-181. [pdf]
  • Spitters, M., De Boni, M., Zavrel, J., and Bonnema, R. (2007). Learning to compose effective strategies from a library of dialogue components. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 792-799. [pdf]
  • Canisius, S., and Sporleder, C. (2007). Bootstrapping information extraction from field books. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, pp. 827-836. [pdf, bib]
  • Canisius, S., and Tjong Kim Sang, E. (2007). A constraint satisfaction approach to dependency parsing. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, pp. 1124-1128. [pdf, bib]
  • Canisius, S., and Sporleder, C. (2007). Learning to segment and label semi-structured documents with little or no supervision. In P. Adriaans, M. van Someren, and S. Katrenko (Eds.), Proceedings of the 18th BENELEARN Conference, Amsterdam, The Netherlands. [pdf]
  • Canisius, S., T. Bogers, A. van den Bosch, J. Geertzen, and E. Tjong Kim Sang (2006). Dependency parsing by inference over high-recall dependency predictions. In Proceedings of the Tenth Conference on Computational Natural Language Learning, CoNLL-X, June 2006, New York City, NY. [pdf]
  • Van den Bosch, A., and Canisius, S. (2006). Improved morpho-phonological sequence processing with constraint satisfaction inference. In Proceedings of the Eighth Meeting of the ACL Special Interest Group in Computational Phonology, SIGPHON '06, June 2006, New York City, NY. [pdf]
  • Canisius, S., Van den Bosch, A., and Daelemans, W. (2006). Constraint satisfaction inference: Non-probabilistic global inference for sequence labelling. In Proceedings of the EACL 2006 Workshop on Learning Structured Information in Natural Language Applications, Trento, April 2006. [pdf]
  • Lendvai, P. (2005). Conceptual taxonomy identification in medical documents. In Proceedings of The Second International Workshop on Knowledge Discovery and Ontologies (KDO-2005), held within ECML/PKDD, Porto, Portugal, 2005, pp. 31-38. [pdf]
  • Lendvai, P. (2005). Taxonómia felismerése dokumentumszerkezetbõl. In Proceedings of Computational Linguistics in Hungary Conference (Magyar Szamítógépes Nyelvészeti Konferencia, MSZNY-2005), Szeged, Hungary, 2005. pp. 88-95. [pdf]
  • Van den Bosch, A. and Lendvai, P. (2005). Robust ASR lattice representation types in pragma-semantic processing of spoken input. In Proceedings of the AAAI Spoken Language Understanding Workshop, SLU-2005, July 9, 2005, Pittsburgh, PA, pp. 15-22. [pdf]
  • Van den Bosch, A. (2005). Memory-based understanding of user utterances in a spoken dialogue system: Effects of feature selection and co-learning. In Workshop Proceedings of the 6th International Conference on Case-Based Reasoning, Chicago, IL, pp. 85-94. []pdf]
  • Van den Bosch, A., and Daelemans, W. (2005). Improving sequence segmentation learning by predicting trigrams. In Proceedings of the Ninth Conference on Natural Language Learning, CoNLL-2005, June 29-30, 2005, Ann Arbor, MI, pp. 80-87. [pdf]
  • Canisius, S., Van den Bosch, A., and Daelemans, W. (2005). Rule meta-learning for trigram-based sequence processing. In J. Cussens and C. Nedellec (Eds.), Proceedings of the Fourth Learning Language in Logic Workshop, pp. 3-10, Bonn, August 2005. [pdf]
  • Tjong Kim Sang, E., Canisius, S, Van den Bosch, A., and Bogers, T. (2005). Applying spelling error correction techniques for improving semantic role labelling. In Proceedings of the Ninth Conference on Natural Language Learning, CoNLL-2005, June 29-30, 2005, Ann Arbor, MI. [pdf]
  • Lendvai, P., Van den Bosch, A., Krahmer, E., and Canisius, S. (2004). Memory-based Robust Interpretation of Recognised Speech. In: Proceedings of SPECOM '04, 9th International Conference "Speech and Computer", St. Petersburg, Russia, pp. 415-422. [pdf]
  • Canisius, S., and Van den Bosch, A. (2004). A memory-based shallow parser for spoken Dutch. In Decadt, B., De Pauw, G. and Hoste, V. (Eds.), Selected papers from the Thirteenth Computational Linguistics in the Netherlands Meeting, Antwerp, Belgium, pp. 31-45.

 

The idea central to ROLAQUAD was that by doing a direct word-level and sentence-level semantic tagging of both questions and background texts (medical encyclopedic texts), a basic QA module could be rapidly developed. This was effectively realized in the IMIX demonstrator, a spoken medical QA dialogue system that also integrated speech trecognition, dialogue management, natural language generation and speech synthesis, and two other QA modules, Joost (University of Groningen) and Factmine (University of Amsterdam).

ROLAQUAD was part of the NWO IMIX (Interactive Multimodal Information Extraction) programme, and of the ILK Research Group of the Faculty of Humanities (until January 2007: Faculty of Arts) of Tilburg University.

ROLAQUAD's industrial partner was Textkernel B.V.. Textkernel contributes expertise and software for information extraction, text classification, and server-based annotation.

Team members:

ROLAQUAD would like to thank Emiel Krahmer, Erwin Marsi, Erik Tjong Kim Sang, and all other IMIX project partners; Bertjan Busser, Toine Bogers, Jeroen Geertzen, and all other ILK members; Martijn Spitters, Remko Bonnema, Eduard Hovy, and Caroline Sporleder for their help, contributions, and suggestions along the way. Many thanks also to student assistants Ralph Claassens, Eva Creyghton, and Corina Koolen who annotated the medical encyclopedic texts.

Older related publications by members of the group:

  • Lendvai, P. (2004). Extracting Information from Spoken User Input. A Machine Learning Approach. Ph.D. thesis, Tilburg University, 2004.
  • Lendvai, P. (2003). Learning to Identify Fragmented Words in Spoken Discourse. In: Proceedings of EACL-03 Student Research Workshop. Budapest, 2003. pages 25-32. [pdf, slides]
  • Lendvai, P., Van den Bosch, A., and Krahmer, E. (2003). Memory-based disfluency chunking. In R. Eklund (Ed.), Proceedings of DISS'03, Disfluency in Spontaneous Speech Workshop, Göteborg University, Sweden, 2003. pages 63-66. [pdf]
  • Lendvai, P., Van den Bosch, A., and Krahmer, E. (2003). Machine Learning for Shallow Interpretation of User Utterances in Spoken Dialogue Systems. In Proceedings of EACL-03 Workshop on Dialogue Systems: interaction, adaptation and styles of management. Budapest, 2003, pp. 69-78.
    [pdf] (note - this version corrects the published paper!)
  • Lendvai,P., and L. Maruster (2003). Process discovery for evaluating dialogue strategies. In: Proc. of ISCA Workshop on Error Handling in Spoken Dialogue Systems. Chateau d'Oex-Vaud, Switzerland, 2003. pages 119-122. [pdf]
  • Lendvai, P., Van den Bosch, A., Krahmer, E., and Swerts, M. (2002). Improving machine-learned detection of miscommunications in human-machine dialogues through informed data splitting. In: Proceedings of the ESSLLI 2002 Workshop on Machine Learning Approaches in Computational Linguistics, Trento, Italy, August 2002. [postscript]
  • Lendvai, P., Van den Bosch, A., Krahmer, E, and Swerts, M. (2002). Multi-feature error detection in spoken dialogue systems. In: Proceedings of the 12th Computational Linguistics in The Netherlands meeting, Twente, Netherlands, November 2001. [postscript]
  • Van den Bosch, A., Krahmer, E., and Swerts, M. (2001). Detecting problematic turns in human-machine interactions: Rule-induction versus memory-based learning approaches. In Proceedings of the 39th Meeting of the Association for Computational Linguistics (ACL'00). New Brunswick, NJ: ACL, pp. 499-506. [postscript]


© 2004-2007 Tilburg University, Antal.vdnBosch@uvt.nl | Last update: