Caroline Sporleder
This page is out of date, please refer to my current home page at Saarland University: http://www.coli.uni-sb.de/~csporled/
I'm currently a postdoc in the MITCH (Mining for
Information in Texts from the Cultural
Heritage) project. MITCH is a joint project of the ILK research group at
the University of Tilburg and Naturalis, the Dutch National Museum of
Natural History, in Leiden.
The project is funded by NWO as part of the CATCH (Continuous Access to
Cultural Heritage) programme.
Before coming to Tilburg, I was at the University
of Edinburgh from where I obtained a PhD in Informatics. My thesis
was concerned with the automatic construction of lexical inheritance
hierarchies using machine learning techniques. After my PhD I stayed in
Edinburgh for a 2-year postdoc project on automatic
discourse processing.
My first degree was an MA in Linguistics/Computational Linguistics, English
and History from the University of
Bielefeld, where I was affiliated with the Computational
Linguistics and Spoken Language Group.
Publications:
Journal Papers
- Caroline Sporleder and Alex Lascarides. "Using Automatically
Labelled Examples to Classify Rhetorical Relations: An Assessment", to appear
in Natural Language Engineering.
[pdf] (preprint)
- Caroline Sporleder. "Manually vs. Automatically Labelled Data in Discourse Relation Classification. Effects of Example and Feature Selection", to appear in LDV Forum.
[pdf] (preprint)
- Caroline Sporleder and Mirella Lapata. "Broad Coverage
Paragraph Segmentation across Languages and Domains", ACM
Transactions in Speech and Language Processing, 3:2, 1-35, July 2006.
pdf] (preprint)
Conference & Workshop Papers
- Sander Canisius and Caroline Sporleder. "Bootstrapping Information Extraction from Field Books", Proceedings of EMNLP-CoNLL 2007, Prague, Czech Republic, June 28-30, 2007.
[pdf]
- Iris Hendrickx, Roser Morante, Caroline Sporleder, and Antal van den Bosch. "Machine learning of semantic relations with shallow features and almost no data", Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval 2007), Prague, Czech Republic, June 23-24, 2007.
[pdf]
- Sander Canisius and Caroline Sporleder. "Learning to Segment and Label Semi-Structured Documents with Little or No Supervision", Proceedings of Benelearn 2007, Amsterdam, The Netherlands, May 14-15, 2007.
[pdf]
- Antal van den Bosch, Caroline Sporleder, Marieke van Erp, and Steve Hunt. " Automatic Techniques for Generating and Correcting Cultural Heritage Collection Metadata", Digital Humanities 2007, Urbana-Champaign, USA, June 4-8, 2007.
[html] (reviewed abstract)
- Caroline Sporleder, Marieke van Erp, Tijn Porcelijn and Antal van den Bosch. "Correcting 'Wrong-Column' Errors in Text Databases.", Proceedings of the Annual Machine Learning Conference of Belgium and The Netherlands (Benelearn-06), pp. 49-56, Ghent, Belgium, 2006.
[pdf]
- Caroline Sporleder, Marieke van Erp, Tijn Porcelijn and Antal van den Bosch. "Spotting the 'Odd-one-out': Data-Driven Error Detection and Correction in Textual Databases.", Proceedings of the EACL 2006 Workshop on Adaptive Text Extraction and Mining (ATEM-06), pp. 41-48, Trento, Italy, 2006.
[pdf]
- Caroline Sporleder, Marieke van Erp, Tijn Porcelijn, Antal van den Bosch and Pim Arntzen. "Identifying Named Entities in Text Databases from the Natural History Domain", Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC-06), pp. 1742-1745, Genoa, Italy, 2006.
[pdf]
- Caroline Sporleder and Mirella Lapata. "Discourse Chunking and its Application to Sentence Compression", Proceedings of the 2005 Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP-05), Vancouver, Canada, 2005.
[pdf]
- Caroline Sporleder and Alex Lascarides. "Exploiting Linguistic Cues
to Classify Rhetorical Relations", Proceedings of Recent
Advances in Natural Language Processing (RANLP-05), pp. 532-539, Borovets, Bulgaria, 2005. (RANLP-2005 Young Researcher Award)
[pdf]
- Caroline Sporleder and Mirella Lapata. "Automatic Paragraph Identification: A Study across Languages and Domains", Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP-04), pp. 72-79, Barcelona, Spain, 2004.
[pdf]
- Caroline Sporleder and Alex Lascarides. "Combining Hierarchical Clustering and Machine Learning to Predict High-Level Discourse Structure", Proceedings of the 20th International Conference on Computational Linguistics (COLING-04), pp. 43-49, Geneva, Switzerland, 2004.
[ps]
- Caroline Sporleder. "Combining Machine Learning and Set-Theory to Infer Inheritance Hierarchies", Beiträge zur 7. Konferenz zur Verarbeitung Natürlicher Sprache (KONVENS-04), pp. 193-200, Vienna, Austria, 2004.
[ps]
- Caroline Sporleder. "Learning Lexical Inheritance Hierarchies
with Maximum Entropy Models", Workshop on Machine Learning
Approaches in Computational Linguistics, ESSLLI 2002,
Trento, Italy, 5-16 August 2002.
[ps]
- Caroline Sporleder. "Machine Learning of Lexical Inheritance Hierarchies:
Linguistic Plausibility vs. Minimal Redundancy" Proceedings of
the Student Research Workshop at the 40th ACL,
Philadelphia, USA, 6-12 July 2002.
- Caroline Sporleder. "A Galois Lattice based Approach to Lexical
Inheritance Learning", ECAI 2002 Workshop on Machine Learning
and Natural Language Processing for Ontology Engineering (OLT2002),
Lyon, France, July 22-23 2002.
- Caroline Sporleder. "Some Experiments on Lexical
Inheritance Hierarchy Learning", Proceedings of TaCoS 2002, Potsdam, Germany, 6-9 June 2002.
- Harald Lüngen and Caroline Sporleder. "Automatic Induction of Lexical Inheritance Hierarchies",
Multilingual Corpora: Codierung, Struktur, Analyse. 11 Jahrestagung der Gesellschaft für Linguistische Datenverarbeitung (GLDV-99), pp. 42-52, Frankfurt a.M., Germany, 1999.
[ps]
Technical Reports
- Caroline Sporleder, Marieke van Erp, Tijn Porcelijn, Antal van den Bosch, Pim Arntzen and Erik van Nieukerken. Cleaning and Enriching Research Data on Reptiles and Amphibians. The MITCH Pilot Project and "nulmeting". Technical Report, ILK 06-01, Tilburg University, 2006.
[pdf]
Theses
- Caroline Sporleder. Discovering
Lexical Generalisations. A Supervised Machine Learning Approach to
Inheritance Hierarchy Construction. PhD Thesis, School of
Informatics, University of Edinburgh, 2004.
[ps]
- Caroline Sporleder. Learning Lexical Generalisations. An
Operational Evaluation of Current Machine Learning Methods. MA
Thesis, Universität Bielefeld 1999.
[ps]
- Caroline Sporleder. Interfacing Natural Language Generation
and Speech Synthesis: A Topic-Comment Mark-Up for ILEX 2.0. MSc
Thesis, University of Edinburgh 1997.
Address:
ILK/Computational Linguistics
Room Y333
Faculty of Arts
Tilburg University
PO Box 90153
5000 LE Tilburg
The Netherlands
email: csporled AT uvt.nl
and
Naturalis
Room C04.32
Darwinweg 2
2333 CR Leiden
The Netherlands