ILK Home People Publications MBLP book TiMBL MBT Other software Demos ROLAQUAD MITCH Other research projects Contact Links Nederlands
  
Program

Room YZ4, Building Y

9.30 - 10.30 William Cohen (Center for Automated Learning and Discovery, School of Computer Science, Carnegie Mellon University, Pittsburgh PA)

Sequential learning methods for partitioning problems

10.30 - 10.50 coffee break
10.50 - 11.40 Hendrik Blockeel (Department of Computer Science, K.U. Leuven, Belgium)

Experiment databases: A novel methodology for experimental research

11.40 - 12.30 Maarten van Someren (Department of Social Sciences Informatics, University of Amsterdam, Netherlands)

Bias-variance analysis: What is it and why is it useful?

12.30 - 13.30 walking lunch
14.00 - 15.00 PhD defense Iris Hendrickx

Local classification and global estimation: Explorations of the k-nearest neighbor algorithm

Location: Aula, Building A

 

Registration

Attendance is free of charge. Please register with Piroska Lendvai (p.lendvai@uvt.nl). Inquiries: Piroska Lendvai, tel +31.(0)13.466.8260

 

Travel

For travelling directions to Tilburg University, see this page. The mini-symposium is held in room YZ4, on the ground floor of building Y, except for the Ph.D. thesis defense which is in the Aula, in building A. See the campus map.

 

Sponsors

   

Invitation

The ILK Research group and the Language and Information Sciences Department of the Faculty of Arts of Tilburg University kindly invite you to participate in a Machine Learning Mini-Symposium, organised on the occasion of the Ph.D. thesis defense of Iris Hendrickx, regarding her thesis entitled "Local classification and global estimation: Explorations of the k-nearest neighbor algorithm".

Speakers are William Cohen (CMU), Hendrik Blockeel (KUL), and Maarten van Someren (UvA).

Abstracts

William Cohen
Sequential learning methods for partitioning problems

One interesting special case of statistical relational learning is sequential learning, in which the goal is to learn a sequentially correlated set of decisions. Sequential learning has been used on a diverse set of tasks including gene-finding, noun-phrase chunking, named entity recognition, and document analysis. I will review two well-studied approaches to sequential learning: conditional random fields (CRFs), and maximum-entropy Markov models (MEMMs), and then describe a new sequential learning scheme called ``stacked sequential learning''.

Stacked sequential learning is a meta-learning algorithm, in which an arbitrary base learner is augmented so as make it aware of the labels of nearby examples. I will present experimental results on several ``sequential partitioning problems'', which are characterized by long runs of identical labels, and show that on such problems MEMMs are unstable, while CRFs and sequential stacking is not. I will also show that on these problems, sequential stacking usually improves the performance of non-sequential base learners; that it often improves the performance of CRFs, and that a sequentially stacked maximum-entropy learner often outperforms CRFs.

Hendrik Blockeel
Experiment databases: A novel methodology for experimental research

Data mining and machine learning are to some extent experimental sciences: a lot of insight in the behaviour of algorithms is obtained by implementing them and studying how they behave when run on datasets. Performing experiments is a non-trivial task in this area: performance of algorithms on datasets can be characterized in many different ways and is influenced by many parameters of the algorithms and the datasets. As a result, experiments need to be set up with care, and results need to be interpreted with caution.

In this talk we will discuss the concept of "experiment databases" as the basis of a new and improved experimental methodology for machine learning and data mining. The basic idea behind experiment databases is that instead of setting up experiments to answer specific research questions, large sets of experiments are performed automatically and stored in a database, and specific research questions are answered by querying that database.

The proposed methodology has numerous advantages of the classical one. However, in order to exploit these optimally, several research challenges need to be addressed. We will discuss these challenges as well as the potential impact of the proposed methodology.

Maarten van Someren
Bias-variance analysis: What is it and why is it useful?

Bias-variance analysis analyses prediction errors into components that have different causes. This talk will summarise the concept, show why it is important, using an analysis of different solutions to a real-world learning problem and conclude with guidelines on how to use bias-variance analysis in data mining and machine learning.


© 2005 Tilburg University, Antal.vdnBosch@uvt.nl | Last update: Wed Nov 2 2005