One of the most difficult problems in artificial intelligence and cognitive science in general is the so-called symbol grounding problem. This problem is concerned with the question ``how seemingly meaningless symbols acquire a meaning in relation to the real world?'' Each robot, which reasons about its environment or uses language, has to deal with the symbol grounding problem.
Finding a consistent symbolic representation has proven to be very difficult. In early robot applications the meaning of a symbol, for instance the colour `red' was assigned by the programmer. Such a meaning was given a rule that, e.g. if the robot observes a particular light frequency, this observation means `red'. But detecting the colour red under different lighting conditions does not yield a singular frequency. Nevertheless, humans are well capable of categorising red. So far, a robot is not capable of doing this very well. It is impractical, if not impossible to program the grounded meaning of a symbol so that a robot can deal with this meaning in all possible real-world situations. Yet, if this could be done, such an implementation would soon be out of date. Many meanings are continuously subject to change and depend often on the experience of the observer. Hence it would be more interesting to design a robot, which can construct meaning symbolic representations of its observations. Such a robot is developed in this thesis.
In the introduction of this PhD thesis, the symbol grounding problem is introduced, and a theoretic framework is presented with which this problem may be solved. The theory on semiotics is used as a starting point. The design of the implementation is inspired by the behaviour-oriented approach to AI. Three research questions are formulated at the end of this chapter, which are answered in the rest of this thesis. The questions are: (1) Can the symbol grounding problem be solved within the given experimental set-up? And if so, how is this accomplished? (2) What are the important types of non-linguistic information that agents should share when developing a coherent communication system? Two types of non-linguistic information are investigated. The first one concerns joint attention established prior to the linguistic communication. The second is about the feedback, which the robots may get from the effect of their communication. And (3) what is the influence of the physical conditions and interaction of the robots on developing a grounded lexicon?
The research is done using two LEGO robots, which are developed at the Artificial Intelligence Laboratory of the Free University of Brussels. The robots have a sensorimotor interface with which the robots can observe and act. They do this in an environment where there are four light sources about which the robots will try to develop a shared lexicon. The robots are programmed in the Process Description Language PDL. PDL is a programming language with which the robots can be programmed with the behaviour-oriented control. The robots, their environment and programming language are described in chapter 2.
The symbol grounding problem is solved by means of language games. At the beginning of each experiment the robots have no representations of meaning, neither do the have word-meaning associations in their lexicons. In a language game two robots, a speaker and a hearer, come together and observe their surroundings. This observation is segmented such that the robots find sensings that relate to the light sources. Next, the speaker selects one segment as the topic of the language game and tries to find one or more categories relating to this segment. If it fails, the speaker expands it memory of categories so that it might succeed in the future. The hearer does the same for those segments that it considers a possible topic. Which segments the hearer considers depends on the type of language game being played. Four different language games are investigated in this thesis. If both robots thus acquired a categorisation (or meaning), the speaker will search its lexicon for a word-meaning association that matches the meaning. The found word-form is exchanged with the hearer. In turn, the hearer will look in its lexicon for word-meaning associations that match the word-form. Depending on the matching meaning, the hearer will select its topic. The language game is successful when both robots thus identified the same topic. It is argued that the symbol grounding problem is solved in a particular situation when the language game is successful. If the language game fails, the lexicon is expanded so that the robots may be successful in the future. Furthermore, word-meaning associations are either strengthened or weakened depending on the association's effectiveness in the game. In this way the lexicon is constructed and organised such that the robots can effectively communicate with each other. The model of the language games is explained in chapter 3.
In chapter 4 the first experimental results are presented. Although the robots succeed to solve the symbol grounding problem to some extent, a few problems were observed. To investigate these problems, a few methods and parameters of the experiment from chapter 4 are varied to see what their impact is. In addition, experiments are done to compare all four language games. The results of these experiments are presented in chapter 5. Observed improvements from chapter 5 are combined in three experiments that give the most optimal results. The three experiments involve two different language games in which the successful combinations of joint attention and feedback are investigated. This is presented in chapter 6. Each set of experiments in these three chapters is followed by a brief discussion.
Finally, chapter 7 contains an extensive discussion of the results and conclusions are drawn. The most important conclusion is that the symbol grounding problem is solved in the given experimental set-up. Although some assumptions are made to overcome a few technical problems. The most important assumption made is that the robots are technically capable of establishing joint attention on a referent without using linguistic information. The establishment of joint attention, used both for prior topic information and feedback, is indispensable for the success of the experiments. An interesting finding is that despite a referent cannot be categorised uniquely and a word-form may have several meanings, these word-forms mostly refer to a single referent. The results further showed that the physical conditions of the experiments, as expected, do influence the success. The end of chapter 7 discusses a few possible future experiments.