Afra Alishahi

Research

Computational Models of Human Language Acquisition

My primary research interest involves the development of computational models of human language acquisition. Computational modeling is an effective tool for studying human cognition: whereas linguistic and psychological theories often give a high level explanation for the experimental data, computational models provide a detailed account of the underlying mechanisms for the cognitive task at hand. Moreover, the behaviour of a model can be directly compared to that of humans through computational simulation. One area of cognitive science that has extensively benefited from computational modeling is the study of natural language acquisition and use. However, developing computational algorithms that capture the complex structure of natural languages is an open problem. In particular, learning the abstract properties of language only from usage data without built-in knowledge of language structure remains a challenge. I am particularly interested in using appropriate machine learning techniques for the purpose of acquiring general knowledge of language from usage data. In my dissertation, I proposed a Bayesian, usage-based framework for modeling various aspects of early verb learning. The general constructions of language (such as transitive and intransitive) are viewed as a probability distribution over the syntactic and semantic features, e.g., the semantic properties of the verb and its arguments, and their relative word order in an utterance. Constructions are learned through clustering similar verb usages. Language use, on the other hand, is modeled as a Bayesian prediction problem, where the missing features in a usage are predicted based on the available parts and the acquired constructions (e.g., in sentence production, the best syntactic pattern for an utterance is predicted from the available semantic information). The model can successfully learn the common constructions of language, and its behaviour shows similarities to actual child data, both in sentence production and comprehension.

The probabilistic approach of our model has proved promising in modeling different aspects of child language acquisition: abstract patterns of language emerge from instances of input data without relying on extensive innate knowledge; the acquired knowledge of language is robust yet flexible; and many general patterns of behaviour that are observed in children can be simulated and explained in our model. These properties pose probabilistic modeling as an appropriate framework for investigating a range of phenomena in the domain of language, and suggest many new research directions. I will mention a number of these directions in the following sections.

Word Learning and Concept Formation

Learning the meaning of words is one of the basic steps for learning a language. However, establishing the mapping between words and their correct meaning is a non-trivial task, and the actual mechanisms used by children have been the subject of much debate. Many factors add complexity to this process, such as noise, ambiguity, and referential uncertainty, i.e., the fact that the child may perceive many aspects of the scene that are unrelated to the perceived utterance. Existing computational models of word learning are mostly based on the idea of cross-situational learning, or detecting the meaning elements that are common across all the situational contexts of a word usage. However, it has been argued that the meaning of many words, especially verbs, cannot be learned without relying on syntactic cues. At the same time, learning the knowledge of syntax itself relies on knowing the meaning of (at least some) verbs.

I am interested in examining different aspects of word learning in a probabilistic framework. A computational model similar to the one proposed in my dissertation can be used to model the mutual influence of word and syntax learning: the basic constructions of language can be acquired from a limited number of simple verbs whose meanings can be inferred through unambiguous contexts; the acquired constructions, in turn, can be used to guide learning the meaning of ambiguous words. Such a model can also be used to study the role of language on the development of the conceptual structure, as is suggested by recent experimental findings. For example, speakers of different languages seem to have slightly different representations of spatial relations. Such effects can be studied in the suggested probabilistic framework for word learning, where natural language sentences can be used as a cue for grouping relevant meaning elements that form a concept.

Unified Models of Language Acquisition and Processing

Human language processing is a well-studied problem. Many computational models have been proposed for modeling natural language processing, and explaining the consistent patterns observed in human experimental data. However, few models have attempted to integrate language acquisition and processing and study them in a unified framework. Instead, language acquisition and language processing have been mostly studied as isolated problems, a setting that is highly unrealistic.

The Bayesian model proposed in my dissertation is only a first step to developing a hybrid model of language acquisition and use. The general constructions that the model learns over time have been shown to be highly useful in a variety of language tasks, including limited sentence comprehension. However, a comprehensive model of sentence processing that uses the acquired knowledge of constructions as well as the verb-based knowledge is yet to be developed. Our novel view of constructions as a probability distribution over both syntactic and semantic features allows for drawing on lexical and syntactic valency, as well as semantic plausibility, in guiding sentence comprehension and handling ambiguity. Moreover, the probabilistic nature of constructions can account for the statistical effects in natural language processing.