From February to May 2010 I staid in Paris at the Sony Computer Science Laboratory (Prof. Luc Steels) and worked there with Michael Spranger on the emergence of a vocabulary to describe postures. In simulated language games a population of agents comes up with a conceptualisation of the posture space – we will present our work in September on the CSDL/ESLP conference in San Diego:
Posture verbs (e.g., sit, stand, lie) describe a human agent’s bodily posture, often while indicating position or location with respect to a landmark (“He sits on a chair”). But in addition the usage of posture verbs is in many languages extended to include location of objects and even to carry a metaphorical meaning. While obviously posture verbs inherit an embodied meaning, i.e., they can be grounded in bodily experiences, it remains completely open how this grounded meaning can be utilised in communication for describing the location of an object. Besides the motor features, posture verbs also denote the visual properties of objects, like orientation (e.g., ”a book lies on the table“ or ”a book stands on the shelf“).
We are interested in how such a flexible conceptualisation can emerge which not only is grounded in direct bodily experiences, but in addition includes visual properties. This seems to be crucial for extending posture verbs from describing a persons posture to describing an objects location and its orientation. We approach this question through analysing the emergence of an artificial language in a population of robots. In our experiments, two agents from a population of robots play language action games to self-organise a shared vocabulary on the meaning of utterances about bodily postures. The agents take turn in requesting another agent to perform a posture and afterwards judging if the performed posture was the intended one.
While the task is by its situated nature in itself demanding, the process of alignment gets even more complicated in the case of the action games. The perceptions of the hearer and speaker are completely different: the speaker is seeing the hearer performing the posture, while the hearer has to control his movements. In initial experiments the agents were able to come up in a bottom-up fashion with a mapping between given clusters of postures in a motor space and matching clusters in a perceptual space. But the conceptual systems are separated and given in advance. We therefore introduced a biological-inspired internal body model implemented as a recurrent neural network. On the one hand, the body model is applied in motor control of the acting robot exploiting the inverse kinematic function of the network. On the other hand, it is used to track the movements when the robot is observing.
The body model facilitates the mapping between the motor space and the visual feature space as it links action and perception and therefore allows the process of conceptualisation to be influenced by the language games. The agents converge to a set of lexicalised postures.
These experiments are a first step, showing how bottom-up influences (embodied experiences like in motor control) and top-down influences (successful communication) can interact in conceptualisation. Posture verbs can carry more meaning then an embodied experience as such. Here, we have shown how an enriched embodied representation including visual features can emerge and get aligned through communication.