Researchers from Stanford University have recently introduced a new computational framework called Deep Evolutionary Reinforcement Learning (DERL). It allows AI agents to evolve morphologies and learn challenging locomotion and manipulation tasks in complex environments using only low-level egocentric sensory information.
In evolutionary biology, the Baldwin effect suggests that behaviors originally learned over a lifetime in ancient generations of an evolutionary process will progressively become instinctual and may even be genetically transmitted to succeeding generations.
Studies related to the processes of learning and evolution in the complex environment have recognized many aspects of animal intelligence, deeply embodied in the evolved morphologies. However, it is challenging to demonstrate the Baldwin effect in morphological evolution, either living organisms or in computer modeling or simulations.
DERL is the first demonstration of a Darwinian Baldwin Effect via morphological learning. Fei-Fei Li, one of the co-authors in the paper ‘Embodied Intelligence via Learning and Evolution,’ states that it is an essential trick of Nature for animal evolution, now shown in the AI agents. The researchers identify the combinatorially large number of possible morphologies and the computational time required to evaluate fitness through lifetime learning as the significant challenges they faced in creating their AI embodied agents.
Earlier studies focused on identifying evolved agents in limited morphological search spaces or finding optimal parameters based on a fixed hand-designed morphology. However, DERL allows to simultaneously scale embodied agents’ creation across three types of complexity: environmental, morphological, and control. The team developed a UNIMAL (UNIversal aniMAL), a design space to overcome previous morphological search spaces’ limited expressiveness. It enables highly expressive and useful controllable morphologies in agents and analyses the resulting embodied agents in three environments: hills, steps, and rubble.
DERL made large-scale simulations possible, which yielded insights into how learning, evolution, and environmental complexity can interact to generate intelligent morphologies. Firstly, they have identified that Environmental complexity encourages the growth of morphological intelligence as quantified by the ability of morphology to facilitate the learning of novel tasks. Secondly, they recognize that evolution rapidly selects morphologies that learn faster. This enables behaviors learned late in early ancestors’ lifetime to be expressed early in their descendants’ lifetime. This result establishes the demonstration of a long-conjectured morphological Baldwin effect in agents that learn and evolve in complex environments.
The study suggests that a mechanistic basis for both the Baldwin effect and morphological intelligence’s emergence through morphologies’ evolution is more physically stable and energy-efficient. It can also facilitate efficient learning and control.
The Baldwinian transfer of intelligence from phenotype to genotype has been assumed to free up phenotypic learning resources to learn more complex animals’ behaviors, including the emergence of language and imitation in humans. This suggests that the large-scale simulations of learning and evolution can speed up reinforcement learning through the emergence of morphological intelligence. Likewise, the researchers believe that the large-scale explorations of learning and evolution in other contexts may yield rapidly learnable intelligent behaviors in RL agents and unique engineering advances to instantiate them in machines.