darpa.gif (3978 bytes)        msu.gif (4062 bytes)                                                


Home
Framework
Accomplishment
Plan
Publications
Members
Future platform                   

Distinguished Lecture Series

Recent accomplishments

A new capability, autonomous skill development, has been accomplished on our SAIL robot for an imitation learning mode, during which human teacher guides the actions of the robot online.  For example, for the first time in the field, a human teacher can teach a "robot baby" how to navigate along a corridor and how to make turns properly using its vision, simply by taking it for a walk by pushing it along the corridor, without in-lab pre-training.  This new learning capability is made possible by a new kind of program, called developmental program, which automatically derives internal representation, including deriving features to be used at any time, from the sensory signal that the robot senses in real-time.  A fundamental difference between our new SAIL developmental program and other traditional programs is that it enables a robot to learn tasks that its programmer does not know at the time of programming.  Consequently, this new capability has greatly increased the capability of the robot to learn new tasks in unknown complex environment. It has also drastically reduced the difficulty of human programming for sophisticated intelligent robots. 

The above new capability, autonomous skill development, has also been tested successfully for another learning mode, reinforcement learning mode, through computer simulation using real corridor navigation video sequence.  In this learning mode, human teacher needs only to encourage or discourage the robot while it is exploring and practicing on its own.  The Q-learning algorithm has been modified and integrated with our Hierarchical Discriminant Regression (HDR) method to deal with the challenging high-dimensional input, a large number of states, and real-time response.  Furthermore, the imitation and reinforcement learning modes can be totally interleaved in any order and in any time duration. It is up to the human teacher to decide according to the performance of the robot at that time.  With this new capability, human can allow the robot to explore autonomously in the real physical world, while giving instructions from time to time through hand-in-hand teaching (imitation learning mode) or giving encouragement and discouragement (reinforcement learning mode).  

At a longer time scale, the deliberative layer builds an environmental model and makes the plan. One of the most difficult challenges here is to enable robot to learn to act reliably in confusing or perceptually aliased situations (e.g. in an office environment, two corridors or intersections can look very alike). We have investigated a new approach to acting in perceptually aliased environments by building multi-scale hierarchical spatial models. In these models, higher levels of the hierarchy represent more "abstract" concepts, such as corridors or intersections, whereas lower levels of the hierarchy represent regions within a corridor. We have implemented a novel solution using the framework of Hierarchical Hidden Markov Models (HHMMs). In an experimental study in indoor robot navigation, we have shown faster learning by reusing submodels, better fit of the model to the training data, better localization of the robot, and the ability to infer topological structure of the environment.  We have also implemented a planning system using HHMM models, where the robot can find its way to a destination location.

We have investigated another novel approach to acting in perceptually aliased environments, based on remembering previous observations and actions.  We have extended Q-learning with a hierarchical short-term memory method that rapidly brings to bear past experience that is appropriate to the grain-size of the decisions being considered.  At higher levels in the hierarchy, the agent abstracts over lower-level details and looks back over a variable number of high-level decisions in time.  We formalized this idea in a framework called Hierarchical Short-Term Memory (HSM). We have shown that this framework outperforms several related reinforcement learning techniques on a challenging simulated corridor navigation task.

The research in the servo control layer focuses on the following subtopics, the path tracking for non-holonomic mobile robots, control of mobile manipulators, the interaction between human being and mobile manipulator, and formation control of multiple autonomous mobile robots. The goal of this research is to develop tracking controllers in a perceptive frame, which can deal with uncertain obstacles and ensure the stability of the controller.  Tracking control for simple paths such as straight lines, circles as well as more complex paths have been developed and implemented on a Nomadic XR4000 mobile robot. With the help of sensors, obstacle avoidance is also tested. Equipped with these controllers, the command from deliberative and interactive layers can be executed with presence of obstacles. 

Another work at the servo control layer is the interaction between human and mobile manipulator.   If the control interaction goes through the Internet, we need to overcome the uncertain time delay existed in the Internet.  A real-time control of the mobile robot over the Internet has been tested.

 


For problems or questions regarding this web contact webmaster.
Last updated: February 21, 2002.