Festo’s Bionic Handling Assistant (BHA) was revealed to the public in 2010 and got quite some media coverage. Its bionic features make it especially cool, yet at the same time they make control especially hard as we already covered in our last article: Simulation of the Bionic Handling Assistant.
The structure and functioning of the Bionic Handling Assistant was inspired by an elephant’s trunk. It is manufactured with rapid-prototyping methods by means of 3D-printing. Its main material is Polyamide which makes the entire robot deformable and very lightweight: Essentially, the robot consists of plastic and a lot of air. It is actuated by thirteen pneumatic bellow actuators (chambers): each three of them in parallel to form the three main segments and the wrist, and one for the gripper. The chambers are inflated and deflated to move the robot, which allows the entire robot to bend and stretch.
We showed in our last article how we managed to get a pretty good approximation for the position of the gripper based in sensing the actuator lengths with the cable potentiometers. We showed that based on this model, we can simulate the robot well and really fast.
But if you to want to use the gripper to pick up and manipulate stuff, you need to invest a lot more. In fact, we looked at how infants approach similar problems when learning to move their limbs.
Challenge: inverse kinematics for the BHA
A lot has happened since this last article. A couple of research papers, a trade fair, and a Ph.D. thesis later, we want to focus on the control of the BHA’s end-effector in this post. There are a lot of basic things that one needs to thinks about when starting to operate such a unique and off-standard robot platform. Yet, after all one wants to use the BHA’s gripper to pick up and manipulate something. The BHA can be moved by controlling the lengths of the nine actuators in the main segments (difficult enough stuff for another article, see [Neumann et al., 2013], [Nordmann et al., 2012]). Hence, to pick up an object at a certain position in space with the gripper, you need to know:
Which motor commands (actuator lengths) must be applied to move the gripper to the object position?
This problem is known as inverse kinematics: Finding a posture for the robot that brings the effector to some desired coordinate. For standard robots with revolute joints this problem is solved, although in practice it can still be annoying. The typical approach is to start with the (exactly) known forward kinematics, do some fancy math on it (matrix-inverses of the kinematic’s derivatives), and approach the goal step by step. So since we have a nice approximation of the BHA’s forward kinematics, we’re done.
For the BHA, this approach is not suitable for a couple of reasons:
- The forward kinematics are not exactly known. The simulation’s (average) mis-prediction of 1cm would instantaneously impair the control accuracy. We could cope with this, if it was the only, or even the most significant problem.
- What is substantially worse is that the limits in which the actuator lengths can be changed are very narrow, and have strong interdependencies which are essentially unknown. One never certainly knowns in advance whether a motor command (a set of lengths for all actuators) can be achieved at all. We actually have a data-driven approximation for these limits, but the impact of even minor errors is huge: the deviation between a model with just slightly wrong limits and the physical BHA can easily be 10cm deviation of the effector-position.
- Both issues together are really hard, but at least in principle solvable by the widely used feedback control schemes that, step-by-step, approach the right solution. Yet, the BHA comprises strong sensory noise and quite some delays in the actuation. Standard feedback control on such hardware can only be done for very slow movements (or „low gains“ in control-slang). And „slow“ here means painfully slow on the BHA. Control theory describes some approaches to circumvent this problem, but which only require more analytic knowledge that we don’t have.
- The final, and even more show-stopping, reason is: the BHA changes over time, due to wear-out effects and visco-elasticity of the elastic material. Even if was possible to know all the necessary stuff about the BHA analytically and exactly, it would be void in the very next moment. In particular the actuation limits, which cause the largest prediction errors, change a lot even on short and medium time scales.
A consequent approach (not only for us) is to fill up these huge gaps of knowledge with artificial learning. Motor learning in indeed a widely investigated topic both from a machine learning and actual control perspective. There is a sheer endless number of approaches that has been suggested, see [Nguyen and Peters, 2011] for a recent review. So lets apply them, and finally we’re done.
Learning is not the problem. The problem is that data for learning has to be generated by exploration. Enough data! The standard approaches to motor learning require full knowledge about the space of possible motor commands. That means that essentially all motor command needs to be explored exhaustively on the robot. Of course there is not a „countable“ number of actuator lengths, but one could, for instance, try out ten different lengths per actuator and then go through all combination. How difficult can it be? Well, ten different lengths for nine actuators brings up one billion combinations! Often this is done in practice by „motor babbling“: try out a lot (!) of random actions. This superficially conceals the demand to explore really all actions, but changes nothing of the exhaustive character. In any way, there are way too many actions to be explored in the lifetime of a robot or any learning agent. The problem is even more drastic when the system changes over time, like the BHA does. If one cannot even probe the necessary exploratory data once, then how to react to ongoing changes? Try (and fail) to explore everything again? Pointless. This approach does not scale.
What babies can tell us
This is when infants come into play.
Learning to coordinate many degrees of freedom on a changing body is indeed challenging. Yet, the problem is not unique to technical artifacts. To the opposite, we humans face the very same problem when we are born and not even capable of the most simple coordinated movements. If only one could mimic the tremendous efficiency of human learning, it should be possible to successfully learn on platforms like the BHA as well.
Our human body possesses more than 600 skeletal muscles which we need to coordinate for purposeful actions. It is completely hopeless to just explore all ways to move 600 muscles, or even to track the ongoing change induced by the rapid growth of our bodies. So what is the trick that organizes our own sensorimotor learning? The movements of a newborns do not appear to follow any coordinated pattern at first sight. In fact, it was believed for a long time that during our first months of life we only perform (essentially) random movements – a believe that dates back to the earliest, and most seminal developmental theories such as Piaget’s.
Random movements essentially correspond to an exhaustive exploration. Hence it does neither provide a real explanation to the success of human learning, nor guidelines for artificial learning on robots like the BHA. Not much hope to solve the challenge so far …
… until a remarkable infant study came up. In 1982, Claes von Hosten investigated the way newborns react to flashy objects closely in front of them. He found that the newborns’ movements are indeed not very coordinated, but at the same time far, far away from being random. The newborns, as young as two days after birth, tried to reach for the objects!
Newborns aren’t random, but already move in goal-directed ways!
This finding is remarkable, because infants consistently fail to get the hand to the object. Why would they do it if they fail? It is even more remarkable for two other reasons:
- It has been overlooked for three decades by developmental psychology after Piaget’s seminal work. Mostly because infants only try, but fail in the end.
- It has been overlooked for another three decades by robotics and machine learning after it was found. Was there nothing to learn from this study?
In the early days of my Ph.D. work I stumbled across von Hofsten’s study. I was both intrigued by the question what these early goal-directed movements are good for, and increasingly annoyed by repeated claims of how awesome (and, absurdly, even „biologically plausible“) random motor babbling is. Today, there are some answers. Goal babbling, our short-notion for goal-directed exploration from the very beginning, is indeed different from exhaustive exploration: this kind of exploration rapidly seeks solutions for behavioral goals and sticks to them if appropriate.
Most motor-tasks, like the BHA’s inverse kinematics, contain huge redundancy: any possible goal (e.g. an end-effector position) can be achieved in infinitely many ways. While exhaustive or random exploration searches all of them, goal babbling seeks few of them, just enough to solve the task. Hence, „enough“ data can be generated more efficiently. This is complemented by another finding from developmental psychology: it seems that infants only learn one single solution in the very beginning, that is applied even without feedback or corrections. This is not as versatile as all our modern control stuff, but it can be efficiently explored with goal babbling.
We mimicked this combination of goal babbling, and the representation of just a single solution in simulation, at least to start with [Rolf et al., 2011] . It turns out:
- It scales! We found that this approach does not even take more time for 50 degrees of freedom than it takes for two degrees of freedom. This is remarkable compared to exhaustive exploration, where the cost (or time) needed to gather data increases exponentially with the dimension.
- It is fast, actually comparable to the speed of human learning. Useful results require only a few hundred movements, which also we (humans) need [Sailer and Flanagan, 2005] when we solve previously unknown sensorimotor tasks.
Both aspects are very good news, and a nice example how biology inspires solutions to previously intractable problems. Time to combine award-winning, bio-inspired hardware with award-winning, bio-inspired learning;
Time for practice!
Learning and Exploitation on the BHA
When we started with goal babbling on the BHA, we made some rapid progress in the beginning. Yet, we had to learn a lot about the platform itself (for instance the narrow and changing actuation limits), and had quite some remaining problems with the underlying (pressure and length) control. As we increasingly mastered the basic control stuff also goal babbling matured …
… and worked nicely!
The first working version was to just learn left / right movements of the effector in space. Not very useful, but awesome stuff for live-demonstrations because it only takes around two minutes of goal babbling to learn this task from scratch, even though nine actuators need to be coordinated. We quickly discovered how nicely it worked when one of the BHA’s actuators was physically broken. The actuator could not inflate anymore and, even worse, was still passively moving like a spring in the middle robot segment. Learning didn’t even care and just learned to use these passive movements together with the remaining, functioning actuators. We did not plan to do such experiments, we were just hit by reality. True story.
A likewise positive aspect came up when, let’s say, „someone“ started to touch and push the robot while it was exploring and learning. Imagine a Ph.D. student’s nervous face when a professor (oops) starts to interfere with the robot … The nervousness did not last for more than a few seconds. It was awesome to see that one could just teach the robot on the fly. We had a visualization of the robot’s current goal, and could physically push it into the right direction. Learning then was so quick that there wasn’t even resistance of the controller, given one pushes in the right direction. An ability that is only possible by having goals and very rapid, ongoing learning and a very compliant, light-weight robot.
Although all the learning and exploration was done physically on the robot, the kinematics simulation has become an important tool for the learning experiments. The mere ability to visualize the robot in relation to its current, self-generated goals was crucial to understand what learning currently does, and how well it performs. This was very important, and informative when we started to work on the 3D control of the effector (not just left/right). Similar to our very initial, pure simulation experiments we could make estimates of how accurate the controller already is and see a lot of its properties. This was most important to understand how important, and difficult the narrow and changing actuation limits are.
After all learning could deal with all the problems mentioned above. Just after learning, our controller reached accuracies around 2-3 cm error. Not perfect, but almost enough since the BHA’s flexible fin-gripper does not require millimeter-accuracy to pick up an object. An important aspect of the learned controller is that it is feed-forward. It does not have to wait for delayed sensory feedback to perform a movement, but directly „knows“ how to go to the goal, which allows quite fast movements. This comes with the cost of residual errors. If needed, however, we still managed to incorporate feedback and become more accurate. The very last sequence of the video shows how the controller slowly approaches and grasps the cup. This is done with an additional feedback controller on top of the learning. And yes, this is the low speed possible when feedback comes to play. However, things become much more accurate: we reached accuracies around 6-8 mm, where about 5 mm is the absolute baseline in which the BHA is controllable at all.
It is worth to highlight that all this stuff does not just work in the video. We made some very time-consuming experiments (more than 30 hours robot-time) that show the consistent success of learning, which is backed up by simulation experiments that explicitly show that goal babbling deals with the changing body very effectively. What is even more conclusive is that we made a lot of live-demonstrations in our lab, in all of which learning worked nicely, and actually showed live-learning on the Automatica 2012 trade fair in Munich. On the fair the robot, the learning, and the controller were operated four days for eight hours each.
Festo’s Bionic Handling Assistant is a fancy robot indeed. Its elephant’s trunk-inspired look and nature and its intrinsic safety for interaction are striking features. However, with facing the completely new kinematic structure, its new materials and actuation scheme, and without any kinematic model at all, we came a long way controlling it.
We started with a robot without any model and just really basic pressure control. Bit by bit we added a forward model, a simulation and prediction tool and length control. The key to success though were to use machine learning techniques and especially the infant-inspired Goal Babbling, which turned out to be able to learn the control of the robot really fast.
So now that we can move it properly to grasp and manipulate objects: Which task would you like to see it doing?
- Dr.-Ing. Matthias Rolf, CoR-Lab – Bielefeld University, firstname.lastname@example.org
- Prof. Jochen Steil, CoR-Lab – Bielefeld University, email@example.com
Matthias Rolf is researcher at the Research Institute for Cognition and Robotics at the Bielefeld University, Germany. His main research field is motor learning in developmental robotics.