New research teaches robots to anticipate what’s coming rather than focusing on what’s right in front of them.
What’s new: Santhosh K. Ramakrishnan and colleagues at Facebook and University of Texas at Austin developed Occupancy Anticipation (OA), a navigation system that predicts unseen obstacles in addition to observing those in its field of view. For instance, seeing the corner of a bed, the model evaluates whether a clear path around the bed is likely to exist. The system won the Habitat 2020 PointNav Challenge, which tests a robot’s ability to navigate a complex environment autonomously using only sensory input.
Key insight: The PointNav Challenge supplies a robot with an indoor destination (such as, “two meters west and four north”) often blocked by unknown obstacles, like furniture, outside the line of sight. Knowledge of these obstacles would enable the robot to generate an efficient route to the destination. The next-best thing is predicting their presence.
How it works: OA receives inputs from the robot’s depth sensor, front-facing camera, and state (its position and whether its wheels are turned and moving). It learns to minimize the distance and length of path to the destination. The system incorporates a version of Active Neural Slam (ANS), which won last year’s PointNav Challenge, modified to let OA take advantage of its predictive capabilities.
- Based on input from the depth sensor, an image processing network draws an aerial map of known obstacles. A U-Net extracts features from the map and camera image to predict whether an unseen obstacle lies out of view. For instance, a wall’s edges may be out of view, but the model can estimate, based on past experience, how far away the next door or corner is likely to be.
- On its own, ANS would search for the shortest path to the destination by exploring intermediate locations that help the robot view more of the environment, but OA prefers intermediate locations that help the system predict hidden obstacles. This strategy can decrease the amount of exploration necessary. For instance, if the robot can predict the table’s edges, it doesn’t need to circle the table to confirm that it doesn’t hide a shortcut that goes through it.
- Once OA chooses an intermediate location, it drives the robot there collecting known obstacles along the way. It repeats the process until the robot reaches its destination.
Results: The PointNav Challenge ranks methods according to the metric known as success weighted by path length (SPL), which takes a value between 0 and 1, higher being better. SPL measures the average success rate but penalizes successes resulting from longer paths. OA achieved 0.21 SPL to wallop the second-place ego-localization, which achieved 0.15 SPL.
Why it matters: Reinforcement learning agents must balance exploration and sticking to a known method. Exploration can reveal shortcuts, but they can also waste time. OA offers an elegant solution, since an agent can bypass areas where it predicts unseen obstacles.
We’re thinking: The way Nova drops toys around the Ng residence, even PointNav champs wouldn’t stand a chance.