Meta FAIR ships GRASP as gradient-based planning bypasses the sequential rollout trap
By lifting the dynamics constraint and optimizing trajectories in parallel, the new architecture prevents high-dimensional world models from collapsing into local minima during long-horizon physical tasks.
In contact-rich manipulation, the bottleneck is no longer predicting what the part will do next. It is calculating the sequence of joint torques required to get there without the math collapsing. A joint research team from Meta FAIR and UC Berkeley has published GRASP, a gradient-based planner that stops treating learned world models as sequential simulators and instead optimizes the entire physical trajectory in parallel.
The traditional approach to planning with a world model is strictly sequential: roll the model forward step by step and backpropagate the terminal error. But as the horizon stretches past a few seconds of physical time, the computation graph deepens. The Jacobian matrix conditioning scales exponentially with time, leading to vanishing gradients. Furthermore, the optimization landscape becomes non-greedy. If a manipulator needs to back away from a fixture to reposition its grip, a sequential planner will fight the temporary increase in distance to the goal and trap the arm in a local minimum.
GRASP bypasses this by lifting the dynamics constraint. Instead of forcing every intermediate state to strictly obey the physics model during optimization, the planner treats the trajectory as a soft constraint. It optimizes actions and states simultaneously across the entire time horizon. This allows the system to temporarily explore physically impossible midpoints to find a valid, non-monotonic path around an obstacle, before pulling the trajectory back into dynamic feasibility. Crucially, the researchers—including Yann LeCun and Amir Bar—reshaped the gradients to avoid the adversarial brittleness inherent to high-dimensional vision models, bypassing the “dimpled manifold” problem where orthogonal perturbations typically shatter the control policy.
The immediate winners are deployment sites reliant on high-dimensional visual spaces for long-horizon tasks, specifically unstructured bin picking and multi-step assembly where the environment cannot be perfectly fixtured. The losers are traditional backpropagation-through-time architectures, which remain mathematically fragile when tasked with any motion profile that requires a temporary retreat to advance.
The architecture forecloses the assumption that simply scaling a predictive world model automatically yields a capable physical controller. What it opens is a practical method for translating the latent-space reasoning of massive generative models into the physical reality of a factory floor, where the only metric that matters is whether the end-effector reaches the target on time.
