
UR5 performing a bottle-into-socket insertion in simulation. The control policy is trained inside a differentiable physics simulator and produces smooth, sub-millimetre-accurate trajectories that adapt in real time to moving targets such as bottles arriving on a conveyor.
Differentiable-Physics Motion Planning is an AI training method for industrial robot arms that learns smooth, sub-millimetre-accurate motion trajectories directly from a physics simulator with full gradient backpropagation. Instead of the trial-and-error search used by reinforcement learning, it exploits the gradients of the simulator itself, giving stable, high-speed training — roughly one to two orders of magnitude fewer steps than RL — and a compact neural controller that computes the next joint action in 0.3 ms, fast enough to track moving targets without manual part feeding. The method was validated in simulation on a bottle-into-socket insertion task with a Universal Robots UR5 arm, in both static- and moving-target scenarios.
Training runs inside a fully differentiable rigid-body simulator (Nimble Physics) in which every step — kinematics, dynamics, contact and joint limits — is differentiable with respect to the controller's parameters. The controller is a small feed-forward neural network (one hidden layer of 256 units) that takes the current joint state, the target position and velocity, and a normalised time index, and outputs joint actions. The simulator unrolls the resulting trajectory (typically 30 steps at 10 ms per step), and a single composite, differentiable loss combines a smoothness term (sum of squared joint velocities), start- and goal-state errors (Cartesian distance plus orientation alignment), and non-penetration constraints. Gradients are backpropagated through the entire unrolled simulation and the weights are updated with the Adam optimiser. The fastest feasible motion is found by iteratively shrinking the trajectory length until the constraints can no longer be satisfied, and state noise is injected during training to improve robustness and prepare for sim-to-real transfer.
#Training dynamics in both scenarios — loss falls quickly and stays stable.*
Key features:
The method operates in the following stages:
This method is intended for:
Within the AIMS5.0 context, the method targets high-volume production lines that still rely on manual feeding because conventional automation is too rigid to handle variability in object pose, conveyor speed and target geometry. By training accurate, smooth, real-time motion controllers quickly — and re-training them quickly when products change — it supports the flexible, high-mix automation envisioned for Industry 5.0.
If you use this method in research or development, please cite the corresponding paper.
@article{leja2025differentiable,
title={Differentiable Physics Training Method for Robot Motion Planning},
author={Leja, Laura and Strazds, Guntis Vilnis and Chronis, Christos and Varlamis, Iraklis and Freivalds, Kārlis},
journal={IFAC-PapersOnLine},
volume={59},
number={27},
pages={220--225},
year={2025},
publisher={Elsevier}
}
Developed within AIMS5.0 — Advancing Integrated Manufacturing Systems (Industry 5.0), supported by the Chips Joint Undertaking and its members, with top-up funding from National Funding Authorities of participating countries.