On the stability of gradient descent with second order dynamics for time-varying cost functions
Transactions on Machine Learning Research, 2025
This work explores the usage of end-to-end differentiable simulators for failure discovery and repair, combined with a gradient-based sampling approach (Metropolis-Adjusted Langevin Algorithm aka MALA) for falsification and repair of data-driven policies in autonomous systems. We provide end-to-end differentiable simulators written in JAX for manipulation and self-driving tasks, and validate our sampling based algorithms for failure discovery and repair, with higher failure discovery and repair rate compared to baselines such as Learning to Collide (L2C), REINFORCE, vanilla gradient-based optimization and black-box sampling.
Recommended citation: T.E. Gibson, S. Acharya, A. Parashar, J.E. Gaudio, A.M. Annaswamy (2025). "On the stability of gradient descent with second order dynamics for time-varying cost functions; Transactions on Machine Learning Research.
Download Paper
