Ph.D. candidate in the Autonomous Learning Group at the Max Planck Institute for Intelligent Systems, Tuebingen.
Our paper Extracting Strong Policies for Robotics Tasks from zero-order trajectory optimizers got accepted for ICLR2021.
The work is a continuation of our last paper iCEM. We added a policy extraction scheme to learn a policy from the optimal trajectories generated by the zero-order CEM optimizer.
Check out the paper for more details.