Planning with Diffusion for Flexible Behavior Synthesis

Janner*   Du*   Tenenbaum   Levine
ICML 2022 (long talk)   Paper  Code  BibTex
*equal contribution

Planning as denoising

Diffuser is a denoising diffusion probabilistic model that plans by iteratively refining randomly sampled noise. The denoising process lends itself to flexible conditioning, by either using gradients of an objective function to bias plans toward high-reward regions or conditioning the plan to reach a specified goal.




replay
Variable-length planning

Diffuser's planning horizon is determined by the size of the random noise used to initialize the denoising process.



Flexible behavior synthesis

Diffuser acts as an unconditional prior over possible behaviors. We can plan for new test-time tasks by guiding its sampled plans with reward functions or constraints. All of the plans below are executed by a single model.


Unconditional stacking: Maximize the height of a block tower, with no further constraints.


Conditional stacking: Stack towers subject to test-time constraints.

replay
Planning with Diffusion
for Flexible Behavior Synthesis
ICML 2022   Paper  Code  Colab  BibTex 
*equal contribution