Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
international conference on learning representations, 2018.
Gradient-based optimization is the foundation of deep learning and reinforcement learning. Even when the mechanism being optimized is unknown or not differentiable, optimization using high-variance or biased gradient estimates is still often the best strategy. We introduce a general framework for learning low-variance, unbiased gradient e...More
PPT (Upload PPT)