Representational Aspects Of Depth And Conditioning In Normalizing Flows

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139(2021)

引用 21|浏览68
暂无评分
摘要
Normalizing flows are among the most popular paradigms in generative modeling, especially for images, primarily because we can efficiently evaluate the likelihood of a data point. This is desirable both for evaluating the fit of a model, and for ease of training, as maximizing the likelihood can be done by gradient descent. However, training normalizing flows comes with difficulties as well: models which produce good samples typically need to be extremely deep - which comes with accompanying vanishing/exploding gradient problems. A very related problem is that they are often poorly conditioned: since they are parametrized as invertible maps from R-d -> R-d, and typical training data like images intuitively is lower-dimensional, the learned maps often have Jacobians that are close to being singular.In our paper, we tackle representational aspects around depth and conditioning of normalizing flows: both for general invertible architectures, and for a particular common architecture, affine couplings. We prove that Theta(1) affine coupling layers suffice to exactly represent a permutation or 1 x 1 convolution, as used in GLOW, showing that representationally the choice of partition is not a bottleneck for depth. We also show that shallow affine coupling networks are universal approximators in Wasserstein distance if ill-conditioning is allowed, and experimentally investigate related phenomena involving padding. Finally, we show a depth lower bound for general flow architectures with few neurons per layer and bounded Lipschitz constant.
更多
查看译文
关键词
normalizing flows,depth,representational aspects,conditioning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要