DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems

Rupe Adam,Kumar Nalini,Epifanov Vladislav,Kashinath Karthik, Pavlyk Oleksandr, Schlimbach Frank,Patwary Mostofa, Maidanov Sergey,Lee Victor,Prabhat, Crutchfield James P.

arxiv(2019)

引用 12|浏览19
暂无评分
摘要
Extracting actionable insight from complex unlabeled scientific data is an open challenge and key to unlocking data-driven discovery in science. Complementary and alternative to supervised machine learning approaches, unsupervised physics-based methods based on behavior-driven theories hold great promise. Due to computational limitations, practical application on real-world domain science problems has lagged far behind theoretical development. We present our first step towards bridging this divide - DisCo - a high-performance distributed workflow for the behavior-driven local causal state theory. DisCo provides a scalable unsupervised physics-based representation learning method that decomposes spatiotemporal systems into their structurally relevant components, which are captured by the latent local causal state variables. Complex spatiotemporal systems are generally highly structured and organize around a lower-dimensional skeleton of coherent structures, and in several firsts we demonstrate the efficacy of DisCo in capturing such structures from observational and simulated scientific data. To the best of our knowledge, DisCo is also the first application software developed entirely in Python to scale to over 1000 machine nodes, providing good performance along with ensuring domain scientists' productivity. We developed scalable, performant methods optimized for Intel many-core processors that will be upstreamed to open-source Python library packages. Our capstone experiment, using newly developed DisCo workflow and libraries, performs unsupervised spacetime segmentation analysis of CAM5.1 climate simulation data, processing an unprecedented 89.5 TB in 6.6 minutes end-to-end using 1024 Intel Haswell nodes on the Cori supercomputer obtaining 91% weak-scaling and 64% strong-scaling efficiency.
更多
查看译文
关键词
scalable unsupervised physics-based representation learning method,spatiotemporal systems,structurally relevant components,latent local causal state variables,DisCo,physically meaningful coherent structures,observational simulated scientific data,application software,machine nodes,CAM5.1 climate simulation data,coherent spatiotemporal structures,physics-based unsupervised discovery,complex unlabeled scientific data,data-driven discovery,unsupervised physics-based methods,behavior-driven theories,real-world domain science problems,theoretical development,modern supercomputers,high-performance distributed workflow,behavior-driven local causal state theory,unsupervised spacetime segmentation analysis,time 6.6 min
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要