FlashProfile: Interactive Synthesis of Syntactic Profiles.

arXiv: Learning(2017)

引用 24|浏览32
暂无评分
摘要
address the problem of learning comprehensive syntactic profiles for a set of strings. Real-world datasets, typically curated from multiple sources, often contain data in various formats. Thus any data processing task is preceded by the critical step of data format identification. However, manual inspection of data to identify various formats is infeasible in standard big-data scenarios. We present a technique for generating comprehensive syntactic profiles in terms of user-defined patterns that also allows for interactive refinement. define a syntactic profile as a set of succinct patterns that describe the entire dataset. approach efficiently learns such profiles, and allows refinement by exposing a desired number of patterns. Our implementation, FlashProfile, shows a median profiling time of 0.7s over 142 tasks on 74 real datasets. also show that access to the generated data profiles allow for more accurate synthesis of programs, using fewer examples in programming-by-example workflows.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要