SPINE: Scaling up Programming-by-Negative-Example for String Filtering and Transformation

PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22)(2022)

引用 1|浏览3
暂无评分
摘要
Program synthesis (a.k.a. programming-by-example, PBE) has been deployed in several widely-used commercial products, such as Microsoft Excel, Power BI, and Google Spreadsheet, due to its effectiveness and user-friendliness. It takes a few user-provided positive and negative examples as input and produces a program that is consistent with all the examples, which helps end-users wrangle messy texts without writing any code. In this paper, we focus on two text wrangling tasks, string filtering and transformation. Existing PBE systems for string filtering do not scale well with negative examples. This is because they first explicitly synthesize all the consistent programs and then greedily search a good one in them. However, when there are negative examples, it could take an exponential time and space to synthesize all the exponential number of consistent programs. In contrast, we propose to synthesize all the programs consistent with the positive examples first and then lazily determine whether a program is also consistent with all the negative examples on demand in the search step. For this purpose, we develop a dynamic programming algorithm to search the optimal consistent program. Many programs are never explored during dynamic programming as they are dominated by other better consistent programs. As for string transformation, existing PBE systems do not even support negative examples. Our approach naturally extends to string transformation. Experimental results show that our methods significantly outperformed the state-of-the-art string filtering and transformation approaches and achieved better scalability.
更多
查看译文
关键词
program synthesis, data wrangling, string transformation, filtering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要