Learning Good Generators for Property-Based Testing in Deductive Program Verifiers

semanticscholar(2022)

引用 0|浏览1
暂无评分
摘要
Deductive Program Verifiers like Dafny and Frama-C have proven themselves both useful and powerful tools in the fight against un-verified software. However, their adoption is hindered by the fact that graduate-level expertise in verification is required in order to verify any non-trivial program. One approach to lowering the barrier to entry for deductive verifiers is to incorporate lighter-weight formal methods into the mix. While SMT based verification is very powerful, its nuances lead to much of the difficulty in using verifiers based on it, and so selectively substituting this verification technique with a less precise but easier-to-use formal method would lower the difficulty while retaining the same workflow and some of the theoretical guarantees. In this report, we investigate the use of Property-Based Testing in place of SMT solving as a core for program verification tools. Property-Based Testing (PBT) [2] is a lightweight formal method wherein the specifications for a program are checked by randomly generating thousands of inputs, and ensuring that the program meets its specification by executing it at those inputs. While PBT doesn’t give full correctness guarantees like SMT solving does, it brings a level of confidence in a manner similar to Model Checking. Moreover, PBT is fully modular, and so it can be used selectively: randomly-tested code and SMT-verified code can coexist in the same file, enabling users to take a gradual verification approach by gradually replacing tested code with verified code. Further, PBT has significant usability benefits over SMT-based verification. A common complaint while using SMT-based verifiers is that it is often difficult to interpret the SMT output. When an SMT solver fails to verify a specification, it can be for one of two reasons: either (a) the specification is incorrect, or (b) the solver was simply unable to find a proof. These two results are often indistinguishable, as the solver may not return a counter-model indicating that the specification is falsifiable — in this case, the only course of action is to attempt to provide more hints to the solver in the hopes that it will eventually prove the property. With PBT, the only failure mode is through counterexamples, and so no interpretation is required. Moreover, the counter-examples in a PBT approach are more easily interpretable. With SMT, counter-examples are encoded in an SMT data format and are not easily lifted back to program values. With testing, the inputs were concrete to begin with, and so counter-examples can be presented as-is. Finally, PBT can actually aid in the development of SMT-verified programs by allowing the user to test specifications to gain confidence that they are true before spending costly time attempting to develop tricky loop invariants. In this report, we develop a simple deductive program verifier based on Dafny which is backed by PBT instead of SMT. In Section 2, we show how PBT can be used in place of an SMT solver to do deductive verification. This process is conceptually simple, but we will see that it hides a major challenge: generating thousands of independent random input sets which satisfy a function’s precondition is NP-hard. In fact, tackling a small variant of problem is the bulk of the content of this report, and its main technical contribution. We give a sketch of this contribution in this section. In brief, our technique first infers a set of “candidate" example-generators from the precondition in question, and then uses a reinforcement-learning-based online learning algorithm to find the best generator among the set. In Section 3, we introduce a subset of a domain-specific language called ALuck based on Luck [5], whose programs are generators. We discuss the semantics of these generators, and give some examples of good and bad generators. In Section 4, we present our algorithm for inferring candidate generators for a precondition. In Section 5, we discuss the online learning technique we employ to discover the best generator among our candidates. Finally, in Section 6, we discuss our implementation of this algorithm, present benchmarks, and discuss future work.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要