Lexical Parsing Expression Recognition Schemata

ASWEC(2015)

引用 0|浏览25
暂无评分
摘要
Parsing expression grammars (PEGs) have emerged as a promising substitute for context-free grammars (CFGs) and regular expressions (REs) in programming language specification. The benefits of PEGs are twofold. First, parsing expression grammars replace unordered choice between alternatives by prioritized choice, which naturally solves the ubiquitous "dangling else" problem in grammar definitions. Second, PEGs employ "character-level syntax" specifications that eliminate the need to separate the lexical and hierarchical components of a language specification. However, there is "no free lunch" in PEGs. PEGs capture only syntactic relationships, but many language constructs cannot be parsed without additional semantic information. Moreover, character-level specifications can become unwieldy, as every aspect of the language, including spacing, has to be accounted for. To overcome these issues, we extend the original PEG formalism to incorporate semantic predicates that yield a programmatic means for state-based token recognition control. Furthermore, rather than requiring a single complete specification, we capture lexical components as PEG closures that provide a self-contained token recognition mechanism to reduce the clutter associated with purely character-level PEGs. To test the effectiveness of our approach, we use it for the construction of a Delphi language front-end and practically confirm that Ford's theoretical linear-time result also holds for PEG closures.
更多
查看译文
关键词
parsing expression grammars,semantic predicates,language composition,language processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要