Blitz: A Preprocessor for Heuristically Detecting Context- Independent Linguistic Structures

msra

引用 25|浏览11
暂无评分
摘要
etc., which are difficult to analyze with conventional linguistic parsers. Blitz, a heuristic-based natural language preprocessor, has been integrated into the START Natural Language System [4], considerably improving START's ability to analyze real-world sentences. Motivation: Real-world sentences are populated with numerous constructions that do not submit neatly to regu-lar linguistic parsing methods. To handle these constructions, natural language systems typically implement spe-cialized new rules. This leads to a level of complexity which renders development and maintenance difficult. Because these constructions have highly regular forms, and can be largely understood in the absence of context, it is possible to shift the burden of processing away from the primary parser, and onto a simpler, faster, non-linguistic preprocessor. Previous Work: There already exist several systems (such as [7, 9, 3, 2, 6, 8, 10, 1]) which specialize in the extraction of proper nouns and names. However, the focus of the Blitz system differs somewhat from these other systems. Blitz was designed to handle not only proper nouns, but the entire spectrum of special constructions, and to assist in the goal of natural language understanding, as opposed to previous systems'somewhat less ambitious goals of automatic indexing, keyword extraction, and summary generation. Approach: Blitz uses very simple heuristic rules to extract the above-mentioned constructions from free text and return results in a uniform structure. Ultimately, all information is passed back to START (or any natural language system), endowing it with the ability to understand sentences that it otherwise would not be able to understand.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要