A system for automated content organization

A system for automated content organization(2006)

引用 23|浏览2
暂无评分
摘要
The main goal of Information Retrieval (IR) is to facilitate information access from large document collections. Starting from a user's query, usually made in a natural language, a classic IR system retrieves a set of items relevant to the user's query and displays them as a ranked list. Search-engines are examples of IR Systems. They are effective in finding specific items, but search results for less specific information tend to be off-target, overwhelming, and less useful. In this thesis, we report the design, prototyping, and experiences of an experimental system called ACOSys for automated organization of content using menu/folder hierarchies based on a mathematical theory called Formal Concept Analysis (FCA). ACOSys utilizes the concept of FCA and the structure of a hierarchical menu to categorize search results into more specific groups. The resulting items can be found by quickly zeroing in on subfolders where they may reside, saving the effort of browsing through thousands of off-target items. The technical contribution of this thesis consists of the design and implementation of algorithms in three related categories. First, we develop the principle and the coding of a new algorithm for generating concepts and rules. We show by both theoretical and practical study that it is an efficient algorithm for both sparse and dense contexts. Second, we develop an algorithm for maintaining and updating the construction of concept lattices. In comparison to other incremental algorithms, our algorithm not only updates the concept set, but also updates the menu/folder structure; additional items can be added incrementally, and not as an overhaul. Third, instead of using a simple string-match, we provide a semi-automated process for keyword selection, which involves decision-making by a user based on measures such as the word-distribution statistics of a collection. By using the indexing strategy of the Berkeley DB (database system), context sensitive menu hierarchies are constructed in seconds, making ACOSys practical on large number of objects and attributes.
更多
查看译文
关键词
hierarchical menu,incremental algorithm,context sensitive menu hierarchy,concept set,concept lattice,new algorithm,classic IR system,automated content organization,efficient algorithm,IR Systems,search result
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要