A pattern tree-based approach to learning URL normalization rules
WWW, pp. 611-620, 2010.
EI WOS SCOPUS
pattern tree-based approachurl normalizationduplicate urlslocal duplicate pairurl pattern treeMore(8+)
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs to a canonical form using a set of rewrite rules. Nowadays URL normalization has attracted significant attention as it is lightweight and can be flexibly integr...More
Full Text (Upload PDF)
PPT (Upload PPT)