Chrome Extension
WeChat Mini Program
Use on ChatGLM

Smart Approach to Reduce the Web Crawling Traffic of Existing System using HTML based Update File at Web Server

International Journal of Computer Applications(2010)

Cited 2|Views0
No score
Abstract
Web crawler is used for downloading information from web. Web pages are changed without any notice. Web crawler frequently revisits websites to check updates. It is expected that 40% of present internet traffic is because of web crawling. In this paper we propose a file which maintains the list of updated URLs of web pages of web site. Format of file is based on HTML. Crawler will only visit the UPDATE File, and need not have to revisit the full website to know the updates. This scheme can easily implement on today’s system with little modification on web application and web crawler. In simulator we test proposed method; using a website of 13 pages for experiment. Experiment results shows that this scheme is very promising.
More
Translated text
Key words
web crawling traffic,update file,html
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined