A Fast Big Data Collection System Using Mapreduce Framework

2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS)(2014)

引用 25|浏览9
暂无评分
摘要
Social network like a corpus with valuable data, has attracted much attention from a various fields of researchers in recent years, especially in the subject of big data analytics. However, as the foundation, the part of efficient and accurate data collection has not been focused much in the past published works. During the data among the web increasing rapidly, this article will identify two major challenges that traditional distributed based web crawler systems cannot adapt, which is fast handling the big data in social networks and suiting for multiple web sources with a uniformed collecting model. To deal with these two challenges thus to build a foundation of the big data analytics, this article will propose an Ontology based adapted web crawler system called OACM system, which uses MapReduce model to effectively balance the processing resources thus to fasten the processing speed of the collection procedure and designs a uniformed Ontology model to estimate the semantic content of both social networks and collecting tasks to adapt different web sources. During a set of experiments, the proposed OACM system could optimize the system resource scheduling efficiently and could achieve the task of collecting large amount of data from multiple web sources.
更多
查看译文
关键词
Social Network, Big Data Analytics, Web Crawler, Ontology Model, MapReduce
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要