Knock It Off: Profiling The Online Storefronts Of Counterfeit Merchandise

KDD '14: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining New York New York USA August, 2014(2014)

引用 32|浏览94
暂无评分
摘要
We describe an automated system for the large-scale monitoring of Web sites that serve as online storefronts for spam-advertised goods. Our system is developed from an extensive crawl of black-market Web sites that deal in illegal pharmaceuticals, replica luxury goods, and counterfeit software. The operational goal of the system is to identify the affiliate programs of online merchants behind these Web sites; the system itself is part of a larger effort to improve the tracking and targeting of these affiliate programs. There are two main challenges in this domain. The first is that appearances can be deceiving: Web pages that render very differently are often linked to the same affiliate program of merchants. The second is the difficulty of acquiring training data: the manual labeling of Web pages, though necessary to some degree, is a laborious and time-consuming process. Our approach in this paper is to extract features that reveal when Web pages linked to the same affiliate program share a similar underlying structure. Using these features, which are mined from a small initial seed of labeled data, we are able to profile the Web sites of forty-four distinct affiliate programs that account, collectively, for hundreds of millions of dollars in illicit e-commerce. Our work also highlights several broad challenges that arise in the large-scale, empirical study of malicious activity on the Web.
更多
查看译文
关键词
Web page classification,Email spam
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要