Dimensionality Reduction for Low-Latency High-Throughput Fraud Detection on Datastreams.

ICMLA(2019)

引用 0|浏览11
暂无评分
摘要
Given the exponential data growth and the recent focus on understanding high-dimensional "in-motion" data, fundamental machine learning tools, such as Principal Component Analysis (PCA), require computation-efficient streaming algorithms that operate near-real-time. Despite the different streaming PCA flavors, there is no algorithm that provably recovers the principal components in the same precision regime as the batch PCA algorithm does, while maintaining low-latency and high-throughput processing. This work, introduces a novel temporal accumulate / retract learning framework for streaming PCA. We consider the accumulate / retract framework implementation of several competitive PCA algorithms with proven theoretical advantages. We benchmark the improved PCA algorithms on real-world streams (i.e. bank transactions fraud detection) and prove their low-latency (millisecond level) and high-throughput (thousands events/second) processing guarantees.
更多
查看译文
关键词
Stream Processing,PCA,Online Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要