Approximate Distributed Joins in Apache Spark

arXiv: Distributed, Parallel, and Cluster Computing, Volume abs/1805.05874, 2018.

Cited by: 4|Bibtex|Views44
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org

Abstract:

The join operation is a fundamental building block of parallel data processing. Unfortunately, it is very resource-intensive to compute an equi-join across massive datasets. The approximate computing paradigm allows users to trade accuracy and latency for expensive data processing operations. The equi-join operator is thus a natural candi...More

Code:

Data:

Your rating :
0

 

Tags
Comments