Streaming Algorithms for Diversity Maximization with Fairness Constraints
IEEE International Conference on Data Engineering (ICDE)(2022)CCF A
School of Data Science and Engineering | Department of Information and Communication Technologies | Department of Computer Science
Abstract
Diversity maximization is a fundamental problem with wide applications in data summarization, web search, and recommender systems. Given a set
$X$
of
$n$
elements, it asks to select a subset
$S$
of
$k\ll n$
elements with maximum diversity, as quantified by the dissimilarities among the elements in S. In this paper, we focus on the diversity maximization problem with fairness constraints in the streaming setting. Specifically, we consider the max-min diversity objective, which selects a subset
$S$
that maximizes the minimum distance (dissimilarity) between any pair of distinct elements within it. Assuming that the set
$X$
is partitioned into
$m$
disjoint groups by some sensitive attribute, e.g., sex or race, ensuring fairness requires that the selected subset
$S$
contains k
i
elements from each group i є [1, m]. A streaming algorithm should process
$X$
sequentially in one pass and return a subset with maximum diversity while guaranteeing the fairness constraint. Although diversity maximization has been extensively studied, the only known algorithms that can work with the max-min diversity objective and fairness constraints are very inefficient for data streams. Since diversity maximization is NP-hard in general, we propose two approximation algorithms for fair diversity maximization in data streams, the first of which is
$\frac{1-\varepsilon}{4}$
-approximate and specific for m = 2, where є E (0,1), and the second of which achieves a
$\frac{1-\varepsilon}{3m+2}$
-approximation for an arbitrary
$m$
. Experimental results on real-world and synthetic datasets show that both algorithms provide solutions of comparable quality to the state-of-the-art algorithms while running several orders of magnitude faster in the streaming setting.
MoreTranslated text
Key words
algorithmic fairness,diversity maximization,max-min dispersion,streaming algorithm
PDF
View via Publisher
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Related Papers
Diversity Maximization in the Presence of Outliers.
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 10 2023
被引用6
Fair Streaming Principal Component Analysis: Statistical and Algorithmic Viewpoint
NeurIPS 2023 2023
被引用4
Data Distribution Tailoring Revisited: Cost-Efficient Integration of Representative Data
The VLDB Journal 2024
被引用0
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
去 AI 文献库 对话