Paper 2025/2256

Scalable Private Set Intersection over Distributed Encrypted Data

Seunghun Paik, Hanyang University
Nirajan Koirala, University of Notre Dame
Jack Nero, University of Notre Dame
Hyunjung Son, Hanyang University
Yunki Kim, Hanyang University
Jae Hong Seo, Hanyang University
Taeho Jung, University of Notre Dame
Abstract

Finding intersections across sensitive data is a core operation in many real-world data-driven applications, such as healthcare, anti-money laundering, financial fraud, or watchlist applications. These applications often require large-scale collaboration across thousands or more independent sources, such as hospitals, financial institutions, or identity bureaus, where all records must remain encrypted during storage and computation, and are typically outsourced to dedicated/cloud servers. Such a highly distributed, large-scale, and encrypted setting makes it very challenging to apply existing solutions, e.g., (multi-party) private set intersection (PSI) or private membership test (PMT). In this paper, we present Distributed and Outsourced PSI (DO-PSI), an efficient and scalable PSI protocol over outsourced, encrypted, and highly distributed datasets. Our key technique lies in a generic threshold fully homomorphic encryption (FHE) based framework that aggregates equality results additively, which ensures high scalability to a large number of data sources. In addition, we propose a novel technique called \textit{nonzero-preserving mapping}, which maps a zero vector to zero and preserves nonzero values. This allows homomorphic equality tests over a smaller base field, substantially reducing computation while enabling higher-precision representations. We implement DO-PSI and conduct extensive experiments, showing that ours substantially outperforms existing methods in both computation and communication overheads. Our protocol handles a billion-scale set distributed and outsourced to a thousand data owners within one minute, directly reflecting large-scale deployment scenarios, and achieves up to an 11.16$\times$ improvement in end-to-end latency over prior state-of-the-art methods.

Note: Full Version

Metadata
Available format(s)
PDF
Category
Cryptographic protocols
Publication info
Published elsewhere. Minor revision. ASIACCS'26
DOI
https://doi.org/10.1145/3779208.3785272
Keywords
Private Set IntersectionFully Homomorphic EncryptionEncrypted Datasets
Contact author(s)
whitesoonguh @ hanyang ac kr
nkoirala @ nd edu
jnero @ nd edu
dk9050rx @ hanyang ac kr
yunki @ hanyang ac kr
jaehongseo @ hanyang ac kr
tjung @ nd edu
History
2025-12-18: approved
2025-12-16: received
See all versions
Short URL
https://ia.cr/2025/2256
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2025/2256,
      author = {Seunghun Paik and Nirajan Koirala and Jack Nero and Hyunjung Son and Yunki Kim and Jae Hong Seo and Taeho Jung},
      title = {Scalable Private Set Intersection over Distributed Encrypted Data},
      howpublished = {Cryptology {ePrint} Archive, Paper 2025/2256},
      year = {2025},
      doi = {https://doi.org/10.1145/3779208.3785272},
      url = {https://eprint.iacr.org/2025/2256}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.