Intoday’sdigitalsociety,electronicinformationisincreasinglyshared among different entities, and decisions are often made based on common at- tributes. In order to address related privacy concerns, the research community has proposed a few cryptographic techniques for limited (i.e., privacy-preserving) sharing of sensitive information. One interesting, but often overlooked, problem occurs when two mutually distrustful parties need to assess the similarity of their information sets, without disclosing their content. This paper presents the first efficient and provably-secure construction for privacy-preserving evaluation of sample set similarity, measured as per the Jaccard similarity index. We propose two efficient protocols: the first securely computes the Jaccard index of two sets, the second approximates it using MinHash techniques, at a significantly lower cost, with the same privacy guarantees. We show that our novel protocols are attractive in many compelling applications, including document similarity, bio- metric authentication, genetic tests, multimedia file similarity. Finally, we demon- strate, both analytically and experimentally, that our constructions are appreciably more efficient than prior work.
EsPRESSo: Efficient Privacy-Preserving Evaluation of Sample Set Similarity
BLUNDO, Carlo;
2013-01-01
Abstract
Intoday’sdigitalsociety,electronicinformationisincreasinglyshared among different entities, and decisions are often made based on common at- tributes. In order to address related privacy concerns, the research community has proposed a few cryptographic techniques for limited (i.e., privacy-preserving) sharing of sensitive information. One interesting, but often overlooked, problem occurs when two mutually distrustful parties need to assess the similarity of their information sets, without disclosing their content. This paper presents the first efficient and provably-secure construction for privacy-preserving evaluation of sample set similarity, measured as per the Jaccard similarity index. We propose two efficient protocols: the first securely computes the Jaccard index of two sets, the second approximates it using MinHash techniques, at a significantly lower cost, with the same privacy guarantees. We show that our novel protocols are attractive in many compelling applications, including document similarity, bio- metric authentication, genetic tests, multimedia file similarity. Finally, we demon- strate, both analytically and experimentally, that our constructions are appreciably more efficient than prior work.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.