The weighted possibilistic c-means algorithm is an important soft clustering technique for big data analytics with cloud computing. However, the private data will be disclosed when the raw data is directly uploaded to cloud for efficient clustering. In this paper, a secure weighted possibilistic c-means algorithm based on the BGV encryption scheme is proposed for big data clustering on cloud. Specially, BGV is used to encrypt the raw data for the privacy preservation on cloud. Furthermore, the Taylor theorem is used to approximate the functions for calculating the weight value of each object and updating the membership matrix and the cluster centers as the polynomial functions which only include addition and multiplication operations such that the weighed possibilistic c-means algorithm can be securely and correctly performed on the encrypted data in cloud. Finally, the presented scheme is estimated on two big datasets, i.e., eGSAD and sWSN, by comparing with the traditional weighted possibilistic c-means method in terms of effectiveness, efficiency and scalability. The results show that the presented scheme performs more efficiently than the traditional weighted possiblistic c-means algorithm and it achieves a good scalability on cloud for big data clustering.

Secure weighted possibilistic c-means algorithm on cloud for clustering big data

Castiglione, Arcangelo;
2018-01-01

Abstract

The weighted possibilistic c-means algorithm is an important soft clustering technique for big data analytics with cloud computing. However, the private data will be disclosed when the raw data is directly uploaded to cloud for efficient clustering. In this paper, a secure weighted possibilistic c-means algorithm based on the BGV encryption scheme is proposed for big data clustering on cloud. Specially, BGV is used to encrypt the raw data for the privacy preservation on cloud. Furthermore, the Taylor theorem is used to approximate the functions for calculating the weight value of each object and updating the membership matrix and the cluster centers as the polynomial functions which only include addition and multiplication operations such that the weighed possibilistic c-means algorithm can be securely and correctly performed on the encrypted data in cloud. Finally, the presented scheme is estimated on two big datasets, i.e., eGSAD and sWSN, by comparing with the traditional weighted possibilistic c-means method in terms of effectiveness, efficiency and scalability. The results show that the presented scheme performs more efficiently than the traditional weighted possiblistic c-means algorithm and it achieves a good scalability on cloud for big data clustering.
2018
File in questo prodotto:
File Dimensione Formato  
secure_preprint.pdf

accesso aperto

Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review)
Licenza: Creative commons
Dimensione 757.96 kB
Formato Adobe PDF
757.96 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4705962
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 36
  • ???jsp.display-item.citation.isi??? 27
social impact