The Support Vector Machine (SVM) is one of the most powerful algorithms for machine learning and data mining in numerous and heterogenous application domains. However, in spite of its competitiveness, SVM suffers from scalability problems which drastically worsens its performance in terms of memory requirements and execution time. As a consequence, there is a strong emergence of approaches for supporting SVM in efficiently addressing the aforementioned problems without affecting its classification capabilities. In this scenario, methods for Training Set Selection (TSS) represent a suitable and consolidated pre-processing technique to compute a reduced but representative training dataset, and improve SVM's scalability without deprecating its classification accuracy. Recently, TSS has been formulated as an optimization problem characterized by two objectives (the classification accuracy and the reduction rate) and solved through the application of evolutionary algorithms. However, so far, all the evolutionary approaches for TSS have been based on a so-called multi-objective a priori technique, where multiple objectives are aggregated together into a single objective through a weighted combination. This paper proposes to apply, for the first time, a Pareto-based multi-objective optimization approach to the TSS problem in order to explicitly deal with both its objectives and offer a better trade-off between SVM's classification and reduction performance. The benefits of the proposed approach are validated by a set of experiments involving well-known datasets taken from the UCI Machine Learning Database Repository. As shown by statistical tests, the application of a Pareto-based multi-objective optimization approach improves on state-of-the-art TSS techniques and enhances SVM efficiency.

A multi-objective evolutionary approach to training set selection for support vector machine

Genoveffa Tortora;Autilia Vitiello
2018-01-01

Abstract

The Support Vector Machine (SVM) is one of the most powerful algorithms for machine learning and data mining in numerous and heterogenous application domains. However, in spite of its competitiveness, SVM suffers from scalability problems which drastically worsens its performance in terms of memory requirements and execution time. As a consequence, there is a strong emergence of approaches for supporting SVM in efficiently addressing the aforementioned problems without affecting its classification capabilities. In this scenario, methods for Training Set Selection (TSS) represent a suitable and consolidated pre-processing technique to compute a reduced but representative training dataset, and improve SVM's scalability without deprecating its classification accuracy. Recently, TSS has been formulated as an optimization problem characterized by two objectives (the classification accuracy and the reduction rate) and solved through the application of evolutionary algorithms. However, so far, all the evolutionary approaches for TSS have been based on a so-called multi-objective a priori technique, where multiple objectives are aggregated together into a single objective through a weighted combination. This paper proposes to apply, for the first time, a Pareto-based multi-objective optimization approach to the TSS problem in order to explicitly deal with both its objectives and offer a better trade-off between SVM's classification and reduction performance. The benefits of the proposed approach are validated by a set of experiments involving well-known datasets taken from the UCI Machine Learning Database Repository. As shown by statistical tests, the application of a Pareto-based multi-objective optimization approach improves on state-of-the-art TSS techniques and enhances SVM efficiency.
2018
File in questo prodotto:
File Dimensione Formato  
post-print-KBS copia.pdf

accesso aperto

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Creative commons
Dimensione 670.01 kB
Formato Adobe PDF
670.01 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4706191
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 34
  • ???jsp.display-item.citation.isi??? 33
social impact