The Support Vector Machine (SVM) is one of the most powerful algorithms for machine learning and data mining in numerous and heterogenous application domains. However, in spite of its competitiveness, SVM suffers from scalability problems which drastically worsens its performance in terms of memory requirements and execution time. As a consequence, there is a strong emergence of approaches for supporting SVM in efficiently addressing the aforementioned problems without affecting its classification capabilities. In this scenario, methods for Training Set Selection (TSS) represent a suitable and consolidated pre-processing technique to compute a reduced but representative training dataset, and improve SVM's scalability without deprecating its classification accuracy. Recently, TSS has been formulated as an optimization problem characterized by two objectives (the classification accuracy and the reduction rate) and solved through the application of evolutionary algorithms. However, so far, all the evolutionary approaches for TSS have been based on a so-called multi-objective a priori technique, where multiple objectives are aggregated together into a single objective through a weighted combination. This paper proposes to apply, for the first time, a Pareto-based multi-objective optimization approach to the TSS problem in order to explicitly deal with both its objectives and offer a better trade-off between SVM's classification and reduction performance. The benefits of the proposed approach are validated by a set of experiments involving well-known datasets taken from the UCI Machine Learning Database Repository. As shown by statistical tests, the application of a Pareto-based multi-objective optimization approach improves on state-of-the-art TSS techniques and enhances SVM efficiency.
A multi-objective evolutionary approach to training set selection for support vector machine
Genoveffa Tortora;Autilia Vitiello
2018-01-01
Abstract
The Support Vector Machine (SVM) is one of the most powerful algorithms for machine learning and data mining in numerous and heterogenous application domains. However, in spite of its competitiveness, SVM suffers from scalability problems which drastically worsens its performance in terms of memory requirements and execution time. As a consequence, there is a strong emergence of approaches for supporting SVM in efficiently addressing the aforementioned problems without affecting its classification capabilities. In this scenario, methods for Training Set Selection (TSS) represent a suitable and consolidated pre-processing technique to compute a reduced but representative training dataset, and improve SVM's scalability without deprecating its classification accuracy. Recently, TSS has been formulated as an optimization problem characterized by two objectives (the classification accuracy and the reduction rate) and solved through the application of evolutionary algorithms. However, so far, all the evolutionary approaches for TSS have been based on a so-called multi-objective a priori technique, where multiple objectives are aggregated together into a single objective through a weighted combination. This paper proposes to apply, for the first time, a Pareto-based multi-objective optimization approach to the TSS problem in order to explicitly deal with both its objectives and offer a better trade-off between SVM's classification and reduction performance. The benefits of the proposed approach are validated by a set of experiments involving well-known datasets taken from the UCI Machine Learning Database Repository. As shown by statistical tests, the application of a Pareto-based multi-objective optimization approach improves on state-of-the-art TSS techniques and enhances SVM efficiency.File | Dimensione | Formato | |
---|---|---|---|
post-print-KBS copia.pdf
accesso aperto
Tipologia:
Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza:
Creative commons
Dimensione
670.01 kB
Formato
Adobe PDF
|
670.01 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.