Bias mitigation in soft facial attribute recognition remains a critical and underexplored challenge in the development of fair and accountable Artificial Intelligence (AI) systems. This work introduces a fully automated and reproducible pipeline for the controlled generation of synthetic facial images under demographic constraints. The framework integrates prompt-driven diffusion models with latent-space generative architectures to synthesize demographically balanced datasets, followed by a refinement stage and an automated annotation process that jointly provide facial attributes and demographic labels for fine-grained fairness analysis. Although designed to address disparities across gender, age, and ethnicity, the present study focuses on ethnicity-related bias as a representative case study, with experiments conducted on hair color prediction. Evaluations conducted on the MAAD-Face benchmark with Slim-CNN, a lightweight yet competitive architecture, show that the proposed approach improves fairness metrics in hair color classification while maintaining competitive accuracy. The results highlight the potential of targeted synthetic augmentation to mitigate structural demographic imbalances, offering a generalizable strategy for advancing equity in biometric applications and supporting the design of more transparent and trustworthy AI systems.
Synthetic Data for Fairness: Bias Mitigation in Facial Attribute Recognition
Cascone, Lucia;Maio, Marco Di;Loia, Vincenzo;Nappi, Michele;Pero, Chiara
2025
Abstract
Bias mitigation in soft facial attribute recognition remains a critical and underexplored challenge in the development of fair and accountable Artificial Intelligence (AI) systems. This work introduces a fully automated and reproducible pipeline for the controlled generation of synthetic facial images under demographic constraints. The framework integrates prompt-driven diffusion models with latent-space generative architectures to synthesize demographically balanced datasets, followed by a refinement stage and an automated annotation process that jointly provide facial attributes and demographic labels for fine-grained fairness analysis. Although designed to address disparities across gender, age, and ethnicity, the present study focuses on ethnicity-related bias as a representative case study, with experiments conducted on hair color prediction. Evaluations conducted on the MAAD-Face benchmark with Slim-CNN, a lightweight yet competitive architecture, show that the proposed approach improves fairness metrics in hair color classification while maintaining competitive accuracy. The results highlight the potential of targeted synthetic augmentation to mitigate structural demographic imbalances, offering a generalizable strategy for advancing equity in biometric applications and supporting the design of more transparent and trustworthy AI systems.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


