The DEF-ATC (Differential Error Feedback - Adapt Then Combine) approach is a novel strategy to address decentralized learning and optimization problems under communication constraints. The strategy blends differential quantization and error feedback to mitigate the negative impact of exchanging compressed updates between neighboring agents. While differential quantization leverages correlations between subsequent iterates, error feedback (which consists of incorporating the compression error into subsequent steps) allows to compensate for the bias caused by compression. In this work, we examine the steady-state mean-square-error performance of the DEF-ATC approach in order to uncover the influence of several factors, including the gradient noise, the network topology, the learning step-size, and the compression schemes, on the network performance. The theoretical findings indicate that, under some general conditions on the compression error, and in the small step-size regime, it is possible to achieve performance levels comparable to those obtained without compression. This implies that, despite using compression techniques to reduce communication overheads, the performance of the decentralized compressed approach can still match that of its uncompressed counterpart, which in turn can match that of centralized learning where all data is aggregated and processed in a centralized manner.

Matching centralized learning performance via compressed decentralized learning with error feedback

Carpentiero M.;Matta V.;
2024

Abstract

The DEF-ATC (Differential Error Feedback - Adapt Then Combine) approach is a novel strategy to address decentralized learning and optimization problems under communication constraints. The strategy blends differential quantization and error feedback to mitigate the negative impact of exchanging compressed updates between neighboring agents. While differential quantization leverages correlations between subsequent iterates, error feedback (which consists of incorporating the compression error into subsequent steps) allows to compensate for the bias caused by compression. In this work, we examine the steady-state mean-square-error performance of the DEF-ATC approach in order to uncover the influence of several factors, including the gradient noise, the network topology, the learning step-size, and the compression schemes, on the network performance. The theoretical findings indicate that, under some general conditions on the compression error, and in the small step-size regime, it is possible to achieve performance levels comparable to those obtained without compression. This implies that, despite using compression techniques to reduce communication overheads, the performance of the decentralized compressed approach can still match that of its uncompressed counterpart, which in turn can match that of centralized learning where all data is aggregated and processed in a centralized manner.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4926498
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact