Flaky tests are software tests that exhibit a seemingly random outcome (pass or fail) despite exercising unchanged code. In this work, we examine the perceptions of software developers about the nature, relevance, and challenges of flaky tests. We asked 21 professional developers to classify 200 flaky tests they previously fixed, in terms of the nature and the origin of the flakiness, as well as of the fixing effort. We also examined developers' fixing strategies. Subsequently, we conducted an online survey with 121 developers with a median industrial programming experience of five years. Our research shows that: The flakiness is due to several different causes, four of which have never been reported before, despite being the most costly to fix; flakiness is perceived as significant by the vast majority of developers, regardless of their team's size and project's domain, and it can have effects on resource allocation, scheduling, and the perceived reliability of the test suite; and the challenges developers report to face regard mostly the reproduction of the flaky behavior and the identification of the cause for the flakiness. Public preprint [http://arxiv.org/abs/1907.01466], data and materials [https://doi.org/10.5281/zenodo.3265785].

Understanding Flaky Tests: The Developer's Perspective

Fabio Palomba;
2019-01-01

Abstract

Flaky tests are software tests that exhibit a seemingly random outcome (pass or fail) despite exercising unchanged code. In this work, we examine the perceptions of software developers about the nature, relevance, and challenges of flaky tests. We asked 21 professional developers to classify 200 flaky tests they previously fixed, in terms of the nature and the origin of the flakiness, as well as of the fixing effort. We also examined developers' fixing strategies. Subsequently, we conducted an online survey with 121 developers with a median industrial programming experience of five years. Our research shows that: The flakiness is due to several different causes, four of which have never been reported before, despite being the most costly to fix; flakiness is perceived as significant by the vast majority of developers, regardless of their team's size and project's domain, and it can have effects on resource allocation, scheduling, and the perceived reliability of the test suite; and the challenges developers report to face regard mostly the reproduction of the flaky behavior and the identification of the cause for the flakiness. Public preprint [http://arxiv.org/abs/1907.01466], data and materials [https://doi.org/10.5281/zenodo.3265785].
2019
978-1-4503-5572-8
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4727974
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 89
  • ???jsp.display-item.citation.isi??? 72
social impact