Breaking Bad? Semantic Versioning and Impact of Breaking Changes in Maven Central

The content presented in this website accompanies the paper “Breaking Bad? Semantic Versioning and Impact of Breaking Changes in Maven Central” authored by Lina Ochoa, Thomas Degueule, Jean-Rémy Falleri, and Jurgen Vinju. The paper was submitted and accepted in the Journal of Empirical Software Engineering (EMSE’21). This study is an external and differentiated replication study of the paper “Semantic Versioning and Impact of Breaking Changes in the Maven Repository” presented by Steven Raemaekers, Arie van Deursen, and Joost Visser.

Research Questions & Hypotheses

Q1: How are semantic versioning principles applied in the Maven Central repository in terms of BCs?
- H1: BCs are widespread without regard for semantic versioning principles.
Q2: To what extent has the adherence to semantic versioning principles increased over time?
- H2: The adherence to semantic versioning principles has increased over time.
Q3: What is the impact of BCs on clients?
- H3: BCs have a significant impact in terms of compilation errors in client systems.

Corpora

Maven Dependency Dataset (MDD): This corpus is a snapshot of Maven Central Repository (MCR). It was introduced by Raemaekers et al. in this paper and it can be downloaded from here.
Maven Dependency Graph (MDG): This corpus is a snapshot of MCR 2018. It was introduced by Benelalla et al. in this paper and it can be found at Zenodo.

Conclusions

Q1: We conclude that although semver principles are not always strictly applied (20.1% of non-major releases are breaking), they are largely followed: 83.4% of all upgrades comply with semver regarding backwards compatibility guarantees, and the differences between semver levels are notable. Not only do minor and patch releases break less often than major releases, they also introduce fewer BCs. This leads us to reject H1.
Q2: Our results confirm the results of the original study. They also show that the improvement over time is much higher than initially reported for the 2005–2011 period. The tendency persists in the 2011–2018 period, although the slope is less steep. Thus, we cannot reject H2.
Q3: We observe that in most cases breaking declarations are not used by client projects, which instead yields a low number of broken clients (7.9% for all releases). The number is even lower in the case of minor and patch upgrades. However, when a breaking declaration is used by a client, there is a high chance that it will be impacted. These results contrast with those of the original study and lead us to reject H3.

Takeaway

The introduction of BCs is inherent to software evolution and cannot always be avoided.
Introducing BCs is tolerable as long as they are properly announced in advance to not take clients by surprise (use versioning conventions or code-level mechanisms).
In MCR, most releases comply with semver requirements and avoid BCs in non-major releases. Besides, the situation has significantly improved over time.
Each ecosystem has its own policy regarding versioning conventions and the treatment of BCs. Client developers should thus pay attention to ecosystem-specific guidelines and pick an ecosystem that advocates a strict policy to minimize the risk of being impacted by unwanted changes.
There is an open opportunity to clarify the role of code-level mechanisms and their relation with semver.
Programming languages can incorporate better mechanisms to delimit public APIs.
Researchers should also strive to design and implement benchmarks to compare tools related to library evolution objectively.
When analysing software evolution, the design of a study protocol and the creation of the underlying datasets should be carefully performed. Sampling bias is a recurrent threat to validity that can hurt the interpretation of the results.

Annexes

Breaking Changes and Detections

In the study, we consider 31 types of breaking changes as reported on the Java Language Specification Java SE 8 Edition, and the blog post “Evolving Java-based APIs” authored by Jim Des Rivières. We rely on japicmp to extract these changes from Java bytecode.

Interface added
Interface removed
Class less accessible
Class no longer public
Class now abstract
Class now final
Class removed
Class type changed
Constructor less accessible
Constructor removed
Field less accessible
Field more accessible
Field no longer static
Field now final
Field now static
Field removed
Field type changed
Method abstract added to class
Method abstract now default
Method added to interface
Method less accessible
Method more accessible
Method new default
Method no longer static
Method now abstract
Method now final
Method now static
Method removed
Method return type changed
Superclass added
Superclass removed

Details on how we detect these breaking changes are provided here.

Maracas

Datasets and Notebooks

The main datasets and notebooks used for this study are contained inn the maven-api-dataset repository (warning: > 2GB). This repository contains:

Jupyter notebooks for every research question and both corpora (MDD and MDG). They can be used to play with the data, re-generate the paper’s figures, and run the statistical tests.
The code used to compute the datasets, deltas, and detections.
CSV files containing the annotations extracted from the top-1000 most popular libraries on Maven Central, the 148,253 artefacts extracted from the MDD, and the most popular version suffixes in the MDG.
CSV files containing the raw data used to answer every research question, automatically generated by our analysis scripts, imported and analyzed in the notebooks.

Maracas The API-client co-evolution project