Now in its seventh year, Sonatype’s 2021 State of the Software Supply Chain Report blends a broad set of public and proprietary data to reveal important findings about open source and its increasingly important role in digital innovation.
Open Source Supply, Demand, and Security
Open source supply is growing exponentially.
Currently, the top four open source ecosystems contain a combined 37,451,682 components and packages. These same communities released a combined 6,302,733 new versions of components / packages over the past year and have introduced 723,570 brand new projects in support of 27 million developers worldwide.
Available Supply of Open Source
Increase in Downloads
Year Over Year 2020 – 2021
Open source demand continues to explode.
In 2021, developers around the world will request more than 2.2 trillion open source packages, representing a 73% YoY growth in developer downloads of open source components. Despite the growing volume of downloads, the percentage of available components utilized in production applications is shockingly low.
Vulnerabilities are more common in popular projects.
The top 10% of most popular OSS project versions are 29% likely on average to contain known vulnerabilities. Conversely, the remaining 90% of project versions are only 6.5% likely to contain known vulnerabilities. In combination, these statistics indicate that the vast majority of security research (whitehat and blackhat) is focused on finding and fixing (or exploiting) vulnerabilities in projects that are more commonly utilized.
Vulnerability Release Density Vs. Popularity
2021 Software Supply Chain Statistics
Total Project Versions
Year-Over-Year Download Growth
Ecosystem Project Utilization
Vuln Density for Utilized Versions:
10% Most Popular
Vuln Density for Utilized Versions:
90% Least Popular
High-Profile Software Supply Chain Attacks
Dec 2020-July 2021
Threat actors gained access to SolarWinds dev infrastructure, and injected malicious code into Orion update binaries. 18,000 customers automatically pulled trojanized updates, planting backdoors into their systems and allowing bad actors to exploit private networks at will.
Three days after news broke of an ethical researcher hacking over 35 big tech firms in a novel supply-chain attack, more than 300 malicious copycat attacks were recorded. Within one month, more than 10,000 namespace confusion copycats had infiltrated npm and other ecosystems.
An attacker was able to gain access to a credential via a mistake in how Codecov were building Docker images. This credential then let them modify Codecov’s bash uploader script which was either used directly by customers or via Codecov’s other uploaders like their Github Action. The attacker used this modified script to steal credentials from the CI environments of customers using it.
The weekend after launching, WinGet’s software registry was flooded with pull requests for apps that were either duplicates or malformed. Some newly added duplicate packages were corrupted and ended up overwriting the existing packages, raising serious concerns about the integrity of the WinGet ecosystem.
A ransomware group discovered and exploited a zero-day vulnerability in a remote monitoring and management software platform used by dozens of managed security providers (MSP). Because these MSPs service thousands of downstream customers, the hackers were able to conduct a ransomware attack against 1,500 victims.
Software Supply Chain Attacks Increase 650%
Members of the world’s open source community are facing a novel and rapidly expanding threat that has nothing to do with passive adversaries exploiting known vulnerabilities in the wild — and everything to do with aggressive attackers intentionally tampering with open source projects to infiltrate the commercial software supply chain.
From February 2015 to June 2019, 216 software supply chain attacks were recorded. Then, from July 2019 to May 2020, the number of attacks increased to 929 attacks. However, in the past year, such attacks numbered more than 12,000 and represented a 650% year over year increase.
To accelerate the pace of digital innovation without sacrificing quality or security, engineering and risk management leaders should understand supply, demand, and risk dynamics associated with third-party open source ecosystems. Furthermore, they should carefully define and automatically enforce open source policies across every phase of the souftware supply chain.
Understanding Exemplary Open Source Projects
Some open source projects are definitely better than others. But how do you know? This year we examined three different methods for identifying exemplary open source projects: Sonatype Mean Time to Update (MTTU), OpenSSF Criticality. and Libraries.io Sourcerank. We found that MTTU combined with OpenSSF Criticality are strongly associated with exemplary project outcomes in the areas of security and dev productivity.
Metrics to Use to Assess Relative Quality of an OSS Project
- Sonatype MTTU
- OpenSSF Criticality
- Libraries.io Sourcerank
Sonatype MTTU provides a measure of project quality that is based on how quickly the project moves to update dependencies. Lower (faster) is better. Components that consistently react quickly to dependency upgrades will have lower MTTU. Components that react slowly or have high variance in their update times will have higher MTTU.
OpenSSF Criticality measures a project’s community, usage, and activity. This is distilled into a score that is intended to measure how critical the project is in the open source ecosystem.
Libraries.io Sourcerank aims to measure the quality of software, mostly focusing on project documentation, maturity, and community. It is computed by evaluating a number of yes/no responses such as “Is the project more than six months old?” and a set of numerical questions, such as “How many ‘stars’ does the project have?” These are distilled into a single score, with yes/no questions adding or subtracting a fixed number of “points” and numerical questions being converted into points using a formula, e.g. “log(num_stars)/2.” The current maximum number of points is approximately 30.
Lower MTTU is better.
Components that consistently react quickly to dependency upgrades will have low MTTU. Components that either consistently react slowly or have high variance in their reaction time will have higher MTTU.
Suppose we have a component A with dependencies B and C, both at version 1.2. Suppose B and C each release a new version (1.3) and some time later A releases a new version that bumps the version of B and C to 1.3. The time between the release of B version 1.3 and the release of A version 1.3 is the Time To Upgrade (TTU) for A’s migration to B version 1.3 (and similarly for A’s adoption of C version 1.3). The average of all these upgrade times is then the MTTU.
Expand for more insight.
Aggregate MTTUs are improving over time.
In addition to the number of projects growing over the years, there has been a clear trend toward faster MTTUs. The average MTTU across projects in 2011 was 371 days. In 2014 it was 302 days and by 2018 it was 158 days. In 2021, as of August 1, average MTTU was 28 days – less half of the 73 days the average project took in 2020.
Suppose a project A includes a dependency B, and B has a vulnerability disclosed at date D1. Then A updates the version of B it’s using on date D2. Time to Remediate (TTR) is then the time between D1 and D2 measured in days, and MTTR is the average TTR for a project across all disclosed security vulnerabilities.
Expand for more insight.
MTTU is highly correlated to MTTR.
While MTTU does not directly measure the speed at which projects fix publicly disclosed vulnerabilities, it does correlate to a project’s Mean Time to Remediate (MTTR), which is the time required to update dependencies that have published vulnerabilities. Thus, we consider MTTU to be the best metric available to determine the impact a component will have on the security of projects that incorporate it.
Choosing high quality open source projects should be considered an important strategic decision for enterprise software development organizations.
To avoid stale dependencies and minimize security risks associated with third party open source, software engineering teams should actively embrace projects that consistently demonstrate low mean time to update (MTTU) values and high OpenSSF Criticality scores.
How do your peers manage open
For this year’s report, we examined 4 million real-world dependency management decisions spread across 100,000 applications. Our learnings highlighted below are enlightening.
Despite the growing volume of downloads, the percentage of available components observed in production applications is shockingly low.
On average, production enterprise Java applications utilize 10% of available open source components, and commercial engineering teams actively update only 25% of those components that are utilized.
Active Projects in the Maven Central Repository
5 Groups of Migration Decisions
69% of dependency management decisions are suboptimal.
The average modern application contains 128 open source dependencies, and the average open source project releases 10 times per year. This reality combined with the fact that a few hyper active projects release more than 8,000 times per year, creates a situation in which developers must constantly decide when (and when not to) update third-party dependencies inside of their applications. In light of these circumstances, Sonatype researchers set out to answer the question: are developers making efficient dependency management decisions? We studied 100,000 applications and analyzed more than 4,000,000 component migrations (upgrades) and found that 69% of such decisions were suboptimal.
Despite unstructured decision making, there is evidence of wisdom in the crowd.
The chart below provides a visual summary of herd migration behavior over the past year associated with spring-core, a single component within the highly popular spring-framework. The y-axis shows the past 52 weeks of upgrade activity, with the top row representing herd migration decisions made one year ago, and the bottom row representing herd migration decisions made during the most recent week. The x-axis represents the 150 most recent versions with older versions to the left, and newer versions to the right. View key observations by clicking on the dots below.
Herd Migration Behavior of org.springframework:spring-coreAugust 9, 2020–August 1, 2021
The most recent release (5.3.x) of spring-core releases approximately every 4 weeks.
The project is actively maintaining these 2 releases. Darker shading indicates the majority of the community is using these releases.
The project is no longer actively supporting these releases. Teams should migrate away from these stale versions.
Laggards continue to update to older, unsupported, and even vulnerable versions.
Older versions are vulnerable, and older non vulnerable versions (4.3.15+) will inevitably be subject to new vulnerability disclosures.
The community generally avoids .0 releases and pre-releases.
8 Rules for Upgrading to the Optimal Version
Avoid Objectively Bad Choices
Don’t choose an alpha, beta, milestone, release candidate, etc. version.
Don’t upgrade to a vulnerable version.
Upgrade to a lower risk severity if your current version is vulnerable.
When a component is published twice in close succession, choose the later version.
Avoid Subjectively Bad Choices
Choose a migration path (from version to version) others have chosen.
Choose a version that minimizes breaking code changes.
Choose a version that the majority of population is using.
If all else is tied, choose the newest version.
Passing these rules results in optimal upgrades.
Save time and money.
Intelligent automation that standardizes engineering teams on exemplary open source projects could remove 1.6M hours and $240M of real world waste spread across our sample of 100,000 production applications. Extrapolated out to the entire software industry, the associated savings would be billions.
The Benefit of Intelligent Automation to Dev Teams
Strategies for optimal dependency management: near the edge is best.
The bleeding edge is dangerous. The near edge is optimal. When analyzing herd migration behavior around dependency management practices, we observed three distinct patterns of team behavior: Teams living in dissaray, teams living on the edge, and teams living close to the edge.
Strategies for Dependency Management
Teams living in disarray
Developers working on these teams lack automated guidance. They update dependencies infrequently. When they do update dependencies, they utilize gut instincts and commonly make suboptimal decisions. This approach to dependency management is highly reactive, not scalable, and leads to stale software and increased security risk.
Teams living close to the edge
Developers working on these teams have the benefit of intelligent and contextual automation. Dependencies are automatically recommended for updating, but only when optimal. This type of intelligent automation keeps software fresh without inadvertently introducing wasted effort or increased security risks. This approach is proactive, scalable, and optimal in terms of cost efficiency and quality outcomes.
Teams living on the edge
Developers working on these teams have the benefit of simplistic, but non contextual, automation. Dependencies are automatically updated to the latest version, whether optimal or not. Such automation helps to keep software fresh, but it can inadvertently lead to increased security risks and higher costs associated with unnecessary updates and broken builds. This approach is proactive and scalable, but not optimal in terms of costs or outcomes.
Software engineering teams should strive to standardize dependency management decisions.
Engineering leaders should maximize information available to developers to save time and money.
Engineering leaders should embrace tools to automate intelligent dependency management decisions.
Software Supply Chain Maturity Survey
For this year’s report, we surveyed 702 engineering professionals about software supply chain management practices, including approaches and philosophies to utilizing open source components, organizational design, governance, approval processes, and tooling.
Disconnect Between Perception vs Reality on Software Supply Chain Maturity
Subjectively, survey respondents report they are doing a good job remediating defective components and indicate that they understand where supply chain risk resides. Objectively, research shows development teams lack structured guidance and frequently make suboptimal decisions with respect to software supply chain management.
We plotted all survey responses against the five different stages of software supply chain maturity and found that the majority of respondents were graded less than the “Control” level – which is deemed the point at which an organization transitions from “figuring it out” to a minimal level of maturity that will enable high quality outcomes.
Click on the dots to the right for additional insights.
Software Supply Chain Maturity Score by Theme
5th, 50th, and 95th Percentile
The majority of respondents demonstrate an “Ad Hoc” approach to software supply chain management
The only two themes where the respondents demonstrated a high level of maturity were for Inventory and Remediation.
Comparing survey responses to the objective analysis done, we see a disconnect between what is actually happening, and what people think is happening: 70% of remediations are actually suboptimal.
The survey suggests that respondents have talked themselves into believing they’re doing a good job in areas we see objectively they are not. This is a reminder to be mindful of what you think your organization is doing, versus what’s actually happening and continuously measure your workflow and systems against desired outcomes.
Emergence of Software Supply Chain Regulation and Standards
Following several attacks in 2020 aimed at critical infrastructure, governments around the world began to pursue regulations and standards aimed at improving software supply chain security and hygiene.
The United States
In May 2021, President Biden signed the Executive Order on Improving the Nation’s Cybersecurity, which has been heralded as a milestone for the U.S. government at a time when cyber espionage and nation-state attacks on critical infrastructure are reaching crisis proportions.
Germany passed the Information Technology Security Act 2.0 as an update to the First Act to “increase cyber and information security against the backdrop of increasingly frequent and complex cyber-attacks and the continued digitalisation of everyday life.”
The European Union Agency for Cybersecurity (ENISA) released a July 2021 report titled “Understanding the increase in Supply Chain Security Attacks.” The report reviewed 24 different software supply chain attacks and shared recommendations that organizations should put in place to protect themselves against attacks.
As governments finally recognize the risks associated with unmanaged software supply chains, they are aggressively pursuing mandates that align the software industry with other manufacturing sectors. Pay attention to what’s happening legislatively in your market, get involved in the public conversations and be prepared to make changes to your development practices accordingly.
Dig Deeper and Download the Full Report
Engineers are making a wide variety of digital decisions at every phase of the DevSecOps value stream that they didn’t have to think about just a year ago. Understanding how to optimize those decisions and how they affect the greater software supply chain is paramount to a company’s success.
Dig into the full report for more insights, analysis and guidance around developing optimal software supply chains.