Why Scanners Fail in Practice: Lessons from the Shai-Hulud Attacks on NPM

9.12.2025 | 8 minutes reading time

2025 marks the year supply chain security stopped being a theoretical risk and became a practical nightmare for anyone managing a package.json file. The recent attack waves on the NPM ecosystem demonstrated this vividly, turning trusted libraries into attack vectors that compromised pipelines before the code even hit production.

First, the compromise of several popular NPM packages including chalk and debug showed how easy one phishing attack on a single developer can have widespread implications. Only a week later, the first Shai-Hulud wave introduced a self-replicating worm and large-scale credential stealing from developer machines. Recently, Shai-Hulud 2 (a.k.a. Sha1-Hulud) carried this strategy to the extremes and added the perfidious behavior of deleting user files in case of a takedown attempt.

How can development and infosec teams tackle this situation and identify compromises? When discussing this question, Software Composition Analysis (SCA) through dependency scanning and/or Software Bills of Materials (SBOMs) usually come up. Once you know all your dependencies, it should be easy to automatically identify compromised ones and start the mitigation process from there.

There are some inherent limitations of this approach in the case of Shai-Hulud:

The malicious payload does not need to be deployed as part of a release, but runs during the build process. This means that any execution can be a risk, including CI pipelines for any branch. However, SCA often only runs on the main branch or release artifacts.
Since the malware's targets include developer machines, it only needs to be installed locally, far removed from the environments where SCA typically runs.

Despite these limits, one would expect composition analysis to serve as one line of defense and provide valuable insights into the propagation of the compromised packages.

Evaluating dependency scanning tools

When running Trivy on an allegedly affected project, I was surprised that it could not identify any issues. I then did a poll on Mastodon to figure out if I was missing something:

Of the 50 participants, the majority (68 %) shared my own expectations.

This caused us to dig deeper and investigate the behavior of the most popular open-source SCA and dependency scanning tools: Besides Trivy by Aqua Security, this includes Grype by Anchore, OSV-Scanner by Google, and OWASP Dependency-Track. Due to their restricted availability, we did not investigate commercial tools such as Snyk, Aikido, or GitLab Ultimate Dependency Scanning.

For this comparison, I set up a demo project with affected versions of ansi-regex and kill-port:

1{
2  "dependencies": {
3    "ansi-regex": "6.2.1",
4    "kill-port": "2.0.2"
5  }
6}

ansi-regex was affected by the first major wave of NPM package compromises in September 2025, while kill-port was compromised as part of Shai-Hulud 2.

Since those versions had quickly been removed from NPM (and, of course, contain malware), we did not actually install them, but set up fake package.json and package-lock.json files. We verified that all scanners picked up these fake versions from the metadata files.

Unless otherwise noted, the default configurations of all tools were used. All tests were performed with current vulnerability information as of 2025-12-03.

Trivy

As mentioned above, Trivy did not identify any issues. This held true whether scanning the project directly (trivy fs) or scanning a pre-generated SBOM (trivy sbom):

Report Summary

┌───────────────────┬──────┬─────────────────┐
│      Target       │ Type │ Vulnerabilities │
├───────────────────┼──────┼─────────────────┤
│ package-lock.json │ npm  │        0        │
└───────────────────┴──────┴─────────────────┘

Grype

Grype did find both issues as Critical when run on the SBOM generated with Trivy (grype bom.json):

NAME        INSTALLED  TYPE  VULNERABILITY        SEVERITY  EPSS  RISK
ansi-regex  6.2.1      npm   GHSA-jvhh-2m83-6w29  Critical  N/A   N/A
kill-port   2.0.2      npm   GHSA-3j2r-p9f6-rw66  Critical  N/A   N/A

Executing it directly on the project folder (grype .) also yielded the issues, but also many false-positives:

NAME                   INSTALLED  FIXED IN  TYPE  VULNERABILITY        SEVERITY  EPSS           RISK
json5                  1.0.1      1.0.2     npm   GHSA-9c47-m6qq-7p4h  High      37.3% (97th)   27.2
json5                  2.2.1      2.2.2     npm   GHSA-9c47-m6qq-7p4h  High      37.3% (97th)   27.2
trim-newlines          1.0.0      3.0.1     npm   GHSA-7p7h-4mm5-852v  High      1.3% (78th)    0.9

# [...]

ansi-regex             6.2.1                npm   GHSA-jvhh-2m83-6w29  Critical  N/A            N/A
kill-port              2.0.2                npm   GHSA-3j2r-p9f6-rw66  Critical  N/A            N/A

Note the high-severity alerts for json5 and trim-newlines. These (and many other omitted ones) are present because Grype descended into the node_modules directory, read the package.json metadata of all modules, and incorrectly identified their devDependencies as part of our project. This could probably be fixed through configuration, but we stuck to the default and did not bother with that.

OSV-Scanner

OSV-Scanner scored (almost) perfectly, identifying the issues both directly in the project directory as well as on the SBOM from Trivy. However, it could not provide information on the criticality or fixed versions:

╭─────────────────────────────────┬──────┬───────────┬────────────┬─────────┬───────────────┬──────────╮
│ OSV URL                         │ CVSS │ ECOSYSTEM │ PACKAGE    │ VERSION │ FIXED VERSION │ SOURCE   │
├─────────────────────────────────┼──────┼───────────┼────────────┼─────────┼───────────────┼──────────┤
│ https://osv.dev/MAL-2025-46966  │      │ npm       │ ansi-regex │ 6.2.1   │ --            │ bom.json │
│ https://osv.dev/MAL-2025-191116 │      │ npm       │ kill-port  │ 2.0.2   │ --            │ bom.json │
╰─────────────────────────────────┴──────┴───────────┴────────────┴─────────┴───────────────┴──────────╯

OWASP Dependency-Track

Finally, let's have a look at OWASP Dependency-Track. In contrast to the other tools, it is not invoked from the command line, but runs as a web application. It also cannot perform its own composition analysis, but always needs to be provided with an existing SBOM. For this purpose, I once again used the SBOM file from Trivy.

In its default configuration, Dependency-Track could not identify any vulnerabilities:

However, Dependency-Track allows you to configure its data sources for vulnerability information. Besides the built-in NVD CVE feed, the openly available additional options include GitHub Advisories and Open Source Vulnerabilities (OSV, the data source behind OSV-Scanner).

Enabling GitHub Advisories yielded no changes and still, no vulnerabilities were detected. While the OSV data source is only in Beta state, enabling it actually made a difference:

Exploring data sources

What are the different sources of vulnerability information typically accessed by dependency scanning tools?

The most well-known is NVD's Common Vulnerabilities and Exposures program, a.k.a. the CVE database. In addition to its ongoing general data quality issues, there is a major catch here for compromised NPM packages: For (almost?) all of them, nobody bothered to issue CVEs! We can therefore rule out this data source for identifying Shai-Hulud and the likes.

GitHub, which also happens to run the NPM package registry, provides its own Security Advisory database (GHSA). Whenever GitHub removed a compromised package version, they also issued a respective advisory for it. However, those are special Malware advisories and therefore well-hidden: In order to find them, you have to add the special filter type:malware to your search query.

GitHub's stated reasoning is as follows:

Our malware advisories are mostly about substitution attacks. During this type of attack, an attacker publishes a package to the public registry with the same name as a dependency that users rely on from a third party or private registry, with the hope that the malicious version is consumed. [...] Users who have their dependencies appropriately scoped should not be affected by malware.

While this makes sense for substitution attacks, it falls apart once real, trustworthy packages get compromised with malware.

Malware advisories are not returned from the GHSA API by default, which is probably the reason why they are not picked up by OWASP Dependency-Track and Trivy (which also uses GHSAs as its data source for NPM packages). Grype appears to handle this differently and includes the malware advisories by default.

The third major data source is Google's Open Source Vulnerabilities (OSV) project. While it "only" aggregates vulnerability information from other sources, including GHSA, it does include GitHub malware advisories in its main feed. From there, the information does find its way to OSV-Scanner and (optionally) OWASP Dependency-Track.

Some might argue that malware infections should not be part of vulnerability databases, since they are not exploitable vulnerabilities, but rather instances where the compromise has already occurred in the supply chain. We view this as a purely theoretical distinction. This is supported by the fact that there is indeed a CWE (Common Weakness Enumeration, a categorization system from the CVE ecosystem) entry for replicating malicious code.

Ultimately, these dependencies are blatantly insecure. Of course, we want to learn about them from a security vulnerability feed! If anything, the risk is higher than that of a vulnerable, but unexploited package.

Endpoint protection to the rescue?

In light of the limitations discussed initially and dependency scanners being a somewhat mixed bag, should we move to another line of defense? After all, in the Shai-Hulud waves, a huge part of the risk stems from the malware getting installed on developer machines. Couldn't an endpoint protection solution (antivirus/EDR) identify it and prevent further damage?

To verify this idea, I did another poll on Mastodon, where the majority once again confirmed it:

As endpoint protection is notoriously hard to test, we had a look at the VirusTotal results for the well-known payloads of Shai-Hulud 2. VirusTotal primarily checks static signatures, whereas an advanced EDR would ideally catch the malicious behavior. However, if the file signature isn't even flagged, the first line of defense is already broken. After all, detection in this case basically comes down to spotting files with a few, well-known checksums.

This was the result for one of the malicious files as of 2025-11-26, two days after the initial detection:

When looking at the live results now, detection rates have improved slightly, but several major players are still not detecting it.

Things looked even more dire for another one of the malicious payloads on 2025-11-26:

In this case, live results have also improved, but as of now, the malware is still not detected by around half of the scanners.

Conclusion

While GitHub as the operator of NPM was quick to remove package versions affected by large-scale compromises (at least in most instances), the information sharing around it leaves room for improvement – from notes on the packages' NPM pages to issuing (the right kind of) security advisories. This makes the use of dependency scanners less reliable than it could be, making detection heavily dependent on the specific data sources used.

We were surprised to learn how many scanners could not identify the compromises, be it at the software composition or the endpoint layer. While the right scanners usually identify the issues or at least can be configured to do so, that is not without its pitfalls and not every product delivers a satisfying result. This is particularly concerning given the large impact and widespread attention for the recent NPM compromises. One can only wonder what happens in case of more subtle attacks with less public attention.

Was this post helpful?

Blog author

Felix Dreißig

Information Security Specialist

Do you still have questions? Just send me a message.

Where Vibe Coding helps—and where it doesn't: A field report

Vibe Coding is a programming approach that delegates virtually every task involved in working with source code—from understanding to creation to modification—to a GenAI, placing almost complete trust in the output of these kinds of AI. Based on a recent...

Generative AI
Software Modernization
IT-Security

20.10.2025 | 10 minutes reading time

Patrick Krings

Dr. Florian Rademacher

Seven Ways to Replace Kaniko in your Container Image Builds

Modern CI pipelines have largely abandoned specialized, custom-configured build servers in favor of reproducible, code-defined environments. In GitLab CI and other popular platforms, this is provided through a container-based approach, where each CI ...

Container
CI/CD
GitLab
Kubernetes
Linux
DevOps
DevSecOps

11.9.2025 | 19 minutes reading time

Felix Dreißig

Full control despite virus protection and modern systems – How to truly...

Recently, codecentric's security experts were tasked with testing the IT infrastructure security of a company with several hundred employees. The clients believed they were secure: The systems were running on the latest version of Windows 11 and Windows...

IT-Security
Infrastructure

2.7.2025 | 6 minutes reading time

How to Catch the Good Guys: My Learnings on Recruiting IT Security Professionals...

In 2024, I embarked on the journey to become a recruiter for an IT Security Consulting team. I thought, “How hard can it be?” I had already been a recruiter for over 10 years, focusing predominantly on software developers, and I imagined my new task ...

IT-Security
HR

13.6.2025 | 4 minutes reading time

Christine Seagar

Relative path DLL hijacking in Windows programs

As part of a Red Team assessment, a challenge arose to execute our own code via a DLL. The reason for this scenario was the use of Application Allow Listing software, which blocks the execution of unknown executables. The usual options for loading DLLs...

IT-Security

24.3.2025 | 4 minutes reading time

Timo Sablowski

Self-issued JWT for mobile client authentication

Overview Mobile applications frequently authenticate their backend calls via JWT. These tokens are frequently used in conjunction with OIDC to authenticate a user. Sometimes, particularly in high-assurance scenarios, it can be preferable to authenticate...

IT-Security
Mobile
Rust
Kotlin
Android

4.2.2025 | 8 minutes reading time

Elisabeth Schulz

How we can hack an AI with just a few words

How we can hack an AI with just a few words Artificial intelligence (AI) has undergone an astonishing transformation in recent years and is now present in many areas of life. Whether in the form of chatbots that help us with everyday questions or generative...

IT-Security
AI

27.1.2025 | 4 minutes reading time

Spring and Vue - A setup for small projects (Part 2)

In the first part we presented a setup for a combination of Spring Boot and Vue.js. Now we have to look at how to connect two type-safe languages, TypeScript for the frontend and Java for the backend, through a REST-API and in a type-safe manner. We ...

Spring
Frontend
API
JavaScript
Java

17.1.2025 | 10 minutes reading time

Roger Butenuth

Nils Winking

Spring and Vue - A setup for small projects (Part 1)

Quickly adding a new Vue.js application to an existing Spring Boot project should be pretty easy, or at least a googleable problem, or so we thought. But in the end, it wasn't. However, with the right combination of configuration, components, and some...

Spring
Frontend
JavaScript
Java
API

10.1.2025 | 8 minutes reading time

Roger Butenuth

Nils Winking

Dangling DNS in cloud infrastructures

Dangling DNS entries are nothing new. Forgotten, outdated or incorrect DNS records can lead to subdomains being taken over and used in phishing campaigns, for example, to steal employee secrets. Due to dynamic IP addresses of rapidly changing resources...

IT-Security
Validation
Cloud
AWS
Infrastructure

5.9.2024 | 4 minutes reading time

Markus Höfer

React is dead, long live React - React 19 is here

The world of frontend development has changed once again, and this time React 19 is leading the way. This version brings a variety of new features and improvements, but the most exciting innovation is the brand new compiler, which already requires React...

React
Frontend
Software development
JavaScript
Webdevelopment

19.7.2024 | 6 minutes reading time

Michel Ehmen

Server Actions in Next.js 14

Server Actions were introduced in Next.js 14 as a new method to send data to the server (see the documentation). They are asynchronous functions that can be used in server components, within server-side forms, as well as in client-side components. While...

Webdevelopment
React
JavaScript

10.6.2024 | 9 minutes reading time

Lukas Lehmann

Zero Trust Azure Identity & Access Architecture

Falko Lehmann and Hendrik Kamp have already explained in their blog post on Zero-trust Architecture why zero-trust security models are preferable to traditional perimeter security models in order to minimize damage from cyber attacks. Falko and Hendrik...

IT-Security
IAM
Azure
Software architecture

4.6.2024 | 14 minutes reading time

A/B Testing: Tool support and testing GrowthBook

In the previous blog post we introduced some general concepts of A/B testing: we explored the main aspects, defined test types and explained the most common statistical methods. Now we want to explore the areas in which A/B testing tools can provide...

Testing
Python
Data
UX/UI
Analysis
JavaScript

18.3.2024 | 20 minutes reading time

Francesca Diana

Zero-trust architecture – Why we need to end perimeter-based security

Introduction This article will help you understand the importance of zero-trust architecture and why it is the state of the art to protect your organization from cyberattacks. We see it as fundamental knowledge for solution and system architects to consider...

IT-Security
Networking

29.9.2023 | 9 minutes reading time

Hendrik Kamp

Building desktop apps with web technologies

Building desktop apps with web technologies In this article I share insights into Electron and what to consider when shipping an desktop app with Electron. After that I introduce you to a new alternative called Tauri. It the end I provide an estimation...

Frontend
JavaScript
Node.js
Open Source
Webdevelopment

20.9.2023 | 13 minutes reading time

Fighting Gandalf with magic spells (the spells are prompt injections) ...

Note: Do not attack any systems for which you do not have explicit permission to do so. In this article, I will recount the tale of outwitting a large language model by performing prompt injection attacks. Before we start, let's establish a common baseline...

IT-Security
AI

10.7.2023 | 12 minutes reading time

Michael Wagner

Charge your APIs Volume 9: Perfecting APIOps - API Monitoring with Checkly

Over the past series of blog posts, we've been exploring the fascinating world of API Operations (APIOps), diving deep into Continuous Integration, Continuous Deployment, load testing, API diffing, and API Portals and Marketplaces. We've built a robust...

GitHub
API
CI/CD

5.7.2023 | 3 minutes reading time

Daniel Kocot

Charge your APIs Volume 8: Expanding APIOps - API Portals and Marketplaces

In our previous blog posts, we've taken an exciting journey through the world of API Operations (APIOps), exploring concepts like Continuous Integration, Continuous Deployment, load testing with k6, and API diffing with Tufin/oasdiff. By integrating ...

GitHub
API
CI/CD

28.6.2023 | 2 minutes reading time

Daniel Kocot

Charge your APIs Volume 7: Enhancing APIOps - API Diffing with Tufin/oasdiff

Throughout our exploration of API Operations (APIOps), we've covered a range of concepts - from Continuous Integration and Deployment to API testing under stress. These pillars of APIOps have brought us invaluable insights, helping to streamline our ...

API
GitHub
CI/CD

21.6.2023 | 2 minutes reading time

Daniel Kocot

Why Scanners Fail in Practice: Lessons from the Shai-Hulud Attacks on NPM

Evaluating dependency scanning tools

Trivy

Grype

OSV-Scanner

OWASP Dependency-Track

Exploring data sources

Endpoint protection to the rescue?

Conclusion

Was this post helpful?

Blog author

More articles in this subject area

Where Vibe Coding helps—and where it doesn't: A field report

Seven Ways to Replace Kaniko in your Container Image Builds

Full control despite virus protection and modern systems – How to truly...

How to Catch the Good Guys: My Learnings on Recruiting IT Security Professionals...

Relative path DLL hijacking in Windows programs

Self-issued JWT for mobile client authentication

How we can hack an AI with just a few words

Spring and Vue - A setup for small projects (Part 2)

Spring and Vue - A setup for small projects (Part 1)

Dangling DNS in cloud infrastructures

React is dead, long live React - React 19 is here

Server Actions in Next.js 14

Zero Trust Azure Identity & Access Architecture

A/B Testing: Tool support and testing GrowthBook

Zero-trust architecture – Why we need to end perimeter-based security

Building desktop apps with web technologies

Fighting Gandalf with magic spells (the spells are prompt injections) ...

Charge your APIs Volume 9: Perfecting APIOps - API Monitoring with Checkly

Charge your APIs Volume 8: Expanding APIOps - API Portals and Marketplaces

Charge your APIs Volume 7: Enhancing APIOps - API Diffing with Tufin/oasdiff