Weird state of PyPi ecosystem in 2025
There are more lessons to be learned from my Pipreqs dependency confusion
Intro
Just over two years ago I got sidetracked into a weird dependency confusion story while trying to get together a requirements.txt
file for an internship challenge web app I strapped together in a few days. Full story is available here.
Tl;dr
Like every great professional I just googled my way out of this. Some wise people advocated in favor of using some third party tool called pipreqs
instead of tried and true pip freeze
, cause who needs virtual environments, am I right?
Well, I installed it, ran it and got a wrong package in my requirements.txt.
What was more important, it looked like I was not alone in my misery:
My mind started racing — was there something wrong with pipreqs? Can this behavior be exploited?
Yep, there is an issue
Weird world of PIP packages
Remember how you wanted to install a jwt
package using pip install jwt
only to find out that it is actually called pyjwt
in PIP and you installed some crap instead?
Well, there is technically no obligation for the PIP package name and the name of exported package modules to be the same. Nothing stops you from defining a PIP package called my_package_python
with the following structure:
1
2
3
4
5
6
7
8
9
my_package_python/
├── pyproject.toml
├── README.md
├── my_package/
│ ├── __init__.py
│ └── code.py
└── tests/
└── test_main.py
That will export the package in the actual Python like that:
1
from my_package import code
In fact, a lot of popular projects follow this structure, since cool names were already taken in PIP.
CVE-2023-31543
Main advertisement feature of Pipreqs is “smart” resolution of the packages imported by the given project code. How it’s done? Well, it just tries to look up all of the package names from the import statements at PIP repository. See the issue here?
Given our example package above, Pipreqs will actually try to look up and add the my_package
PIP package, not my_package_python
.
Nothing happens if there is no such package at PIP. But, if someone is to create such malicious my_package
at PIP, Pipreqs will happily add it to your requirements.txt
file ;)
Together with the pipreqs developers, we scrambled for a little fix that should’ve prevented remote lookup of the package names if they were installed locally, and called it a day.
Did it get any better?
Fast forward 2 years. I got curious and visited the issues page in Pipreqs. I wish I had not done this. Looks like the fix did not help at all. The issue persisted in one form or another:
Is Pipreqs still used?
Unfortunately, pipreqs suggestion is stuck firmly in the 2015 StackOverflow answer.
As an unpleasant surprise, it is also now forever stuck as a viable option somewhere in the back neurons of the LLMs:
“Only includes libraries actually imported in the code” — no shit, Sherlock!
How popular is Pipreqs exactly?
Given the pypistats.org data, the tool exploded in popularity in early 2023 and then dropped dramatically somewhere near the date I found the issue and submitted the CVE. Most probably due to it failing to properly do it’s job and creating working requirements.txt
. I don’t think me finding a critical vulnerability did anything for the public to stop using it.
Nonetheless, Pipreqs still seems to gain some traction, and as of 2025 it has more than doubled in monthly downloads (500k → 1.25M)
Well, at least my PoC dependency confusion packages should, at least, be long forgotten, right? Wait… WTF???
My dependency confusion packages were not, in fact, forgotten
The monthly downloads increased in 5 times since the publishing of the package in 2023. Currently I happen to poison ~50000 installations in month with my “Gotcha” print message…
Who is to blame here? This could not have been the Pipreqs for sure — it’s download count dropped by a factor of 5 at the time of publishing the research, and still hasn’t recovered yet. Was I fighting windmills this whole time?
I think the answer might be really simple. There were no code vulnerabilities behind this download count surge. It was people who were confusing the package names all along. I personally fell victim to the “pip install jwt
instead of pip install pyjwt
” trick at least a dozen of times already.
Python really needs a whole redesign of the import naming.
It won’t get any better. For now, at least
You see, Python comittee does acknowledge the problem in numerous PEPs, including 0708, but backporting a fix into the system used by millions is obviously not an easy job to pull.
Conclusion
I am fighting the windmills indeed. Go grab pip freeze
or something. Don’t use third party tools to generate your dependencies.