Removing PGP from PyPI (2023)

blog.pypi.org

58 points by harporoeder 7 hours ago


woodruffw - 5 hours ago

This is slightly old news. For those curious, PGP support on the modern PyPI (i.e. the new codebase that began to be used in 2017-18) was always vestigial, and this change merely polished off a component that was, empirically[1], doing very little to improve the security of the packaging ecosystem.

Since then, PyPI has been working to adopt PEP 740[2], which both enforces a more modern cryptographic suite and signature scheme (built on Sigstore, although the design is adaptable) and is bootstrapped on PyPI's support for Trusted Publishing[3], meaning that it doesn't have the fundamental "identity" problem that PyPI-hosted PGP signatures have.

The hard next step from there is putting verification in client hands, which is the #1 thing that actually makes any signature scheme actually useful.

[1]: https://blog.yossarian.net/2023/05/21/PGP-signatures-on-PyPI...

[2]: https://peps.python.org/pep-0740/

[3]: https://docs.pypi.org/trusted-publishers/

politelemon - 5 hours ago

This feels like perfect being the enemy of good enough. There are examples where the system falls over but that doesn't mean that it completely negates the benefits.

It is very easy to get blinkered into thinking that the specific problems they're citing absolutely need to be solved, and quite possibly an element of trying to use that as an excuse to reduce some maintenance overhead without understanding its benefits.

bjornsing - 5 hours ago

Does it matter much if the key can be verified? I mean it seems like a pretty big step up security wise to know that a new version of a package is signed with the same key was previous versions.

dang - 24 minutes ago

Discussed at the time:

Removing PGP from PyPI - https://news.ycombinator.com/item?id=36044543 - May 2023 (187 comments)

jacques_chester - 6 hours ago

I performed a similar analysis on RubyGems and found that of the top 10k most-downloaded gems, less than one percent had valid signatures. That plus the general hassle of managing key material means that this was a dead-end for large scale adoption.

I'm still hopeful that sigstore will see wide adoption and bring authorial attestation (code signing) to the masses.

SethMLarson - 4 hours ago

I've authored a proposal to deprecate the expectation of PGP signatures for future CPython releases. We've been providing Sigstore signatures for CPython since 3.11.

https://peps.python.org/pep-0761/

aborsy - an hour ago

What’s the current best solution for associating a public key to an identity or person?

This is not related to cryptography protocols.

OpenPGP key server verifies email. Keybase was a good idea but seems dead. Maybe identity providers?

troismph - an hour ago

I am curious why we still need PyPI to hold packages: it may be better to install from github.

Github provides much better integrated experience: source code, issues, docs, etc.

rurban - 5 hours ago

On the other hand PGP keys were widely successful for cpan, the perl5 repo. It's very simple to use, not as complicated as with pypi.

upofadown - 4 hours ago

> Of those 1069 unique keys, about 30% of them were not discoverable on major public keyservers, making it difficult or impossible to meaningfully verify those signatures. Of the remaining 71%, nearly half of them were unable to be meaningfully verified at the time of the audit (2023-05-19).

A PGP keyserver provides no identity verification. It is simply a place to store keys. So I don't understand this statement. What is the ultimate goal here? I thought that things like this mostly provided a consistent identity for contributing entities with no requirement to know who the people behind the identities actually were in real life.

hiatus - 5 hours ago

Should be tagged 2023.

badgersnake - 5 hours ago

Most people do security badly so let’s not do it at all.

Right.

opello - 5 hours ago

Wouldn't another very good answer be for PyPI to have a keyserver and require keys be sent to it for them to be used in package publishing?

bee_rider - 5 hours ago

I dunno, not all projects are equally important or popular, so it seems to me that the number of downloads which had keys is the better metric to look at.

But, if there are fundamental issues with the key system anyway, the percentages don’t matter anyway.

nonameiguess - 4 hours ago

I feel like there is a broader issue being pushed aside here. Verifying a signature means you have a cryptographic guarantee that whoever generated an artifact possessed a private key associated with a public key. That key doesn't necessarily need to be published in a web-facing keystore to be useful. For packages associated with an OS-approved app store or a Linux distro's official repo, the store of trusted keys is baked into the package manager.

What value does that provide? As the installer of something, you almost never personally know the developer. You don't really trust them. At best, you trust the operating system vendor to sufficient vet contributors to a blessed app store. Whoever published package A is actually a maintainer of Arch Linux. Whoever published app B went through whatever the heck hoops Apple makes you go through. If malware gets through, some sort of process failed that can potentially be mediated.

If you're downloading a package from PyPI or RubyGems or crates.io or whatever, a web repository that does no vetting and allow anyone to publish anything, what assurance is this giving? Great, some package was legitimately published by a person who also published a public key. Who are they exactly? A pseudonym on Github with a cartoon avatar? Does that make them trustworthy? If they publish malware, what process can be changed to prevent that from happening again? As far as I can tell, nothing.

If you change the keystore provider to sigstore, what does that give you? Fulcio just requires that you control an e-mail address to issue you a signing key. They're not vetting you in any way or requiring you to disclose a real-world identity that can be pursued if you do something bad. It's a step up in a limited scope of use cases in which packages are published by corporate entities that control an e-mail domain and ideally use their own private artifact registry. It does nothing for public repositories in which anyone is allowed to publish anything.

Fundamentally, if a public repository allows anyone to publish anything, does no vetting and requires no real identity disclosure, what is the basis of trust? If you're going to say something like "well I'm looking for .whl files but only from Microsoft," then the answer is for Microsoft to host its own repository that you can download from, not for Microsoft to publish packages to PyPI.

There are examples of making this sort of simpler for the consumer to get everything from a single place. Docker Hub, for instance. You can choose to only ever pull official library images and verify them against sigstore, but that works because Docker is itself a well-funded corporate entity that restricts who can publish official library images by vetting and verifying real identities.

exabrial - 3 hours ago

Not Invented Here!

- 6 hours ago
[deleted]
throwaway984393 - 5 hours ago

[dead]