Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on validating License-File presence in metadata version 2.1? #862

Open
hauntsaninja opened this issue Dec 17, 2024 · 1 comment

Comments

@hauntsaninja
Copy link
Contributor

hauntsaninja commented Dec 17, 2024

Due to pypa/setuptools#4759 , an extremely large number of wheels fails metadata validation.
In a venv of around 800 packages I had lying around, about 75% of them have metadata that fails validation for this reason.

  |   File ".../python3.11/site-packages/packaging/metadata.py", line 752, in from_raw
  |     raise ExceptionGroup("invalid metadata", exceptions)
  | ExceptionGroup: invalid metadata (1 sub-exception)
  +-+---------------- 1 ----------------
    | packaging.metadata.InvalidMetadata: license-file introduced in metadata version 2.4, not 2.1
    +------------------------------------

It seems like it will take many years for this to be long ago enough that this stops being a regular issue for people who want to parse metadata (the setuptools issue is open; I learnt today that I uploaded a wheel with METADATA that fails validation today).

Given the widespread prevalence, should packaging.metadata have some (kwarg opt-out/in) special handling of License-File presence in metadata version 2.1, beyond just turning validation off entirely?

@brettcannon
Copy link
Member

So I think there's a not-so-painful work-around for this which is manually accessing all but that field if you're after eager validation. In fact, if you look at what the eager validation does is it just accesses every attribute on your behalf; nothing magical that one couldn't do in one's own code. This sort of scenario is exactly why eager validation was made optional (and also why we have raw metadata access as well).

So basically I'm not sure if 'packaging' needs to carry around the fix as suggested as much as people being aware of how to work around it. And I would be fine with putting something in the docs saying how to do validation for a select number of keys and use this issue as the example. I would also be okay adding an attribute that listed all the keys for a metadata version so that validation is as simple as:

meta = Metadata.from_email(data, validate=False)

for attr in bikeshed_later["2.1"]:
    getattr(meta, attr)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants