Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeGuard type narrowing on returning False #996

Open
ikamensh opened this issue Dec 27, 2021 · 12 comments
Open

TypeGuard type narrowing on returning False #996

ikamensh opened this issue Dec 27, 2021 · 12 comments
Labels
topic: feature Discussions about new features for Python's type annotations

Comments

@ikamensh
Copy link

I've faced limitation of not narrowing types when trying to add types in https://github.com/openai/gym/blob/8e5ae02ab13a89c976ce7b1278c21f755dfa4bd2/gym/spaces/box.py#L33 - (np.isscalar)[https://numpy.org/doc/stable/reference/generated/numpy.isscalar.html] returning False rules out SupportsFloat type, and given input type SupportsFloat | np.ndarray I now know it was narrowed down to np.ndarray.

Numpy has not added type guard yet in its stub, but this is the logical type for this function once all supported python versions will have it. Yet with current syntax it's not possible to describe the function narrowing type by returning False.

Proposal: give TypeGuard second optional argument: TypeGuard[A, B] should narrow argument type A | B to A if returning True, and to B if returning False. Old syntax should still work, where one can have TypeGuard[A] work as it does currently.

I didn't find this in rejected ideas of https://www.python.org/dev/peps/pep-0647/, so wanted to share. I'm not sure how much of stdlib could use this, but it definitely helps to be able to write function in the most semantically convenient way. I can express both "is_x" and "is_not_x" function in typesystem if this is added. This could have synergy with NonType currently being discussed.

Unclear yet: how best to describe situation where no type narrowing occurs on True return, but only on False. Maybe TypeGuard[Any, B] could have this semantics.

@ikamensh ikamensh added the topic: feature Discussions about new features for Python's type annotations label Dec 27, 2021
@Gobot1234
Copy link
Contributor

Relevant to #926

@erictraut
Copy link
Collaborator

I understand the motivation, but I think the proposed solution is problematic. Unless and until the Python type system adds support for default type arguments, all unspecified type arguments for a generic class are assumed to be Any. That means if some future version of TypeGuard were to accept two type arguments, TypeGuard[A] would be equivalent to TypeGuard[A, Any], which is not semantically equivalent to today's TypeGuard[A]. I suppose it's possible to special-case TypeGuard so it acts unlike any other subscripted type, but I don't think that's advisable.

There are other reasons why we rejected the idea of doing type narrowing in the negative ("else") case for TypeGuard. It's generally unsafe to make this assumption. To support this in a safe manner, the output type needs to be a proper subtype of the input type. For example, if the input parameter for the user-defined type guard function is annotated A | B and the return type is TypeGuard[A], it would be possible to safely narrow the type to B in the "else" case, but only if B and A are not subtypes of each other. Consider the case where these conditions are not met. For example, let's say that the input parameter for the user-defined type guard function is annotated object (or worse, Any) and the output is annotated as TypeGuard[A]. If the value passed to this user-defined type guard function is of type A | B, it's not safe to narrow the type to B in the "else" case.

Here's an alternative proposal. Perhaps we should consider amending the current TypeGuard mechanism to include the following rule:

Type narrowing is applied in the negative case only if the following conditions are met:

  1. The input parameter to the user-defined type guard function is a union.
  2. The return type of the user-defined type guard is one of the types within the input type union.

In this case, the type can be narrowed in the positive case to be the TypeGuard return type, and it can be narrowed in the negative case to be the input union with the TypeGuard return type eliminated.

For example:

def tg(val: A | B | C) -> TypeGuard[A]: ...

def func(x: A | B | C, y: A | B):
    if tg(x):
        reveal_type(x) # A
    else:
        reveal_type(x) # B | C

    if tg(y):
        reveal_type(y) # A
    else:
        reveal_type(y) # B

Am I correct in assuming this would address your use case?

This proposal would not require any new type arguments or new forms of TypeGuard, and it would require a relatively minor update to the logic in the various type checkers.

Thoughts?

@ikamensh
Copy link
Author

ikamensh commented Dec 29, 2021

Given the truth table for np.isscalar, I think this would be sufficient. It's interesting if the type system will be able to deduce following detail: it's essentially returning true for both SupportsFloat and str, but string is not a part of the input type in the particular case I'm interested in:

def isscalar(x: str | SupportsFloat | np.ndarray) -> TypeGuard[str | SupportsFloat]: ...

def func(a: SupportsFloat | np.ndarray):
  if not isscalar(a):
    reveal_type(a)  # desired: np.ndarray

Because TypeGuard is not strictly narrowing types currently, but allows upcasting, I'm wondering if it will suddenly consider it to be a string now. Re-reading the PEP, I still can't understand because List[str] is not a subtype of List[object] because of invariance rules. I thought str is a subtype of object + List is covariant.

On the bigger picture, you are mentioning that current handling of default values in type system would hinder this proposal. I think this is a current implementation detail and not a part of python spec, so my ideal would be to discuss desirable spec for python, agree on it, and then delay implementation as far as needed. But it's of course easier for me to say, I know you are maintaining pyright type checker and I've no idea how much work would such change be / what would be implications for the architecture.

To check if I'm getting another paragraph of yours, There are other reasons why we rejected the idea of doing type narrowing in the negative ("else") case for TypeGuard. - Are you saying that assuming any type narrowing without any further specs from the annotation is unsafe? Because just adding not to a typeguard should not change possible type transformations.

@ikamensh
Copy link
Author

Your alternative proposal doesn't require any change in annotations by users of type guards, and would strictly add type narrowing in a safe case, right?

@sobolevn
Copy link
Member

sobolevn commented Dec 29, 2021

I think that this rule would be super hard to explain:

Type narrowing is applied in the negative case only if the following conditions are met:

  1. The input parameter to the user-defined type guard function is a union.
  2. The return type of the user-defined type guard is one of the types within the input type union.

Instead, I think that we can add a second version of TypeGuard[IfCase, ElseCase]. When the second type argument is provided, then the user explicitly want to support the else case.

Some examples:

def first(o: object) -> TypeGuard[Literal[True]]: pass

b: bool
if first(b):
   reveal_type(b)  # Revealed type is "Literal[True]"
else: 
   reveal_type(b)  # Revealed type is "builtins.bool"

But:

def second(o: object) -> TypeGuard[Literal[True], Literal[False]]: pass

b: bool
if second(b):
   reveal_type(b)  # Revealed type is "Literal[True]"
else: 
   reveal_type(b)  # Revealed type is "Literal[False]"

I think that it is quite easy to teach, readable, backward-compatible, minimalistic.
The only problem is that this pattern of "variadic" or "generics with default params" is not common and requires a special-casing in type checkers.

@gvanrossum
Copy link
Member

I don't think Eric's proposal can work, since it changes behavior. One could have a type guard that returns true only for some values of the indicated type. For example:

def is_positive_int(a: int|str) -> TypeGuard[int]:
    return isinstance(a, int) and a >= 0

PEP 647 allows this, but the new rule Eric proposes would incorrectly assume the value is a string when this returns false.

However, we could introduce a variant, ExactTypeGuard[T], that uses Eric's rules. Or, alternative, ExactTypeGuard[T1, T2]. This would require a new PEP, but it would be a short one.

@erictraut
Copy link
Collaborator

Yeah, my proposal would admittedly change existing behavior. I have a hard time imagining any non-contrived cases breaking, but I suppose it's theoretically possible.

So the proposals are:

  1. Allow the existing TypeGuard form to accept either one or two type arguments. If one is provided, the behavior is as today. If a second one is provided, it provides a negative narrowing type.
  2. Introduce an ExactTypeGuard that takes two type arguments only.
  3. Introduce an ExactTypeGuard that acts like TypeGuard today with the added rules for conditionally narrowing in the negative case.

Normal generic classes cannot accept a variable number of type arguments today (at least not without the use of PEP 646’s TypeVarTuple). That makes options 1 a little odd, but perhaps it's OK to bend the rule here since TypeGuard is a special form.

Of these three options, I'm leaning toward option 1. I will prototype this in pyright and report back.

@erictraut
Copy link
Collaborator

I added provisional support for option 1 in pyright. It was pretty trivial to implement, and I do like the usability of it. If anyone would like to argue for one of the other two options (or would like to present a fourth option), please speak up.

I'll publish a new version of pyright (1.1.202) within the next twelve hours with the provisional support for option 1. Folks can then play with it and provide feedback. If the general consensus is that this is the best solution, we can write an amendment to PEP 647 (or write a follow-on smaller PEP if we think that's required in this case).

@erictraut
Copy link
Collaborator

I just published pyright 1.1.202, and it has provisional support for the two-argument form of TypeGuard. If a second argument is provided, it's assumed to be the negative narrowing type. Please give it a try and let me know if you have any feedback.

@erictraut
Copy link
Collaborator

I've started a thread in the typing-sig to get broader feedback about this proposal.
https://mail.python.org/archives/list/[email protected]/thread/EMUD2D424OI53DCWQ4H5L6SJD2IXBHUL/

@ikamensh
Copy link
Author

ikamensh commented Jan 7, 2022

Thanks for the test implementation, Eric. I've tested it and it works as described, and I would have use for this in projects I work on. However for the specific case that motivated creation of this issue, following would have to work:

from typing import TypeGuard

def is_str(x: str | int | object) -> TypeGuard[str, int | object]:
    return isinstance(x, str)

def foo(y: str | int):
    if is_str(y):
        return y.upper()
    else:
        return 2 + y  # error: Operator "+" not supported for types "int | object"

Here the type guard together with more restrictive type in foo could be used to infer that object is not a possible type for y in the else clause.

If this is asking too much, it would already be a big step forward to have else clause TypeGuard as currently in Pyright.

@MajorDallas
Copy link

Type narrowing is applied in the negative case only if the following conditions are met:

  1. The input parameter to the user-defined type guard function is a union.
  2. The return type of the user-defined type guard is one of the types within the input type union.

Honestly, this was the behavior I was expecting in the first place. I find the PEP-647 behavior extremely counter-intuitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: feature Discussions about new features for Python's type annotations
Projects
None yet
Development

No branches or pull requests

6 participants