-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with str
being treated as Sequence[str]
#5090
Comments
str
as Sequence[str]
str
being treated as Sequence[str]
This is interesting. We can suggest workarounds etc. for the example code,
but I wonder -- does any code that declares it wants a Sequence[str] really
expect a single str object?
We could remove the Sequence[str] and instead add explicit dunder methods
to str objects. It would still be seen as Sized,
Iterable[str], Container[str] (in 3.6+), and Reversible[str], because those
are all Protocols. But Sequence is not, and(for the first time ever!) we
would have a way to state that a function takes a Sequence[str] but not a
str.
Not sure how much code would break...
|
@gvanrossum Thank you for the reply. One more thing (which is probably obvious for someone more knowlegable). Why mypy treats those two cases differently: d1 = {'k1': 'qwe', 'k2': []}
reveal_type(d1)
d2 = {'k1': [1], 'k2': []}
reveal_type(d2) mypy:
I'd say type of dict's value in the second shoud be, analogically, EDIT: I think I can guess: in the first case empty list is treated kinda as an empty string, which is quite wrong. |
The second example is a known issue #5045 |
Yep! For example, in Hypothesis we have So at the very least, I have a strong concrete (as well as aesthetic) preference to keep treating |
If we went this way, adding the type hint to sampled_from would be the
right solution there.
It sounds there's no way to please everybody.
|
I also have a use case for this, where I have parsers some of which expect |
I think both the runtime behavior of treating a string as instance of def process_words(seq: Iterable[str]) -> Iterable[str]:
return [s.upper() for s in seq] where the intended use case is process_words(['hello', 'world']) # ['HELLO', 'WORLD'] and an unintended side effects of the implementation is process_words('hello' world') # ['H', 'E', 'L', 'L', 'O', ' ', 'W', 'O', 'R', 'L', 'D'] Given that the runtime behavior of |
But |
Yes, and this is actually really nice, having to deal with character types adds a lot of unneeded complexity. The correct way to deal with this is that whenever you want an |
A clean but albeit somewhat ugly solution would be to introduce new types in Example: x: typing.NewIterable[str] = ['hello', 'world'] # typechecks
y: typing.NewIterable[str] = 'hello world' # doesn't typecheck
isinstance(['hello', 'world'], collections.abc.NewIterable) # True
isinstance('hello world', collections.abc.NewIterable) # False I think this is do-able but alas
What should be the names of the new In the spirit of the python Zen (explicit is better than implicit, there should be one obvious way to do it, yada yada) annotations like |
experiencing the same issue, any update? |
One way to deal with this would be to introduce a new optional error that is generated when |
Again, My understanding of the goal of mypy as a project is that it's a type system for Python as it exists, not as it might be if it were slightly nicer. |
@JukkaL how challenging do you think it would be implement a putative
For example: def just_sequence(vals: Iterable[str]) -> Any:
...
def sequence_or_str(val_or_vals: Union[str, Iterable[str]]) -> Any:
...
# mypy error or warning if `--warn-iterable-anystr` or `--no-iterable-anystr` is provided
just_sequence("bad")
just_sequence(["this", "is", "fine"])
sequence_or_str("this_is_fine")
sequence_or_str(["this", "is", "fine"]) This feels like a longstanding issue, eg python/typing#256, and there's strong evidence that this is both a common error and pain point in the language. So much so, that it's one of the very few deviations between python and starlark's semantics. However it seems clear from the discussion here and elsewhere that a I don't believe this can be implemented through |
I for one would make heavy use of an ability to specify that I expect some strings to not be treated as NestedIntList = Union[int, Sequence["NestedIntList"]]
def complex_interpreter(src: str) -> NestedIntList:
nesting = src.count("[")
result: NestedIntList = src.replace("[", "")
for _ in range(nesting):
result = [result]
return result $ pyright demo6.py
0 errors, 0 warnings, 0 informations
$ mypy demo6.py
Success: no issues found in 1 source file
$ pytype demo6.py
Success: no errors found I'd be heavily in favor of a deprecation cycle that allows str to no longer typecheck as Sequence, except for an opt-in annotation of |
Closing as a duplicate of the more popular #11001. I've also added a |
Are you reporting a bug, or opening a feature request?
I think it's a bug or very unexpected behavior at least. Related issues I found: #1965, #4334.
Code
Actual Output
So, as far as I understand, mypy tried to find the strictest type for values of
d
, which isSequence[str]
in this case. In Python, string is a Sequence of (sub)strings (s[0] == s[0][0] == s[0][0][0]
), also list is a Sequence, so this is understandable. Unfortunately, the rule that string is a sequence of (sub)strings does not work in case ofvalue: str = d['k1']
.Expected output
To get rid of the error I need to manually give more general type:
d: Dict[str, Any]
, so this is not such a big problem. Having said that, expected behavior would be: No "Incompatible types in assignment" error for the given code.Versions
Python 3.6.5
mypy==0.600
The text was updated successfully, but these errors were encountered: