-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: syntax for variable and attribute annotations (PEP 526) #258
Comments
Very good proposal! I was thinking some time ago about how to introduce There is something that could probably be discussed here: should it be possible to add annotations to variables in for i: int in my_iter():
print(i+42)
with open('/folder/file', 'rb') as f: IO[bytes]:
print(f.read()[-1]) "Practical" questions:
|
|
This comment is going to list things brought up in python-idea that need more thought...
[UPDATES:]
|
Here are some thoughts about PEP 484 support using type comments on a for-statement, to indicate the type of the control variables, e.g. for x, y in points: # type: float, float
# Here x and y are floats
... This isn't implemented by mypy and doesn't strike me as super useful (typically type inference from points should suffice) but we could perhaps support this with the proposal under discussion: for x: float, y: float in points:
... However, this makes it a bit hard to find the iterable ( x: float
y: float
for x, y in points:
... It's even worse for a with-statement -- we'd have to somehow support with foo() as tt: Tuple[str, int]:
... which would be unreasonable to ask our parser to handle -- the expression after the first |
OK, I agree, also the syntax: x: float
y: float
for x, y in points:
... emphasizes the fact that Concerning why it is better than comments I have two typecheck-unrelated "practical" points to add (not sure if this was already mentioned):
|
Nick just proposed a: ClassAttr[x] insread of a: class X, which simplifies the grammar and is close to what I proposed to put in annotations anyway. So let's run with this. |
OK, I want to make this into a PEP in time for 3.6. And I want to hack on the implementation at the Core Devs Sprint Sept. 6-9 at Facebook, and get it into 3.6 beta 1. That should just about be possible. |
It appears to me as if |
@srkunze There is a convention for generics: if a type variable is missing it is assumed to be |
I actually think a dim color is a good thing here. Moreover, it's already true for many editors with docstrings and annotations as those are barely relevant for code execution. I find this very convenient as it helps me focusing on the important pieces. |
@srkunze I understand your point, but nevertheless I would like to differentiate two levels of "relevance": docstrings and type annotations are in some sense parts of "public API" while comments are not such parts. |
Got it. |
How about allowing the use of parenthesized If I'm repeating old proposals, I still think it will be helpful to document the reasons to reject it. |
That could make sense and can be another motivation of why local variables don't have annotations: they simply don't represent a "public API". |
Good point. |
It was brought up on python-ideas and I briefly mention it in the section "Multiple types/variables" in the original post above, but I expect that the syntax will be hairy, the benefits slight, and the readability poor. |
@gvanrossum I have few short questions about the new PEP:
|
Oops, @kirbyfan64 and @phouse512 are the lucky lottery winners, and they are writing a draft in a repo cloned from the peps repo by the latter. If you have text to submit best post it here so they can cherry-pick it. I don't think the PEP should go into details about the AST nodes, though if you want to work on a reference implementation separately just go ahead! |
What is the point of variable declaration? You haven't given any justification, and I don't see any. |
It's less verbose, and makes the meaning clear. The existing notation is On Tuesday, August 9, 2016, Mark Shannon [email protected] wrote:
--Guido (mobile) |
Which existing notation? |
The type comments. Even when types can be inferred, sometimes adding them On Tuesday, August 9, 2016, Mark Shannon [email protected] wrote:
--Guido (mobile) |
Regarding typed instance variables with initializers, it might be worth comparing this proposal with the ipython traitlets project, which is a heavier version of that. |
@gvanrossum There is one important question: Currently, there is a statement in the draft
I think that the limitation for the right side is not necessary, and these could be allowed (both assignments currently work, if you omit the types): a: Tuple[int, int] = 1, 2
b: Tuple[int, int, int, int] = *a, *a If we are going to stick only to the left side limitation (one variable per type annotation), then it looks like there is a nice and simple way to change the grammar by adding a "declaration statement" (you choose the name). This requires only two small changes to
The above change requires to add an AST node:
If you agree with this, then I will proceed this way. |
@gvanrossum I tried my idea above and it turns out it requires a bit more work on grammar to avoid ambiguity, so that the actual implementation is a bit different form described above. I implemented the parse step (source to AST) with the rules described in previous post and nice syntax errors, here is a snippet: >>> x: int = 5; y: float = 10
>>> a: Tuple[int, int] = 1, 2
>>> b: Tuple[int, int, int, int] = *a, *a
>>> x, y: int
File "<stdin>", line 1
SyntaxError: only single variable can be type annotated
>>> 2+2: int
File "<stdin>", line 1
SyntaxError: can't type annotate operator
>>> True: bool = 0
File "<stdin>", line 1
SyntaxError: can't type annotate keyword
>>> x, *y, z: int = range(5)
File "<stdin>", line 1
SyntaxError: only single variable can be type annotated
>>> x:
File "<stdin>", line 1
x:
^
SyntaxError: invalid syntax (note: set of allowed expressions for type annotations is exactly the same as for function annotations) If you think this is OK, I will continue with the actual compilation. |
@NeilGirdhar Regarding traitlets, that seems focused on runtime enforcement -- here we're interested in signalling types to the type checker. We're also only interested in a notation that reuses PEP 484 as the way to "spell" types (e.g. |
@ilevkivskyi I am not super committed to allowing only a single expression on the right, but I feel allowing it would just reopen the debate on why we are not allowing multiple variables on the left. Regarding your syntax: I'm glad you've got something working, but IIRC the parser looks ahead only one token, which means that if your have a rule
it will prevent the parser from also recognizing regular assignments
that start with a In addition I think we need to allow more than just a single name on the left -- I think we must allow at least My own thoughts on how to update the grammar are more along the lines of changing the existing
to
There will then have to be some additional manual checks to ensure that when a len(a) = 42 I guess we can quibble about whether the initializer should be |
Exactly, I implemented something very similar to the version you propose, and indeed I added some checks to Ah, you just didn't write it up that way in your previous comment. I'd accept anything that's acceptable as a single lvalue, so |
@gvanrossum Actually, I have just checked that the grammar in my implementation is equivalent to what you propose expcept with It looks like it does not lead to ambiguity, at least all the test suite runs OK and I tried to play with new syntax a bit without generating any errors in parser. |
@kirbyfan64 Yes that's right. And sorry, I agree that mega-threads stink no matter which medium you use. :-( |
So who wants to write the PEP update? |
I could do this in two-three hours, if someone is available sooner, please go ahead. |
I'll try to do it now a sec. |
Done. Note that I'm assuming |
@kirbyfan64 Yes, it is valid, but it will not store an annotation in |
@kirbyfan64 Your commit looks good. I have left three small comments on the commit. |
@ilevkivskyi Yes please go ahead and update the implementation. We're in
the final stretch!
|
Can my idea (of treating bare annotations the same way as global or nonlocal declarations) be added to the PEP as rejected? The closest thing it currently says is
but the same thing can be said for global and nonlocal, and that didn't stop you then. I'm sure you have valid reasons of how exactly is this fundamentally different, but I'd like to see them in the PEP. At some moment Guido said
and while this is a strong stance, I don't feel it's adequately explained - again, global and nonlocal statements are also "not comments with a funny syntax", but they are evaluated statically, no matter where they are put in the scope. I'm very worried about what semantics should I assign to re-annotations and conditional annotations.
In each of these cases, I'm extremely confused about what actually happens. And reading the PEP doesn't help very much. Are these illegal? What tool detects them (if any)? If the behavior is specified, where is the specification? (Note that all (or most) of these would be trivially flagged as errors if annotations were treated statically.) |
Please don't worry. The title of the PEP contains "Syntax", not "Semantics" because we don't want to impose any new type semantics (apart from addition of |
@vedgar I've added text to the PEP to explain (hopefully) your rejected proposal:
I have nothing to add to what @ilevkivskyi said about the semantics of redefinition -- that is to be worked out between type checkers. (Much like PEP 484 doesn't specify how type checkers should behave -- while it gives examples of suggested behavior, those examples are not normative, and there are huge gray areas where the PEP doesn't give any guidance. It's the same for PEP 526.) |
Reading your explanation, it just occured to me that you could detect misnamings and treat bare annotations statically. At least, Python already does this when nonlocal is used (not for global, of course, since the name can be inserted dynamically afterwards).
If I end the input here with a blank line, I get "SyntaxError: no binding for nonlocal 'x' found". (But I can introduce x later, and get no error.) Python obviously doesn't evaluate x, but it somehow knows, statically, whether there is a variable it could refer to, even if it's assigned to later. Similar thing can be done here, I presume. "no binding for 'slef' found" could be reported without evaluating slef. But yeah, it gets hairy with more complicated names. Ok, I'll stop here. I'm happy with the current solution. At least I think so. :-) |
@fperez @Carreau I am 👍 on introspect-ability.
|
@SylvainCorlay @fperez @Carreau Mypy already doesn't understand what's going on with libraries that use traitlets so I'm not particularly worried about how mypy should deal with this -- we'll cross that bridge when we get to it. Personally I hope to be able to define named tuples like this: class User(NamedTuple):
username: str
userid: int
first_name: Optional[str]
last_name: Optional[str] |
^ @ellisonbg @minrk. |
This looks cool! I already want to implement this. Maybe this can even go into |
@ilevkivskyi Can't you just add an |
@NeilGirdhar As I understand |
Feel free to get started on future_typing with this as the first feature!
|
Since you're in the mood of allowing expressivity expansion of the language, I have allowed my imagination to go wild a bit: what about Enums? Currently, there is a big discussion about how to declare Enum (and Flags and similar classes) whose member values "don't matter". I think it could be really cool if we could say
and be done with it. Of course, lack of unpacking and unavaliability of class name are technical problems, and probably the best way would be just
but that's probably too big a change. However, the first variant is already possible to implement, and requires just a bit of philosophical adaptation, which I think is justified in this case. After all, blue, green and red are of "type" (Enum, to be precise) Color. If we really want to strenghten the bond between annotation and assignment, it's not too great a stretch to envision a metaclass treating bare annotations as "I don't care about assigned values" than just ordinary "missing default assigned values". What do you say? For me, it reads much better than the currently proposed abomination of assigning None, (), or some "magical object" like |
I think it's up to the enum folks to come up with that. I think the first
form is too noisy (you have to say 'Color' for each value) -- maybe we'll
end up relenting and allow `a, b, c: T` in beta 2 or so. Personally I am
happy though to give my enums a value: `red, green, blue = 1, 2, 3` or `red
= 1; green = 2` etc.
|
I'm happy with explicit values too. But there are people who really want to have the syntax for "don't care" values. I just think that having a class body where you have
(a and b are simple identifiers) and after the class definition you have |
Enums are begging for syntactic support. They may yet get it, if someone
writes the PEP for Python 3.7.
|
Hmm... by syntactic support, do you mean the "relenting" you have talked above (allowing unpacking on bare annotations, and maybe even a forward ("outward"?:) declaration without quotes), or you mean a full-fledged syntactic support like enum keyword? If we're going to go that route, I'm sure the general "make" keyword would be the better choice, and IIRC, you were against it. Or maybe you mean something in between? Like
This we can do right now, but it requires philosophical adjustment too. [The true solution is probably giving up the idea of the current |
Let's take this off this tracker, if you want to propose something,
python-ideas is the place to go.
|
Shouldn't we close this issue? PEP 526 is accepted and marked Final. |
I am glad to close this megathread. IIRC PEP 526 is accepted provisionally, but if there will be some ideas, it is better to open a new issue. |
Introduction
This issue is reserved for substantive work on PEP 526, "Syntax for Variable and Attribute Annotations". For textual nits please comment directly on the latest PR for this PEP in the peps repo.
I sent a strawman proposal to python-ideas. The feedback was mixed but useful -- people tried to poke holes in it from many angles.
In this issue I want to arrive at a more solid specification. I'm out of time right now, but here are some notes:
__init__
or__new__
a: <type>
vs. how it strikes people the wrong way__annotations__
Work in progress here!
I'm updating the issue description to avoid spamming subscribers to this tracker. I'll keep doing this until we have reasonable discussion.
Basic proposal
My basic observation is that introducing a new keyword has two downsides: (a) choice of a good keyword is hard (e.g. it can't be 'var' because that is way too common a variable name, and it can't be 'local' if we want to use it for class variables or globals,) and (b) no matter what we choose, we'll still need a
__future__
import.So I'm proposing something keyword-free:
The idea is that this is pretty easy to explain to someone who's already familiar with function annotations.
Multiple types/variables
An obvious question is whether to allow combining type declarations with tuple unpacking (e.g.
a, b, c = x
). This leads to (real or perceived) ambiguity, and I propose not to support this. If there's a type annotation there can only be one variable to its left, and one value to its right. This still allows tuple packing (just put the tuple in parentheses) but it disallows tuple unpacking. (It's been proposed to allow multiple parenthesized variable names, or types inside parentheses, but none of these look attractive to me.)There's a similar question about what to about the type of
a = b = c = x
. My answer to this is the same: Let's not go there; if you want to add a type you have to split it up.Omitting the initial value
My next step is to observe that sometimes it's convenient to decouple the type declaration from the initialization. One example is a variable that is initialized in each branch of a big sequence of
if
/elif
/etc. blocks, where you want to declare its type before entering the firstif
, and there's no convenient initial value (e.g.None
is not valid because the type is notOptional[...]
). So I propose to allow leaving out the assignment:The line
log: Logger
looks a little odd at first but I believe you can get used to it easily. Also, it is again similar to what you can do in function annotations. (However, don't hyper-generalize. A line containing justlog
by itself means something different -- it's probably aNameError
.)Note that this is something that you currently can't do with
# type
comments -- you currently have to put the type on the (lexically) first assignment, like this:(In this particular example, a type declaration may be needed because
heavy_logger()
returns a subclass ofLogger
, while other branches produce different subclasses; in general the type checker shouldn't just compute the common superclass because then a type error would just infer the typeobject
.)What about runtime
Suppose we have
a: int
-- what should this do at runtime? Is it ignored, or does it initializea
toNone
, or should we perhaps introduce something new like JavaScript'sundefined
? I feel quite strongly that it should leavea
uninitialized, just as if the line was not there at all.Instance variables and class variables
Based on working with mypy since last December I feel strongly that it's very useful to be able to declare the types of instance variables in class bodies. In fact this is one place where I find the value-less notation (
a: int
) particularly useful, to declare instance variables that should always be initialized by__init__
(or__new__
), e.g. variables whose type is mutable or cannot beNone
.We still need a way to declare class variables, and here I propose some new syntax, prefixing the type with a
class
keyword:I do have to admit that this is entirely unproven. PEP 484 and mypy currently don't have a way to distinguish between instance and class variables, and it hasn't been a big problem (though I think I've seen a few mypy bug reports related to mypy's inability to tell the difference).
Capturing the declared types at runtime
For function annotations, the types are captured in the function's
__annotations__
object. It would be an obvious extension of this idea to do the same thing for variable declarations. But where exactly would we store this info? A strawman proposal is to introduce__annotations__
dictionaries at various levels. At each level, the types would go into the__annotations__
dict at that same level. Examples:Global variables
This would print
{'players': Dict[str, Player]}
(where the value is the runtime representation of the typeDict[str, Player]
).Class and instance variables:
This would print a dict with five keys, and corresponding values:
Finally, locals. Here I think we should not store the types -- the value of having the annotations available locally is just not enough to offset the cost of creating and populating the dictionary on each function call.
In fact, I don't even think that the type expression should be evaluated during the function execution. So for example:
should not print anything. (A type checker would also complain that
side_effect()
is not a valid type.)This is inconsistent with the behavior of
which does print something (at function definition time). But there's a limit to how much consistency I am prepared to propose. (OTOH for globals and class/instance variables I think that there would be some cool use cases for having the information available.)
Effect of presence of
a: <type>
The presence of a local variable declaration without initialization still has an effect: it ensures that the variable is considered to be a local variable, and it is given a "slot" as if it was assigned to. So, for example:
will raise
UnboundLocalError
, notNameError
. It's the same as if the code had readInstance variables inside methods
Mypy currently supports
# type
comments on assignments to instance variables (and other things). At least for__init__
(and__new__
, and functions called from either) this seems useful, in case you prefer a style where instance variables are declared in__init__
(etc.) rather than in the class body.I'd like to support this, at least for cases that obviously refer to instance variables of
self
. In this case we should probably not update__annotations__
.What about
global
ornonlocal
?We should not change
global
andnonlocal
. The reason is that those don't declare new variables, they declare that an existing variable is write-accessible in the current scope. Their type belongs in the scope where they are defined.Redundant declarations
I propose that the Python compiler should ignore duplicate declarations of the same variable in the same scope. It should also not bother to validate the type expression (other than evaluating it when not in a local scope). It's up to the type checker to complain about this. The following nonsensical snippet should be allowed at runtime:
The text was updated successfully, but these errors were encountered: