CVE-2019-19275
Description
typed_ast 1.3.0/1.3.1 has an out-of-bounds read in ast_for_arguments, potentially allowing a crash via malicious Python source parsing.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
typed_ast 1.3.0/1.3.1 has an out-of-bounds read in ast_for_arguments, potentially allowing a crash via malicious Python source parsing.
Vulnerability
Description
The typed_ast library versions 1.3.0 and 1.3.1, as well as certain Python 3.8.0-alpha prereleases, contain an out-of-bounds read vulnerability in the ast_for_arguments function. This flaw originates from improper handling of certain AST node structures, leading to a read past the allocated buffer [1]. The issue was introduced when typed_ast code was merged into CPython, and it affects parsing of Python source code with specific constructs.
Exploitation
Conditions
An attacker does not need to execute arbitrary code; the vulnerability can be triggered by supplying Python source code that, when parsed, causes the out-of-bounds read. This makes it relevant for services that parse Python code without executing it, such as certain static analysis tools or web-based code evaluation endpoints [1][2]. The attack requires no special privileges, as parsing is typically performed without authentication in such contexts [2]. The out-of-bounds read results in undefined behavior, which may manifest as a crash.
Impact
Successful exploitation leads to a denial-of-service (DoS) condition, where the Python interpreter or the application using typed_ast crashes. This can disrupt service availability if an attacker repeatedly submits malicious input [1][2]. No code execution or data theft is expected from this vulnerability alone, but the crash can cause service interruption.
Mitigation
The vulnerability is addressed in typed_ast version 1.3.2 and later. Users are advised to upgrade to the latest version. Additionally, the fix has been integrated into CPython starting from Python 3.8.0 beta releases [1][3][4]. No workarounds are available, so upgrading is the recommended course of action.
AI Insight generated on May 21, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
typed-astPyPI | >= 1.3.0, < 1.3.2 | 1.3.2 |
Affected products
4- typed_ast/typed_astdescription
- ghsa-coords3 versionspkg:pypi/typed-astpkg:rpm/opensuse/python-typed-ast&distro=openSUSE%20Leap%2015.1pkg:rpm/suse/python-typed-ast&distro=SUSE%20Package%20Hub%2015%20SP1
>= 1.3.0, < 1.3.2+ 2 more
- (no CPE)range: >= 1.3.0, < 1.3.2
- (no CPE)range: < 1.3.1-lp151.2.6.1
- (no CPE)range: < 1.3.1-bp151.2.6.1
Patches
4dc317ac9cff8Fix two out-of-bounds array reads (#99)
1 file changed · +2 −2
ast3/Python/ast.c+2 −2 modified@@ -1445,7 +1445,7 @@ handle_keywordonly_args(struct compiling *c, const node *n, int start, goto error; asdl_seq_SET(kwonlyargs, j++, arg); i += 1; /* the name */ - if (TYPE(CHILD(n, i)) == COMMA) + if (i < NCH(n) && TYPE(CHILD(n, i)) == COMMA) i += 1; /* the comma, if present */ break; case TYPE_COMMENT: @@ -1644,7 +1644,7 @@ ast_for_arguments(struct compiling *c, const node *n) if (!kwarg) return NULL; i += 2; /* the double star and the name */ - if (TYPE(CHILD(n, i)) == COMMA) + if (i < NCH(n) && TYPE(CHILD(n, i)) == COMMA) i += 1; /* the comma, if present */ break; case TYPE_COMMENT:
a4d78362397fbpo-36495: Fix two out-of-bounds array reads (GH-12641)
1 file changed · +2 −2
Python/ast.c+2 −2 modified@@ -1400,7 +1400,7 @@ handle_keywordonly_args(struct compiling *c, const node *n, int start, goto error; asdl_seq_SET(kwonlyargs, j++, arg); i += 1; /* the name */ - if (TYPE(CHILD(n, i)) == COMMA) + if (i < NCH(n) && TYPE(CHILD(n, i)) == COMMA) i += 1; /* the comma, if present */ break; case TYPE_COMMENT: @@ -1599,7 +1599,7 @@ ast_for_arguments(struct compiling *c, const node *n) if (!kwarg) return NULL; i += 2; /* the double star and the name */ - if (TYPE(CHILD(n, i)) == COMMA) + if (i < NCH(n) && TYPE(CHILD(n, i)) == COMMA) i += 1; /* the comma, if present */ break; case TYPE_COMMENT:
dcfcd146f8e6bpo-35766: Merge typed_ast back into CPython (GH-11645)
30 files changed · +2039 −651
Doc/library/ast.rst+18 −1 modified@@ -126,16 +126,33 @@ The abstract grammar is currently defined as follows: Apart from the node classes, the :mod:`ast` module defines these utility functions and classes for traversing abstract syntax trees: -.. function:: parse(source, filename='<unknown>', mode='exec') +.. function:: parse(source, filename='<unknown>', mode='exec', *, type_comments=False) Parse the source into an AST node. Equivalent to ``compile(source, filename, mode, ast.PyCF_ONLY_AST)``. + If ``type_comments=True`` is given, the parser is modified to check + and return type comments as specified by :pep:`484` and :pep:`526`. + This is equivalent to adding :data:`ast.PyCF_TYPE_COMMENTS` to the + flags passed to :func:`compile()`. This will report syntax errors + for misplaced type comments. Without this flag, type comments will + be ignored, and the ``type_comment`` field on selected AST nodes + will always be ``None``. In addition, the locations of ``# type: + ignore`` comments will be returned as the ``type_ignores`` + attribute of :class:`Module` (otherwise it is always an empty list). + + In addition, if ``mode`` is ``'func_type'``, the input syntax is + modified to correspond to :pep:`484` "signature type comments", + e.g. ``(str, int) -> List[str]``. + .. warning:: It is possible to crash the Python interpreter with a sufficiently large/complex string due to stack depth limitations in Python's AST compiler. + .. versionchanged:: 3.8 + Added ``type_comments=True`` and ``mode='func_type'``. + .. function:: literal_eval(node_or_string)
Doc/library/token-list.inc+4 −0 modified@@ -203,6 +203,10 @@ .. data:: OP +.. data:: TYPE_IGNORE + +.. data:: TYPE_COMMENT + .. data:: ERRORTOKEN .. data:: N_TOKENS
Doc/library/token.rst+10 −0 modified@@ -69,6 +69,13 @@ the :mod:`tokenize` module. always be an ``ENCODING`` token. +.. data:: TYPE_COMMENT + + Token value indicating that a type comment was recognized. Such + tokens are only produced when :func:`ast.parse()` is invoked with + ``type_comments=True``. + + .. versionchanged:: 3.5 Added :data:`AWAIT` and :data:`ASYNC` tokens. @@ -78,3 +85,6 @@ the :mod:`tokenize` module. .. versionchanged:: 3.7 Removed :data:`AWAIT` and :data:`ASYNC` tokens. "async" and "await" are now tokenized as :data:`NAME` tokens. + +.. versionchanged:: 3.8 + Added :data:`TYPE_COMMENT`.
Grammar/Grammar+22 −9 modified@@ -7,7 +7,9 @@ # single_input is a single interactive statement; # file_input is a module or sequence of commands read from an input file; # eval_input is the input for the eval() functions. +# func_type_input is a PEP 484 Python 2 function type comment # NB: compound_stmt in single_input is followed by extra NEWLINE! +# NB: due to the way TYPE_COMMENT is tokenized it will always be followed by a NEWLINE single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE file_input: (NEWLINE | stmt)* ENDMARKER eval_input: testlist NEWLINE* ENDMARKER @@ -17,14 +19,14 @@ decorators: decorator+ decorated: decorators (classdef | funcdef | async_funcdef) async_funcdef: 'async' funcdef -funcdef: 'def' NAME parameters ['->' test] ':' suite +funcdef: 'def' NAME parameters ['->' test] ':' [TYPE_COMMENT] func_body_suite parameters: '(' [typedargslist] ')' -typedargslist: (tfpdef ['=' test] (',' tfpdef ['=' test])* [',' [ - '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]] - | '**' tfpdef [',']]] - | '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]] - | '**' tfpdef [',']) +typedargslist: (tfpdef ['=' test] (',' [TYPE_COMMENT] tfpdef ['=' test])* (TYPE_COMMENT | [',' [TYPE_COMMENT] [ + '*' [tfpdef] (',' [TYPE_COMMENT] tfpdef ['=' test])* (TYPE_COMMENT | [',' [TYPE_COMMENT] ['**' tfpdef [','] [TYPE_COMMENT]]]) + | '**' tfpdef [','] [TYPE_COMMENT]]]) + | '*' [tfpdef] (',' [TYPE_COMMENT] tfpdef ['=' test])* (TYPE_COMMENT | [',' [TYPE_COMMENT] ['**' tfpdef [','] [TYPE_COMMENT]]]) + | '**' tfpdef [','] [TYPE_COMMENT]) tfpdef: NAME [':' test] varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [ '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]] @@ -39,7 +41,7 @@ simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt | import_stmt | global_stmt | nonlocal_stmt | assert_stmt) expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist) | - ('=' (yield_expr|testlist_star_expr))*) + [('=' (yield_expr|testlist_star_expr))+ [TYPE_COMMENT]] ) annassign: ':' test ['=' (yield_expr|testlist)] testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [','] augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' | @@ -71,13 +73,13 @@ compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef async_stmt: 'async' (funcdef | with_stmt | for_stmt) if_stmt: 'if' namedexpr_test ':' suite ('elif' namedexpr_test ':' suite)* ['else' ':' suite] while_stmt: 'while' test ':' suite ['else' ':' suite] -for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite] +for_stmt: 'for' exprlist 'in' testlist ':' [TYPE_COMMENT] suite ['else' ':' suite] try_stmt: ('try' ':' suite ((except_clause ':' suite)+ ['else' ':' suite] ['finally' ':' suite] | 'finally' ':' suite)) -with_stmt: 'with' with_item (',' with_item)* ':' suite +with_stmt: 'with' with_item (',' with_item)* ':' [TYPE_COMMENT] suite with_item: test ['as' expr] # NB compile.c makes sure that the default except clause is last except_clause: 'except' [test ['as' NAME]] @@ -150,3 +152,14 @@ encoding_decl: NAME yield_expr: 'yield' [yield_arg] yield_arg: 'from' test | testlist_star_expr + +# the TYPE_COMMENT in suites is only parsed for funcdefs, +# but can't go elsewhere due to ambiguity +func_body_suite: simple_stmt | NEWLINE [TYPE_COMMENT NEWLINE] INDENT stmt+ DEDENT + +func_type_input: func_type NEWLINE* ENDMARKER +func_type: '(' [typelist] ')' '->' test +# typelist is a modified typedargslist (see above) +typelist: (test (',' test)* [',' + ['*' [test] (',' test)* [',' '**' test] | '**' test]] + | '*' [test] (',' test)* [',' '**' test] | '**' test)
Grammar/Tokens+2 −0 modified@@ -55,6 +55,8 @@ ELLIPSIS '...' COLONEQUAL ':=' OP +TYPE_IGNORE +TYPE_COMMENT ERRORTOKEN # These aren't used by the C tokenizer but are needed for tokenize.py
Include/compile.h+3 −2 modified@@ -22,6 +22,7 @@ PyAPI_FUNC(PyCodeObject *) PyNode_Compile(struct _node *, const char *); #define PyCF_DONT_IMPLY_DEDENT 0x0200 #define PyCF_ONLY_AST 0x0400 #define PyCF_IGNORE_COOKIE 0x0800 +#define PyCF_TYPE_COMMENTS 0x1000 #ifndef Py_LIMITED_API typedef struct { @@ -85,10 +86,10 @@ PyAPI_FUNC(int) _PyAST_Optimize(struct _mod *, PyArena *arena, int optimize); #endif /* !Py_LIMITED_API */ -/* These definitions must match corresponding definitions in graminit.h. - There's code in compile.c that checks that they are the same. */ +/* These definitions must match corresponding definitions in graminit.h. */ #define Py_single_input 256 #define Py_file_input 257 #define Py_eval_input 258 +#define Py_func_type_input 345 #endif /* !Py_COMPILE_H */
Include/graminit.h+4 −0 modified@@ -88,3 +88,7 @@ #define encoding_decl 341 #define yield_expr 342 #define yield_arg 343 +#define func_body_suite 344 +#define func_type_input 345 +#define func_type 346 +#define typelist 347
Include/parsetok.h+1 −0 modified@@ -37,6 +37,7 @@ typedef struct { #define PyPARSE_IGNORE_COOKIE 0x0010 #define PyPARSE_BARRY_AS_BDFL 0x0020 +#define PyPARSE_TYPE_COMMENTS 0x0040 PyAPI_FUNC(node *) PyParser_ParseString(const char *, grammar *, int, perrdetail *);
Include/Python-ast.h+64 −30 modified@@ -46,14 +46,17 @@ typedef struct _alias *alias_ty; typedef struct _withitem *withitem_ty; +typedef struct _type_ignore *type_ignore_ty; + enum _mod_kind {Module_kind=1, Interactive_kind=2, Expression_kind=3, - Suite_kind=4}; + FunctionType_kind=4, Suite_kind=5}; struct _mod { enum _mod_kind kind; union { struct { asdl_seq *body; + asdl_seq *type_ignores; } Module; struct { @@ -64,6 +67,11 @@ struct _mod { expr_ty body; } Expression; + struct { + asdl_seq *argtypes; + expr_ty returns; + } FunctionType; + struct { asdl_seq *body; } Suite; @@ -88,6 +96,7 @@ struct _stmt { asdl_seq *body; asdl_seq *decorator_list; expr_ty returns; + string type_comment; } FunctionDef; struct { @@ -96,6 +105,7 @@ struct _stmt { asdl_seq *body; asdl_seq *decorator_list; expr_ty returns; + string type_comment; } AsyncFunctionDef; struct { @@ -117,6 +127,7 @@ struct _stmt { struct { asdl_seq *targets; expr_ty value; + string type_comment; } Assign; struct { @@ -137,13 +148,15 @@ struct _stmt { expr_ty iter; asdl_seq *body; asdl_seq *orelse; + string type_comment; } For; struct { expr_ty target; expr_ty iter; asdl_seq *body; asdl_seq *orelse; + string type_comment; } AsyncFor; struct { @@ -161,11 +174,13 @@ struct _stmt { struct { asdl_seq *items; asdl_seq *body; + string type_comment; } With; struct { asdl_seq *items; asdl_seq *body; + string type_comment; } AsyncWith; struct { @@ -421,6 +436,7 @@ struct _arguments { struct _arg { identifier arg; expr_ty annotation; + string type_comment; int lineno; int col_offset; int end_lineno; @@ -442,26 +458,40 @@ struct _withitem { expr_ty optional_vars; }; +enum _type_ignore_kind {TypeIgnore_kind=1}; +struct _type_ignore { + enum _type_ignore_kind kind; + union { + struct { + int lineno; + } TypeIgnore; + + } v; +}; + // Note: these macros affect function definitions, not only call sites. -#define Module(a0, a1) _Py_Module(a0, a1) -mod_ty _Py_Module(asdl_seq * body, PyArena *arena); +#define Module(a0, a1, a2) _Py_Module(a0, a1, a2) +mod_ty _Py_Module(asdl_seq * body, asdl_seq * type_ignores, PyArena *arena); #define Interactive(a0, a1) _Py_Interactive(a0, a1) mod_ty _Py_Interactive(asdl_seq * body, PyArena *arena); #define Expression(a0, a1) _Py_Expression(a0, a1) mod_ty _Py_Expression(expr_ty body, PyArena *arena); +#define FunctionType(a0, a1, a2) _Py_FunctionType(a0, a1, a2) +mod_ty _Py_FunctionType(asdl_seq * argtypes, expr_ty returns, PyArena *arena); #define Suite(a0, a1) _Py_Suite(a0, a1) mod_ty _Py_Suite(asdl_seq * body, PyArena *arena); -#define FunctionDef(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9) _Py_FunctionDef(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9) +#define FunctionDef(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10) _Py_FunctionDef(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10) stmt_ty _Py_FunctionDef(identifier name, arguments_ty args, asdl_seq * body, - asdl_seq * decorator_list, expr_ty returns, int lineno, - int col_offset, int end_lineno, int end_col_offset, - PyArena *arena); -#define AsyncFunctionDef(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9) _Py_AsyncFunctionDef(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9) + asdl_seq * decorator_list, expr_ty returns, string + type_comment, int lineno, int col_offset, int + end_lineno, int end_col_offset, PyArena *arena); +#define AsyncFunctionDef(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10) _Py_AsyncFunctionDef(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10) stmt_ty _Py_AsyncFunctionDef(identifier name, arguments_ty args, asdl_seq * body, asdl_seq * decorator_list, expr_ty returns, - int lineno, int col_offset, int end_lineno, int - end_col_offset, PyArena *arena); + string type_comment, int lineno, int col_offset, + int end_lineno, int end_col_offset, PyArena + *arena); #define ClassDef(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9) _Py_ClassDef(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9) stmt_ty _Py_ClassDef(identifier name, asdl_seq * bases, asdl_seq * keywords, asdl_seq * body, asdl_seq * decorator_list, int lineno, @@ -473,10 +503,10 @@ stmt_ty _Py_Return(expr_ty value, int lineno, int col_offset, int end_lineno, #define Delete(a0, a1, a2, a3, a4, a5) _Py_Delete(a0, a1, a2, a3, a4, a5) stmt_ty _Py_Delete(asdl_seq * targets, int lineno, int col_offset, int end_lineno, int end_col_offset, PyArena *arena); -#define Assign(a0, a1, a2, a3, a4, a5, a6) _Py_Assign(a0, a1, a2, a3, a4, a5, a6) -stmt_ty _Py_Assign(asdl_seq * targets, expr_ty value, int lineno, int - col_offset, int end_lineno, int end_col_offset, PyArena - *arena); +#define Assign(a0, a1, a2, a3, a4, a5, a6, a7) _Py_Assign(a0, a1, a2, a3, a4, a5, a6, a7) +stmt_ty _Py_Assign(asdl_seq * targets, expr_ty value, string type_comment, int + lineno, int col_offset, int end_lineno, int end_col_offset, + PyArena *arena); #define AugAssign(a0, a1, a2, a3, a4, a5, a6, a7) _Py_AugAssign(a0, a1, a2, a3, a4, a5, a6, a7) stmt_ty _Py_AugAssign(expr_ty target, operator_ty op, expr_ty value, int lineno, int col_offset, int end_lineno, int @@ -485,14 +515,14 @@ stmt_ty _Py_AugAssign(expr_ty target, operator_ty op, expr_ty value, int stmt_ty _Py_AnnAssign(expr_ty target, expr_ty annotation, expr_ty value, int simple, int lineno, int col_offset, int end_lineno, int end_col_offset, PyArena *arena); -#define For(a0, a1, a2, a3, a4, a5, a6, a7, a8) _Py_For(a0, a1, a2, a3, a4, a5, a6, a7, a8) +#define For(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9) _Py_For(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9) stmt_ty _Py_For(expr_ty target, expr_ty iter, asdl_seq * body, asdl_seq * - orelse, int lineno, int col_offset, int end_lineno, int - end_col_offset, PyArena *arena); -#define AsyncFor(a0, a1, a2, a3, a4, a5, a6, a7, a8) _Py_AsyncFor(a0, a1, a2, a3, a4, a5, a6, a7, a8) + orelse, string type_comment, int lineno, int col_offset, int + end_lineno, int end_col_offset, PyArena *arena); +#define AsyncFor(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9) _Py_AsyncFor(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9) stmt_ty _Py_AsyncFor(expr_ty target, expr_ty iter, asdl_seq * body, asdl_seq * - orelse, int lineno, int col_offset, int end_lineno, int - end_col_offset, PyArena *arena); + orelse, string type_comment, int lineno, int col_offset, + int end_lineno, int end_col_offset, PyArena *arena); #define While(a0, a1, a2, a3, a4, a5, a6, a7) _Py_While(a0, a1, a2, a3, a4, a5, a6, a7) stmt_ty _Py_While(expr_ty test, asdl_seq * body, asdl_seq * orelse, int lineno, int col_offset, int end_lineno, int end_col_offset, PyArena @@ -501,13 +531,14 @@ stmt_ty _Py_While(expr_ty test, asdl_seq * body, asdl_seq * orelse, int lineno, stmt_ty _Py_If(expr_ty test, asdl_seq * body, asdl_seq * orelse, int lineno, int col_offset, int end_lineno, int end_col_offset, PyArena *arena); -#define With(a0, a1, a2, a3, a4, a5, a6) _Py_With(a0, a1, a2, a3, a4, a5, a6) -stmt_ty _Py_With(asdl_seq * items, asdl_seq * body, int lineno, int col_offset, - int end_lineno, int end_col_offset, PyArena *arena); -#define AsyncWith(a0, a1, a2, a3, a4, a5, a6) _Py_AsyncWith(a0, a1, a2, a3, a4, a5, a6) -stmt_ty _Py_AsyncWith(asdl_seq * items, asdl_seq * body, int lineno, int - col_offset, int end_lineno, int end_col_offset, PyArena - *arena); +#define With(a0, a1, a2, a3, a4, a5, a6, a7) _Py_With(a0, a1, a2, a3, a4, a5, a6, a7) +stmt_ty _Py_With(asdl_seq * items, asdl_seq * body, string type_comment, int + lineno, int col_offset, int end_lineno, int end_col_offset, + PyArena *arena); +#define AsyncWith(a0, a1, a2, a3, a4, a5, a6, a7) _Py_AsyncWith(a0, a1, a2, a3, a4, a5, a6, a7) +stmt_ty _Py_AsyncWith(asdl_seq * items, asdl_seq * body, string type_comment, + int lineno, int col_offset, int end_lineno, int + end_col_offset, PyArena *arena); #define Raise(a0, a1, a2, a3, a4, a5, a6) _Py_Raise(a0, a1, a2, a3, a4, a5, a6) stmt_ty _Py_Raise(expr_ty exc, expr_ty cause, int lineno, int col_offset, int end_lineno, int end_col_offset, PyArena *arena); @@ -656,16 +687,19 @@ excepthandler_ty _Py_ExceptHandler(expr_ty type, identifier name, asdl_seq * arguments_ty _Py_arguments(asdl_seq * args, arg_ty vararg, asdl_seq * kwonlyargs, asdl_seq * kw_defaults, arg_ty kwarg, asdl_seq * defaults, PyArena *arena); -#define arg(a0, a1, a2, a3, a4, a5, a6) _Py_arg(a0, a1, a2, a3, a4, a5, a6) -arg_ty _Py_arg(identifier arg, expr_ty annotation, int lineno, int col_offset, - int end_lineno, int end_col_offset, PyArena *arena); +#define arg(a0, a1, a2, a3, a4, a5, a6, a7) _Py_arg(a0, a1, a2, a3, a4, a5, a6, a7) +arg_ty _Py_arg(identifier arg, expr_ty annotation, string type_comment, int + lineno, int col_offset, int end_lineno, int end_col_offset, + PyArena *arena); #define keyword(a0, a1, a2) _Py_keyword(a0, a1, a2) keyword_ty _Py_keyword(identifier arg, expr_ty value, PyArena *arena); #define alias(a0, a1, a2) _Py_alias(a0, a1, a2) alias_ty _Py_alias(identifier name, identifier asname, PyArena *arena); #define withitem(a0, a1, a2) _Py_withitem(a0, a1, a2) withitem_ty _Py_withitem(expr_ty context_expr, expr_ty optional_vars, PyArena *arena); +#define TypeIgnore(a0, a1) _Py_TypeIgnore(a0, a1) +type_ignore_ty _Py_TypeIgnore(int lineno, PyArena *arena); PyObject* PyAST_mod2obj(mod_ty t); mod_ty PyAST_obj2mod(PyObject* ast, PyArena* arena, int mode);
Include/token.h+4 −2 modified@@ -65,8 +65,10 @@ extern "C" { #define ELLIPSIS 52 #define COLONEQUAL 53 #define OP 54 -#define ERRORTOKEN 55 -#define N_TOKENS 59 +#define TYPE_IGNORE 55 +#define TYPE_COMMENT 56 +#define ERRORTOKEN 57 +#define N_TOKENS 61 #define NT_OFFSET 256 /* Special definitions for cooperation with parser */
Lib/ast.py+6 −2 modified@@ -27,12 +27,16 @@ from _ast import * -def parse(source, filename='<unknown>', mode='exec'): +def parse(source, filename='<unknown>', mode='exec', *, type_comments=False): """ Parse the source into an AST node. Equivalent to compile(source, filename, mode, PyCF_ONLY_AST). + Pass type_comments=True to get back type comments where the syntax allows. """ - return compile(source, filename, mode, PyCF_ONLY_AST) + flags = PyCF_ONLY_AST + if type_comments: + flags |= PyCF_TYPE_COMMENTS + return compile(source, filename, mode, flags) def literal_eval(node_or_string):
Lib/symbol.py+4 −0 modified@@ -100,6 +100,10 @@ encoding_decl = 341 yield_expr = 342 yield_arg = 343 +func_body_suite = 344 +func_type_input = 345 +func_type = 346 +typelist = 347 #--end constants-- sym_name = {}
Lib/test/test_asdl_parser.py+2 −1 modified@@ -117,7 +117,8 @@ def visitConstructor(self, cons): v = CustomVisitor() v.visit(self.types['mod']) - self.assertEqual(v.names_with_seq, ['Module', 'Interactive', 'Suite']) + self.assertEqual(v.names_with_seq, + ['Module', 'Module', 'Interactive', 'FunctionType', 'Suite']) if __name__ == '__main__':
Lib/test/test_ast.py+66 −65 modified@@ -455,7 +455,7 @@ class N2(ast.Num): def test_module(self): body = [ast.Num(42)] - x = ast.Module(body) + x = ast.Module(body, []) self.assertEqual(x.body, body) def test_nodeclasses(self): @@ -524,13 +524,13 @@ def test_pickling(self): def test_invalid_sum(self): pos = dict(lineno=2, col_offset=3) - m = ast.Module([ast.Expr(ast.expr(**pos), **pos)]) + m = ast.Module([ast.Expr(ast.expr(**pos), **pos)], []) with self.assertRaises(TypeError) as cm: compile(m, "<test>", "exec") self.assertIn("but got <_ast.expr", str(cm.exception)) def test_invalid_identitifer(self): - m = ast.Module([ast.Expr(ast.Name(42, ast.Load()))]) + m = ast.Module([ast.Expr(ast.Name(42, ast.Load()))], []) ast.fix_missing_locations(m) with self.assertRaises(TypeError) as cm: compile(m, "<test>", "exec") @@ -575,11 +575,11 @@ def test_dump(self): self.assertEqual(ast.dump(node), "Module(body=[Expr(value=Call(func=Name(id='spam', ctx=Load()), " "args=[Name(id='eggs', ctx=Load()), Constant(value='and cheese')], " - "keywords=[]))])" + "keywords=[]))], type_ignores=[])" ) self.assertEqual(ast.dump(node, annotate_fields=False), "Module([Expr(Call(Name('spam', Load()), [Name('eggs', Load()), " - "Constant('and cheese')], []))])" + "Constant('and cheese')], []))], [])" ) self.assertEqual(ast.dump(node, include_attributes=True), "Module(body=[Expr(value=Call(func=Name(id='spam', ctx=Load(), " @@ -588,7 +588,7 @@ def test_dump(self): "end_lineno=1, end_col_offset=9), Constant(value='and cheese', " "lineno=1, col_offset=11, end_lineno=1, end_col_offset=23)], keywords=[], " "lineno=1, col_offset=0, end_lineno=1, end_col_offset=24), " - "lineno=1, col_offset=0, end_lineno=1, end_col_offset=24)])" + "lineno=1, col_offset=0, end_lineno=1, end_col_offset=24)], type_ignores=[])" ) def test_copy_location(self): @@ -617,7 +617,8 @@ def test_fix_missing_locations(self): "lineno=1, col_offset=0, end_lineno=1, end_col_offset=0), " "args=[Constant(value='eggs', lineno=1, col_offset=0, end_lineno=1, " "end_col_offset=0)], keywords=[], lineno=1, col_offset=0, end_lineno=1, " - "end_col_offset=0), lineno=1, col_offset=0, end_lineno=1, end_col_offset=0)])" + "end_col_offset=0), lineno=1, col_offset=0, end_lineno=1, end_col_offset=0)], " + "type_ignores=[])" ) def test_increment_lineno(self): @@ -760,7 +761,7 @@ def test_bad_integer(self): names=[ast.alias(name='sleep')], level=None, lineno=None, col_offset=None)] - mod = ast.Module(body) + mod = ast.Module(body, []) with self.assertRaises(ValueError) as cm: compile(mod, 'test', 'exec') self.assertIn("invalid integer value: None", str(cm.exception)) @@ -770,7 +771,7 @@ def test_level_as_none(self): names=[ast.alias(name='sleep')], level=None, lineno=0, col_offset=0)] - mod = ast.Module(body) + mod = ast.Module(body, []) code = compile(mod, 'test', 'exec') ns = {} exec(code, ns) @@ -790,11 +791,11 @@ def mod(self, mod, msg=None, mode="exec", *, exc=ValueError): self.assertIn(msg, str(cm.exception)) def expr(self, node, msg=None, *, exc=ValueError): - mod = ast.Module([ast.Expr(node)]) + mod = ast.Module([ast.Expr(node)], []) self.mod(mod, msg, exc=exc) def stmt(self, stmt, msg=None): - mod = ast.Module([stmt]) + mod = ast.Module([stmt], []) self.mod(mod, msg) def test_module(self): @@ -1603,61 +1604,61 @@ def main(): raise SystemExit unittest.main() -#### EVERYTHING BELOW IS GENERATED ##### +#### EVERYTHING BELOW IS GENERATED BY python Lib/test/test_ast.py -g ##### exec_results = [ -('Module', [('Expr', (1, 0), ('Constant', (1, 0), None))]), -('Module', [('Expr', (1, 0), ('Constant', (1, 0), 'module docstring'))]), -('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('Pass', (1, 9))], [], None)]), -('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('Expr', (1, 9), ('Constant', (1, 9), 'function docstring'))], [], None)]), -('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [('arg', (1, 6), 'a', None)], None, [], [], None, []), [('Pass', (1, 10))], [], None)]), -('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [('arg', (1, 6), 'a', None)], None, [], [], None, [('Constant', (1, 8), 0)]), [('Pass', (1, 12))], [], None)]), -('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [], ('arg', (1, 7), 'args', None), [], [], None, []), [('Pass', (1, 14))], [], None)]), -('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], ('arg', (1, 8), 'kwargs', None), []), [('Pass', (1, 17))], [], None)]), -('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [('arg', (1, 6), 'a', None), ('arg', (1, 9), 'b', None), ('arg', (1, 14), 'c', None), ('arg', (1, 22), 'd', None), ('arg', (1, 28), 'e', None)], ('arg', (1, 35), 'args', None), [('arg', (1, 41), 'f', None)], [('Constant', (1, 43), 42)], ('arg', (1, 49), 'kwargs', None), [('Constant', (1, 11), 1), ('Constant', (1, 16), None), ('List', (1, 24), [], ('Load',)), ('Dict', (1, 30), [], [])]), [('Expr', (1, 58), ('Constant', (1, 58), 'doc for f()'))], [], None)]), -('Module', [('ClassDef', (1, 0), 'C', [], [], [('Pass', (1, 8))], [])]), -('Module', [('ClassDef', (1, 0), 'C', [], [], [('Expr', (1, 9), ('Constant', (1, 9), 'docstring for class C'))], [])]), -('Module', [('ClassDef', (1, 0), 'C', [('Name', (1, 8), 'object', ('Load',))], [], [('Pass', (1, 17))], [])]), -('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('Return', (1, 8), ('Constant', (1, 15), 1))], [], None)]), -('Module', [('Delete', (1, 0), [('Name', (1, 4), 'v', ('Del',))])]), -('Module', [('Assign', (1, 0), [('Name', (1, 0), 'v', ('Store',))], ('Constant', (1, 4), 1))]), -('Module', [('Assign', (1, 0), [('Tuple', (1, 0), [('Name', (1, 0), 'a', ('Store',)), ('Name', (1, 2), 'b', ('Store',))], ('Store',))], ('Name', (1, 6), 'c', ('Load',)))]), -('Module', [('Assign', (1, 0), [('Tuple', (1, 0), [('Name', (1, 1), 'a', ('Store',)), ('Name', (1, 3), 'b', ('Store',))], ('Store',))], ('Name', (1, 8), 'c', ('Load',)))]), -('Module', [('Assign', (1, 0), [('List', (1, 0), [('Name', (1, 1), 'a', ('Store',)), ('Name', (1, 3), 'b', ('Store',))], ('Store',))], ('Name', (1, 8), 'c', ('Load',)))]), -('Module', [('AugAssign', (1, 0), ('Name', (1, 0), 'v', ('Store',)), ('Add',), ('Constant', (1, 5), 1))]), -('Module', [('For', (1, 0), ('Name', (1, 4), 'v', ('Store',)), ('Name', (1, 9), 'v', ('Load',)), [('Pass', (1, 11))], [])]), -('Module', [('While', (1, 0), ('Name', (1, 6), 'v', ('Load',)), [('Pass', (1, 8))], [])]), -('Module', [('If', (1, 0), ('Name', (1, 3), 'v', ('Load',)), [('Pass', (1, 5))], [])]), -('Module', [('With', (1, 0), [('withitem', ('Name', (1, 5), 'x', ('Load',)), ('Name', (1, 10), 'y', ('Store',)))], [('Pass', (1, 13))])]), -('Module', [('With', (1, 0), [('withitem', ('Name', (1, 5), 'x', ('Load',)), ('Name', (1, 10), 'y', ('Store',))), ('withitem', ('Name', (1, 13), 'z', ('Load',)), ('Name', (1, 18), 'q', ('Store',)))], [('Pass', (1, 21))])]), -('Module', [('Raise', (1, 0), ('Call', (1, 6), ('Name', (1, 6), 'Exception', ('Load',)), [('Constant', (1, 16), 'string')], []), None)]), -('Module', [('Try', (1, 0), [('Pass', (2, 2))], [('ExceptHandler', (3, 0), ('Name', (3, 7), 'Exception', ('Load',)), None, [('Pass', (4, 2))])], [], [])]), -('Module', [('Try', (1, 0), [('Pass', (2, 2))], [], [], [('Pass', (4, 2))])]), -('Module', [('Assert', (1, 0), ('Name', (1, 7), 'v', ('Load',)), None)]), -('Module', [('Import', (1, 0), [('alias', 'sys', None)])]), -('Module', [('ImportFrom', (1, 0), 'sys', [('alias', 'v', None)], 0)]), -('Module', [('Global', (1, 0), ['v'])]), -('Module', [('Expr', (1, 0), ('Constant', (1, 0), 1))]), -('Module', [('Pass', (1, 0))]), -('Module', [('For', (1, 0), ('Name', (1, 4), 'v', ('Store',)), ('Name', (1, 9), 'v', ('Load',)), [('Break', (1, 11))], [])]), -('Module', [('For', (1, 0), ('Name', (1, 4), 'v', ('Store',)), ('Name', (1, 9), 'v', ('Load',)), [('Continue', (1, 11))], [])]), -('Module', [('For', (1, 0), ('Tuple', (1, 4), [('Name', (1, 4), 'a', ('Store',)), ('Name', (1, 6), 'b', ('Store',))], ('Store',)), ('Name', (1, 11), 'c', ('Load',)), [('Pass', (1, 14))], [])]), -('Module', [('For', (1, 0), ('Tuple', (1, 4), [('Name', (1, 5), 'a', ('Store',)), ('Name', (1, 7), 'b', ('Store',))], ('Store',)), ('Name', (1, 13), 'c', ('Load',)), [('Pass', (1, 16))], [])]), -('Module', [('For', (1, 0), ('List', (1, 4), [('Name', (1, 5), 'a', ('Store',)), ('Name', (1, 7), 'b', ('Store',))], ('Store',)), ('Name', (1, 13), 'c', ('Load',)), [('Pass', (1, 16))], [])]), -('Module', [('Expr', (1, 0), ('GeneratorExp', (1, 0), ('Tuple', (2, 4), [('Name', (3, 4), 'Aa', ('Load',)), ('Name', (5, 7), 'Bb', ('Load',))], ('Load',)), [('comprehension', ('Tuple', (8, 4), [('Name', (8, 4), 'Aa', ('Store',)), ('Name', (10, 4), 'Bb', ('Store',))], ('Store',)), ('Name', (10, 10), 'Cc', ('Load',)), [], 0)]))]), -('Module', [('Expr', (1, 0), ('DictComp', (1, 0), ('Name', (1, 1), 'a', ('Load',)), ('Name', (1, 5), 'b', ('Load',)), [('comprehension', ('Name', (1, 11), 'w', ('Store',)), ('Name', (1, 16), 'x', ('Load',)), [], 0), ('comprehension', ('Name', (1, 22), 'm', ('Store',)), ('Name', (1, 27), 'p', ('Load',)), [('Name', (1, 32), 'g', ('Load',))], 0)]))]), -('Module', [('Expr', (1, 0), ('DictComp', (1, 0), ('Name', (1, 1), 'a', ('Load',)), ('Name', (1, 5), 'b', ('Load',)), [('comprehension', ('Tuple', (1, 11), [('Name', (1, 11), 'v', ('Store',)), ('Name', (1, 13), 'w', ('Store',))], ('Store',)), ('Name', (1, 18), 'x', ('Load',)), [], 0)]))]), -('Module', [('Expr', (1, 0), ('SetComp', (1, 0), ('Name', (1, 1), 'r', ('Load',)), [('comprehension', ('Name', (1, 7), 'l', ('Store',)), ('Name', (1, 12), 'x', ('Load',)), [('Name', (1, 17), 'g', ('Load',))], 0)]))]), -('Module', [('Expr', (1, 0), ('SetComp', (1, 0), ('Name', (1, 1), 'r', ('Load',)), [('comprehension', ('Tuple', (1, 7), [('Name', (1, 7), 'l', ('Store',)), ('Name', (1, 9), 'm', ('Store',))], ('Store',)), ('Name', (1, 14), 'x', ('Load',)), [], 0)]))]), -('Module', [('AsyncFunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('Expr', (2, 1), ('Constant', (2, 1), 'async function')), ('Expr', (3, 1), ('Await', (3, 1), ('Call', (3, 7), ('Name', (3, 7), 'something', ('Load',)), [], [])))], [], None)]), -('Module', [('AsyncFunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('AsyncFor', (2, 1), ('Name', (2, 11), 'e', ('Store',)), ('Name', (2, 16), 'i', ('Load',)), [('Expr', (2, 19), ('Constant', (2, 19), 1))], [('Expr', (3, 7), ('Constant', (3, 7), 2))])], [], None)]), -('Module', [('AsyncFunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('AsyncWith', (2, 1), [('withitem', ('Name', (2, 12), 'a', ('Load',)), ('Name', (2, 17), 'b', ('Store',)))], [('Expr', (2, 20), ('Constant', (2, 20), 1))])], [], None)]), -('Module', [('Expr', (1, 0), ('Dict', (1, 0), [None, ('Constant', (1, 10), 2)], [('Dict', (1, 3), [('Constant', (1, 4), 1)], [('Constant', (1, 6), 2)]), ('Constant', (1, 12), 3)]))]), -('Module', [('Expr', (1, 0), ('Set', (1, 0), [('Starred', (1, 1), ('Set', (1, 2), [('Constant', (1, 3), 1), ('Constant', (1, 6), 2)]), ('Load',)), ('Constant', (1, 10), 3)]))]), -('Module', [('AsyncFunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('Expr', (2, 1), ('ListComp', (2, 1), ('Name', (2, 2), 'i', ('Load',)), [('comprehension', ('Name', (2, 14), 'b', ('Store',)), ('Name', (2, 19), 'c', ('Load',)), [], 1)]))], [], None)]), -('Module', [('FunctionDef', (3, 0), 'f', ('arguments', [], None, [], [], None, []), [('Pass', (3, 9))], [('Name', (1, 1), 'deco1', ('Load',)), ('Call', (2, 0), ('Name', (2, 1), 'deco2', ('Load',)), [], [])], None)]), -('Module', [('AsyncFunctionDef', (3, 0), 'f', ('arguments', [], None, [], [], None, []), [('Pass', (3, 15))], [('Name', (1, 1), 'deco1', ('Load',)), ('Call', (2, 0), ('Name', (2, 1), 'deco2', ('Load',)), [], [])], None)]), -('Module', [('ClassDef', (3, 0), 'C', [], [], [('Pass', (3, 9))], [('Name', (1, 1), 'deco1', ('Load',)), ('Call', (2, 0), ('Name', (2, 1), 'deco2', ('Load',)), [], [])])]), -('Module', [('FunctionDef', (2, 0), 'f', ('arguments', [], None, [], [], None, []), [('Pass', (2, 9))], [('Call', (1, 1), ('Name', (1, 1), 'deco', ('Load',)), [('GeneratorExp', (1, 5), ('Name', (1, 6), 'a', ('Load',)), [('comprehension', ('Name', (1, 12), 'a', ('Store',)), ('Name', (1, 17), 'b', ('Load',)), [], 0)])], [])], None)]), +('Module', [('Expr', (1, 0), ('Constant', (1, 0), None))], []), +('Module', [('Expr', (1, 0), ('Constant', (1, 0), 'module docstring'))], []), +('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('Pass', (1, 9))], [], None, None)], []), +('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('Expr', (1, 9), ('Constant', (1, 9), 'function docstring'))], [], None, None)], []), +('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [('arg', (1, 6), 'a', None, None)], None, [], [], None, []), [('Pass', (1, 10))], [], None, None)], []), +('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [('arg', (1, 6), 'a', None, None)], None, [], [], None, [('Constant', (1, 8), 0)]), [('Pass', (1, 12))], [], None, None)], []), +('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [], ('arg', (1, 7), 'args', None, None), [], [], None, []), [('Pass', (1, 14))], [], None, None)], []), +('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], ('arg', (1, 8), 'kwargs', None, None), []), [('Pass', (1, 17))], [], None, None)], []), +('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [('arg', (1, 6), 'a', None, None), ('arg', (1, 9), 'b', None, None), ('arg', (1, 14), 'c', None, None), ('arg', (1, 22), 'd', None, None), ('arg', (1, 28), 'e', None, None)], ('arg', (1, 35), 'args', None, None), [('arg', (1, 41), 'f', None, None)], [('Constant', (1, 43), 42)], ('arg', (1, 49), 'kwargs', None, None), [('Constant', (1, 11), 1), ('Constant', (1, 16), None), ('List', (1, 24), [], ('Load',)), ('Dict', (1, 30), [], [])]), [('Expr', (1, 58), ('Constant', (1, 58), 'doc for f()'))], [], None, None)], []), +('Module', [('ClassDef', (1, 0), 'C', [], [], [('Pass', (1, 8))], [])], []), +('Module', [('ClassDef', (1, 0), 'C', [], [], [('Expr', (1, 9), ('Constant', (1, 9), 'docstring for class C'))], [])], []), +('Module', [('ClassDef', (1, 0), 'C', [('Name', (1, 8), 'object', ('Load',))], [], [('Pass', (1, 17))], [])], []), +('Module', [('FunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('Return', (1, 8), ('Constant', (1, 15), 1))], [], None, None)], []), +('Module', [('Delete', (1, 0), [('Name', (1, 4), 'v', ('Del',))])], []), +('Module', [('Assign', (1, 0), [('Name', (1, 0), 'v', ('Store',))], ('Constant', (1, 4), 1), None)], []), +('Module', [('Assign', (1, 0), [('Tuple', (1, 0), [('Name', (1, 0), 'a', ('Store',)), ('Name', (1, 2), 'b', ('Store',))], ('Store',))], ('Name', (1, 6), 'c', ('Load',)), None)], []), +('Module', [('Assign', (1, 0), [('Tuple', (1, 0), [('Name', (1, 1), 'a', ('Store',)), ('Name', (1, 3), 'b', ('Store',))], ('Store',))], ('Name', (1, 8), 'c', ('Load',)), None)], []), +('Module', [('Assign', (1, 0), [('List', (1, 0), [('Name', (1, 1), 'a', ('Store',)), ('Name', (1, 3), 'b', ('Store',))], ('Store',))], ('Name', (1, 8), 'c', ('Load',)), None)], []), +('Module', [('AugAssign', (1, 0), ('Name', (1, 0), 'v', ('Store',)), ('Add',), ('Constant', (1, 5), 1))], []), +('Module', [('For', (1, 0), ('Name', (1, 4), 'v', ('Store',)), ('Name', (1, 9), 'v', ('Load',)), [('Pass', (1, 11))], [], None)], []), +('Module', [('While', (1, 0), ('Name', (1, 6), 'v', ('Load',)), [('Pass', (1, 8))], [])], []), +('Module', [('If', (1, 0), ('Name', (1, 3), 'v', ('Load',)), [('Pass', (1, 5))], [])], []), +('Module', [('With', (1, 0), [('withitem', ('Name', (1, 5), 'x', ('Load',)), ('Name', (1, 10), 'y', ('Store',)))], [('Pass', (1, 13))], None)], []), +('Module', [('With', (1, 0), [('withitem', ('Name', (1, 5), 'x', ('Load',)), ('Name', (1, 10), 'y', ('Store',))), ('withitem', ('Name', (1, 13), 'z', ('Load',)), ('Name', (1, 18), 'q', ('Store',)))], [('Pass', (1, 21))], None)], []), +('Module', [('Raise', (1, 0), ('Call', (1, 6), ('Name', (1, 6), 'Exception', ('Load',)), [('Constant', (1, 16), 'string')], []), None)], []), +('Module', [('Try', (1, 0), [('Pass', (2, 2))], [('ExceptHandler', (3, 0), ('Name', (3, 7), 'Exception', ('Load',)), None, [('Pass', (4, 2))])], [], [])], []), +('Module', [('Try', (1, 0), [('Pass', (2, 2))], [], [], [('Pass', (4, 2))])], []), +('Module', [('Assert', (1, 0), ('Name', (1, 7), 'v', ('Load',)), None)], []), +('Module', [('Import', (1, 0), [('alias', 'sys', None)])], []), +('Module', [('ImportFrom', (1, 0), 'sys', [('alias', 'v', None)], 0)], []), +('Module', [('Global', (1, 0), ['v'])], []), +('Module', [('Expr', (1, 0), ('Constant', (1, 0), 1))], []), +('Module', [('Pass', (1, 0))], []), +('Module', [('For', (1, 0), ('Name', (1, 4), 'v', ('Store',)), ('Name', (1, 9), 'v', ('Load',)), [('Break', (1, 11))], [], None)], []), +('Module', [('For', (1, 0), ('Name', (1, 4), 'v', ('Store',)), ('Name', (1, 9), 'v', ('Load',)), [('Continue', (1, 11))], [], None)], []), +('Module', [('For', (1, 0), ('Tuple', (1, 4), [('Name', (1, 4), 'a', ('Store',)), ('Name', (1, 6), 'b', ('Store',))], ('Store',)), ('Name', (1, 11), 'c', ('Load',)), [('Pass', (1, 14))], [], None)], []), +('Module', [('For', (1, 0), ('Tuple', (1, 4), [('Name', (1, 5), 'a', ('Store',)), ('Name', (1, 7), 'b', ('Store',))], ('Store',)), ('Name', (1, 13), 'c', ('Load',)), [('Pass', (1, 16))], [], None)], []), +('Module', [('For', (1, 0), ('List', (1, 4), [('Name', (1, 5), 'a', ('Store',)), ('Name', (1, 7), 'b', ('Store',))], ('Store',)), ('Name', (1, 13), 'c', ('Load',)), [('Pass', (1, 16))], [], None)], []), +('Module', [('Expr', (1, 0), ('GeneratorExp', (1, 0), ('Tuple', (2, 4), [('Name', (3, 4), 'Aa', ('Load',)), ('Name', (5, 7), 'Bb', ('Load',))], ('Load',)), [('comprehension', ('Tuple', (8, 4), [('Name', (8, 4), 'Aa', ('Store',)), ('Name', (10, 4), 'Bb', ('Store',))], ('Store',)), ('Name', (10, 10), 'Cc', ('Load',)), [], 0)]))], []), +('Module', [('Expr', (1, 0), ('DictComp', (1, 0), ('Name', (1, 1), 'a', ('Load',)), ('Name', (1, 5), 'b', ('Load',)), [('comprehension', ('Name', (1, 11), 'w', ('Store',)), ('Name', (1, 16), 'x', ('Load',)), [], 0), ('comprehension', ('Name', (1, 22), 'm', ('Store',)), ('Name', (1, 27), 'p', ('Load',)), [('Name', (1, 32), 'g', ('Load',))], 0)]))], []), +('Module', [('Expr', (1, 0), ('DictComp', (1, 0), ('Name', (1, 1), 'a', ('Load',)), ('Name', (1, 5), 'b', ('Load',)), [('comprehension', ('Tuple', (1, 11), [('Name', (1, 11), 'v', ('Store',)), ('Name', (1, 13), 'w', ('Store',))], ('Store',)), ('Name', (1, 18), 'x', ('Load',)), [], 0)]))], []), +('Module', [('Expr', (1, 0), ('SetComp', (1, 0), ('Name', (1, 1), 'r', ('Load',)), [('comprehension', ('Name', (1, 7), 'l', ('Store',)), ('Name', (1, 12), 'x', ('Load',)), [('Name', (1, 17), 'g', ('Load',))], 0)]))], []), +('Module', [('Expr', (1, 0), ('SetComp', (1, 0), ('Name', (1, 1), 'r', ('Load',)), [('comprehension', ('Tuple', (1, 7), [('Name', (1, 7), 'l', ('Store',)), ('Name', (1, 9), 'm', ('Store',))], ('Store',)), ('Name', (1, 14), 'x', ('Load',)), [], 0)]))], []), +('Module', [('AsyncFunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('Expr', (2, 1), ('Constant', (2, 1), 'async function')), ('Expr', (3, 1), ('Await', (3, 1), ('Call', (3, 7), ('Name', (3, 7), 'something', ('Load',)), [], [])))], [], None, None)], []), +('Module', [('AsyncFunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('AsyncFor', (2, 1), ('Name', (2, 11), 'e', ('Store',)), ('Name', (2, 16), 'i', ('Load',)), [('Expr', (2, 19), ('Constant', (2, 19), 1))], [('Expr', (3, 7), ('Constant', (3, 7), 2))], None)], [], None, None)], []), +('Module', [('AsyncFunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('AsyncWith', (2, 1), [('withitem', ('Name', (2, 12), 'a', ('Load',)), ('Name', (2, 17), 'b', ('Store',)))], [('Expr', (2, 20), ('Constant', (2, 20), 1))], None)], [], None, None)], []), +('Module', [('Expr', (1, 0), ('Dict', (1, 0), [None, ('Constant', (1, 10), 2)], [('Dict', (1, 3), [('Constant', (1, 4), 1)], [('Constant', (1, 6), 2)]), ('Constant', (1, 12), 3)]))], []), +('Module', [('Expr', (1, 0), ('Set', (1, 0), [('Starred', (1, 1), ('Set', (1, 2), [('Constant', (1, 3), 1), ('Constant', (1, 6), 2)]), ('Load',)), ('Constant', (1, 10), 3)]))], []), +('Module', [('AsyncFunctionDef', (1, 0), 'f', ('arguments', [], None, [], [], None, []), [('Expr', (2, 1), ('ListComp', (2, 1), ('Name', (2, 2), 'i', ('Load',)), [('comprehension', ('Name', (2, 14), 'b', ('Store',)), ('Name', (2, 19), 'c', ('Load',)), [], 1)]))], [], None, None)], []), +('Module', [('FunctionDef', (3, 0), 'f', ('arguments', [], None, [], [], None, []), [('Pass', (3, 9))], [('Name', (1, 1), 'deco1', ('Load',)), ('Call', (2, 0), ('Name', (2, 1), 'deco2', ('Load',)), [], [])], None, None)], []), +('Module', [('AsyncFunctionDef', (3, 0), 'f', ('arguments', [], None, [], [], None, []), [('Pass', (3, 15))], [('Name', (1, 1), 'deco1', ('Load',)), ('Call', (2, 0), ('Name', (2, 1), 'deco2', ('Load',)), [], [])], None, None)], []), +('Module', [('ClassDef', (3, 0), 'C', [], [], [('Pass', (3, 9))], [('Name', (1, 1), 'deco1', ('Load',)), ('Call', (2, 0), ('Name', (2, 1), 'deco2', ('Load',)), [], [])])], []), +('Module', [('FunctionDef', (2, 0), 'f', ('arguments', [], None, [], [], None, []), [('Pass', (2, 9))], [('Call', (1, 1), ('Name', (1, 1), 'deco', ('Load',)), [('GeneratorExp', (1, 5), ('Name', (1, 6), 'a', ('Load',)), [('comprehension', ('Name', (1, 12), 'a', ('Store',)), ('Name', (1, 17), 'b', ('Load',)), [], 0)])], [])], None, None)], []), ] single_results = [ ('Interactive', [('Expr', (1, 0), ('BinOp', (1, 0), ('Constant', (1, 0), 1), ('Add',), ('Constant', (1, 2), 2)))]),
Lib/test/test_type_comments.py+295 −0 added@@ -0,0 +1,295 @@ +import ast +import unittest + + +funcdef = """\ +def foo(): + # type: () -> int + pass + +def bar(): # type: () -> None + pass +""" + +asyncdef = """\ +async def foo(): + # type: () -> int + return await bar() + +async def bar(): # type: () -> int + return await bar() +""" + +redundantdef = """\ +def foo(): # type: () -> int + # type: () -> str + return '' +""" + +nonasciidef = """\ +def foo(): + # type: () -> àçčéñt + pass +""" + +forstmt = """\ +for a in []: # type: int + pass +""" + +withstmt = """\ +with context() as a: # type: int + pass +""" + +vardecl = """\ +a = 0 # type: int +""" + +ignores = """\ +def foo(): + pass # type: ignore + +def bar(): + x = 1 # type: ignore +""" + +# Test for long-form type-comments in arguments. A test function +# named 'fabvk' would have two positional args, a and b, plus a +# var-arg *v, plus a kw-arg **k. It is verified in test_longargs() +# that it has exactly these arguments, no more, no fewer. +longargs = """\ +def fa( + a = 1, # type: A +): + pass + +def fa( + a = 1 # type: A +): + pass + +def fab( + a, # type: A + b, # type: B +): + pass + +def fab( + a, # type: A + b # type: B +): + pass + +def fv( + *v, # type: V +): + pass + +def fv( + *v # type: V +): + pass + +def fk( + **k, # type: K +): + pass + +def fk( + **k # type: K +): + pass + +def fvk( + *v, # type: V + **k, # type: K +): + pass + +def fvk( + *v, # type: V + **k # type: K +): + pass + +def fav( + a, # type: A + *v, # type: V +): + pass + +def fav( + a, # type: A + *v # type: V +): + pass + +def fak( + a, # type: A + **k, # type: K +): + pass + +def fak( + a, # type: A + **k # type: K +): + pass + +def favk( + a, # type: A + *v, # type: V + **k, # type: K +): + pass + +def favk( + a, # type: A + *v, # type: V + **k # type: K +): + pass +""" + + +class TypeCommentTests(unittest.TestCase): + + def parse(self, source): + return ast.parse(source, type_comments=True) + + def classic_parse(self, source): + return ast.parse(source) + + def test_funcdef(self): + tree = self.parse(funcdef) + self.assertEqual(tree.body[0].type_comment, "() -> int") + self.assertEqual(tree.body[1].type_comment, "() -> None") + tree = self.classic_parse(funcdef) + self.assertEqual(tree.body[0].type_comment, None) + self.assertEqual(tree.body[1].type_comment, None) + + def test_asyncdef(self): + tree = self.parse(asyncdef) + self.assertEqual(tree.body[0].type_comment, "() -> int") + self.assertEqual(tree.body[1].type_comment, "() -> int") + tree = self.classic_parse(asyncdef) + self.assertEqual(tree.body[0].type_comment, None) + self.assertEqual(tree.body[1].type_comment, None) + + def test_redundantdef(self): + with self.assertRaisesRegex(SyntaxError, "^Cannot have two type comments on def"): + tree = self.parse(redundantdef) + + def test_nonasciidef(self): + tree = self.parse(nonasciidef) + self.assertEqual(tree.body[0].type_comment, "() -> àçčéñt") + + def test_forstmt(self): + tree = self.parse(forstmt) + self.assertEqual(tree.body[0].type_comment, "int") + tree = self.classic_parse(forstmt) + self.assertEqual(tree.body[0].type_comment, None) + + def test_withstmt(self): + tree = self.parse(withstmt) + self.assertEqual(tree.body[0].type_comment, "int") + tree = self.classic_parse(withstmt) + self.assertEqual(tree.body[0].type_comment, None) + + def test_vardecl(self): + tree = self.parse(vardecl) + self.assertEqual(tree.body[0].type_comment, "int") + tree = self.classic_parse(vardecl) + self.assertEqual(tree.body[0].type_comment, None) + + def test_ignores(self): + tree = self.parse(ignores) + self.assertEqual([ti.lineno for ti in tree.type_ignores], [2, 5]) + tree = self.classic_parse(ignores) + self.assertEqual(tree.type_ignores, []) + + def test_longargs(self): + tree = self.parse(longargs) + for t in tree.body: + # The expected args are encoded in the function name + todo = set(t.name[1:]) + self.assertEqual(len(t.args.args), + len(todo) - bool(t.args.vararg) - bool(t.args.kwarg)) + self.assertTrue(t.name.startswith('f'), t.name) + for c in t.name[1:]: + todo.remove(c) + if c == 'v': + arg = t.args.vararg + elif c == 'k': + arg = t.args.kwarg + else: + assert 0 <= ord(c) - ord('a') < len(t.args.args) + arg = t.args.args[ord(c) - ord('a')] + self.assertEqual(arg.arg, c) # That's the argument name + self.assertEqual(arg.type_comment, arg.arg.upper()) + assert not todo + tree = self.classic_parse(longargs) + for t in tree.body: + for arg in t.args.args + [t.args.vararg, t.args.kwarg]: + if arg is not None: + self.assertIsNone(arg.type_comment, "%s(%s:%r)" % + (t.name, arg.arg, arg.type_comment)) + + def test_inappropriate_type_comments(self): + """Tests for inappropriately-placed type comments. + + These should be silently ignored with type comments off, + but raise SyntaxError with type comments on. + + This is not meant to be exhaustive. + """ + + def check_both_ways(source): + ast.parse(source, type_comments=False) + with self.assertRaises(SyntaxError): + ast.parse(source, type_comments=True) + + check_both_ways("pass # type: int\n") + check_both_ways("foo() # type: int\n") + check_both_ways("x += 1 # type: int\n") + check_both_ways("while True: # type: int\n continue\n") + check_both_ways("while True:\n continue # type: int\n") + check_both_ways("try: # type: int\n pass\nfinally:\n pass\n") + check_both_ways("try:\n pass\nfinally: # type: int\n pass\n") + + def test_func_type_input(self): + + def parse_func_type_input(source): + return ast.parse(source, "<unknown>", "func_type") + + # Some checks below will crash if the returned structure is wrong + tree = parse_func_type_input("() -> int") + self.assertEqual(tree.argtypes, []) + self.assertEqual(tree.returns.id, "int") + + tree = parse_func_type_input("(int) -> List[str]") + self.assertEqual(len(tree.argtypes), 1) + arg = tree.argtypes[0] + self.assertEqual(arg.id, "int") + self.assertEqual(tree.returns.value.id, "List") + self.assertEqual(tree.returns.slice.value.id, "str") + + tree = parse_func_type_input("(int, *str, **Any) -> float") + self.assertEqual(tree.argtypes[0].id, "int") + self.assertEqual(tree.argtypes[1].id, "str") + self.assertEqual(tree.argtypes[2].id, "Any") + self.assertEqual(tree.returns.id, "float") + + with self.assertRaises(SyntaxError): + tree = parse_func_type_input("(int, *str, *Any) -> float") + + with self.assertRaises(SyntaxError): + tree = parse_func_type_input("(int, **str, Any) -> float") + + with self.assertRaises(SyntaxError): + tree = parse_func_type_input("(**int, **str) -> float") + + +if __name__ == '__main__': + unittest.main()
Lib/token.py+7 −5 modified@@ -58,12 +58,14 @@ ELLIPSIS = 52 COLONEQUAL = 53 OP = 54 +TYPE_IGNORE = 55 +TYPE_COMMENT = 56 # These aren't used by the C tokenizer but are needed for tokenize.py -ERRORTOKEN = 55 -COMMENT = 56 -NL = 57 -ENCODING = 58 -N_TOKENS = 59 +ERRORTOKEN = 57 +COMMENT = 58 +NL = 59 +ENCODING = 60 +N_TOKENS = 61 # Special definitions for cooperation with parser NT_OFFSET = 256
Misc/NEWS.d/next/Core and Builtins/2019-01-22-19-17-27.bpo-35766.gh1tHZ.rst+1 −0 added@@ -0,0 +1 @@ +Add the option to parse PEP 484 type comments in the ast module. (Off by default.) This is merging the key functionality of the third party fork thereof, [typed_ast](https://github.com/python/typed_ast). \ No newline at end of file
Modules/parsermodule.c+6 −0 modified@@ -663,6 +663,12 @@ validate_node(node *tree) for (pos = 0; pos < nch; ++pos) { node *ch = CHILD(tree, pos); int ch_type = TYPE(ch); + if (ch_type == suite && TYPE(tree) == funcdef) { + /* This is the opposite hack of what we do in parser.c + (search for func_body_suite), except we don't ever + support type comments here. */ + ch_type = func_body_suite; + } for (arc = 0; arc < dfa_state->s_narcs; ++arc) { short a_label = dfa_state->s_arc[arc].a_lbl; assert(a_label < _PyParser_Grammar.g_ll.ll_nlabels);
Parser/asdl_c.py+14 −2 modified@@ -890,6 +890,15 @@ def visitModule(self, mod): return obj2ast_object(obj, out, arena); } +static int obj2ast_string(PyObject* obj, PyObject** out, PyArena* arena) +{ + if (!PyUnicode_CheckExact(obj) && !PyBytes_CheckExact(obj)) { + PyErr_SetString(PyExc_TypeError, "AST string must be of type str"); + return 1; + } + return obj2ast_object(obj, out, arena); +} + static int obj2ast_int(PyObject* obj, int* out, PyArena* arena) { int i; @@ -993,6 +1002,8 @@ def visitModule(self, mod): self.emit('if (PyDict_SetItemString(d, "AST", (PyObject*)&AST_type) < 0) return NULL;', 1) self.emit('if (PyModule_AddIntMacro(m, PyCF_ONLY_AST) < 0)', 1) self.emit("return NULL;", 2) + self.emit('if (PyModule_AddIntMacro(m, PyCF_TYPE_COMMENTS) < 0)', 1) + self.emit("return NULL;", 2) for dfn in mod.dfns: self.visit(dfn) self.emit("return m;", 1) @@ -1176,18 +1187,19 @@ class PartingShots(StaticVisitor): } /* mode is 0 for "exec", 1 for "eval" and 2 for "single" input */ +/* and 3 for "func_type" */ mod_ty PyAST_obj2mod(PyObject* ast, PyArena* arena, int mode) { mod_ty res; PyObject *req_type[3]; - char *req_name[] = {"Module", "Expression", "Interactive"}; + char *req_name[] = {"Module", "Expression", "Interactive", "FunctionType"}; int isinstance; req_type[0] = (PyObject*)Module_type; req_type[1] = (PyObject*)Expression_type; req_type[2] = (PyObject*)Interactive_type; - assert(0 <= mode && mode <= 2); + assert(0 <= mode && mode <= 3); if (!init_types()) return NULL;
Parser/parser.c+11 −2 modified@@ -12,6 +12,7 @@ #include "node.h" #include "parser.h" #include "errcode.h" +#include "graminit.h" #ifdef Py_DEBUG @@ -260,15 +261,23 @@ PyParser_AddToken(parser_state *ps, int type, char *str, /* Push non-terminal */ int nt = (x >> 8) + NT_OFFSET; int arrow = x & ((1<<7)-1); - dfa *d1 = PyGrammar_FindDFA( + dfa *d1; + if (nt == func_body_suite && !(ps->p_flags & PyCF_TYPE_COMMENTS)) { + /* When parsing type comments is not requested, + we can provide better errors about bad indentation + by using 'suite' for the body of a funcdef */ + D(printf(" [switch func_body_suite to suite]")); + nt = suite; + } + d1 = PyGrammar_FindDFA( ps->p_grammar, nt); if ((err = push(&ps->p_stack, nt, d1, arrow, lineno, col_offset, end_lineno, end_col_offset)) > 0) { D(printf(" MemError: push\n")); return err; } - D(printf(" Push ...\n")); + D(printf(" Push '%s'\n", d1->d_name)); continue; }
Parser/parsetok.c+78 −0 modified@@ -15,6 +15,42 @@ static node *parsetok(struct tok_state *, grammar *, int, perrdetail *, int *); static int initerr(perrdetail *err_ret, PyObject * filename); +typedef struct { + int *items; + size_t size; + size_t num_items; +} growable_int_array; + +static int +growable_int_array_init(growable_int_array *arr, size_t initial_size) { + assert(initial_size > 0); + arr->items = malloc(initial_size * sizeof(*arr->items)); + arr->size = initial_size; + arr->num_items = 0; + + return arr->items != NULL; +} + +static int +growable_int_array_add(growable_int_array *arr, int item) { + if (arr->num_items >= arr->size) { + arr->size *= 2; + arr->items = realloc(arr->items, arr->size * sizeof(*arr->items)); + if (!arr->items) { + return 0; + } + } + + arr->items[arr->num_items] = item; + arr->num_items++; + return 1; +} + +static void +growable_int_array_deallocate(growable_int_array *arr) { + free(arr->items); +} + /* Parse input coming from a string. Return error code, print some errors. */ node * PyParser_ParseString(const char *s, grammar *g, int start, perrdetail *err_ret) @@ -59,6 +95,9 @@ PyParser_ParseStringObject(const char *s, PyObject *filename, err_ret->error = PyErr_Occurred() ? E_DECODE : E_NOMEM; return NULL; } + if (*flags & PyPARSE_TYPE_COMMENTS) { + tok->type_comments = 1; + } #ifndef PGEN Py_INCREF(err_ret->filename); @@ -127,6 +166,9 @@ PyParser_ParseFileObject(FILE *fp, PyObject *filename, err_ret->error = E_NOMEM; return NULL; } + if (*flags & PyPARSE_TYPE_COMMENTS) { + tok->type_comments = 1; + } #ifndef PGEN Py_INCREF(err_ret->filename); tok->filename = err_ret->filename; @@ -188,6 +230,13 @@ parsetok(struct tok_state *tok, grammar *g, int start, perrdetail *err_ret, node *n; int started = 0; int col_offset, end_col_offset; + growable_int_array type_ignores; + + if (!growable_int_array_init(&type_ignores, 10)) { + err_ret->error = E_NOMEM; + PyTokenizer_Free(tok); + return NULL; + } if ((ps = PyParser_New(g, start)) == NULL) { err_ret->error = E_NOMEM; @@ -197,6 +246,8 @@ parsetok(struct tok_state *tok, grammar *g, int start, perrdetail *err_ret, #ifdef PY_PARSER_REQUIRES_FUTURE_KEYWORD if (*flags & PyPARSE_BARRY_AS_BDFL) ps->p_flags |= CO_FUTURE_BARRY_AS_BDFL; + if (*flags & PyPARSE_TYPE_COMMENTS) + ps->p_flags |= PyCF_TYPE_COMMENTS; #endif for (;;) { @@ -277,6 +328,15 @@ parsetok(struct tok_state *tok, grammar *g, int start, perrdetail *err_ret, else { end_col_offset = -1; } + + if (type == TYPE_IGNORE) { + if (!growable_int_array_add(&type_ignores, tok->lineno)) { + err_ret->error = E_NOMEM; + break; + } + continue; + } + if ((err_ret->error = PyParser_AddToken(ps, (int)type, str, lineno, col_offset, tok->lineno, end_col_offset, @@ -293,6 +353,24 @@ parsetok(struct tok_state *tok, grammar *g, int start, perrdetail *err_ret, n = ps->p_tree; ps->p_tree = NULL; + if (n->n_type == file_input) { + /* Put type_ignore nodes in the ENDMARKER of file_input. */ + int num; + node *ch; + size_t i; + + num = NCH(n); + ch = CHILD(n, num - 1); + REQ(ch, ENDMARKER); + + for (i = 0; i < type_ignores.num_items; i++) { + PyNode_AddChild(ch, TYPE_IGNORE, NULL, + type_ignores.items[i], 0, + type_ignores.items[i], 0); + } + } + growable_int_array_deallocate(&type_ignores); + #ifndef PGEN /* Check that the source for a single input statement really is a single statement by looking at what is left in the
Parser/Python.asdl+14 −9 modified@@ -3,17 +3,20 @@ module Python { - mod = Module(stmt* body) + mod = Module(stmt* body, type_ignore *type_ignores) | Interactive(stmt* body) | Expression(expr body) + | FunctionType(expr* argtypes, expr returns) -- not really an actual node but useful in Jython's typesystem. | Suite(stmt* body) stmt = FunctionDef(identifier name, arguments args, - stmt* body, expr* decorator_list, expr? returns) + stmt* body, expr* decorator_list, expr? returns, + string? type_comment) | AsyncFunctionDef(identifier name, arguments args, - stmt* body, expr* decorator_list, expr? returns) + stmt* body, expr* decorator_list, expr? returns, + string? type_comment) | ClassDef(identifier name, expr* bases, @@ -23,18 +26,18 @@ module Python | Return(expr? value) | Delete(expr* targets) - | Assign(expr* targets, expr value) + | Assign(expr* targets, expr value, string? type_comment) | AugAssign(expr target, operator op, expr value) -- 'simple' indicates that we annotate simple name without parens | AnnAssign(expr target, expr annotation, expr? value, int simple) -- use 'orelse' because else is a keyword in target languages - | For(expr target, expr iter, stmt* body, stmt* orelse) - | AsyncFor(expr target, expr iter, stmt* body, stmt* orelse) + | For(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment) + | AsyncFor(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment) | While(expr test, stmt* body, stmt* orelse) | If(expr test, stmt* body, stmt* orelse) - | With(withitem* items, stmt* body) - | AsyncWith(withitem* items, stmt* body) + | With(withitem* items, stmt* body, string? type_comment) + | AsyncWith(withitem* items, stmt* body, string? type_comment) | Raise(expr? exc, expr? cause) | Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody) @@ -111,7 +114,7 @@ module Python arguments = (arg* args, arg? vararg, arg* kwonlyargs, expr* kw_defaults, arg? kwarg, expr* defaults) - arg = (identifier arg, expr? annotation) + arg = (identifier arg, expr? annotation, string? type_comment) attributes (int lineno, int col_offset, int? end_lineno, int? end_col_offset) -- keyword arguments supplied to call (NULL identifier for **kwargs) @@ -121,5 +124,7 @@ module Python alias = (identifier name, identifier? asname) withitem = (expr context_expr, expr? optional_vars) + + type_ignore = TypeIgnore(int lineno) }
Parser/token.c+2 −0 modified@@ -61,6 +61,8 @@ const char * const _PyParser_TokenNames[] = { "ELLIPSIS", "COLONEQUAL", "OP", + "TYPE_IGNORE", + "TYPE_COMMENT", "<ERRORTOKEN>", "<COMMENT>", "<NL>",
Parser/tokenizer.c+56 −1 modified@@ -48,6 +48,10 @@ static int tok_nextc(struct tok_state *tok); static void tok_backup(struct tok_state *tok, int c); +/* Spaces in this constant are treated as "zero or more spaces or tabs" when + tokenizing. */ +static const char* type_comment_prefix = "# type: "; + /* Create and initialize a new tok_state structure */ static struct tok_state * @@ -82,6 +86,7 @@ tok_new(void) tok->decoding_readline = NULL; tok->decoding_buffer = NULL; #endif + tok->type_comments = 0; return tok; } @@ -1245,11 +1250,61 @@ tok_get(struct tok_state *tok, char **p_start, char **p_end) /* Set start of current token */ tok->start = tok->cur - 1; - /* Skip comment */ + /* Skip comment, unless it's a type comment */ if (c == '#') { + const char *prefix, *p, *type_start; + while (c != EOF && c != '\n') { c = tok_nextc(tok); } + + if (tok->type_comments) { + p = tok->start; + prefix = type_comment_prefix; + while (*prefix && p < tok->cur) { + if (*prefix == ' ') { + while (*p == ' ' || *p == '\t') { + p++; + } + } else if (*prefix == *p) { + p++; + } else { + break; + } + + prefix++; + } + + /* This is a type comment if we matched all of type_comment_prefix. */ + if (!*prefix) { + int is_type_ignore = 1; + tok_backup(tok, c); /* don't eat the newline or EOF */ + + type_start = p; + + is_type_ignore = tok->cur >= p + 6 && memcmp(p, "ignore", 6) == 0; + p += 6; + while (is_type_ignore && p < tok->cur) { + if (*p == '#') + break; + is_type_ignore = is_type_ignore && (*p == ' ' || *p == '\t'); + p++; + } + + if (is_type_ignore) { + /* If this type ignore is the only thing on the line, consume the newline also. */ + if (blankline) { + tok_nextc(tok); + tok->atbol = 1; + } + return TYPE_IGNORE; + } else { + *p_start = (char *) type_start; /* after type_comment_prefix */ + *p_end = tok->cur; + return TYPE_COMMENT; + } + } + } } /* Check for EOF and errors now */
Parser/tokenizer.h+2 −0 modified@@ -70,6 +70,8 @@ struct tok_state { const char* enc; /* Encoding for the current str. */ const char* str; const char* input; /* Tokenizer's newline translated copy of the string. */ + + int type_comments; /* Whether to look for type comments */ }; extern struct tok_state *PyTokenizer_FromString(const char *, int);
Python/ast.c+246 −58 modified@@ -698,6 +698,13 @@ ast_error(struct compiling *c, const node *n, const char *errmsg, ...) small_stmt elements is returned. */ +static string +new_type_comment(const char *s) +{ + return PyUnicode_DecodeUTF8(s, strlen(s), NULL); +} +#define NEW_TYPE_COMMENT(n) new_type_comment(STR(n)) + static int num_stmts(const node *n) { @@ -725,11 +732,17 @@ num_stmts(const node *n) case simple_stmt: return NCH(n) / 2; /* Divide by 2 to remove count of semi-colons */ case suite: + case func_body_suite: + /* func_body_suite: simple_stmt | NEWLINE [TYPE_COMMENT NEWLINE] INDENT stmt+ DEDENT */ + /* suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT */ if (NCH(n) == 1) return num_stmts(CHILD(n, 0)); else { + i = 2; l = 0; - for (i = 2; i < (NCH(n) - 1); i++) + if (TYPE(CHILD(n, 1)) == TYPE_COMMENT) + i += 2; + for (; i < (NCH(n) - 1); i++) l += num_stmts(CHILD(n, i)); return l; } @@ -753,10 +766,13 @@ PyAST_FromNodeObject(const node *n, PyCompilerFlags *flags, { int i, j, k, num; asdl_seq *stmts = NULL; + asdl_seq *type_ignores = NULL; stmt_ty s; node *ch; struct compiling c; mod_ty res = NULL; + asdl_seq *argtypes = NULL; + expr_ty ret, arg; c.c_arena = arena; /* borrowed reference */ @@ -795,7 +811,23 @@ PyAST_FromNodeObject(const node *n, PyCompilerFlags *flags, } } } - res = Module(stmts, arena); + + /* Type ignores are stored under the ENDMARKER in file_input. */ + ch = CHILD(n, NCH(n) - 1); + REQ(ch, ENDMARKER); + num = NCH(ch); + type_ignores = _Py_asdl_seq_new(num, arena); + if (!type_ignores) + goto out; + + for (i = 0; i < num; i++) { + type_ignore_ty ti = TypeIgnore(LINENO(CHILD(ch, i)), arena); + if (!ti) + goto out; + asdl_seq_SET(type_ignores, i, ti); + } + + res = Module(stmts, type_ignores, arena); break; case eval_input: { expr_ty testlist_ast; @@ -847,6 +879,46 @@ PyAST_FromNodeObject(const node *n, PyCompilerFlags *flags, res = Interactive(stmts, arena); } break; + case func_type_input: + n = CHILD(n, 0); + REQ(n, func_type); + + if (TYPE(CHILD(n, 1)) == typelist) { + ch = CHILD(n, 1); + /* this is overly permissive -- we don't pay any attention to + * stars on the args -- just parse them into an ordered list */ + num = 0; + for (i = 0; i < NCH(ch); i++) { + if (TYPE(CHILD(ch, i)) == test) { + num++; + } + } + + argtypes = _Py_asdl_seq_new(num, arena); + if (!argtypes) + goto out; + + j = 0; + for (i = 0; i < NCH(ch); i++) { + if (TYPE(CHILD(ch, i)) == test) { + arg = ast_for_expr(&c, CHILD(ch, i)); + if (!arg) + goto out; + asdl_seq_SET(argtypes, j++, arg); + } + } + } + else { + argtypes = _Py_asdl_seq_new(0, arena); + if (!argtypes) + goto out; + } + + ret = ast_for_expr(&c, CHILD(n, NCH(n) - 1)); + if (!ret) + goto out; + res = FunctionType(argtypes, ret, arena); + break; default: PyErr_Format(PyExc_SystemError, "invalid node %d for PyAST_FromNode", TYPE(n)); @@ -1269,7 +1341,7 @@ ast_for_arg(struct compiling *c, const node *n) return NULL; } - ret = arg(name, annotation, LINENO(n), n->n_col_offset, + ret = arg(name, annotation, NULL, LINENO(n), n->n_col_offset, n->n_end_lineno, n->n_end_col_offset, c->c_arena); if (!ret) return NULL; @@ -1328,13 +1400,22 @@ handle_keywordonly_args(struct compiling *c, const node *n, int start, goto error; if (forbidden_name(c, argname, ch, 0)) goto error; - arg = arg(argname, annotation, LINENO(ch), ch->n_col_offset, + arg = arg(argname, annotation, NULL, LINENO(ch), ch->n_col_offset, ch->n_end_lineno, ch->n_end_col_offset, c->c_arena); if (!arg) goto error; asdl_seq_SET(kwonlyargs, j++, arg); - i += 2; /* the name and the comma */ + i += 1; /* the name */ + if (TYPE(CHILD(n, i)) == COMMA) + i += 1; /* the comma, if present */ + break; + case TYPE_COMMENT: + /* arg will be equal to the last argument processed */ + arg->type_comment = NEW_TYPE_COMMENT(ch); + if (!arg->type_comment) + goto error; + i += 1; break; case DOUBLESTAR: return i; @@ -1464,19 +1545,29 @@ ast_for_arguments(struct compiling *c, const node *n) if (!arg) return NULL; asdl_seq_SET(posargs, k++, arg); - i += 2; /* the name and the comma */ + i += 1; /* the name */ + if (i < NCH(n) && TYPE(CHILD(n, i)) == COMMA) + i += 1; /* the comma, if present */ break; case STAR: if (i+1 >= NCH(n) || - (i+2 == NCH(n) && TYPE(CHILD(n, i+1)) == COMMA)) { + (i+2 == NCH(n) && (TYPE(CHILD(n, i+1)) == COMMA + || TYPE(CHILD(n, i+1)) == TYPE_COMMENT))) { ast_error(c, CHILD(n, i), - "named arguments must follow bare *"); + "named arguments must follow bare *"); return NULL; } ch = CHILD(n, i+1); /* tfpdef or COMMA */ if (TYPE(ch) == COMMA) { int res = 0; i += 2; /* now follows keyword only arguments */ + + if (i < NCH(n) && TYPE(CHILD(n, i)) == TYPE_COMMENT) { + ast_error(c, CHILD(n, i), + "bare * has associated type comment"); + return NULL; + } + res = handle_keywordonly_args(c, n, i, kwonlyargs, kwdefaults); if (res == -1) return NULL; @@ -1487,7 +1578,17 @@ ast_for_arguments(struct compiling *c, const node *n) if (!vararg) return NULL; - i += 3; + i += 2; /* the star and the name */ + if (i < NCH(n) && TYPE(CHILD(n, i)) == COMMA) + i += 1; /* the comma, if present */ + + if (i < NCH(n) && TYPE(CHILD(n, i)) == TYPE_COMMENT) { + vararg->type_comment = NEW_TYPE_COMMENT(CHILD(n, i)); + if (!vararg->type_comment) + return NULL; + i += 1; + } + if (i < NCH(n) && (TYPE(CHILD(n, i)) == tfpdef || TYPE(CHILD(n, i)) == vfpdef)) { int res = 0; @@ -1504,7 +1605,21 @@ ast_for_arguments(struct compiling *c, const node *n) kwarg = ast_for_arg(c, ch); if (!kwarg) return NULL; - i += 3; + i += 2; /* the double star and the name */ + if (TYPE(CHILD(n, i)) == COMMA) + i += 1; /* the comma, if present */ + break; + case TYPE_COMMENT: + assert(i); + + if (kwarg) + arg = kwarg; + + /* arg will be equal to the last argument processed */ + arg->type_comment = NEW_TYPE_COMMENT(ch); + if (!arg->type_comment) + return NULL; + i += 1; break; default: PyErr_Format(PyExc_SystemError, @@ -1613,14 +1728,16 @@ static stmt_ty ast_for_funcdef_impl(struct compiling *c, const node *n0, asdl_seq *decorator_seq, bool is_async) { - /* funcdef: 'def' NAME parameters ['->' test] ':' suite */ + /* funcdef: 'def' NAME parameters ['->' test] ':' [TYPE_COMMENT] suite */ const node * const n = is_async ? CHILD(n0, 1) : n0; identifier name; arguments_ty args; asdl_seq *body; expr_ty returns = NULL; int name_i = 1; int end_lineno, end_col_offset; + node *tc; + string type_comment = NULL; REQ(n, funcdef); @@ -1638,16 +1755,37 @@ ast_for_funcdef_impl(struct compiling *c, const node *n0, return NULL; name_i += 2; } + if (TYPE(CHILD(n, name_i + 3)) == TYPE_COMMENT) { + type_comment = NEW_TYPE_COMMENT(CHILD(n, name_i + 3)); + if (!type_comment) + return NULL; + name_i += 1; + } body = ast_for_suite(c, CHILD(n, name_i + 3)); if (!body) return NULL; get_last_end_pos(body, &end_lineno, &end_col_offset); + if (NCH(CHILD(n, name_i + 3)) > 1) { + /* Check if the suite has a type comment in it. */ + tc = CHILD(CHILD(n, name_i + 3), 1); + + if (TYPE(tc) == TYPE_COMMENT) { + if (type_comment != NULL) { + ast_error(c, n, "Cannot have two type comments on def"); + return NULL; + } + type_comment = NEW_TYPE_COMMENT(tc); + if (!type_comment) + return NULL; + } + } + if (is_async) - return AsyncFunctionDef(name, args, body, decorator_seq, returns, + return AsyncFunctionDef(name, args, body, decorator_seq, returns, type_comment, LINENO(n0), n0->n_col_offset, end_lineno, end_col_offset, c->c_arena); else - return FunctionDef(name, args, body, decorator_seq, returns, + return FunctionDef(name, args, body, decorator_seq, returns, type_comment, LINENO(n), n->n_col_offset, end_lineno, end_col_offset, c->c_arena); } @@ -2295,7 +2433,7 @@ ast_for_atom(struct compiling *c, const node *n) /* It's a dictionary comprehension. */ if (is_dict) { ast_error(c, n, "dict unpacking cannot be used in " - "dict comprehension"); + "dict comprehension"); return NULL; } res = ast_for_dictcomp(c, ch); @@ -2870,13 +3008,13 @@ ast_for_call(struct compiling *c, const node *n, expr_ty func, if (nkeywords) { if (ndoublestars) { ast_error(c, chch, - "positional argument follows " - "keyword argument unpacking"); + "positional argument follows " + "keyword argument unpacking"); } else { ast_error(c, chch, - "positional argument follows " - "keyword argument"); + "positional argument follows " + "keyword argument"); } return NULL; } @@ -2890,8 +3028,8 @@ ast_for_call(struct compiling *c, const node *n, expr_ty func, expr_ty starred; if (ndoublestars) { ast_error(c, chch, - "iterable argument unpacking follows " - "keyword argument unpacking"); + "iterable argument unpacking follows " + "keyword argument unpacking"); return NULL; } e = ast_for_expr(c, CHILD(ch, 1)); @@ -2929,13 +3067,13 @@ ast_for_call(struct compiling *c, const node *n, expr_ty func, if (nkeywords) { if (ndoublestars) { ast_error(c, chch, - "positional argument follows " - "keyword argument unpacking"); + "positional argument follows " + "keyword argument unpacking"); } else { ast_error(c, chch, - "positional argument follows " - "keyword argument"); + "positional argument follows " + "keyword argument"); } return NULL; } @@ -2996,7 +3134,7 @@ ast_for_call(struct compiling *c, const node *n, expr_ty func, tmp = ((keyword_ty)asdl_seq_GET(keywords, k))->arg; if (tmp && !PyUnicode_Compare(tmp, key)) { ast_error(c, chch, - "keyword argument repeated"); + "keyword argument repeated"); return NULL; } } @@ -3045,15 +3183,16 @@ ast_for_expr_stmt(struct compiling *c, const node *n) { REQ(n, expr_stmt); /* expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist) | - ('=' (yield_expr|testlist_star_expr))*) - annassign: ':' test ['=' test] - testlist_star_expr: (test|star_expr) (',' test|star_expr)* [','] - augassign: '+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' - | '<<=' | '>>=' | '**=' | '//=' + [('=' (yield_expr|testlist_star_expr))+ [TYPE_COMMENT]] ) + annassign: ':' test ['=' (yield_expr|testlist)] + testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [','] + augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' | + '<<=' | '>>=' | '**=' | '//=') test: ... here starts the operator precedence dance */ + int num = NCH(n); - if (NCH(n) == 1) { + if (num == 1) { expr_ty e = ast_for_testlist(c, CHILD(n, 0)); if (!e) return NULL; @@ -3178,17 +3317,22 @@ ast_for_expr_stmt(struct compiling *c, const node *n) } } else { - int i; + int i, nch_minus_type, has_type_comment; asdl_seq *targets; node *value; expr_ty expression; + string type_comment; /* a normal assignment */ REQ(CHILD(n, 1), EQUAL); - targets = _Py_asdl_seq_new(NCH(n) / 2, c->c_arena); + + has_type_comment = TYPE(CHILD(n, num - 1)) == TYPE_COMMENT; + nch_minus_type = num - has_type_comment; + + targets = _Py_asdl_seq_new(nch_minus_type / 2, c->c_arena); if (!targets) return NULL; - for (i = 0; i < NCH(n) - 2; i += 2) { + for (i = 0; i < nch_minus_type - 2; i += 2) { expr_ty e; node *ch = CHILD(n, i); if (TYPE(ch) == yield_expr) { @@ -3205,14 +3349,21 @@ ast_for_expr_stmt(struct compiling *c, const node *n) asdl_seq_SET(targets, i / 2, e); } - value = CHILD(n, NCH(n) - 1); + value = CHILD(n, nch_minus_type - 1); if (TYPE(value) == testlist_star_expr) expression = ast_for_testlist(c, value); else expression = ast_for_expr(c, value); if (!expression) return NULL; - return Assign(targets, expression, LINENO(n), n->n_col_offset, + if (has_type_comment) { + type_comment = NEW_TYPE_COMMENT(CHILD(n, nch_minus_type)); + if (!type_comment) + return NULL; + } + else + type_comment = NULL; + return Assign(targets, expression, type_comment, LINENO(n), n->n_col_offset, n->n_end_lineno, n->n_end_col_offset, c->c_arena); } } @@ -3520,8 +3671,9 @@ ast_for_import_stmt(struct compiling *c, const node *n) n = CHILD(n, idx); n_children = NCH(n); if (n_children % 2 == 0) { - ast_error(c, n, "trailing comma not allowed without" - " surrounding parentheses"); + ast_error(c, n, + "trailing comma not allowed without" + " surrounding parentheses"); return NULL; } break; @@ -3639,13 +3791,15 @@ ast_for_assert_stmt(struct compiling *c, const node *n) static asdl_seq * ast_for_suite(struct compiling *c, const node *n) { - /* suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT */ + /* suite: simple_stmt | NEWLINE [TYPE_COMMENT NEWLINE] INDENT stmt+ DEDENT */ asdl_seq *seq; stmt_ty s; int i, total, num, end, pos = 0; node *ch; - REQ(n, suite); + if (TYPE(n) != func_body_suite) { + REQ(n, suite); + } total = num_stmts(n); seq = _Py_asdl_seq_new(total, c->c_arena); @@ -3669,7 +3823,13 @@ ast_for_suite(struct compiling *c, const node *n) } } else { - for (i = 2; i < (NCH(n) - 1); i++) { + i = 2; + if (TYPE(CHILD(n, 1)) == TYPE_COMMENT) { + i += 2; + REQ(CHILD(n, 2), NEWLINE); + } + + for (; i < (NCH(n) - 1); i++) { ch = CHILD(n, i); REQ(ch, stmt); num = num_stmts(ch); @@ -3903,11 +4063,15 @@ ast_for_for_stmt(struct compiling *c, const node *n0, bool is_async) expr_ty target, first; const node *node_target; int end_lineno, end_col_offset; - /* for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite] */ + int has_type_comment; + string type_comment; + /* for_stmt: 'for' exprlist 'in' testlist ':' [TYPE_COMMENT] suite ['else' ':' suite] */ REQ(n, for_stmt); - if (NCH(n) == 9) { - seq = ast_for_suite(c, CHILD(n, 8)); + has_type_comment = TYPE(CHILD(n, 5)) == TYPE_COMMENT; + + if (NCH(n) == 9 + has_type_comment) { + seq = ast_for_suite(c, CHILD(n, 8 + has_type_comment)); if (!seq) return NULL; } @@ -3929,7 +4093,7 @@ ast_for_for_stmt(struct compiling *c, const node *n0, bool is_async) expression = ast_for_testlist(c, CHILD(n, 3)); if (!expression) return NULL; - suite_seq = ast_for_suite(c, CHILD(n, 5)); + suite_seq = ast_for_suite(c, CHILD(n, 5 + has_type_comment)); if (!suite_seq) return NULL; @@ -3938,12 +4102,21 @@ ast_for_for_stmt(struct compiling *c, const node *n0, bool is_async) } else { get_last_end_pos(suite_seq, &end_lineno, &end_col_offset); } + + if (has_type_comment) { + type_comment = NEW_TYPE_COMMENT(CHILD(n, 5)); + if (!type_comment) + return NULL; + } + else + type_comment = NULL; + if (is_async) - return AsyncFor(target, expression, suite_seq, seq, + return AsyncFor(target, expression, suite_seq, seq, type_comment, LINENO(n0), n0->n_col_offset, end_lineno, end_col_offset, c->c_arena); else - return For(target, expression, suite_seq, seq, + return For(target, expression, suite_seq, seq, type_comment, LINENO(n), n->n_col_offset, end_lineno, end_col_offset, c->c_arena); } @@ -4111,21 +4284,25 @@ ast_for_with_item(struct compiling *c, const node *n) return withitem(context_expr, optional_vars, c->c_arena); } -/* with_stmt: 'with' with_item (',' with_item)* ':' suite */ +/* with_stmt: 'with' with_item (',' with_item)* ':' [TYPE_COMMENT] suite */ static stmt_ty ast_for_with_stmt(struct compiling *c, const node *n0, bool is_async) { const node * const n = is_async ? CHILD(n0, 1) : n0; - int i, n_items, end_lineno, end_col_offset; + int i, n_items, nch_minus_type, has_type_comment, end_lineno, end_col_offset; asdl_seq *items, *body; + string type_comment; REQ(n, with_stmt); - n_items = (NCH(n) - 2) / 2; + has_type_comment = TYPE(CHILD(n, NCH(n) - 2)) == TYPE_COMMENT; + nch_minus_type = NCH(n) - has_type_comment; + + n_items = (nch_minus_type - 2) / 2; items = _Py_asdl_seq_new(n_items, c->c_arena); if (!items) return NULL; - for (i = 1; i < NCH(n) - 2; i += 2) { + for (i = 1; i < nch_minus_type - 2; i += 2) { withitem_ty item = ast_for_with_item(c, CHILD(n, i)); if (!item) return NULL; @@ -4137,11 +4314,19 @@ ast_for_with_stmt(struct compiling *c, const node *n0, bool is_async) return NULL; get_last_end_pos(body, &end_lineno, &end_col_offset); + if (has_type_comment) { + type_comment = NEW_TYPE_COMMENT(CHILD(n, NCH(n) - 2)); + if (!type_comment) + return NULL; + } + else + type_comment = NULL; + if (is_async) - return AsyncWith(items, body, LINENO(n0), n0->n_col_offset, + return AsyncWith(items, body, type_comment, LINENO(n0), n0->n_col_offset, end_lineno, end_col_offset, c->c_arena); else - return With(items, body, LINENO(n), n->n_col_offset, + return With(items, body, type_comment, LINENO(n), n->n_col_offset, end_lineno, end_col_offset, c->c_arena); } @@ -4768,8 +4953,9 @@ fstring_find_expr(const char **str, const char *end, int raw, int recurse_lvl, if (ch == '\\') { /* Error: can't include a backslash character, inside parens or strings or not. */ - ast_error(c, n, "f-string expression part " - "cannot include a backslash"); + ast_error(c, n, + "f-string expression part " + "cannot include a backslash"); return -1; } if (quote_char) { @@ -4893,8 +5079,9 @@ fstring_find_expr(const char **str, const char *end, int raw, int recurse_lvl, /* Validate the conversion. */ if (!(conversion == 's' || conversion == 'r' || conversion == 'a')) { - ast_error(c, n, "f-string: invalid conversion character: " - "expected 's', 'r', or 'a'"); + ast_error(c, n, + "f-string: invalid conversion character: " + "expected 's', 'r', or 'a'"); return -1; } } @@ -5446,7 +5633,8 @@ parsestr(struct compiling *c, const node *n, int *bytesmode, int *rawmode, const char *ch; for (ch = s; *ch; ch++) { if (Py_CHARMASK(*ch) >= 0x80) { - ast_error(c, n, "bytes can only contain ASCII " + ast_error(c, n, + "bytes can only contain ASCII " "literal characters."); return -1; }
Python/bltinmodule.c+16 −4 modified@@ -765,13 +765,13 @@ builtin_compile_impl(PyObject *module, PyObject *source, PyObject *filename, int compile_mode = -1; int is_ast; PyCompilerFlags cf; - int start[] = {Py_file_input, Py_eval_input, Py_single_input}; + int start[] = {Py_file_input, Py_eval_input, Py_single_input, Py_func_type_input}; PyObject *result; cf.cf_flags = flags | PyCF_SOURCE_IS_UTF8; if (flags & - ~(PyCF_MASK | PyCF_MASK_OBSOLETE | PyCF_DONT_IMPLY_DEDENT | PyCF_ONLY_AST)) + ~(PyCF_MASK | PyCF_MASK_OBSOLETE | PyCF_DONT_IMPLY_DEDENT | PyCF_ONLY_AST | PyCF_TYPE_COMMENTS)) { PyErr_SetString(PyExc_ValueError, "compile(): unrecognised flags"); @@ -795,9 +795,21 @@ builtin_compile_impl(PyObject *module, PyObject *source, PyObject *filename, compile_mode = 1; else if (strcmp(mode, "single") == 0) compile_mode = 2; + else if (strcmp(mode, "func_type") == 0) { + if (!(flags & PyCF_ONLY_AST)) { + PyErr_SetString(PyExc_ValueError, + "compile() mode 'func_type' requires flag PyCF_ONLY_AST"); + goto error; + } + compile_mode = 3; + } else { - PyErr_SetString(PyExc_ValueError, - "compile() mode must be 'exec', 'eval' or 'single'"); + const char *msg; + if (flags & PyCF_ONLY_AST) + msg = "compile() mode must be 'exec', 'eval', 'single' or 'func_type'"; + else + msg = "compile() mode must be 'exec', 'eval' or 'single'"; + PyErr_SetString(PyExc_ValueError, msg); goto error; }
Python/graminit.c+613 −412 modified@@ -135,30 +135,35 @@ static arc arcs_7_3[2] = { static arc arcs_7_4[1] = { {26, 6}, }; -static arc arcs_7_5[1] = { +static arc arcs_7_5[2] = { {28, 7}, + {29, 8}, }; static arc arcs_7_6[1] = { {27, 5}, }; static arc arcs_7_7[1] = { - {0, 7}, + {29, 8}, +}; +static arc arcs_7_8[1] = { + {0, 8}, }; -static state states_7[8] = { +static state states_7[9] = { {1, arcs_7_0}, {1, arcs_7_1}, {1, arcs_7_2}, {2, arcs_7_3}, {1, arcs_7_4}, - {1, arcs_7_5}, + {2, arcs_7_5}, {1, arcs_7_6}, {1, arcs_7_7}, + {1, arcs_7_8}, }; static arc arcs_8_0[1] = { {13, 1}, }; static arc arcs_8_1[2] = { - {29, 2}, + {30, 2}, {15, 3}, }; static arc arcs_8_2[1] = { @@ -174,107 +179,144 @@ static state states_8[4] = { {1, arcs_8_3}, }; static arc arcs_9_0[3] = { - {30, 1}, - {33, 2}, - {34, 3}, + {31, 1}, + {34, 2}, + {35, 3}, }; -static arc arcs_9_1[3] = { - {31, 4}, - {32, 5}, +static arc arcs_9_1[4] = { + {32, 4}, + {33, 5}, + {28, 6}, {0, 1}, }; -static arc arcs_9_2[3] = { - {30, 6}, - {32, 7}, +static arc arcs_9_2[4] = { + {31, 7}, + {33, 8}, + {28, 6}, {0, 2}, }; static arc arcs_9_3[1] = { - {30, 8}, + {31, 9}, }; static arc arcs_9_4[1] = { - {26, 9}, + {26, 10}, }; -static arc arcs_9_5[4] = { - {30, 10}, - {33, 11}, - {34, 3}, +static arc arcs_9_5[5] = { + {28, 11}, + {31, 12}, + {34, 13}, + {35, 3}, {0, 5}, }; -static arc arcs_9_6[2] = { - {32, 7}, +static arc arcs_9_6[1] = { {0, 6}, }; static arc arcs_9_7[3] = { - {30, 12}, - {34, 3}, + {33, 8}, + {28, 6}, {0, 7}, }; -static arc arcs_9_8[2] = { - {32, 13}, +static arc arcs_9_8[4] = { + {28, 14}, + {31, 15}, + {35, 3}, {0, 8}, }; -static arc arcs_9_9[2] = { - {32, 5}, +static arc arcs_9_9[3] = { + {33, 16}, + {28, 6}, {0, 9}, }; static arc arcs_9_10[3] = { - {32, 5}, - {31, 4}, + {33, 5}, + {28, 6}, {0, 10}, }; -static arc arcs_9_11[3] = { - {30, 14}, - {32, 15}, +static arc arcs_9_11[4] = { + {31, 12}, + {34, 13}, + {35, 3}, {0, 11}, }; -static arc arcs_9_12[3] = { - {32, 7}, - {31, 16}, +static arc arcs_9_12[4] = { + {33, 5}, + {32, 4}, + {28, 6}, {0, 12}, }; -static arc arcs_9_13[1] = { +static arc arcs_9_13[4] = { + {31, 17}, + {33, 18}, + {28, 6}, {0, 13}, }; -static arc arcs_9_14[2] = { - {32, 15}, +static arc arcs_9_14[3] = { + {31, 15}, + {35, 3}, {0, 14}, }; -static arc arcs_9_15[3] = { - {30, 17}, - {34, 3}, +static arc arcs_9_15[4] = { + {33, 8}, + {32, 19}, + {28, 6}, {0, 15}, }; -static arc arcs_9_16[1] = { - {26, 6}, +static arc arcs_9_16[2] = { + {28, 6}, + {0, 16}, }; static arc arcs_9_17[3] = { - {32, 15}, - {31, 18}, + {33, 18}, + {28, 6}, {0, 17}, }; -static arc arcs_9_18[1] = { - {26, 14}, +static arc arcs_9_18[4] = { + {28, 20}, + {31, 21}, + {35, 3}, + {0, 18}, +}; +static arc arcs_9_19[1] = { + {26, 7}, }; -static state states_9[19] = { +static arc arcs_9_20[3] = { + {31, 21}, + {35, 3}, + {0, 20}, +}; +static arc arcs_9_21[4] = { + {33, 18}, + {32, 22}, + {28, 6}, + {0, 21}, +}; +static arc arcs_9_22[1] = { + {26, 17}, +}; +static state states_9[23] = { {3, arcs_9_0}, - {3, arcs_9_1}, - {3, arcs_9_2}, + {4, arcs_9_1}, + {4, arcs_9_2}, {1, arcs_9_3}, {1, arcs_9_4}, - {4, arcs_9_5}, - {2, arcs_9_6}, + {5, arcs_9_5}, + {1, arcs_9_6}, {3, arcs_9_7}, - {2, arcs_9_8}, - {2, arcs_9_9}, + {4, arcs_9_8}, + {3, arcs_9_9}, {3, arcs_9_10}, - {3, arcs_9_11}, - {3, arcs_9_12}, - {1, arcs_9_13}, - {2, arcs_9_14}, - {3, arcs_9_15}, - {1, arcs_9_16}, + {4, arcs_9_11}, + {4, arcs_9_12}, + {4, arcs_9_13}, + {3, arcs_9_14}, + {4, arcs_9_15}, + {2, arcs_9_16}, {3, arcs_9_17}, - {1, arcs_9_18}, + {4, arcs_9_18}, + {1, arcs_9_19}, + {3, arcs_9_20}, + {4, arcs_9_21}, + {1, arcs_9_22}, }; static arc arcs_10_0[1] = { {23, 1}, @@ -296,82 +338,82 @@ static state states_10[4] = { {1, arcs_10_3}, }; static arc arcs_11_0[3] = { - {36, 1}, - {33, 2}, - {34, 3}, + {37, 1}, + {34, 2}, + {35, 3}, }; static arc arcs_11_1[3] = { - {31, 4}, - {32, 5}, + {32, 4}, + {33, 5}, {0, 1}, }; static arc arcs_11_2[3] = { - {36, 6}, - {32, 7}, + {37, 6}, + {33, 7}, {0, 2}, }; static arc arcs_11_3[1] = { - {36, 8}, + {37, 8}, }; static arc arcs_11_4[1] = { {26, 9}, }; static arc arcs_11_5[4] = { - {36, 10}, - {33, 11}, - {34, 3}, + {37, 10}, + {34, 11}, + {35, 3}, {0, 5}, }; static arc arcs_11_6[2] = { - {32, 7}, + {33, 7}, {0, 6}, }; static arc arcs_11_7[3] = { - {36, 12}, - {34, 3}, + {37, 12}, + {35, 3}, {0, 7}, }; static arc arcs_11_8[2] = { - {32, 13}, + {33, 13}, {0, 8}, }; static arc arcs_11_9[2] = { - {32, 5}, + {33, 5}, {0, 9}, }; static arc arcs_11_10[3] = { - {32, 5}, - {31, 4}, + {33, 5}, + {32, 4}, {0, 10}, }; static arc arcs_11_11[3] = { - {36, 14}, - {32, 15}, + {37, 14}, + {33, 15}, {0, 11}, }; static arc arcs_11_12[3] = { - {32, 7}, - {31, 16}, + {33, 7}, + {32, 16}, {0, 12}, }; static arc arcs_11_13[1] = { {0, 13}, }; static arc arcs_11_14[2] = { - {32, 15}, + {33, 15}, {0, 14}, }; static arc arcs_11_15[3] = { - {36, 17}, - {34, 3}, + {37, 17}, + {35, 3}, {0, 15}, }; static arc arcs_11_16[1] = { {26, 6}, }; static arc arcs_11_17[3] = { - {32, 15}, - {31, 18}, + {33, 15}, + {32, 18}, {0, 17}, }; static arc arcs_11_18[1] = { @@ -420,14 +462,14 @@ static state states_13[2] = { {1, arcs_13_1}, }; static arc arcs_14_0[1] = { - {37, 1}, + {38, 1}, }; static arc arcs_14_1[2] = { - {38, 2}, + {39, 2}, {2, 3}, }; static arc arcs_14_2[2] = { - {37, 1}, + {38, 1}, {2, 3}, }; static arc arcs_14_3[1] = { @@ -440,14 +482,14 @@ static state states_14[4] = { {1, arcs_14_3}, }; static arc arcs_15_0[8] = { - {39, 1}, {40, 1}, {41, 1}, {42, 1}, {43, 1}, {44, 1}, {45, 1}, {46, 1}, + {47, 1}, }; static arc arcs_15_1[1] = { {0, 1}, @@ -457,27 +499,28 @@ static state states_15[2] = { {1, arcs_15_1}, }; static arc arcs_16_0[1] = { - {47, 1}, + {48, 1}, }; static arc arcs_16_1[4] = { - {48, 2}, - {49, 3}, - {31, 4}, + {49, 2}, + {50, 3}, + {32, 4}, {0, 1}, }; static arc arcs_16_2[1] = { {0, 2}, }; static arc arcs_16_3[2] = { - {50, 2}, + {51, 2}, {9, 2}, }; static arc arcs_16_4[2] = { - {50, 5}, - {47, 5}, + {51, 5}, + {48, 5}, }; -static arc arcs_16_5[2] = { - {31, 4}, +static arc arcs_16_5[3] = { + {32, 4}, + {28, 2}, {0, 5}, }; static state states_16[6] = { @@ -486,7 +529,7 @@ static state states_16[6] = { {1, arcs_16_2}, {2, arcs_16_3}, {2, arcs_16_4}, - {2, arcs_16_5}, + {3, arcs_16_5}, }; static arc arcs_17_0[1] = { {27, 1}, @@ -495,11 +538,11 @@ static arc arcs_17_1[1] = { {26, 2}, }; static arc arcs_17_2[2] = { - {31, 3}, + {32, 3}, {0, 2}, }; static arc arcs_17_3[2] = { - {50, 4}, + {51, 4}, {9, 4}, }; static arc arcs_17_4[1] = { @@ -514,15 +557,15 @@ static state states_17[5] = { }; static arc arcs_18_0[2] = { {26, 1}, - {51, 1}, + {52, 1}, }; static arc arcs_18_1[2] = { - {32, 2}, + {33, 2}, {0, 1}, }; static arc arcs_18_2[3] = { {26, 1}, - {51, 1}, + {52, 1}, {0, 2}, }; static state states_18[3] = { @@ -531,7 +574,6 @@ static state states_18[3] = { {3, arcs_18_2}, }; static arc arcs_19_0[13] = { - {52, 1}, {53, 1}, {54, 1}, {55, 1}, @@ -544,6 +586,7 @@ static arc arcs_19_0[13] = { {62, 1}, {63, 1}, {64, 1}, + {65, 1}, }; static arc arcs_19_1[1] = { {0, 1}, @@ -553,10 +596,10 @@ static state states_19[2] = { {1, arcs_19_1}, }; static arc arcs_20_0[1] = { - {65, 1}, + {66, 1}, }; static arc arcs_20_1[1] = { - {66, 2}, + {67, 2}, }; static arc arcs_20_2[1] = { {0, 2}, @@ -567,7 +610,7 @@ static state states_20[3] = { {1, arcs_20_2}, }; static arc arcs_21_0[1] = { - {67, 1}, + {68, 1}, }; static arc arcs_21_1[1] = { {0, 1}, @@ -577,11 +620,11 @@ static state states_21[2] = { {1, arcs_21_1}, }; static arc arcs_22_0[5] = { - {68, 1}, {69, 1}, {70, 1}, {71, 1}, {72, 1}, + {73, 1}, }; static arc arcs_22_1[1] = { {0, 1}, @@ -591,7 +634,7 @@ static state states_22[2] = { {1, arcs_22_1}, }; static arc arcs_23_0[1] = { - {73, 1}, + {74, 1}, }; static arc arcs_23_1[1] = { {0, 1}, @@ -601,7 +644,7 @@ static state states_23[2] = { {1, arcs_23_1}, }; static arc arcs_24_0[1] = { - {74, 1}, + {75, 1}, }; static arc arcs_24_1[1] = { {0, 1}, @@ -611,10 +654,10 @@ static state states_24[2] = { {1, arcs_24_1}, }; static arc arcs_25_0[1] = { - {75, 1}, + {76, 1}, }; static arc arcs_25_1[2] = { - {47, 2}, + {48, 2}, {0, 1}, }; static arc arcs_25_2[1] = { @@ -626,7 +669,7 @@ static state states_25[3] = { {1, arcs_25_2}, }; static arc arcs_26_0[1] = { - {50, 1}, + {51, 1}, }; static arc arcs_26_1[1] = { {0, 1}, @@ -636,14 +679,14 @@ static state states_26[2] = { {1, arcs_26_1}, }; static arc arcs_27_0[1] = { - {76, 1}, + {77, 1}, }; static arc arcs_27_1[2] = { {26, 2}, {0, 1}, }; static arc arcs_27_2[2] = { - {77, 3}, + {78, 3}, {0, 2}, }; static arc arcs_27_3[1] = { @@ -660,8 +703,8 @@ static state states_27[5] = { {1, arcs_27_4}, }; static arc arcs_28_0[2] = { - {78, 1}, {79, 1}, + {80, 1}, }; static arc arcs_28_1[1] = { {0, 1}, @@ -671,10 +714,10 @@ static state states_28[2] = { {1, arcs_28_1}, }; static arc arcs_29_0[1] = { - {80, 1}, + {81, 1}, }; static arc arcs_29_1[1] = { - {81, 2}, + {82, 2}, }; static arc arcs_29_2[1] = { {0, 2}, @@ -685,32 +728,32 @@ static state states_29[3] = { {1, arcs_29_2}, }; static arc arcs_30_0[1] = { - {77, 1}, + {78, 1}, }; static arc arcs_30_1[3] = { - {82, 2}, {83, 2}, + {84, 2}, {12, 3}, }; static arc arcs_30_2[4] = { - {82, 2}, {83, 2}, + {84, 2}, {12, 3}, - {80, 4}, + {81, 4}, }; static arc arcs_30_3[1] = { - {80, 4}, + {81, 4}, }; static arc arcs_30_4[3] = { - {33, 5}, + {34, 5}, {13, 6}, - {84, 5}, + {85, 5}, }; static arc arcs_30_5[1] = { {0, 5}, }; static arc arcs_30_6[1] = { - {84, 7}, + {85, 7}, }; static arc arcs_30_7[1] = { {15, 5}, @@ -729,7 +772,7 @@ static arc arcs_31_0[1] = { {23, 1}, }; static arc arcs_31_1[2] = { - {86, 2}, + {87, 2}, {0, 1}, }; static arc arcs_31_2[1] = { @@ -748,7 +791,7 @@ static arc arcs_32_0[1] = { {12, 1}, }; static arc arcs_32_1[2] = { - {86, 2}, + {87, 2}, {0, 1}, }; static arc arcs_32_2[1] = { @@ -764,14 +807,14 @@ static state states_32[4] = { {1, arcs_32_3}, }; static arc arcs_33_0[1] = { - {85, 1}, + {86, 1}, }; static arc arcs_33_1[2] = { - {32, 2}, + {33, 2}, {0, 1}, }; static arc arcs_33_2[2] = { - {85, 1}, + {86, 1}, {0, 2}, }; static state states_33[3] = { @@ -780,10 +823,10 @@ static state states_33[3] = { {2, arcs_33_2}, }; static arc arcs_34_0[1] = { - {87, 1}, + {88, 1}, }; static arc arcs_34_1[2] = { - {32, 0}, + {33, 0}, {0, 1}, }; static state states_34[2] = { @@ -794,21 +837,21 @@ static arc arcs_35_0[1] = { {23, 1}, }; static arc arcs_35_1[2] = { - {82, 0}, + {83, 0}, {0, 1}, }; static state states_35[2] = { {1, arcs_35_0}, {2, arcs_35_1}, }; static arc arcs_36_0[1] = { - {88, 1}, + {89, 1}, }; static arc arcs_36_1[1] = { {23, 2}, }; static arc arcs_36_2[2] = { - {32, 1}, + {33, 1}, {0, 2}, }; static state states_36[3] = { @@ -817,13 +860,13 @@ static state states_36[3] = { {2, arcs_36_2}, }; static arc arcs_37_0[1] = { - {89, 1}, + {90, 1}, }; static arc arcs_37_1[1] = { {23, 2}, }; static arc arcs_37_2[2] = { - {32, 1}, + {33, 1}, {0, 2}, }; static state states_37[3] = { @@ -832,13 +875,13 @@ static state states_37[3] = { {2, arcs_37_2}, }; static arc arcs_38_0[1] = { - {90, 1}, + {91, 1}, }; static arc arcs_38_1[1] = { {26, 2}, }; static arc arcs_38_2[2] = { - {32, 3}, + {33, 3}, {0, 2}, }; static arc arcs_38_3[1] = { @@ -855,15 +898,15 @@ static state states_38[5] = { {1, arcs_38_4}, }; static arc arcs_39_0[9] = { - {91, 1}, {92, 1}, {93, 1}, {94, 1}, {95, 1}, + {96, 1}, {19, 1}, {18, 1}, {17, 1}, - {96, 1}, + {97, 1}, }; static arc arcs_39_1[1] = { {0, 1}, @@ -877,8 +920,8 @@ static arc arcs_40_0[1] = { }; static arc arcs_40_1[3] = { {19, 2}, - {95, 2}, - {93, 2}, + {96, 2}, + {94, 2}, }; static arc arcs_40_2[1] = { {0, 2}, @@ -889,27 +932,27 @@ static state states_40[3] = { {1, arcs_40_2}, }; static arc arcs_41_0[1] = { - {97, 1}, + {98, 1}, }; static arc arcs_41_1[1] = { - {98, 2}, + {99, 2}, }; static arc arcs_41_2[1] = { {27, 3}, }; static arc arcs_41_3[1] = { - {28, 4}, + {100, 4}, }; static arc arcs_41_4[3] = { - {99, 1}, - {100, 5}, + {101, 1}, + {102, 5}, {0, 4}, }; static arc arcs_41_5[1] = { {27, 6}, }; static arc arcs_41_6[1] = { - {28, 7}, + {100, 7}, }; static arc arcs_41_7[1] = { {0, 7}, @@ -925,7 +968,7 @@ static state states_41[8] = { {1, arcs_41_7}, }; static arc arcs_42_0[1] = { - {101, 1}, + {103, 1}, }; static arc arcs_42_1[1] = { {26, 2}, @@ -934,17 +977,17 @@ static arc arcs_42_2[1] = { {27, 3}, }; static arc arcs_42_3[1] = { - {28, 4}, + {100, 4}, }; static arc arcs_42_4[2] = { - {100, 5}, + {102, 5}, {0, 4}, }; static arc arcs_42_5[1] = { {27, 6}, }; static arc arcs_42_6[1] = { - {28, 7}, + {100, 7}, }; static arc arcs_42_7[1] = { {0, 7}, @@ -960,60 +1003,65 @@ static state states_42[8] = { {1, arcs_42_7}, }; static arc arcs_43_0[1] = { - {102, 1}, + {104, 1}, }; static arc arcs_43_1[1] = { - {66, 2}, + {67, 2}, }; static arc arcs_43_2[1] = { - {103, 3}, + {105, 3}, }; static arc arcs_43_3[1] = { {9, 4}, }; static arc arcs_43_4[1] = { {27, 5}, }; -static arc arcs_43_5[1] = { +static arc arcs_43_5[2] = { {28, 6}, + {100, 7}, }; -static arc arcs_43_6[2] = { +static arc arcs_43_6[1] = { {100, 7}, - {0, 6}, }; -static arc arcs_43_7[1] = { - {27, 8}, +static arc arcs_43_7[2] = { + {102, 8}, + {0, 7}, }; static arc arcs_43_8[1] = { - {28, 9}, + {27, 9}, }; static arc arcs_43_9[1] = { - {0, 9}, + {100, 10}, +}; +static arc arcs_43_10[1] = { + {0, 10}, }; -static state states_43[10] = { +static state states_43[11] = { {1, arcs_43_0}, {1, arcs_43_1}, {1, arcs_43_2}, {1, arcs_43_3}, {1, arcs_43_4}, - {1, arcs_43_5}, - {2, arcs_43_6}, - {1, arcs_43_7}, + {2, arcs_43_5}, + {1, arcs_43_6}, + {2, arcs_43_7}, {1, arcs_43_8}, {1, arcs_43_9}, + {1, arcs_43_10}, }; static arc arcs_44_0[1] = { - {104, 1}, + {106, 1}, }; static arc arcs_44_1[1] = { {27, 2}, }; static arc arcs_44_2[1] = { - {28, 3}, + {100, 3}, }; static arc arcs_44_3[2] = { - {105, 4}, - {106, 5}, + {107, 4}, + {108, 5}, }; static arc arcs_44_4[1] = { {27, 6}, @@ -1022,15 +1070,15 @@ static arc arcs_44_5[1] = { {27, 7}, }; static arc arcs_44_6[1] = { - {28, 8}, + {100, 8}, }; static arc arcs_44_7[1] = { - {28, 9}, + {100, 9}, }; static arc arcs_44_8[4] = { - {105, 4}, - {100, 10}, - {106, 5}, + {107, 4}, + {102, 10}, + {108, 5}, {0, 8}, }; static arc arcs_44_9[1] = { @@ -1040,10 +1088,10 @@ static arc arcs_44_10[1] = { {27, 11}, }; static arc arcs_44_11[1] = { - {28, 12}, + {100, 12}, }; static arc arcs_44_12[2] = { - {106, 5}, + {108, 5}, {0, 12}, }; static state states_44[13] = { @@ -1062,37 +1110,42 @@ static state states_44[13] = { {2, arcs_44_12}, }; static arc arcs_45_0[1] = { - {107, 1}, + {109, 1}, }; static arc arcs_45_1[1] = { - {108, 2}, + {110, 2}, }; static arc arcs_45_2[2] = { - {32, 1}, + {33, 1}, {27, 3}, }; -static arc arcs_45_3[1] = { +static arc arcs_45_3[2] = { {28, 4}, + {100, 5}, }; static arc arcs_45_4[1] = { - {0, 4}, + {100, 5}, }; -static state states_45[5] = { +static arc arcs_45_5[1] = { + {0, 5}, +}; +static state states_45[6] = { {1, arcs_45_0}, {1, arcs_45_1}, {2, arcs_45_2}, - {1, arcs_45_3}, + {2, arcs_45_3}, {1, arcs_45_4}, + {1, arcs_45_5}, }; static arc arcs_46_0[1] = { {26, 1}, }; static arc arcs_46_1[2] = { - {86, 2}, + {87, 2}, {0, 1}, }; static arc arcs_46_2[1] = { - {109, 3}, + {111, 3}, }; static arc arcs_46_3[1] = { {0, 3}, @@ -1104,14 +1157,14 @@ static state states_46[4] = { {1, arcs_46_3}, }; static arc arcs_47_0[1] = { - {110, 1}, + {112, 1}, }; static arc arcs_47_1[2] = { {26, 2}, {0, 1}, }; static arc arcs_47_2[2] = { - {86, 3}, + {87, 3}, {0, 2}, }; static arc arcs_47_3[1] = { @@ -1135,14 +1188,14 @@ static arc arcs_48_1[1] = { {0, 1}, }; static arc arcs_48_2[1] = { - {111, 3}, + {113, 3}, }; static arc arcs_48_3[1] = { {6, 4}, }; static arc arcs_48_4[2] = { {6, 4}, - {112, 1}, + {114, 1}, }; static state states_48[5] = { {2, arcs_48_0}, @@ -1155,7 +1208,7 @@ static arc arcs_49_0[1] = { {26, 1}, }; static arc arcs_49_1[2] = { - {113, 2}, + {115, 2}, {0, 1}, }; static arc arcs_49_2[1] = { @@ -1171,21 +1224,21 @@ static state states_49[4] = { {1, arcs_49_3}, }; static arc arcs_50_0[2] = { - {114, 1}, - {115, 2}, + {116, 1}, + {117, 2}, }; static arc arcs_50_1[2] = { - {97, 3}, + {98, 3}, {0, 1}, }; static arc arcs_50_2[1] = { {0, 2}, }; static arc arcs_50_3[1] = { - {114, 4}, + {116, 4}, }; static arc arcs_50_4[1] = { - {100, 5}, + {102, 5}, }; static arc arcs_50_5[1] = { {26, 2}, @@ -1199,8 +1252,8 @@ static state states_50[6] = { {1, arcs_50_5}, }; static arc arcs_51_0[2] = { - {114, 1}, - {117, 1}, + {116, 1}, + {119, 1}, }; static arc arcs_51_1[1] = { {0, 1}, @@ -1210,10 +1263,10 @@ static state states_51[2] = { {1, arcs_51_1}, }; static arc arcs_52_0[1] = { - {118, 1}, + {120, 1}, }; static arc arcs_52_1[2] = { - {35, 2}, + {36, 2}, {27, 3}, }; static arc arcs_52_2[1] = { @@ -1233,17 +1286,17 @@ static state states_52[5] = { {1, arcs_52_4}, }; static arc arcs_53_0[1] = { - {118, 1}, + {120, 1}, }; static arc arcs_53_1[2] = { - {35, 2}, + {36, 2}, {27, 3}, }; static arc arcs_53_2[1] = { {27, 3}, }; static arc arcs_53_3[1] = { - {116, 4}, + {118, 4}, }; static arc arcs_53_4[1] = { {0, 4}, @@ -1256,33 +1309,33 @@ static state states_53[5] = { {1, arcs_53_4}, }; static arc arcs_54_0[1] = { - {119, 1}, + {121, 1}, }; static arc arcs_54_1[2] = { - {120, 0}, + {122, 0}, {0, 1}, }; static state states_54[2] = { {1, arcs_54_0}, {2, arcs_54_1}, }; static arc arcs_55_0[1] = { - {121, 1}, + {123, 1}, }; static arc arcs_55_1[2] = { - {122, 0}, + {124, 0}, {0, 1}, }; static state states_55[2] = { {1, arcs_55_0}, {2, arcs_55_1}, }; static arc arcs_56_0[2] = { - {123, 1}, - {124, 2}, + {125, 1}, + {126, 2}, }; static arc arcs_56_1[1] = { - {121, 2}, + {123, 2}, }; static arc arcs_56_2[1] = { {0, 2}, @@ -1293,36 +1346,36 @@ static state states_56[3] = { {1, arcs_56_2}, }; static arc arcs_57_0[1] = { - {109, 1}, + {111, 1}, }; static arc arcs_57_1[2] = { - {125, 0}, + {127, 0}, {0, 1}, }; static state states_57[2] = { {1, arcs_57_0}, {2, arcs_57_1}, }; static arc arcs_58_0[10] = { - {126, 1}, - {127, 1}, {128, 1}, {129, 1}, {130, 1}, {131, 1}, {132, 1}, - {103, 1}, - {123, 2}, - {133, 3}, + {133, 1}, + {134, 1}, + {105, 1}, + {125, 2}, + {135, 3}, }; static arc arcs_58_1[1] = { {0, 1}, }; static arc arcs_58_2[1] = { - {103, 1}, + {105, 1}, }; static arc arcs_58_3[2] = { - {123, 1}, + {125, 1}, {0, 3}, }; static state states_58[4] = { @@ -1332,10 +1385,10 @@ static state states_58[4] = { {2, arcs_58_3}, }; static arc arcs_59_0[1] = { - {33, 1}, + {34, 1}, }; static arc arcs_59_1[1] = { - {109, 2}, + {111, 2}, }; static arc arcs_59_2[1] = { {0, 2}, @@ -1346,85 +1399,85 @@ static state states_59[3] = { {1, arcs_59_2}, }; static arc arcs_60_0[1] = { - {134, 1}, + {136, 1}, }; static arc arcs_60_1[2] = { - {135, 0}, + {137, 0}, {0, 1}, }; static state states_60[2] = { {1, arcs_60_0}, {2, arcs_60_1}, }; static arc arcs_61_0[1] = { - {136, 1}, + {138, 1}, }; static arc arcs_61_1[2] = { - {137, 0}, + {139, 0}, {0, 1}, }; static state states_61[2] = { {1, arcs_61_0}, {2, arcs_61_1}, }; static arc arcs_62_0[1] = { - {138, 1}, + {140, 1}, }; static arc arcs_62_1[2] = { - {139, 0}, + {141, 0}, {0, 1}, }; static state states_62[2] = { {1, arcs_62_0}, {2, arcs_62_1}, }; static arc arcs_63_0[1] = { - {140, 1}, + {142, 1}, }; static arc arcs_63_1[3] = { - {141, 0}, - {142, 0}, + {143, 0}, + {144, 0}, {0, 1}, }; static state states_63[2] = { {1, arcs_63_0}, {3, arcs_63_1}, }; static arc arcs_64_0[1] = { - {143, 1}, + {145, 1}, }; static arc arcs_64_1[3] = { - {144, 0}, - {145, 0}, + {146, 0}, + {147, 0}, {0, 1}, }; static state states_64[2] = { {1, arcs_64_0}, {3, arcs_64_1}, }; static arc arcs_65_0[1] = { - {146, 1}, + {148, 1}, }; static arc arcs_65_1[6] = { - {33, 0}, + {34, 0}, {11, 0}, - {147, 0}, - {148, 0}, {149, 0}, + {150, 0}, + {151, 0}, {0, 1}, }; static state states_65[2] = { {1, arcs_65_0}, {6, arcs_65_1}, }; static arc arcs_66_0[4] = { - {144, 1}, - {145, 1}, - {150, 1}, - {151, 2}, + {146, 1}, + {147, 1}, + {152, 1}, + {153, 2}, }; static arc arcs_66_1[1] = { - {146, 2}, + {148, 2}, }; static arc arcs_66_2[1] = { {0, 2}, @@ -1435,14 +1488,14 @@ static state states_66[3] = { {1, arcs_66_2}, }; static arc arcs_67_0[1] = { - {152, 1}, + {154, 1}, }; static arc arcs_67_1[2] = { - {34, 2}, + {35, 2}, {0, 1}, }; static arc arcs_67_2[1] = { - {146, 3}, + {148, 3}, }; static arc arcs_67_3[1] = { {0, 3}, @@ -1454,14 +1507,14 @@ static state states_67[4] = { {1, arcs_67_3}, }; static arc arcs_68_0[2] = { - {153, 1}, - {154, 2}, + {155, 1}, + {156, 2}, }; static arc arcs_68_1[1] = { - {154, 2}, + {156, 2}, }; static arc arcs_68_2[2] = { - {155, 2}, + {157, 2}, {0, 2}, }; static state states_68[3] = { @@ -1471,44 +1524,44 @@ static state states_68[3] = { }; static arc arcs_69_0[10] = { {13, 1}, - {157, 2}, - {159, 3}, + {159, 2}, + {161, 3}, {23, 4}, - {162, 4}, - {163, 5}, - {83, 4}, {164, 4}, - {165, 4}, + {165, 5}, + {84, 4}, {166, 4}, + {167, 4}, + {168, 4}, }; static arc arcs_69_1[3] = { - {50, 6}, - {156, 6}, + {51, 6}, + {158, 6}, {15, 4}, }; static arc arcs_69_2[2] = { - {156, 7}, - {158, 4}, + {158, 7}, + {160, 4}, }; static arc arcs_69_3[2] = { - {160, 8}, - {161, 4}, + {162, 8}, + {163, 4}, }; static arc arcs_69_4[1] = { {0, 4}, }; static arc arcs_69_5[2] = { - {163, 5}, + {165, 5}, {0, 5}, }; static arc arcs_69_6[1] = { {15, 4}, }; static arc arcs_69_7[1] = { - {158, 4}, + {160, 4}, }; static arc arcs_69_8[1] = { - {161, 4}, + {163, 4}, }; static state states_69[9] = { {10, arcs_69_0}, @@ -1522,24 +1575,24 @@ static state states_69[9] = { {1, arcs_69_8}, }; static arc arcs_70_0[2] = { - {98, 1}, - {51, 1}, + {99, 1}, + {52, 1}, }; static arc arcs_70_1[3] = { - {167, 2}, - {32, 3}, + {169, 2}, + {33, 3}, {0, 1}, }; static arc arcs_70_2[1] = { {0, 2}, }; static arc arcs_70_3[3] = { - {98, 4}, - {51, 4}, + {99, 4}, + {52, 4}, {0, 3}, }; static arc arcs_70_4[2] = { - {32, 3}, + {33, 3}, {0, 4}, }; static state states_70[5] = { @@ -1551,15 +1604,15 @@ static state states_70[5] = { }; static arc arcs_71_0[3] = { {13, 1}, - {157, 2}, - {82, 3}, + {159, 2}, + {83, 3}, }; static arc arcs_71_1[2] = { {14, 4}, {15, 5}, }; static arc arcs_71_2[1] = { - {168, 6}, + {170, 6}, }; static arc arcs_71_3[1] = { {23, 5}, @@ -1571,7 +1624,7 @@ static arc arcs_71_5[1] = { {0, 5}, }; static arc arcs_71_6[1] = { - {158, 5}, + {160, 5}, }; static state states_71[7] = { {3, arcs_71_0}, @@ -1583,14 +1636,14 @@ static state states_71[7] = { {1, arcs_71_6}, }; static arc arcs_72_0[1] = { - {169, 1}, + {171, 1}, }; static arc arcs_72_1[2] = { - {32, 2}, + {33, 2}, {0, 1}, }; static arc arcs_72_2[2] = { - {169, 1}, + {171, 1}, {0, 2}, }; static state states_72[3] = { @@ -1608,11 +1661,11 @@ static arc arcs_73_1[2] = { }; static arc arcs_73_2[3] = { {26, 3}, - {170, 4}, + {172, 4}, {0, 2}, }; static arc arcs_73_3[2] = { - {170, 4}, + {172, 4}, {0, 3}, }; static arc arcs_73_4[1] = { @@ -1641,16 +1694,16 @@ static state states_74[3] = { {1, arcs_74_2}, }; static arc arcs_75_0[2] = { - {109, 1}, - {51, 1}, + {111, 1}, + {52, 1}, }; static arc arcs_75_1[2] = { - {32, 2}, + {33, 2}, {0, 1}, }; static arc arcs_75_2[3] = { - {109, 1}, - {51, 1}, + {111, 1}, + {52, 1}, {0, 2}, }; static state states_75[3] = { @@ -1662,7 +1715,7 @@ static arc arcs_76_0[1] = { {26, 1}, }; static arc arcs_76_1[2] = { - {32, 2}, + {33, 2}, {0, 1}, }; static arc arcs_76_2[2] = { @@ -1676,21 +1729,21 @@ static state states_76[3] = { }; static arc arcs_77_0[3] = { {26, 1}, - {34, 2}, - {51, 3}, + {35, 2}, + {52, 3}, }; static arc arcs_77_1[4] = { {27, 4}, - {167, 5}, - {32, 6}, + {169, 5}, + {33, 6}, {0, 1}, }; static arc arcs_77_2[1] = { - {109, 7}, + {111, 7}, }; static arc arcs_77_3[3] = { - {167, 5}, - {32, 6}, + {169, 5}, + {33, 6}, {0, 3}, }; static arc arcs_77_4[1] = { @@ -1701,34 +1754,34 @@ static arc arcs_77_5[1] = { }; static arc arcs_77_6[3] = { {26, 8}, - {51, 8}, + {52, 8}, {0, 6}, }; static arc arcs_77_7[3] = { - {167, 5}, - {32, 9}, + {169, 5}, + {33, 9}, {0, 7}, }; static arc arcs_77_8[2] = { - {32, 6}, + {33, 6}, {0, 8}, }; static arc arcs_77_9[3] = { {26, 10}, - {34, 11}, + {35, 11}, {0, 9}, }; static arc arcs_77_10[1] = { {27, 12}, }; static arc arcs_77_11[1] = { - {109, 13}, + {111, 13}, }; static arc arcs_77_12[1] = { {26, 13}, }; static arc arcs_77_13[2] = { - {32, 9}, + {33, 9}, {0, 13}, }; static state states_77[14] = { @@ -1748,7 +1801,7 @@ static state states_77[14] = { {2, arcs_77_13}, }; static arc arcs_78_0[1] = { - {171, 1}, + {173, 1}, }; static arc arcs_78_1[1] = { {23, 2}, @@ -1762,7 +1815,7 @@ static arc arcs_78_3[2] = { {15, 6}, }; static arc arcs_78_4[1] = { - {28, 7}, + {100, 7}, }; static arc arcs_78_5[1] = { {15, 6}, @@ -1784,14 +1837,14 @@ static state states_78[8] = { {1, arcs_78_7}, }; static arc arcs_79_0[1] = { - {172, 1}, + {174, 1}, }; static arc arcs_79_1[2] = { - {32, 2}, + {33, 2}, {0, 1}, }; static arc arcs_79_2[2] = { - {172, 1}, + {174, 1}, {0, 2}, }; static state states_79[3] = { @@ -1801,13 +1854,13 @@ static state states_79[3] = { }; static arc arcs_80_0[3] = { {26, 1}, + {35, 2}, {34, 2}, - {33, 2}, }; static arc arcs_80_1[4] = { - {167, 3}, - {113, 2}, - {31, 2}, + {169, 3}, + {115, 2}, + {32, 2}, {0, 1}, }; static arc arcs_80_2[1] = { @@ -1823,8 +1876,8 @@ static state states_80[4] = { {1, arcs_80_3}, }; static arc arcs_81_0[2] = { - {167, 1}, - {174, 1}, + {169, 1}, + {176, 1}, }; static arc arcs_81_1[1] = { {0, 1}, @@ -1834,19 +1887,19 @@ static state states_81[2] = { {1, arcs_81_1}, }; static arc arcs_82_0[1] = { - {102, 1}, + {104, 1}, }; static arc arcs_82_1[1] = { - {66, 2}, + {67, 2}, }; static arc arcs_82_2[1] = { - {103, 3}, + {105, 3}, }; static arc arcs_82_3[1] = { - {114, 4}, + {116, 4}, }; static arc arcs_82_4[2] = { - {173, 5}, + {175, 5}, {0, 4}, }; static arc arcs_82_5[1] = { @@ -1862,10 +1915,10 @@ static state states_82[6] = { }; static arc arcs_83_0[2] = { {21, 1}, - {175, 2}, + {177, 2}, }; static arc arcs_83_1[1] = { - {175, 2}, + {177, 2}, }; static arc arcs_83_2[1] = { {0, 2}, @@ -1876,13 +1929,13 @@ static state states_83[3] = { {1, arcs_83_2}, }; static arc arcs_84_0[1] = { - {97, 1}, + {98, 1}, }; static arc arcs_84_1[1] = { - {116, 2}, + {118, 2}, }; static arc arcs_84_2[2] = { - {173, 3}, + {175, 3}, {0, 2}, }; static arc arcs_84_3[1] = { @@ -1905,10 +1958,10 @@ static state states_85[2] = { {1, arcs_85_1}, }; static arc arcs_86_0[1] = { - {177, 1}, + {179, 1}, }; static arc arcs_86_1[2] = { - {178, 2}, + {180, 2}, {0, 1}, }; static arc arcs_86_2[1] = { @@ -1920,8 +1973,8 @@ static state states_86[3] = { {1, arcs_86_2}, }; static arc arcs_87_0[2] = { - {77, 1}, - {47, 2}, + {78, 1}, + {48, 2}, }; static arc arcs_87_1[1] = { {26, 2}, @@ -1934,13 +1987,148 @@ static state states_87[3] = { {1, arcs_87_1}, {1, arcs_87_2}, }; -static dfa dfas[88] = { +static arc arcs_88_0[2] = { + {3, 1}, + {2, 2}, +}; +static arc arcs_88_1[1] = { + {0, 1}, +}; +static arc arcs_88_2[2] = { + {28, 3}, + {113, 4}, +}; +static arc arcs_88_3[1] = { + {2, 5}, +}; +static arc arcs_88_4[1] = { + {6, 6}, +}; +static arc arcs_88_5[1] = { + {113, 4}, +}; +static arc arcs_88_6[2] = { + {6, 6}, + {114, 1}, +}; +static state states_88[7] = { + {2, arcs_88_0}, + {1, arcs_88_1}, + {2, arcs_88_2}, + {1, arcs_88_3}, + {1, arcs_88_4}, + {1, arcs_88_5}, + {2, arcs_88_6}, +}; +static arc arcs_89_0[1] = { + {182, 1}, +}; +static arc arcs_89_1[2] = { + {2, 1}, + {7, 2}, +}; +static arc arcs_89_2[1] = { + {0, 2}, +}; +static state states_89[3] = { + {1, arcs_89_0}, + {2, arcs_89_1}, + {1, arcs_89_2}, +}; +static arc arcs_90_0[1] = { + {13, 1}, +}; +static arc arcs_90_1[2] = { + {183, 2}, + {15, 3}, +}; +static arc arcs_90_2[1] = { + {15, 3}, +}; +static arc arcs_90_3[1] = { + {25, 4}, +}; +static arc arcs_90_4[1] = { + {26, 5}, +}; +static arc arcs_90_5[1] = { + {0, 5}, +}; +static state states_90[6] = { + {1, arcs_90_0}, + {2, arcs_90_1}, + {1, arcs_90_2}, + {1, arcs_90_3}, + {1, arcs_90_4}, + {1, arcs_90_5}, +}; +static arc arcs_91_0[3] = { + {26, 1}, + {34, 2}, + {35, 3}, +}; +static arc arcs_91_1[2] = { + {33, 4}, + {0, 1}, +}; +static arc arcs_91_2[3] = { + {26, 5}, + {33, 6}, + {0, 2}, +}; +static arc arcs_91_3[1] = { + {26, 7}, +}; +static arc arcs_91_4[4] = { + {26, 1}, + {34, 8}, + {35, 3}, + {0, 4}, +}; +static arc arcs_91_5[2] = { + {33, 6}, + {0, 5}, +}; +static arc arcs_91_6[2] = { + {26, 5}, + {35, 3}, +}; +static arc arcs_91_7[1] = { + {0, 7}, +}; +static arc arcs_91_8[3] = { + {26, 9}, + {33, 10}, + {0, 8}, +}; +static arc arcs_91_9[2] = { + {33, 10}, + {0, 9}, +}; +static arc arcs_91_10[2] = { + {26, 9}, + {35, 3}, +}; +static state states_91[11] = { + {3, arcs_91_0}, + {2, arcs_91_1}, + {3, arcs_91_2}, + {1, arcs_91_3}, + {4, arcs_91_4}, + {2, arcs_91_5}, + {2, arcs_91_6}, + {1, arcs_91_7}, + {3, arcs_91_8}, + {2, arcs_91_9}, + {2, arcs_91_10}, +}; +static dfa dfas[92] = { {256, "single_input", 0, 3, states_0, - "\004\050\340\000\002\000\000\000\012\076\011\007\142\011\100\010\000\000\103\242\174\010\002"}, + "\004\050\340\000\004\000\000\000\024\174\022\016\204\045\000\041\000\000\014\211\362\041\010"}, {257, "file_input", 0, 2, states_1, - "\204\050\340\000\002\000\000\000\012\076\011\007\142\011\100\010\000\000\103\242\174\010\002"}, + "\204\050\340\000\004\000\000\000\024\174\022\016\204\045\000\041\000\000\014\211\362\041\010"}, {258, "eval_input", 0, 3, states_2, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {259, "decorator", 0, 7, states_3, "\000\010\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {260, "decorators", 0, 2, states_4, @@ -1949,54 +2137,54 @@ static dfa dfas[88] = { "\000\010\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {262, "async_funcdef", 0, 3, states_6, "\000\000\040\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, - {263, "funcdef", 0, 8, states_7, + {263, "funcdef", 0, 9, states_7, "\000\000\100\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {264, "parameters", 0, 4, states_8, "\000\040\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, - {265, "typedargslist", 0, 19, states_9, - "\000\000\200\000\006\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, + {265, "typedargslist", 0, 23, states_9, + "\000\000\200\000\014\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {266, "tfpdef", 0, 4, states_10, "\000\000\200\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {267, "varargslist", 0, 19, states_11, - "\000\000\200\000\006\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\200\000\014\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {268, "vfpdef", 0, 2, states_12, "\000\000\200\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {269, "stmt", 0, 2, states_13, - "\000\050\340\000\002\000\000\000\012\076\011\007\142\011\100\010\000\000\103\242\174\010\002"}, + "\000\050\340\000\004\000\000\000\024\174\022\016\204\045\000\041\000\000\014\211\362\041\010"}, {270, "simple_stmt", 0, 4, states_14, - "\000\040\200\000\002\000\000\000\012\076\011\007\000\000\100\010\000\000\103\242\174\000\002"}, + "\000\040\200\000\004\000\000\000\024\174\022\016\000\000\000\041\000\000\014\211\362\001\010"}, {271, "small_stmt", 0, 2, states_15, - "\000\040\200\000\002\000\000\000\012\076\011\007\000\000\100\010\000\000\103\242\174\000\002"}, + "\000\040\200\000\004\000\000\000\024\174\022\016\000\000\000\041\000\000\014\211\362\001\010"}, {272, "expr_stmt", 0, 6, states_16, - "\000\040\200\000\002\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\004\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {273, "annassign", 0, 5, states_17, "\000\000\000\010\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {274, "testlist_star_expr", 0, 3, states_18, - "\000\040\200\000\002\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\004\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {275, "augassign", 0, 2, states_19, - "\000\000\000\000\000\000\360\377\001\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\340\377\003\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {276, "del_stmt", 0, 3, states_20, - "\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {277, "pass_stmt", 0, 2, states_21, - "\000\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {278, "flow_stmt", 0, 2, states_22, - "\000\000\000\000\000\000\000\000\000\036\000\000\000\000\000\000\000\000\000\000\000\000\002"}, + "\000\000\000\000\000\000\000\000\000\074\000\000\000\000\000\000\000\000\000\000\000\000\010"}, {279, "break_stmt", 0, 2, states_23, - "\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000\000\000\000"}, - {280, "continue_stmt", 0, 2, states_24, "\000\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000\000\000\000\000\000"}, - {281, "return_stmt", 0, 3, states_25, + {280, "continue_stmt", 0, 2, states_24, "\000\000\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\000\000\000\000\000\000"}, + {281, "return_stmt", 0, 3, states_25, + "\000\000\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {282, "yield_stmt", 0, 2, states_26, - "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\002"}, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\010"}, {283, "raise_stmt", 0, 5, states_27, - "\000\000\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\000\040\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {284, "import_stmt", 0, 2, states_28, - "\000\000\000\000\000\000\000\000\000\040\001\000\000\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\000\100\002\000\000\000\000\000\000\000\000\000\000\000\000"}, {285, "import_name", 0, 3, states_29, - "\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000\000\000"}, {286, "import_from", 0, 8, states_30, - "\000\000\000\000\000\000\000\000\000\040\000\000\000\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\000\100\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {287, "import_as_name", 0, 4, states_31, "\000\000\200\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {288, "dotted_as_name", 0, 4, states_32, @@ -2008,111 +2196,119 @@ static dfa dfas[88] = { {291, "dotted_name", 0, 2, states_35, "\000\000\200\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {292, "global_stmt", 0, 3, states_36, - "\000\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\000\000\000\000"}, - {293, "nonlocal_stmt", 0, 3, states_37, "\000\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000\000"}, - {294, "assert_stmt", 0, 5, states_38, + {293, "nonlocal_stmt", 0, 3, states_37, "\000\000\000\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000\000\000\000"}, + {294, "assert_stmt", 0, 5, states_38, + "\000\000\000\000\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\000\000\000\000"}, {295, "compound_stmt", 0, 2, states_39, - "\000\010\140\000\000\000\000\000\000\000\000\000\142\011\000\000\000\000\000\000\000\010\000"}, + "\000\010\140\000\000\000\000\000\000\000\000\000\204\045\000\000\000\000\000\000\000\040\000"}, {296, "async_stmt", 0, 3, states_40, "\000\000\040\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {297, "if_stmt", 0, 8, states_41, - "\000\000\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000\000\000"}, {298, "while_stmt", 0, 8, states_42, - "\000\000\000\000\000\000\000\000\000\000\000\000\040\000\000\000\000\000\000\000\000\000\000"}, - {299, "for_stmt", 0, 10, states_43, - "\000\000\000\000\000\000\000\000\000\000\000\000\100\000\000\000\000\000\000\000\000\000\000"}, - {300, "try_stmt", 0, 13, states_44, + "\000\000\000\000\000\000\000\000\000\000\000\000\200\000\000\000\000\000\000\000\000\000\000"}, + {299, "for_stmt", 0, 11, states_43, "\000\000\000\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\000\000"}, - {301, "with_stmt", 0, 5, states_45, - "\000\000\000\000\000\000\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\000\000"}, + {300, "try_stmt", 0, 13, states_44, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000\000"}, + {301, "with_stmt", 0, 6, states_45, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\040\000\000\000\000\000\000\000\000\000"}, {302, "with_item", 0, 4, states_46, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {303, "except_clause", 0, 5, states_47, - "\000\000\000\000\000\000\000\000\000\000\000\000\000\100\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\000"}, {304, "suite", 0, 5, states_48, - "\004\040\200\000\002\000\000\000\012\076\011\007\000\000\100\010\000\000\103\242\174\000\002"}, + "\004\040\200\000\004\000\000\000\024\174\022\016\000\000\000\041\000\000\014\211\362\001\010"}, {305, "namedexpr_test", 0, 4, states_49, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {306, "test", 0, 6, states_50, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {307, "test_nocond", 0, 2, states_51, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {308, "lambdef", 0, 5, states_52, - "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\100\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000"}, {309, "lambdef_nocond", 0, 5, states_53, - "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\100\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000"}, {310, "or_test", 0, 2, states_54, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\040\000\000\014\211\362\001\000"}, {311, "and_test", 0, 2, states_55, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\040\000\000\014\211\362\001\000"}, {312, "not_test", 0, 3, states_56, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\040\000\000\014\211\362\001\000"}, {313, "comparison", 0, 2, states_57, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\014\211\362\001\000"}, {314, "comp_op", 0, 4, states_58, - "\000\000\000\000\000\000\000\000\000\000\000\000\200\000\000\310\077\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\002\000\040\377\000\000\000\000\000\000"}, {315, "star_expr", 0, 3, states_59, - "\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\004\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {316, "expr", 0, 2, states_60, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\014\211\362\001\000"}, {317, "xor_expr", 0, 2, states_61, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\014\211\362\001\000"}, {318, "and_expr", 0, 2, states_62, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\014\211\362\001\000"}, {319, "shift_expr", 0, 2, states_63, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\014\211\362\001\000"}, {320, "arith_expr", 0, 2, states_64, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\014\211\362\001\000"}, {321, "term", 0, 2, states_65, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\014\211\362\001\000"}, {322, "factor", 0, 3, states_66, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\014\211\362\001\000"}, {323, "power", 0, 4, states_67, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\000\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\000\210\362\001\000"}, {324, "atom_expr", 0, 3, states_68, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\000\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\000\210\362\001\000"}, {325, "atom", 0, 9, states_69, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\000\240\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\000\200\362\001\000"}, {326, "testlist_comp", 0, 5, states_70, - "\000\040\200\000\002\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\004\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {327, "trailer", 0, 7, states_71, - "\000\040\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000\040\000\000\000"}, + "\000\040\000\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\000\200\000\000\000"}, {328, "subscriptlist", 0, 3, states_72, - "\000\040\200\010\000\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\010\000\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {329, "subscript", 0, 5, states_73, - "\000\040\200\010\000\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\010\000\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {330, "sliceop", 0, 3, states_74, "\000\000\000\010\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {331, "exprlist", 0, 3, states_75, - "\000\040\200\000\002\000\000\000\000\000\010\000\000\000\000\000\000\000\103\242\174\000\000"}, + "\000\040\200\000\004\000\000\000\000\000\020\000\000\000\000\000\000\000\014\211\362\001\000"}, {332, "testlist", 0, 3, states_76, - "\000\040\200\000\000\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {333, "dictorsetmaker", 0, 14, states_77, - "\000\040\200\000\006\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\014\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {334, "classdef", 0, 8, states_78, - "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\010\000"}, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\040\000"}, {335, "arglist", 0, 3, states_79, - "\000\040\200\000\006\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\014\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {336, "argument", 0, 4, states_80, - "\000\040\200\000\006\000\000\000\000\000\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\014\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, {337, "comp_iter", 0, 2, states_81, - "\000\000\040\000\000\000\000\000\000\000\000\000\102\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\040\000\000\000\000\000\000\000\000\000\004\001\000\000\000\000\000\000\000\000\000"}, {338, "sync_comp_for", 0, 6, states_82, - "\000\000\000\000\000\000\000\000\000\000\000\000\100\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\000\000"}, {339, "comp_for", 0, 3, states_83, - "\000\000\040\000\000\000\000\000\000\000\000\000\100\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\040\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\000\000"}, {340, "comp_if", 0, 4, states_84, - "\000\000\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000"}, + "\000\000\000\000\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000\000\000"}, {341, "encoding_decl", 0, 2, states_85, "\000\000\200\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {342, "yield_expr", 0, 3, states_86, - "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\002"}, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\010"}, {343, "yield_arg", 0, 3, states_87, - "\000\040\200\000\002\000\000\000\000\040\010\000\000\000\100\010\000\000\103\242\174\000\000"}, + "\000\040\200\000\004\000\000\000\000\100\020\000\000\000\000\041\000\000\014\211\362\001\000"}, + {344, "func_body_suite", 0, 7, states_88, + "\004\040\200\000\004\000\000\000\024\174\022\016\000\000\000\041\000\000\014\211\362\001\010"}, + {345, "func_type_input", 0, 3, states_89, + "\000\040\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, + {346, "func_type", 0, 6, states_90, + "\000\040\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, + {347, "typelist", 0, 11, states_91, + "\000\040\200\000\014\000\000\000\000\000\020\000\000\000\000\041\000\000\014\211\362\001\000"}, }; -static label labels[179] = { +static label labels[184] = { {0, "EMPTY"}, {256, 0}, {4, 0}, @@ -2141,7 +2337,8 @@ static label labels[179] = { {51, 0}, {306, 0}, {11, 0}, - {304, 0}, + {56, 0}, + {344, 0}, {265, 0}, {266, 0}, {22, 0}, @@ -2212,6 +2409,7 @@ static label labels[179] = { {296, 0}, {1, "if"}, {305, 0}, + {304, 0}, {1, "elif"}, {1, "else"}, {1, "while"}, @@ -2292,10 +2490,13 @@ static label labels[179] = { {341, 0}, {1, "yield"}, {343, 0}, + {345, 0}, + {346, 0}, + {347, 0}, }; grammar _PyParser_Grammar = { - 88, + 92, dfas, - {179, labels}, + {184, labels}, 256 };
Python/Python-ast.c+466 −46 modified@@ -10,8 +10,10 @@ static PyTypeObject *mod_type; static PyObject* ast2obj_mod(void*); static PyTypeObject *Module_type; _Py_IDENTIFIER(body); +_Py_IDENTIFIER(type_ignores); static char *Module_fields[]={ "body", + "type_ignores", }; static PyTypeObject *Interactive_type; static char *Interactive_fields[]={ @@ -21,6 +23,13 @@ static PyTypeObject *Expression_type; static char *Expression_fields[]={ "body", }; +static PyTypeObject *FunctionType_type; +_Py_IDENTIFIER(argtypes); +_Py_IDENTIFIER(returns); +static char *FunctionType_fields[]={ + "argtypes", + "returns", +}; static PyTypeObject *Suite_type; static char *Suite_fields[]={ "body", @@ -41,13 +50,14 @@ static PyTypeObject *FunctionDef_type; _Py_IDENTIFIER(name); _Py_IDENTIFIER(args); _Py_IDENTIFIER(decorator_list); -_Py_IDENTIFIER(returns); +_Py_IDENTIFIER(type_comment); static char *FunctionDef_fields[]={ "name", "args", "body", "decorator_list", "returns", + "type_comment", }; static PyTypeObject *AsyncFunctionDef_type; static char *AsyncFunctionDef_fields[]={ @@ -56,6 +66,7 @@ static char *AsyncFunctionDef_fields[]={ "body", "decorator_list", "returns", + "type_comment", }; static PyTypeObject *ClassDef_type; _Py_IDENTIFIER(bases); @@ -81,6 +92,7 @@ static PyTypeObject *Assign_type; static char *Assign_fields[]={ "targets", "value", + "type_comment", }; static PyTypeObject *AugAssign_type; _Py_IDENTIFIER(target); @@ -107,13 +119,15 @@ static char *For_fields[]={ "iter", "body", "orelse", + "type_comment", }; static PyTypeObject *AsyncFor_type; static char *AsyncFor_fields[]={ "target", "iter", "body", "orelse", + "type_comment", }; static PyTypeObject *While_type; _Py_IDENTIFIER(test); @@ -133,11 +147,13 @@ _Py_IDENTIFIER(items); static char *With_fields[]={ "items", "body", + "type_comment", }; static PyTypeObject *AsyncWith_type; static char *AsyncWith_fields[]={ "items", "body", + "type_comment", }; static PyTypeObject *Raise_type; _Py_IDENTIFIER(exc); @@ -478,6 +494,7 @@ _Py_IDENTIFIER(arg); static char *arg_fields[]={ "arg", "annotation", + "type_comment", }; static PyTypeObject *keyword_type; static PyObject* ast2obj_keyword(void*); @@ -500,6 +517,12 @@ static char *withitem_fields[]={ "context_expr", "optional_vars", }; +static PyTypeObject *type_ignore_type; +static PyObject* ast2obj_type_ignore(void*); +static PyTypeObject *TypeIgnore_type; +static char *TypeIgnore_fields[]={ + "lineno", +}; _Py_IDENTIFIER(_fields); @@ -769,6 +792,15 @@ static int obj2ast_identifier(PyObject* obj, PyObject** out, PyArena* arena) return obj2ast_object(obj, out, arena); } +static int obj2ast_string(PyObject* obj, PyObject** out, PyArena* arena) +{ + if (!PyUnicode_CheckExact(obj) && !PyBytes_CheckExact(obj)) { + PyErr_SetString(PyExc_TypeError, "AST string must be of type str"); + return 1; + } + return obj2ast_object(obj, out, arena); +} + static int obj2ast_int(PyObject* obj, int* out, PyArena* arena) { int i; @@ -810,47 +842,50 @@ static int init_types(void) mod_type = make_type("mod", &AST_type, NULL, 0); if (!mod_type) return 0; if (!add_attributes(mod_type, NULL, 0)) return 0; - Module_type = make_type("Module", mod_type, Module_fields, 1); + Module_type = make_type("Module", mod_type, Module_fields, 2); if (!Module_type) return 0; Interactive_type = make_type("Interactive", mod_type, Interactive_fields, 1); if (!Interactive_type) return 0; Expression_type = make_type("Expression", mod_type, Expression_fields, 1); if (!Expression_type) return 0; + FunctionType_type = make_type("FunctionType", mod_type, + FunctionType_fields, 2); + if (!FunctionType_type) return 0; Suite_type = make_type("Suite", mod_type, Suite_fields, 1); if (!Suite_type) return 0; stmt_type = make_type("stmt", &AST_type, NULL, 0); if (!stmt_type) return 0; if (!add_attributes(stmt_type, stmt_attributes, 4)) return 0; FunctionDef_type = make_type("FunctionDef", stmt_type, FunctionDef_fields, - 5); + 6); if (!FunctionDef_type) return 0; AsyncFunctionDef_type = make_type("AsyncFunctionDef", stmt_type, - AsyncFunctionDef_fields, 5); + AsyncFunctionDef_fields, 6); if (!AsyncFunctionDef_type) return 0; ClassDef_type = make_type("ClassDef", stmt_type, ClassDef_fields, 5); if (!ClassDef_type) return 0; Return_type = make_type("Return", stmt_type, Return_fields, 1); if (!Return_type) return 0; Delete_type = make_type("Delete", stmt_type, Delete_fields, 1); if (!Delete_type) return 0; - Assign_type = make_type("Assign", stmt_type, Assign_fields, 2); + Assign_type = make_type("Assign", stmt_type, Assign_fields, 3); if (!Assign_type) return 0; AugAssign_type = make_type("AugAssign", stmt_type, AugAssign_fields, 3); if (!AugAssign_type) return 0; AnnAssign_type = make_type("AnnAssign", stmt_type, AnnAssign_fields, 4); if (!AnnAssign_type) return 0; - For_type = make_type("For", stmt_type, For_fields, 4); + For_type = make_type("For", stmt_type, For_fields, 5); if (!For_type) return 0; - AsyncFor_type = make_type("AsyncFor", stmt_type, AsyncFor_fields, 4); + AsyncFor_type = make_type("AsyncFor", stmt_type, AsyncFor_fields, 5); if (!AsyncFor_type) return 0; While_type = make_type("While", stmt_type, While_fields, 3); if (!While_type) return 0; If_type = make_type("If", stmt_type, If_fields, 3); if (!If_type) return 0; - With_type = make_type("With", stmt_type, With_fields, 2); + With_type = make_type("With", stmt_type, With_fields, 3); if (!With_type) return 0; - AsyncWith_type = make_type("AsyncWith", stmt_type, AsyncWith_fields, 2); + AsyncWith_type = make_type("AsyncWith", stmt_type, AsyncWith_fields, 3); if (!AsyncWith_type) return 0; Raise_type = make_type("Raise", stmt_type, Raise_fields, 2); if (!Raise_type) return 0; @@ -1113,7 +1148,7 @@ static int init_types(void) arguments_type = make_type("arguments", &AST_type, arguments_fields, 6); if (!arguments_type) return 0; if (!add_attributes(arguments_type, NULL, 0)) return 0; - arg_type = make_type("arg", &AST_type, arg_fields, 2); + arg_type = make_type("arg", &AST_type, arg_fields, 3); if (!arg_type) return 0; if (!add_attributes(arg_type, arg_attributes, 4)) return 0; keyword_type = make_type("keyword", &AST_type, keyword_fields, 2); @@ -1125,6 +1160,12 @@ static int init_types(void) withitem_type = make_type("withitem", &AST_type, withitem_fields, 2); if (!withitem_type) return 0; if (!add_attributes(withitem_type, NULL, 0)) return 0; + type_ignore_type = make_type("type_ignore", &AST_type, NULL, 0); + if (!type_ignore_type) return 0; + if (!add_attributes(type_ignore_type, NULL, 0)) return 0; + TypeIgnore_type = make_type("TypeIgnore", type_ignore_type, + TypeIgnore_fields, 1); + if (!TypeIgnore_type) return 0; initialized = 1; return 1; } @@ -1148,16 +1189,19 @@ static int obj2ast_arg(PyObject* obj, arg_ty* out, PyArena* arena); static int obj2ast_keyword(PyObject* obj, keyword_ty* out, PyArena* arena); static int obj2ast_alias(PyObject* obj, alias_ty* out, PyArena* arena); static int obj2ast_withitem(PyObject* obj, withitem_ty* out, PyArena* arena); +static int obj2ast_type_ignore(PyObject* obj, type_ignore_ty* out, PyArena* + arena); mod_ty -Module(asdl_seq * body, PyArena *arena) +Module(asdl_seq * body, asdl_seq * type_ignores, PyArena *arena) { mod_ty p; p = (mod_ty)PyArena_Malloc(arena, sizeof(*p)); if (!p) return NULL; p->kind = Module_kind; p->v.Module.body = body; + p->v.Module.type_ignores = type_ignores; return p; } @@ -1190,6 +1234,24 @@ Expression(expr_ty body, PyArena *arena) return p; } +mod_ty +FunctionType(asdl_seq * argtypes, expr_ty returns, PyArena *arena) +{ + mod_ty p; + if (!returns) { + PyErr_SetString(PyExc_ValueError, + "field returns is required for FunctionType"); + return NULL; + } + p = (mod_ty)PyArena_Malloc(arena, sizeof(*p)); + if (!p) + return NULL; + p->kind = FunctionType_kind; + p->v.FunctionType.argtypes = argtypes; + p->v.FunctionType.returns = returns; + return p; +} + mod_ty Suite(asdl_seq * body, PyArena *arena) { @@ -1204,8 +1266,8 @@ Suite(asdl_seq * body, PyArena *arena) stmt_ty FunctionDef(identifier name, arguments_ty args, asdl_seq * body, asdl_seq * - decorator_list, expr_ty returns, int lineno, int col_offset, int - end_lineno, int end_col_offset, PyArena *arena) + decorator_list, expr_ty returns, string type_comment, int lineno, + int col_offset, int end_lineno, int end_col_offset, PyArena *arena) { stmt_ty p; if (!name) { @@ -1227,6 +1289,7 @@ FunctionDef(identifier name, arguments_ty args, asdl_seq * body, asdl_seq * p->v.FunctionDef.body = body; p->v.FunctionDef.decorator_list = decorator_list; p->v.FunctionDef.returns = returns; + p->v.FunctionDef.type_comment = type_comment; p->lineno = lineno; p->col_offset = col_offset; p->end_lineno = end_lineno; @@ -1236,8 +1299,9 @@ FunctionDef(identifier name, arguments_ty args, asdl_seq * body, asdl_seq * stmt_ty AsyncFunctionDef(identifier name, arguments_ty args, asdl_seq * body, asdl_seq - * decorator_list, expr_ty returns, int lineno, int col_offset, - int end_lineno, int end_col_offset, PyArena *arena) + * decorator_list, expr_ty returns, string type_comment, int + lineno, int col_offset, int end_lineno, int end_col_offset, + PyArena *arena) { stmt_ty p; if (!name) { @@ -1259,6 +1323,7 @@ AsyncFunctionDef(identifier name, arguments_ty args, asdl_seq * body, asdl_seq p->v.AsyncFunctionDef.body = body; p->v.AsyncFunctionDef.decorator_list = decorator_list; p->v.AsyncFunctionDef.returns = returns; + p->v.AsyncFunctionDef.type_comment = type_comment; p->lineno = lineno; p->col_offset = col_offset; p->end_lineno = end_lineno; @@ -1328,8 +1393,8 @@ Delete(asdl_seq * targets, int lineno, int col_offset, int end_lineno, int } stmt_ty -Assign(asdl_seq * targets, expr_ty value, int lineno, int col_offset, int - end_lineno, int end_col_offset, PyArena *arena) +Assign(asdl_seq * targets, expr_ty value, string type_comment, int lineno, int + col_offset, int end_lineno, int end_col_offset, PyArena *arena) { stmt_ty p; if (!value) { @@ -1343,6 +1408,7 @@ Assign(asdl_seq * targets, expr_ty value, int lineno, int col_offset, int p->kind = Assign_kind; p->v.Assign.targets = targets; p->v.Assign.value = value; + p->v.Assign.type_comment = type_comment; p->lineno = lineno; p->col_offset = col_offset; p->end_lineno = end_lineno; @@ -1416,8 +1482,9 @@ AnnAssign(expr_ty target, expr_ty annotation, expr_ty value, int simple, int } stmt_ty -For(expr_ty target, expr_ty iter, asdl_seq * body, asdl_seq * orelse, int - lineno, int col_offset, int end_lineno, int end_col_offset, PyArena *arena) +For(expr_ty target, expr_ty iter, asdl_seq * body, asdl_seq * orelse, string + type_comment, int lineno, int col_offset, int end_lineno, int + end_col_offset, PyArena *arena) { stmt_ty p; if (!target) { @@ -1438,6 +1505,7 @@ For(expr_ty target, expr_ty iter, asdl_seq * body, asdl_seq * orelse, int p->v.For.iter = iter; p->v.For.body = body; p->v.For.orelse = orelse; + p->v.For.type_comment = type_comment; p->lineno = lineno; p->col_offset = col_offset; p->end_lineno = end_lineno; @@ -1446,9 +1514,9 @@ For(expr_ty target, expr_ty iter, asdl_seq * body, asdl_seq * orelse, int } stmt_ty -AsyncFor(expr_ty target, expr_ty iter, asdl_seq * body, asdl_seq * orelse, int - lineno, int col_offset, int end_lineno, int end_col_offset, PyArena - *arena) +AsyncFor(expr_ty target, expr_ty iter, asdl_seq * body, asdl_seq * orelse, + string type_comment, int lineno, int col_offset, int end_lineno, int + end_col_offset, PyArena *arena) { stmt_ty p; if (!target) { @@ -1469,6 +1537,7 @@ AsyncFor(expr_ty target, expr_ty iter, asdl_seq * body, asdl_seq * orelse, int p->v.AsyncFor.iter = iter; p->v.AsyncFor.body = body; p->v.AsyncFor.orelse = orelse; + p->v.AsyncFor.type_comment = type_comment; p->lineno = lineno; p->col_offset = col_offset; p->end_lineno = end_lineno; @@ -1525,8 +1594,8 @@ If(expr_ty test, asdl_seq * body, asdl_seq * orelse, int lineno, int } stmt_ty -With(asdl_seq * items, asdl_seq * body, int lineno, int col_offset, int - end_lineno, int end_col_offset, PyArena *arena) +With(asdl_seq * items, asdl_seq * body, string type_comment, int lineno, int + col_offset, int end_lineno, int end_col_offset, PyArena *arena) { stmt_ty p; p = (stmt_ty)PyArena_Malloc(arena, sizeof(*p)); @@ -1535,6 +1604,7 @@ With(asdl_seq * items, asdl_seq * body, int lineno, int col_offset, int p->kind = With_kind; p->v.With.items = items; p->v.With.body = body; + p->v.With.type_comment = type_comment; p->lineno = lineno; p->col_offset = col_offset; p->end_lineno = end_lineno; @@ -1543,8 +1613,8 @@ With(asdl_seq * items, asdl_seq * body, int lineno, int col_offset, int } stmt_ty -AsyncWith(asdl_seq * items, asdl_seq * body, int lineno, int col_offset, int - end_lineno, int end_col_offset, PyArena *arena) +AsyncWith(asdl_seq * items, asdl_seq * body, string type_comment, int lineno, + int col_offset, int end_lineno, int end_col_offset, PyArena *arena) { stmt_ty p; p = (stmt_ty)PyArena_Malloc(arena, sizeof(*p)); @@ -1553,6 +1623,7 @@ AsyncWith(asdl_seq * items, asdl_seq * body, int lineno, int col_offset, int p->kind = AsyncWith_kind; p->v.AsyncWith.items = items; p->v.AsyncWith.body = body; + p->v.AsyncWith.type_comment = type_comment; p->lineno = lineno; p->col_offset = col_offset; p->end_lineno = end_lineno; @@ -2518,8 +2589,8 @@ arguments(asdl_seq * args, arg_ty vararg, asdl_seq * kwonlyargs, asdl_seq * } arg_ty -arg(identifier arg, expr_ty annotation, int lineno, int col_offset, int - end_lineno, int end_col_offset, PyArena *arena) +arg(identifier arg, expr_ty annotation, string type_comment, int lineno, int + col_offset, int end_lineno, int end_col_offset, PyArena *arena) { arg_ty p; if (!arg) { @@ -2532,6 +2603,7 @@ arg(identifier arg, expr_ty annotation, int lineno, int col_offset, int return NULL; p->arg = arg; p->annotation = annotation; + p->type_comment = type_comment; p->lineno = lineno; p->col_offset = col_offset; p->end_lineno = end_lineno; @@ -2590,6 +2662,18 @@ withitem(expr_ty context_expr, expr_ty optional_vars, PyArena *arena) return p; } +type_ignore_ty +TypeIgnore(int lineno, PyArena *arena) +{ + type_ignore_ty p; + p = (type_ignore_ty)PyArena_Malloc(arena, sizeof(*p)); + if (!p) + return NULL; + p->kind = TypeIgnore_kind; + p->v.TypeIgnore.lineno = lineno; + return p; +} + PyObject* ast2obj_mod(void* _o) @@ -2609,6 +2693,11 @@ ast2obj_mod(void* _o) if (_PyObject_SetAttrId(result, &PyId_body, value) == -1) goto failed; Py_DECREF(value); + value = ast2obj_list(o->v.Module.type_ignores, ast2obj_type_ignore); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_type_ignores, value) == -1) + goto failed; + Py_DECREF(value); break; case Interactive_kind: result = PyType_GenericNew(Interactive_type, NULL, NULL); @@ -2628,6 +2717,20 @@ ast2obj_mod(void* _o) goto failed; Py_DECREF(value); break; + case FunctionType_kind: + result = PyType_GenericNew(FunctionType_type, NULL, NULL); + if (!result) goto failed; + value = ast2obj_list(o->v.FunctionType.argtypes, ast2obj_expr); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_argtypes, value) == -1) + goto failed; + Py_DECREF(value); + value = ast2obj_expr(o->v.FunctionType.returns); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_returns, value) == -1) + goto failed; + Py_DECREF(value); + break; case Suite_kind: result = PyType_GenericNew(Suite_type, NULL, NULL); if (!result) goto failed; @@ -2683,6 +2786,11 @@ ast2obj_stmt(void* _o) if (_PyObject_SetAttrId(result, &PyId_returns, value) == -1) goto failed; Py_DECREF(value); + value = ast2obj_string(o->v.FunctionDef.type_comment); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_type_comment, value) == -1) + goto failed; + Py_DECREF(value); break; case AsyncFunctionDef_kind: result = PyType_GenericNew(AsyncFunctionDef_type, NULL, NULL); @@ -2713,6 +2821,11 @@ ast2obj_stmt(void* _o) if (_PyObject_SetAttrId(result, &PyId_returns, value) == -1) goto failed; Py_DECREF(value); + value = ast2obj_string(o->v.AsyncFunctionDef.type_comment); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_type_comment, value) == -1) + goto failed; + Py_DECREF(value); break; case ClassDef_kind: result = PyType_GenericNew(ClassDef_type, NULL, NULL); @@ -2774,6 +2887,11 @@ ast2obj_stmt(void* _o) if (_PyObject_SetAttrId(result, &PyId_value, value) == -1) goto failed; Py_DECREF(value); + value = ast2obj_string(o->v.Assign.type_comment); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_type_comment, value) == -1) + goto failed; + Py_DECREF(value); break; case AugAssign_kind: result = PyType_GenericNew(AugAssign_type, NULL, NULL); @@ -2841,6 +2959,11 @@ ast2obj_stmt(void* _o) if (_PyObject_SetAttrId(result, &PyId_orelse, value) == -1) goto failed; Py_DECREF(value); + value = ast2obj_string(o->v.For.type_comment); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_type_comment, value) == -1) + goto failed; + Py_DECREF(value); break; case AsyncFor_kind: result = PyType_GenericNew(AsyncFor_type, NULL, NULL); @@ -2865,6 +2988,11 @@ ast2obj_stmt(void* _o) if (_PyObject_SetAttrId(result, &PyId_orelse, value) == -1) goto failed; Py_DECREF(value); + value = ast2obj_string(o->v.AsyncFor.type_comment); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_type_comment, value) == -1) + goto failed; + Py_DECREF(value); break; case While_kind: result = PyType_GenericNew(While_type, NULL, NULL); @@ -2917,6 +3045,11 @@ ast2obj_stmt(void* _o) if (_PyObject_SetAttrId(result, &PyId_body, value) == -1) goto failed; Py_DECREF(value); + value = ast2obj_string(o->v.With.type_comment); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_type_comment, value) == -1) + goto failed; + Py_DECREF(value); break; case AsyncWith_kind: result = PyType_GenericNew(AsyncWith_type, NULL, NULL); @@ -2931,6 +3064,11 @@ ast2obj_stmt(void* _o) if (_PyObject_SetAttrId(result, &PyId_body, value) == -1) goto failed; Py_DECREF(value); + value = ast2obj_string(o->v.AsyncWith.type_comment); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_type_comment, value) == -1) + goto failed; + Py_DECREF(value); break; case Raise_kind: result = PyType_GenericNew(Raise_type, NULL, NULL); @@ -3870,6 +4008,11 @@ ast2obj_arg(void* _o) if (_PyObject_SetAttrId(result, &PyId_annotation, value) == -1) goto failed; Py_DECREF(value); + value = ast2obj_string(o->type_comment); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_type_comment, value) == -1) + goto failed; + Py_DECREF(value); value = ast2obj_int(o->lineno); if (!value) goto failed; if (_PyObject_SetAttrId(result, &PyId_lineno, value) < 0) @@ -3981,6 +4124,33 @@ ast2obj_withitem(void* _o) return NULL; } +PyObject* +ast2obj_type_ignore(void* _o) +{ + type_ignore_ty o = (type_ignore_ty)_o; + PyObject *result = NULL, *value = NULL; + if (!o) { + Py_RETURN_NONE; + } + + switch (o->kind) { + case TypeIgnore_kind: + result = PyType_GenericNew(TypeIgnore_type, NULL, NULL); + if (!result) goto failed; + value = ast2obj_int(o->v.TypeIgnore.lineno); + if (!value) goto failed; + if (_PyObject_SetAttrId(result, &PyId_lineno, value) == -1) + goto failed; + Py_DECREF(value); + break; + } + return result; +failed: + Py_XDECREF(value); + Py_XDECREF(result); + return NULL; +} + int obj2ast_mod(PyObject* obj, mod_ty* out, PyArena* arena) @@ -3999,6 +4169,7 @@ obj2ast_mod(PyObject* obj, mod_ty* out, PyArena* arena) } if (isinstance) { asdl_seq* body; + asdl_seq* type_ignores; if (_PyObject_LookupAttrId(obj, &PyId_body, &tmp) < 0) { return 1; @@ -4030,7 +4201,37 @@ obj2ast_mod(PyObject* obj, mod_ty* out, PyArena* arena) } Py_CLEAR(tmp); } - *out = Module(body, arena); + if (_PyObject_LookupAttrId(obj, &PyId_type_ignores, &tmp) < 0) { + return 1; + } + if (tmp == NULL) { + PyErr_SetString(PyExc_TypeError, "required field \"type_ignores\" missing from Module"); + return 1; + } + else { + int res; + Py_ssize_t len; + Py_ssize_t i; + if (!PyList_Check(tmp)) { + PyErr_Format(PyExc_TypeError, "Module field \"type_ignores\" must be a list, not a %.200s", tmp->ob_type->tp_name); + goto failed; + } + len = PyList_GET_SIZE(tmp); + type_ignores = _Py_asdl_seq_new(len, arena); + if (type_ignores == NULL) goto failed; + for (i = 0; i < len; i++) { + type_ignore_ty val; + res = obj2ast_type_ignore(PyList_GET_ITEM(tmp, i), &val, arena); + if (res != 0) goto failed; + if (len != PyList_GET_SIZE(tmp)) { + PyErr_SetString(PyExc_RuntimeError, "Module field \"type_ignores\" changed size during iteration"); + goto failed; + } + asdl_seq_SET(type_ignores, i, val); + } + Py_CLEAR(tmp); + } + *out = Module(body, type_ignores, arena); if (*out == NULL) goto failed; return 0; } @@ -4099,6 +4300,61 @@ obj2ast_mod(PyObject* obj, mod_ty* out, PyArena* arena) if (*out == NULL) goto failed; return 0; } + isinstance = PyObject_IsInstance(obj, (PyObject*)FunctionType_type); + if (isinstance == -1) { + return 1; + } + if (isinstance) { + asdl_seq* argtypes; + expr_ty returns; + + if (_PyObject_LookupAttrId(obj, &PyId_argtypes, &tmp) < 0) { + return 1; + } + if (tmp == NULL) { + PyErr_SetString(PyExc_TypeError, "required field \"argtypes\" missing from FunctionType"); + return 1; + } + else { + int res; + Py_ssize_t len; + Py_ssize_t i; + if (!PyList_Check(tmp)) { + PyErr_Format(PyExc_TypeError, "FunctionType field \"argtypes\" must be a list, not a %.200s", tmp->ob_type->tp_name); + goto failed; + } + len = PyList_GET_SIZE(tmp); + argtypes = _Py_asdl_seq_new(len, arena); + if (argtypes == NULL) goto failed; + for (i = 0; i < len; i++) { + expr_ty val; + res = obj2ast_expr(PyList_GET_ITEM(tmp, i), &val, arena); + if (res != 0) goto failed; + if (len != PyList_GET_SIZE(tmp)) { + PyErr_SetString(PyExc_RuntimeError, "FunctionType field \"argtypes\" changed size during iteration"); + goto failed; + } + asdl_seq_SET(argtypes, i, val); + } + Py_CLEAR(tmp); + } + if (_PyObject_LookupAttrId(obj, &PyId_returns, &tmp) < 0) { + return 1; + } + if (tmp == NULL) { + PyErr_SetString(PyExc_TypeError, "required field \"returns\" missing from FunctionType"); + return 1; + } + else { + int res; + res = obj2ast_expr(tmp, &returns, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } + *out = FunctionType(argtypes, returns, arena); + if (*out == NULL) goto failed; + return 0; + } isinstance = PyObject_IsInstance(obj, (PyObject*)Suite_type); if (isinstance == -1) { return 1; @@ -4224,6 +4480,7 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) asdl_seq* body; asdl_seq* decorator_list; expr_ty returns; + string type_comment; if (_PyObject_LookupAttrId(obj, &PyId_name, &tmp) < 0) { return 1; @@ -4324,8 +4581,22 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) if (res != 0) goto failed; Py_CLEAR(tmp); } - *out = FunctionDef(name, args, body, decorator_list, returns, lineno, - col_offset, end_lineno, end_col_offset, arena); + if (_PyObject_LookupAttrId(obj, &PyId_type_comment, &tmp) < 0) { + return 1; + } + if (tmp == NULL || tmp == Py_None) { + Py_CLEAR(tmp); + type_comment = NULL; + } + else { + int res; + res = obj2ast_string(tmp, &type_comment, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } + *out = FunctionDef(name, args, body, decorator_list, returns, + type_comment, lineno, col_offset, end_lineno, + end_col_offset, arena); if (*out == NULL) goto failed; return 0; } @@ -4339,6 +4610,7 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) asdl_seq* body; asdl_seq* decorator_list; expr_ty returns; + string type_comment; if (_PyObject_LookupAttrId(obj, &PyId_name, &tmp) < 0) { return 1; @@ -4439,9 +4711,22 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) if (res != 0) goto failed; Py_CLEAR(tmp); } + if (_PyObject_LookupAttrId(obj, &PyId_type_comment, &tmp) < 0) { + return 1; + } + if (tmp == NULL || tmp == Py_None) { + Py_CLEAR(tmp); + type_comment = NULL; + } + else { + int res; + res = obj2ast_string(tmp, &type_comment, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } *out = AsyncFunctionDef(name, args, body, decorator_list, returns, - lineno, col_offset, end_lineno, end_col_offset, - arena); + type_comment, lineno, col_offset, end_lineno, + end_col_offset, arena); if (*out == NULL) goto failed; return 0; } @@ -4668,6 +4953,7 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) if (isinstance) { asdl_seq* targets; expr_ty value; + string type_comment; if (_PyObject_LookupAttrId(obj, &PyId_targets, &tmp) < 0) { return 1; @@ -4712,8 +4998,21 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) if (res != 0) goto failed; Py_CLEAR(tmp); } - *out = Assign(targets, value, lineno, col_offset, end_lineno, - end_col_offset, arena); + if (_PyObject_LookupAttrId(obj, &PyId_type_comment, &tmp) < 0) { + return 1; + } + if (tmp == NULL || tmp == Py_None) { + Py_CLEAR(tmp); + type_comment = NULL; + } + else { + int res; + res = obj2ast_string(tmp, &type_comment, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } + *out = Assign(targets, value, type_comment, lineno, col_offset, + end_lineno, end_col_offset, arena); if (*out == NULL) goto failed; return 0; } @@ -4846,6 +5145,7 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) expr_ty iter; asdl_seq* body; asdl_seq* orelse; + string type_comment; if (_PyObject_LookupAttrId(obj, &PyId_target, &tmp) < 0) { return 1; @@ -4933,8 +5233,21 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) } Py_CLEAR(tmp); } - *out = For(target, iter, body, orelse, lineno, col_offset, end_lineno, - end_col_offset, arena); + if (_PyObject_LookupAttrId(obj, &PyId_type_comment, &tmp) < 0) { + return 1; + } + if (tmp == NULL || tmp == Py_None) { + Py_CLEAR(tmp); + type_comment = NULL; + } + else { + int res; + res = obj2ast_string(tmp, &type_comment, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } + *out = For(target, iter, body, orelse, type_comment, lineno, + col_offset, end_lineno, end_col_offset, arena); if (*out == NULL) goto failed; return 0; } @@ -4947,6 +5260,7 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) expr_ty iter; asdl_seq* body; asdl_seq* orelse; + string type_comment; if (_PyObject_LookupAttrId(obj, &PyId_target, &tmp) < 0) { return 1; @@ -5034,8 +5348,21 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) } Py_CLEAR(tmp); } - *out = AsyncFor(target, iter, body, orelse, lineno, col_offset, - end_lineno, end_col_offset, arena); + if (_PyObject_LookupAttrId(obj, &PyId_type_comment, &tmp) < 0) { + return 1; + } + if (tmp == NULL || tmp == Py_None) { + Py_CLEAR(tmp); + type_comment = NULL; + } + else { + int res; + res = obj2ast_string(tmp, &type_comment, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } + *out = AsyncFor(target, iter, body, orelse, type_comment, lineno, + col_offset, end_lineno, end_col_offset, arena); if (*out == NULL) goto failed; return 0; } @@ -5220,6 +5547,7 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) if (isinstance) { asdl_seq* items; asdl_seq* body; + string type_comment; if (_PyObject_LookupAttrId(obj, &PyId_items, &tmp) < 0) { return 1; @@ -5281,7 +5609,20 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) } Py_CLEAR(tmp); } - *out = With(items, body, lineno, col_offset, end_lineno, + if (_PyObject_LookupAttrId(obj, &PyId_type_comment, &tmp) < 0) { + return 1; + } + if (tmp == NULL || tmp == Py_None) { + Py_CLEAR(tmp); + type_comment = NULL; + } + else { + int res; + res = obj2ast_string(tmp, &type_comment, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } + *out = With(items, body, type_comment, lineno, col_offset, end_lineno, end_col_offset, arena); if (*out == NULL) goto failed; return 0; @@ -5293,6 +5634,7 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) if (isinstance) { asdl_seq* items; asdl_seq* body; + string type_comment; if (_PyObject_LookupAttrId(obj, &PyId_items, &tmp) < 0) { return 1; @@ -5354,8 +5696,21 @@ obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena) } Py_CLEAR(tmp); } - *out = AsyncWith(items, body, lineno, col_offset, end_lineno, - end_col_offset, arena); + if (_PyObject_LookupAttrId(obj, &PyId_type_comment, &tmp) < 0) { + return 1; + } + if (tmp == NULL || tmp == Py_None) { + Py_CLEAR(tmp); + type_comment = NULL; + } + else { + int res; + res = obj2ast_string(tmp, &type_comment, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } + *out = AsyncWith(items, body, type_comment, lineno, col_offset, + end_lineno, end_col_offset, arena); if (*out == NULL) goto failed; return 0; } @@ -8073,6 +8428,7 @@ obj2ast_arg(PyObject* obj, arg_ty* out, PyArena* arena) PyObject* tmp = NULL; identifier arg; expr_ty annotation; + string type_comment; int lineno; int col_offset; int end_lineno; @@ -8104,6 +8460,19 @@ obj2ast_arg(PyObject* obj, arg_ty* out, PyArena* arena) if (res != 0) goto failed; Py_CLEAR(tmp); } + if (_PyObject_LookupAttrId(obj, &PyId_type_comment, &tmp) < 0) { + return 1; + } + if (tmp == NULL || tmp == Py_None) { + Py_CLEAR(tmp); + type_comment = NULL; + } + else { + int res; + res = obj2ast_string(tmp, &type_comment, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } if (_PyObject_LookupAttrId(obj, &PyId_lineno, &tmp) < 0) { return 1; } @@ -8156,8 +8525,8 @@ obj2ast_arg(PyObject* obj, arg_ty* out, PyArena* arena) if (res != 0) goto failed; Py_CLEAR(tmp); } - *out = arg(arg, annotation, lineno, col_offset, end_lineno, end_col_offset, - arena); + *out = arg(arg, annotation, type_comment, lineno, col_offset, end_lineno, + end_col_offset, arena); return 0; failed: Py_XDECREF(tmp); @@ -8284,6 +8653,48 @@ obj2ast_withitem(PyObject* obj, withitem_ty* out, PyArena* arena) return 1; } +int +obj2ast_type_ignore(PyObject* obj, type_ignore_ty* out, PyArena* arena) +{ + int isinstance; + + PyObject *tmp = NULL; + + if (obj == Py_None) { + *out = NULL; + return 0; + } + isinstance = PyObject_IsInstance(obj, (PyObject*)TypeIgnore_type); + if (isinstance == -1) { + return 1; + } + if (isinstance) { + int lineno; + + if (_PyObject_LookupAttrId(obj, &PyId_lineno, &tmp) < 0) { + return 1; + } + if (tmp == NULL) { + PyErr_SetString(PyExc_TypeError, "required field \"lineno\" missing from TypeIgnore"); + return 1; + } + else { + int res; + res = obj2ast_int(tmp, &lineno, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } + *out = TypeIgnore(lineno, arena); + if (*out == NULL) goto failed; + return 0; + } + + PyErr_Format(PyExc_TypeError, "expected some sort of type_ignore, but got %R", obj); + failed: + Py_XDECREF(tmp); + return 1; +} + static struct PyModuleDef _astmodule = { PyModuleDef_HEAD_INIT, "_ast" @@ -8299,13 +8710,17 @@ PyInit__ast(void) if (PyDict_SetItemString(d, "AST", (PyObject*)&AST_type) < 0) return NULL; if (PyModule_AddIntMacro(m, PyCF_ONLY_AST) < 0) return NULL; + if (PyModule_AddIntMacro(m, PyCF_TYPE_COMMENTS) < 0) + return NULL; if (PyDict_SetItemString(d, "mod", (PyObject*)mod_type) < 0) return NULL; if (PyDict_SetItemString(d, "Module", (PyObject*)Module_type) < 0) return NULL; if (PyDict_SetItemString(d, "Interactive", (PyObject*)Interactive_type) < 0) return NULL; if (PyDict_SetItemString(d, "Expression", (PyObject*)Expression_type) < 0) return NULL; + if (PyDict_SetItemString(d, "FunctionType", (PyObject*)FunctionType_type) < + 0) return NULL; if (PyDict_SetItemString(d, "Suite", (PyObject*)Suite_type) < 0) return NULL; if (PyDict_SetItemString(d, "stmt", (PyObject*)stmt_type) < 0) return NULL; @@ -8486,6 +8901,10 @@ PyInit__ast(void) NULL; if (PyDict_SetItemString(d, "withitem", (PyObject*)withitem_type) < 0) return NULL; + if (PyDict_SetItemString(d, "type_ignore", (PyObject*)type_ignore_type) < + 0) return NULL; + if (PyDict_SetItemString(d, "TypeIgnore", (PyObject*)TypeIgnore_type) < 0) + return NULL; return m; } @@ -8498,18 +8917,19 @@ PyObject* PyAST_mod2obj(mod_ty t) } /* mode is 0 for "exec", 1 for "eval" and 2 for "single" input */ +/* and 3 for "func_type" */ mod_ty PyAST_obj2mod(PyObject* ast, PyArena* arena, int mode) { mod_ty res; PyObject *req_type[3]; - char *req_name[] = {"Module", "Expression", "Interactive"}; + char *req_name[] = {"Module", "Expression", "Interactive", "FunctionType"}; int isinstance; req_type[0] = (PyObject*)Module_type; req_type[1] = (PyObject*)Expression_type; req_type[2] = (PyObject*)Interactive_type; - assert(0 <= mode && mode <= 2); + assert(0 <= mode && mode <= 3); if (!init_types()) return NULL;
Python/pythonrun.c+2 −0 modified@@ -158,6 +158,8 @@ static int PARSER_FLAGS(PyCompilerFlags *flags) parser_flags |= PyPARSE_IGNORE_COOKIE; if (flags->cf_flags & CO_FUTURE_BARRY_AS_BDFL) parser_flags |= PyPARSE_BARRY_AS_BDFL; + if (flags->cf_flags & PyCF_TYPE_COMMENTS) + parser_flags |= PyPARSE_TYPE_COMMENTS; return parser_flags; }
156afcb26c19Fully incorporate the code from Python 3.7.2 (#78)
53 files changed · +3393 −6148
appveyor.yml+0 −3 modified@@ -2,9 +2,6 @@ environment: matrix: # For Python versions available on Appveyor, see # http://www.appveyor.com/docs/installed-software#python - - PYTHON: "C:\\Python33" - - PYTHON: "C:\\Python33-x64" - DISTUTILS_USE_SDK: 1 - PYTHON: "C:\\Python34" - PYTHON: "C:\\Python34-x64" DISTUTILS_USE_SDK: 1
ast27/Python/ast.c+2 −3 modified@@ -1500,7 +1500,6 @@ ast_for_atom(struct compiling *c, const node *n) case STRING: { PyObject *kind, *str = parsestrplus(c, n); const char *raw, *s = STR(CHILD(n, 0)); - int quote = Py_CHARMASK(*s); /* currently Python allows up to 2 string modifiers */ char *ch, s_kind[3] = {0, 0, 0}; ch = s_kind; @@ -1519,7 +1518,7 @@ ast_for_atom(struct compiling *c, const node *n) PyErr_Fetch(&type, &value, &tback); errstr = PyObject_Str(value); if (errstr) { - char *s = ""; + const char *s = ""; char buf[128]; s = _PyUnicode_AsString(errstr); PyOS_snprintf(buf, sizeof(buf), "(unicode error) %s", s); @@ -2190,7 +2189,7 @@ ast_for_call(struct compiling *c, const node *n, expr_ty func) keyword_ty kw; identifier key; int k; - char *tmp; + const char *tmp; /* CHILD(ch, 0) is test, but must be an identifier? */ e = ast_for_expr(c, CHILD(ch, 0));
ast3/compile_pgen+0 −22 removed@@ -1,22 +0,0 @@ -gcc -I Parser -I Include $(python3-config --includes) \ - -o pgen.out \ - Parser/acceler.c \ - Parser/grammar1.c \ - Parser/node.c \ - Parser/parser.c \ - Parser/bitset.c \ - Parser/grammar.c \ - Pgen/listnode.c \ - Pgen/metagrammar.c \ - Pgen/firstsets.c \ - Pgen/pgen.c \ - Pgen/obmalloc.c \ - Pgen/dynamic_annotations.c \ - Pgen/mysnprintf.c \ - Pgen/pyctype.c \ - Pgen/tokenizer_pgen.c \ - Pgen/printgrammar.c \ - Pgen/parsetok_pgen.c \ - Pgen/pgenmain.c - -
ast3/Custom/typed_ast.c+7 −4 modified@@ -1,6 +1,6 @@ #include "Python.h" #include "Python-ast.h" -#include "compile.h" +#include "compile-ast3.h" #include "node.h" #include "grammar.h" #include "token.h" @@ -222,10 +222,13 @@ string_object_to_c_ast(const char *s, PyObject *filename, int start, PyCompilerFlags localflags; perrdetail err; int iflags = PARSER_FLAGS(flags); + node *n; - node *n = Ta3Parser_ParseStringObject(s, filename, - &_Ta3Parser_Grammar, start, &err, - &iflags); + if (feature_version >= 7) + iflags |= PyPARSE_ASYNC_ALWAYS; + n = Ta3Parser_ParseStringObject(s, filename, + &_Ta3Parser_Grammar, start, &err, + &iflags); if (flags == NULL) { localflags.cf_flags = 0; flags = &localflags;
ast3/Grammar/Grammar+3 −9 modified@@ -1,14 +1,7 @@ # Grammar for Python -# Note: Changing the grammar specified in this file will most likely -# require corresponding changes in the parser module -# (../Modules/parsermodule.c). If you can't make the changes to -# that module yourself, please co-ordinate the required changes -# with someone who can; ask around on python-dev for help. Fred -# Drake <fdrake@acm.org> will probably be listening there. - # NOTE WELL: You should also follow all the steps listed at -# https://docs.python.org/devguide/grammar.html +# https://devguide.python.org/grammar/ # Start symbols for the grammar: # single_input is a single interactive statement; @@ -150,7 +143,8 @@ argument: ( test [comp_for] | '*' test ) comp_iter: comp_for | comp_if -comp_for: [ASYNC] 'for' exprlist 'in' or_test [comp_iter] +sync_comp_for: 'for' exprlist 'in' or_test [comp_iter] +comp_for: [ASYNC] sync_comp_for comp_if: 'if' test_nocond [comp_iter] # not used in grammar, but may appear in "node" passed from Parser to Compiler
ast3/Include/ast.h+7 −0 modified@@ -18,6 +18,13 @@ extern mod_ty Ta3AST_FromNodeObject( int feature_version, PyArena *arena); +#ifndef Py_LIMITED_API + +/* _PyAST_ExprAsUnicode is defined in ast_unparse.c */ +extern PyObject * _PyAST_ExprAsUnicode(expr_ty); + +#endif /* !Py_LIMITED_API */ + #ifdef __cplusplus } #endif
ast3/Include/bitset.h+7 −7 modified@@ -7,7 +7,7 @@ extern "C" { /* Bitset interface */ -#define BYTE char +#define BYTE char typedef BYTE *bitset; @@ -18,13 +18,13 @@ int addbit(bitset bs, int ibit); /* Returns 0 if already set */ int samebitset(bitset bs1, bitset bs2, int nbits); void mergebitset(bitset bs1, bitset bs2, int nbits); -#define BITSPERBYTE (8*sizeof(BYTE)) -#define NBYTES(nbits) (((nbits) + BITSPERBYTE - 1) / BITSPERBYTE) +#define BITSPERBYTE (8*sizeof(BYTE)) +#define NBYTES(nbits) (((nbits) + BITSPERBYTE - 1) / BITSPERBYTE) -#define BIT2BYTE(ibit) ((ibit) / BITSPERBYTE) -#define BIT2SHIFT(ibit) ((ibit) % BITSPERBYTE) -#define BIT2MASK(ibit) (1 << BIT2SHIFT(ibit)) -#define BYTE2BIT(ibyte) ((ibyte) * BITSPERBYTE) +#define BIT2BYTE(ibit) ((ibit) / BITSPERBYTE) +#define BIT2SHIFT(ibit) ((ibit) % BITSPERBYTE) +#define BIT2MASK(ibit) (1 << BIT2SHIFT(ibit)) +#define BYTE2BIT(ibyte) ((ibyte) * BITSPERBYTE) #ifdef __cplusplus }
ast3/Include/compile-ast3.h+1 −6 renamed@@ -1,11 +1,6 @@ -#ifndef Ta3_COMPILE_H -#define Ta3_COMPILE_H - /* These definitions must match corresponding definitions in graminit.h. There's code in compile.c that checks that they are the same. */ #define Py_single_input 256 #define Py_file_input 257 #define Py_eval_input 258 -#define Py_func_type_input 342 - -#endif /* !Ta3_COMPILE_H */ +#define Py_func_type_input 343
ast3/Include/errcode.h+17 −17 modified@@ -13,24 +13,24 @@ extern "C" { the parser only returns E_EOF when it hits EOF immediately, and it never returns E_OK. */ -#define E_OK 10 /* No error */ -#define E_EOF 11 /* End Of File */ -#define E_INTR 12 /* Interrupted */ -#define E_TOKEN 13 /* Bad token */ -#define E_SYNTAX 14 /* Syntax error */ -#define E_NOMEM 15 /* Ran out of memory */ -#define E_DONE 16 /* Parsing complete */ -#define E_ERROR 17 /* Execution error */ -#define E_TABSPACE 18 /* Inconsistent mixing of tabs and spaces */ -#define E_OVERFLOW 19 /* Node had too many children */ -#define E_TOODEEP 20 /* Too many indentation levels */ -#define E_DEDENT 21 /* No matching outer block for dedent */ -#define E_DECODE 22 /* Error in decoding into Unicode */ -#define E_EOFS 23 /* EOF in triple-quoted string */ -#define E_EOLS 24 /* EOL in single-quoted string */ -#define E_LINECONT 25 /* Unexpected characters after a line continuation */ +#define E_OK 10 /* No error */ +#define E_EOF 11 /* End Of File */ +#define E_INTR 12 /* Interrupted */ +#define E_TOKEN 13 /* Bad token */ +#define E_SYNTAX 14 /* Syntax error */ +#define E_NOMEM 15 /* Ran out of memory */ +#define E_DONE 16 /* Parsing complete */ +#define E_ERROR 17 /* Execution error */ +#define E_TABSPACE 18 /* Inconsistent mixing of tabs and spaces */ +#define E_OVERFLOW 19 /* Node had too many children */ +#define E_TOODEEP 20 /* Too many indentation levels */ +#define E_DEDENT 21 /* No matching outer block for dedent */ +#define E_DECODE 22 /* Error in decoding into Unicode */ +#define E_EOFS 23 /* EOF in triple-quoted string */ +#define E_EOLS 24 /* EOL in single-quoted string */ +#define E_LINECONT 25 /* Unexpected characters after a line continuation */ #define E_IDENTIFIER 26 /* Invalid characters in identifier */ -#define E_BADSINGLE 27 /* Ill-formed single statement input */ +#define E_BADSINGLE 27 /* Ill-formed single statement input */ #ifdef __cplusplus }
ast3/Include/graminit.h+9 −8 modified@@ -81,11 +81,12 @@ #define arglist 334 #define argument 335 #define comp_iter 336 -#define comp_for 337 -#define comp_if 338 -#define encoding_decl 339 -#define yield_expr 340 -#define yield_arg 341 -#define func_type_input 342 -#define func_type 343 -#define typelist 344 +#define sync_comp_for 337 +#define comp_for 338 +#define comp_if 339 +#define encoding_decl 340 +#define yield_expr 341 +#define yield_arg 342 +#define func_type_input 343 +#define func_type 344 +#define typelist 345
ast3/Include/grammar.h+24 −24 modified@@ -12,58 +12,58 @@ extern "C" { /* A label of an arc */ typedef struct { - int lb_type; - char *lb_str; + int lb_type; + char *lb_str; } label; -#define EMPTY 0 /* Label number 0 is by definition the empty label */ +#define EMPTY 0 /* Label number 0 is by definition the empty label */ /* A list of labels */ typedef struct { - int ll_nlabels; - label *ll_label; + int ll_nlabels; + label *ll_label; } labellist; /* An arc from one state to another */ typedef struct { - short a_lbl; /* Label of this arc */ - short a_arrow; /* State where this arc goes to */ + short a_lbl; /* Label of this arc */ + short a_arrow; /* State where this arc goes to */ } arc; /* A state in a DFA */ typedef struct { - int s_narcs; - arc *s_arc; /* Array of arcs */ + int s_narcs; + arc *s_arc; /* Array of arcs */ /* Optional accelerators */ - int s_lower; /* Lowest label index */ - int s_upper; /* Highest label index */ - int *s_accel; /* Accelerator */ - int s_accept; /* Nonzero for accepting state */ + int s_lower; /* Lowest label index */ + int s_upper; /* Highest label index */ + int *s_accel; /* Accelerator */ + int s_accept; /* Nonzero for accepting state */ } state; /* A DFA */ typedef struct { - int d_type; /* Non-terminal this represents */ - char *d_name; /* For printing */ - int d_initial; /* Initial state */ - int d_nstates; - state *d_state; /* Array of states */ - bitset d_first; + int d_type; /* Non-terminal this represents */ + char *d_name; /* For printing */ + int d_initial; /* Initial state */ + int d_nstates; + state *d_state; /* Array of states */ + bitset d_first; } dfa; /* A grammar */ typedef struct { - int g_ndfas; - dfa *g_dfa; /* Array of DFAs */ - labellist g_ll; - int g_start; /* Start symbol of the grammar */ - int g_accel; /* Set if accelerators present */ + int g_ndfas; + dfa *g_dfa; /* Array of DFAs */ + labellist g_ll; + int g_start; /* Start symbol of the grammar */ + int g_accel; /* Set if accelerators present */ } grammar; /* FUNCTIONS */
ast3/Include/node.h+12 −12 modified@@ -8,15 +8,15 @@ extern "C" { #endif typedef struct _node { - short n_type; - char *n_str; - int n_lineno; - int n_col_offset; - int n_nchildren; - struct _node *n_child; + short n_type; + char *n_str; + int n_lineno; + int n_col_offset; + int n_nchildren; + struct _node *n_child; } node; -extern node *Ta3Node_New(int type); +extern node * Ta3Node_New(int type); extern int Ta3Node_AddChild(node *n, int type, char *str, int lineno, int col_offset); extern void Ta3Node_Free(node *n); @@ -25,12 +25,12 @@ extern Py_ssize_t _Ta3Node_SizeOf(node *n); #endif /* Node access functions */ -#define NCH(n) ((n)->n_nchildren) +#define NCH(n) ((n)->n_nchildren) -#define CHILD(n, i) (&(n)->n_child[i]) -#define RCHILD(n, i) (CHILD(n, NCH(n) + i)) -#define TYPE(n) ((n)->n_type) -#define STR(n) ((n)->n_str) +#define CHILD(n, i) (&(n)->n_child[i]) +#define RCHILD(n, i) (CHILD(n, NCH(n) + i)) +#define TYPE(n) ((n)->n_type) +#define STR(n) ((n)->n_str) #define LINENO(n) ((n)->n_lineno) /* Assert that the type of a node is what we expect */
ast3/Include/parsetok.h+6 −5 modified@@ -21,19 +21,20 @@ typedef struct { } perrdetail; #if 0 -#define PyPARSE_YIELD_IS_KEYWORD 0x0001 +#define PyPARSE_YIELD_IS_KEYWORD 0x0001 #endif -#define PyPARSE_DONT_IMPLY_DEDENT 0x0002 +#define PyPARSE_DONT_IMPLY_DEDENT 0x0002 #if 0 -#define PyPARSE_WITH_IS_KEYWORD 0x0003 +#define PyPARSE_WITH_IS_KEYWORD 0x0003 #define PyPARSE_PRINT_IS_FUNCTION 0x0004 #define PyPARSE_UNICODE_LITERALS 0x0008 #endif #define PyPARSE_IGNORE_COOKIE 0x0010 #define PyPARSE_BARRY_AS_BDFL 0x0020 +#define PyPARSE_ASYNC_ALWAYS 0x8000 extern node * Ta3Parser_ParseString(const char *, grammar *, int, perrdetail *); @@ -98,8 +99,8 @@ extern node * Ta3Parser_ParseStringObject( /* Note that the following functions are defined in pythonrun.c, not in parsetok.c */ -PyAPI_FUNC(void) PyParser_SetError(perrdetail *); -PyAPI_FUNC(void) PyParser_ClearError(perrdetail *); +extern void PyParser_SetError(perrdetail *); +extern void PyParser_ClearError(perrdetail *); #ifdef __cplusplus }
ast3/Include/Python-ast.h+62 −63 modified@@ -50,24 +50,24 @@ struct _mod { asdl_seq *body; asdl_seq *type_ignores; } Module; - + struct { asdl_seq *body; } Interactive; - + struct { expr_ty body; } Expression; - + struct { asdl_seq *argtypes; expr_ty returns; } FunctionType; - + struct { asdl_seq *body; } Suite; - + } v; }; @@ -90,7 +90,7 @@ struct _stmt { expr_ty returns; string type_comment; } FunctionDef; - + struct { identifier name; arguments_ty args; @@ -99,121 +99,121 @@ struct _stmt { expr_ty returns; string type_comment; } AsyncFunctionDef; - + struct { identifier name; asdl_seq *bases; asdl_seq *keywords; asdl_seq *body; asdl_seq *decorator_list; } ClassDef; - + struct { expr_ty value; } Return; - + struct { asdl_seq *targets; } Delete; - + struct { asdl_seq *targets; expr_ty value; string type_comment; } Assign; - + struct { expr_ty target; operator_ty op; expr_ty value; } AugAssign; - + struct { expr_ty target; expr_ty annotation; expr_ty value; int simple; } AnnAssign; - + struct { expr_ty target; expr_ty iter; asdl_seq *body; asdl_seq *orelse; string type_comment; } For; - + struct { expr_ty target; expr_ty iter; asdl_seq *body; asdl_seq *orelse; string type_comment; } AsyncFor; - + struct { expr_ty test; asdl_seq *body; asdl_seq *orelse; } While; - + struct { expr_ty test; asdl_seq *body; asdl_seq *orelse; } If; - + struct { asdl_seq *items; asdl_seq *body; string type_comment; } With; - + struct { asdl_seq *items; asdl_seq *body; string type_comment; } AsyncWith; - + struct { expr_ty exc; expr_ty cause; } Raise; - + struct { asdl_seq *body; asdl_seq *handlers; asdl_seq *orelse; asdl_seq *finalbody; } Try; - + struct { expr_ty test; expr_ty msg; } Assert; - + struct { asdl_seq *names; } Import; - + struct { identifier module; asdl_seq *names; int level; } ImportFrom; - + struct { asdl_seq *names; } Global; - + struct { asdl_seq *names; } Nonlocal; - + struct { expr_ty value; } Expr; - + } v; int lineno; int col_offset; @@ -235,146 +235,146 @@ struct _expr { boolop_ty op; asdl_seq *values; } BoolOp; - + struct { expr_ty left; operator_ty op; expr_ty right; } BinOp; - + struct { unaryop_ty op; expr_ty operand; } UnaryOp; - + struct { arguments_ty args; expr_ty body; } Lambda; - + struct { expr_ty test; expr_ty body; expr_ty orelse; } IfExp; - + struct { asdl_seq *keys; asdl_seq *values; } Dict; - + struct { asdl_seq *elts; } Set; - + struct { expr_ty elt; asdl_seq *generators; } ListComp; - + struct { expr_ty elt; asdl_seq *generators; } SetComp; - + struct { expr_ty key; expr_ty value; asdl_seq *generators; } DictComp; - + struct { expr_ty elt; asdl_seq *generators; } GeneratorExp; - + struct { expr_ty value; } Await; - + struct { expr_ty value; } Yield; - + struct { expr_ty value; } YieldFrom; - + struct { expr_ty left; asdl_int_seq *ops; asdl_seq *comparators; } Compare; - + struct { expr_ty func; asdl_seq *args; asdl_seq *keywords; } Call; - + struct { object n; } Num; - + struct { string s; string kind; } Str; - + struct { expr_ty value; int conversion; expr_ty format_spec; } FormattedValue; - + struct { asdl_seq *values; } JoinedStr; - + struct { bytes s; } Bytes; - + struct { singleton value; } NameConstant; - + struct { constant value; } Constant; - + struct { expr_ty value; identifier attr; expr_context_ty ctx; } Attribute; - + struct { expr_ty value; slice_ty slice; expr_context_ty ctx; } Subscript; - + struct { expr_ty value; expr_context_ty ctx; } Starred; - + struct { identifier id; expr_context_ty ctx; } Name; - + struct { asdl_seq *elts; expr_context_ty ctx; } List; - + struct { asdl_seq *elts; expr_context_ty ctx; } Tuple; - + } v; int lineno; int col_offset; @@ -389,15 +389,15 @@ struct _slice { expr_ty upper; expr_ty step; } Slice; - + struct { asdl_seq *dims; } ExtSlice; - + struct { expr_ty value; } Index; - + } v; }; @@ -417,7 +417,7 @@ struct _excepthandler { identifier name; asdl_seq *body; } ExceptHandler; - + } v; int lineno; int col_offset; @@ -462,7 +462,7 @@ struct _type_ignore { struct { int lineno; } TypeIgnore; - + } v; }; @@ -603,8 +603,7 @@ expr_ty _Ta3_Call(expr_ty func, asdl_seq * args, asdl_seq * keywords, int #define Num(a0, a1, a2, a3) _Ta3_Num(a0, a1, a2, a3) expr_ty _Ta3_Num(object n, int lineno, int col_offset, PyArena *arena); #define Str(a0, a1, a2, a3, a4) _Ta3_Str(a0, a1, a2, a3, a4) -expr_ty _Ta3_Str(string s, string kind, int lineno, int col_offset, PyArena - *arena); +expr_ty _Ta3_Str(string s, string kind, int lineno, int col_offset, PyArena *arena); #define FormattedValue(a0, a1, a2, a3, a4, a5) _Ta3_FormattedValue(a0, a1, a2, a3, a4, a5) expr_ty _Ta3_FormattedValue(expr_ty value, int conversion, expr_ty format_spec, int lineno, int col_offset, PyArena *arena);
ast3/Include/token.h+63 −59 modified@@ -9,78 +9,82 @@ extern "C" { #undef TILDE /* Prevent clash of our definition with system macro. Ex AIX, ioctl.h */ -#define ENDMARKER 0 -#define NAME 1 -#define NUMBER 2 -#define STRING 3 -#define NEWLINE 4 -#define INDENT 5 -#define DEDENT 6 -#define LPAR 7 -#define RPAR 8 -#define LSQB 9 -#define RSQB 10 -#define COLON 11 -#define COMMA 12 -#define SEMI 13 -#define PLUS 14 -#define MINUS 15 -#define STAR 16 -#define SLASH 17 -#define VBAR 18 -#define AMPER 19 -#define LESS 20 -#define GREATER 21 -#define EQUAL 22 -#define DOT 23 -#define PERCENT 24 -#define LBRACE 25 -#define RBRACE 26 -#define EQEQUAL 27 -#define NOTEQUAL 28 -#define LESSEQUAL 29 -#define GREATEREQUAL 30 -#define TILDE 31 -#define CIRCUMFLEX 32 -#define LEFTSHIFT 33 -#define RIGHTSHIFT 34 -#define DOUBLESTAR 35 -#define PLUSEQUAL 36 -#define MINEQUAL 37 -#define STAREQUAL 38 -#define SLASHEQUAL 39 -#define PERCENTEQUAL 40 -#define AMPEREQUAL 41 -#define VBAREQUAL 42 -#define CIRCUMFLEXEQUAL 43 -#define LEFTSHIFTEQUAL 44 -#define RIGHTSHIFTEQUAL 45 -#define DOUBLESTAREQUAL 46 -#define DOUBLESLASH 47 +#define ENDMARKER 0 +#define NAME 1 +#define NUMBER 2 +#define STRING 3 +#define NEWLINE 4 +#define INDENT 5 +#define DEDENT 6 +#define LPAR 7 +#define RPAR 8 +#define LSQB 9 +#define RSQB 10 +#define COLON 11 +#define COMMA 12 +#define SEMI 13 +#define PLUS 14 +#define MINUS 15 +#define STAR 16 +#define SLASH 17 +#define VBAR 18 +#define AMPER 19 +#define LESS 20 +#define GREATER 21 +#define EQUAL 22 +#define DOT 23 +#define PERCENT 24 +#define LBRACE 25 +#define RBRACE 26 +#define EQEQUAL 27 +#define NOTEQUAL 28 +#define LESSEQUAL 29 +#define GREATEREQUAL 30 +#define TILDE 31 +#define CIRCUMFLEX 32 +#define LEFTSHIFT 33 +#define RIGHTSHIFT 34 +#define DOUBLESTAR 35 +#define PLUSEQUAL 36 +#define MINEQUAL 37 +#define STAREQUAL 38 +#define SLASHEQUAL 39 +#define PERCENTEQUAL 40 +#define AMPEREQUAL 41 +#define VBAREQUAL 42 +#define CIRCUMFLEXEQUAL 43 +#define LEFTSHIFTEQUAL 44 +#define RIGHTSHIFTEQUAL 45 +#define DOUBLESTAREQUAL 46 +#define DOUBLESLASH 47 #define DOUBLESLASHEQUAL 48 #define AT 49 -#define ATEQUAL 50 +#define ATEQUAL 50 #define RARROW 51 #define ELLIPSIS 52 /* Don't forget to update the table _Ta3Parser_TokenNames in tokenizer.c! */ -#define OP 53 -#define AWAIT 54 -#define ASYNC 55 +#define OP 53 +#define AWAIT 54 +#define ASYNC 55 #define TYPE_IGNORE 56 #define TYPE_COMMENT 57 -#define ERRORTOKEN 58 -#define N_TOKENS 59 +#define ERRORTOKEN 58 +/* These aren't used by the C tokenizer but are needed for tokenize.py */ +#define COMMENT 59 +#define NL 60 +#define ENCODING 61 +#define N_TOKENS 62 /* Special definitions for cooperation with parser */ -#define NT_OFFSET 256 +#define NT_OFFSET 256 -#define ISTERMINAL(x) ((x) < NT_OFFSET) -#define ISNONTERMINAL(x) ((x) >= NT_OFFSET) -#define ISEOF(x) ((x) == ENDMARKER) +#define ISTERMINAL(x) ((x) < NT_OFFSET) +#define ISNONTERMINAL(x) ((x) >= NT_OFFSET) +#define ISEOF(x) ((x) == ENDMARKER) -extern const char *_Ta3Parser_TokenNames[]; /* Token names */ +extern const char * _Ta3Parser_TokenNames[]; /* Token names */ extern int Ta3Token_OneChar(int); extern int Ta3Token_TwoChars(int, int); extern int Ta3Token_ThreeChars(int, int, int);
ast3/Parser/asdl_c.py+130 −131 modified@@ -94,8 +94,9 @@ def emit(self, s, depth, reflow=True): else: lines = [s] for line in lines: - line = (" " * TABSIZE * depth) + line + "\n" - self.file.write(line) + if line: + line = (" " * TABSIZE * depth) + line + self.file.write(line + "\n") class TypeDefVisitor(EmitVisitor): @@ -496,18 +497,31 @@ def isSimpleType(self, field): def visitField(self, field, name, sum=None, prod=None, depth=0): ctype = get_c_type(field.type) - if field.opt: - check = "exists_not_none(obj, &PyId_%s)" % (field.name,) + self.emit("if (lookup_attr_id(obj, &PyId_%s, &tmp) < 0) {" % field.name, depth) + self.emit("return 1;", depth+1) + self.emit("}", depth) + if not field.opt: + self.emit("if (tmp == NULL) {", depth) + message = "required field \\\"%s\\\" missing from %s" % (field.name, name) + format = "PyErr_SetString(PyExc_TypeError, \"%s\");" + self.emit(format % message, depth+1, reflow=False) + self.emit("return 1;", depth+1) else: - check = "_PyObject_HasAttrId(obj, &PyId_%s)" % (field.name,) - self.emit("if (%s) {" % (check,), depth, reflow=False) + self.emit("if (tmp == NULL || tmp == Py_None) {", depth) + self.emit("Py_CLEAR(tmp);", depth+1) + if self.isNumeric(field): + self.emit("%s = 0;" % field.name, depth+1) + elif not self.isSimpleType(field): + self.emit("%s = NULL;" % field.name, depth+1) + else: + raise TypeError("could not determine the default value for %s" % field.name) + self.emit("}", depth) + self.emit("else {", depth) + self.emit("int res;", depth+1) if field.seq: self.emit("Py_ssize_t len;", depth+1) self.emit("Py_ssize_t i;", depth+1) - self.emit("tmp = _PyObject_GetAttrId(obj, &PyId_%s);" % field.name, depth+1) - self.emit("if (tmp == NULL) goto failed;", depth+1) - if field.seq: self.emit("if (!PyList_Check(tmp)) {", depth+1) self.emit("PyErr_Format(PyExc_TypeError, \"%s field \\\"%s\\\" must " "be a list, not a %%.200s\", tmp->ob_type->tp_name);" % @@ -522,8 +536,8 @@ def visitField(self, field, name, sum=None, prod=None, depth=0): self.emit("%s = _Ta3_asdl_seq_new(len, arena);" % field.name, depth+1) self.emit("if (%s == NULL) goto failed;" % field.name, depth+1) self.emit("for (i = 0; i < len; i++) {", depth+1) - self.emit("%s value;" % ctype, depth+2) - self.emit("res = obj2ast_%s(PyList_GET_ITEM(tmp, i), &value, arena);" % + self.emit("%s val;" % ctype, depth+2) + self.emit("res = obj2ast_%s(PyList_GET_ITEM(tmp, i), &val, arena);" % field.type, depth+2, reflow=False) self.emit("if (res != 0) goto failed;", depth+2) self.emit("if (len != PyList_GET_SIZE(tmp)) {", depth+2) @@ -533,27 +547,14 @@ def visitField(self, field, name, sum=None, prod=None, depth=0): depth+3, reflow=False) self.emit("goto failed;", depth+3) self.emit("}", depth+2) - self.emit("asdl_seq_SET(%s, i, value);" % field.name, depth+2) + self.emit("asdl_seq_SET(%s, i, val);" % field.name, depth+2) self.emit("}", depth+1) else: self.emit("res = obj2ast_%s(tmp, &%s, arena);" % (field.type, field.name), depth+1) self.emit("if (res != 0) goto failed;", depth+1) self.emit("Py_CLEAR(tmp);", depth+1) - self.emit("} else {", depth) - if not field.opt: - message = "required field \\\"%s\\\" missing from %s" % (field.name, name) - format = "PyErr_SetString(PyExc_TypeError, \"%s\");" - self.emit(format % message, depth+1, reflow=False) - self.emit("return 1;", depth+1) - else: - if self.isNumeric(field): - self.emit("%s = 0;" % field.name, depth+1) - elif not self.isSimpleType(field): - self.emit("%s = NULL;" % field.name, depth+1) - else: - raise TypeError("could not determine the default value for %s" % field.name) self.emit("}", depth) @@ -622,6 +623,9 @@ class PyTypesVisitor(PickleVisitor): def visitModule(self, mod): self.emit(""" +_Py_IDENTIFIER(_fields); +_Py_IDENTIFIER(_attributes); + typedef struct { PyObject_HEAD PyObject *dict; @@ -630,6 +634,8 @@ def visitModule(self, mod): static void ast_dealloc(AST_object *self) { + /* bpo-31095: UnTrack is needed before calling any callbacks */ + PyObject_GC_UnTrack(self); Py_CLEAR(self->dict); Py_TYPE(self)->tp_free(self); } @@ -641,50 +647,65 @@ def visitModule(self, mod): return 0; } -static void +static int ast_clear(AST_object *self) { Py_CLEAR(self->dict); + return 0; +} + +static int lookup_attr_id(PyObject *v, _Py_Identifier *name, PyObject **result) +{ + PyObject *oname = _PyUnicode_FromId(name); /* borrowed */ + if (!oname) { + *result = NULL; + return -1; + } + *result = PyObject_GetAttr(v, oname); + if (*result == NULL) { + if (!PyErr_ExceptionMatches(PyExc_AttributeError)) { + return -1; + } + PyErr_Clear(); + } + return 0; } static int ast_type_init(PyObject *self, PyObject *args, PyObject *kw) { - _Py_IDENTIFIER(_fields); Py_ssize_t i, numfields = 0; int res = -1; PyObject *key, *value, *fields; - fields = _PyObject_GetAttrId((PyObject*)Py_TYPE(self), &PyId__fields); - if (!fields) - PyErr_Clear(); + if (lookup_attr_id((PyObject*)Py_TYPE(self), &PyId__fields, &fields) < 0) { + goto cleanup; + } if (fields) { numfields = PySequence_Size(fields); if (numfields == -1) goto cleanup; } + res = 0; /* if no error occurs, this stays 0 to the end */ - if (PyTuple_GET_SIZE(args) > 0) { - if (numfields != PyTuple_GET_SIZE(args)) { - PyErr_Format(PyExc_TypeError, "%.400s constructor takes %s" - "%zd positional argument%s", - Py_TYPE(self)->tp_name, - numfields == 0 ? "" : "either 0 or ", - numfields, numfields == 1 ? "" : "s"); + if (numfields < PyTuple_GET_SIZE(args)) { + PyErr_Format(PyExc_TypeError, "%.400s constructor takes at most " + "%zd positional argument%s", + Py_TYPE(self)->tp_name, + numfields, numfields == 1 ? "" : "s"); + res = -1; + goto cleanup; + } + for (i = 0; i < PyTuple_GET_SIZE(args); i++) { + /* cannot be reached when fields is NULL */ + PyObject *name = PySequence_GetItem(fields, i); + if (!name) { res = -1; goto cleanup; } - for (i = 0; i < PyTuple_GET_SIZE(args); i++) { - /* cannot be reached when fields is NULL */ - PyObject *name = PySequence_GetItem(fields, i); - if (!name) { - res = -1; - goto cleanup; - } - res = PyObject_SetAttr(self, name, PyTuple_GET_ITEM(args, i)); - Py_DECREF(name); - if (res < 0) - goto cleanup; - } + res = PyObject_SetAttr(self, name, PyTuple_GET_ITEM(args, i)); + Py_DECREF(name); + if (res < 0) + goto cleanup; } if (kw) { i = 0; /* needed by PyDict_Next */ @@ -703,19 +724,13 @@ def visitModule(self, mod): static PyObject * ast_type_reduce(PyObject *self, PyObject *unused) { - PyObject *res; _Py_IDENTIFIER(__dict__); - PyObject *dict = _PyObject_GetAttrId(self, &PyId___dict__); - if (dict == NULL) { - if (PyErr_ExceptionMatches(PyExc_AttributeError)) - PyErr_Clear(); - else - return NULL; + PyObject *dict; + if (lookup_attr_id(self, &PyId___dict__, &dict) < 0) { + return NULL; } if (dict) { - res = Py_BuildValue("O()O", Py_TYPE(self), dict); - Py_DECREF(dict); - return res; + return Py_BuildValue("O()N", Py_TYPE(self), dict); } return Py_BuildValue("O()", Py_TYPE(self)); } @@ -775,6 +790,8 @@ def visitModule(self, mod): static PyTypeObject* make_type(char *type, PyTypeObject* base, char**fields, int num_fields) { + _Py_IDENTIFIER(__module__); + _Py_IDENTIFIER(_ast3); PyObject *fnames, *result; int i; fnames = PyTuple_New(num_fields); @@ -787,16 +804,18 @@ def visitModule(self, mod): } PyTuple_SET_ITEM(fnames, i, field); } - result = PyObject_CallFunction((PyObject*)&PyType_Type, "s(O){sOss}", - type, base, "_fields", fnames, "__module__", "_ast3"); + result = PyObject_CallFunction((PyObject*)&PyType_Type, "s(O){OOOO}", + type, base, + _PyUnicode_FromId(&PyId__fields), fnames, + _PyUnicode_FromId(&PyId___module__), + _PyUnicode_FromId(&PyId__ast3)); Py_DECREF(fnames); return (PyTypeObject*)result; } static int add_attributes(PyTypeObject* type, char**attrs, int num_fields) { int i, result; - _Py_IDENTIFIER(_attributes); PyObject *s, *l = PyTuple_New(num_fields); if (!l) return 0; @@ -942,28 +961,15 @@ def visitModule(self, mod): d = AST_type.tp_dict; empty_tuple = PyTuple_New(0); if (!empty_tuple || - PyDict_SetItemString(d, "_fields", empty_tuple) < 0 || - PyDict_SetItemString(d, "_attributes", empty_tuple) < 0) { + _PyDict_SetItemId(d, &PyId__fields, empty_tuple) < 0 || + _PyDict_SetItemId(d, &PyId__attributes, empty_tuple) < 0) { Py_XDECREF(empty_tuple); return -1; } Py_DECREF(empty_tuple); return 0; } -static int exists_not_none(PyObject *obj, _Py_Identifier *id) -{ - int isnone; - PyObject *attr = _PyObject_GetAttrId(obj, id); - if (!attr) { - PyErr_Clear(); - return 0; - } - isnone = attr == Py_None; - Py_DECREF(attr); - return !isnone; -} - """, 0, reflow=False) self.emit("static int init_types(void)",0) @@ -1021,22 +1027,20 @@ def visitConstructor(self, cons, name, simple): class ASTModuleVisitor(PickleVisitor): def visitModule(self, mod): - # add parse method to module - self.emit('PyObject *ast3_parse(PyObject *self, PyObject *args);', 0) - self.emit('static PyMethodDef ast3_methods[] = {', 0) - self.emit('{"_parse", ast3_parse, METH_VARARGS, "Parse string into typed AST."},', 1) - self.emit('{NULL, NULL, 0, NULL}', 1) - self.emit('};', 0) - - self.emit("static struct PyModuleDef _astmodule3 = {", 0) - self.emit(' PyModuleDef_HEAD_INIT, "_ast3", NULL, 0, ast3_methods', 0) + self.emit("PyObject *ast3_parse(PyObject *self, PyObject *args);", 0) + self.emit("static PyMethodDef ast3_methods[] = {", 0) + self.emit(' {"_parse", ast3_parse, METH_VARARGS, "Parse string into typed AST."},', 0) + self.emit(" {NULL, NULL, 0, NULL}", 0) + self.emit("};", 0) + self.emit("static struct PyModuleDef _astmodule = {", 0) + self.emit(' PyModuleDef_HEAD_INIT, "_ast3", NULL, 0, ast3_methods', 0) self.emit("};", 0) self.emit("PyMODINIT_FUNC", 0) self.emit("PyInit__ast3(void)", 0) self.emit("{", 0) self.emit("PyObject *m, *d;", 1) self.emit("if (!init_types()) return NULL;", 1) - self.emit('m = PyModule_Create(&_astmodule3);', 1) + self.emit('m = PyModule_Create(&_astmodule);', 1) self.emit("if (!m) return NULL;", 1) self.emit("d = PyModule_GetDict(m);", 1) self.emit('if (PyDict_SetItemString(d, "AST", (PyObject*)&AST_type) < 0) return NULL;', 1) @@ -1098,8 +1102,7 @@ def func_begin(self, name): self.emit("%s o = (%s)_o;" % (ctype, ctype), 1) self.emit("PyObject *result = NULL, *value = NULL;", 1) self.emit('if (!o) {', 1) - self.emit("Py_INCREF(Py_None);", 2) - self.emit('return Py_None;', 2) + self.emit("Py_RETURN_NONE;", 2) self.emit("}", 1) self.emit('', 0) @@ -1286,59 +1289,55 @@ def main(srcfile, dump_module=False): print(mod) if not asdl.check(mod): sys.exit(1) - if INC_DIR: - p = "%s/%s-ast.h" % (INC_DIR, mod.name) - f = open(p, "w") - f.write(auto_gen_msg) - f.write('#include "asdl.h"\n\n') - c = ChainOfVisitors(TypeDefVisitor(f), - StructVisitor(f), - PrototypeVisitor(f), - ) - c.visit(mod) - f.write("PyObject* Ta3AST_mod2obj(mod_ty t);\n") - f.write("mod_ty Ta3AST_obj2mod(PyObject* ast, PyArena* arena, int mode);\n") - f.write("int Ta3AST_Check(PyObject* obj);\n") - f.close() - - if SRC_DIR: - p = os.path.join(SRC_DIR, str(mod.name) + "-ast.c") - f = open(p, "w") - f.write(auto_gen_msg) - f.write('#include <stddef.h>\n') - f.write('\n') - f.write('#include "Python.h"\n') - f.write('#include "%s-ast.h"\n' % mod.name) - f.write('\n') - f.write("static PyTypeObject AST_type;\n") - v = ChainOfVisitors( - PyTypesDeclareVisitor(f), - PyTypesVisitor(f), - Obj2ModPrototypeVisitor(f), - FunctionVisitor(f), - ObjVisitor(f), - Obj2ModVisitor(f), - ASTModuleVisitor(f), - PartingShots(f), - ) - v.visit(mod) - f.close() + if H_FILE: + with open(H_FILE, "w") as f: + f.write(auto_gen_msg) + f.write('#include "asdl.h"\n\n') + c = ChainOfVisitors(TypeDefVisitor(f), + StructVisitor(f), + PrototypeVisitor(f), + ) + c.visit(mod) + f.write("PyObject* Ta3AST_mod2obj(mod_ty t);\n") + f.write("mod_ty Ta3AST_obj2mod(PyObject* ast, PyArena* arena, int mode);\n") + f.write("int Ta3AST_Check(PyObject* obj);\n") + + if C_FILE: + with open(C_FILE, "w") as f: + f.write(auto_gen_msg) + f.write('#include <stddef.h>\n') + f.write('\n') + f.write('#include "Python.h"\n') + f.write('#include "%s-ast.h"\n' % mod.name) + f.write('\n') + f.write("static PyTypeObject AST_type;\n") + v = ChainOfVisitors( + PyTypesDeclareVisitor(f), + PyTypesVisitor(f), + Obj2ModPrototypeVisitor(f), + FunctionVisitor(f), + ObjVisitor(f), + Obj2ModVisitor(f), + ASTModuleVisitor(f), + PartingShots(f), + ) + v.visit(mod) if __name__ == "__main__": import getopt - INC_DIR = '' - SRC_DIR = '' + H_FILE = '' + C_FILE = '' dump_module = False opts, args = getopt.getopt(sys.argv[1:], "dh:c:") for o, v in opts: if o == '-h': - INC_DIR = v + H_FILE = v if o == '-c': - SRC_DIR = v + C_FILE = v if o == '-d': dump_module = True - if INC_DIR and SRC_DIR: + if H_FILE and C_FILE: print('Must specify exactly one output file') sys.exit(1) elif len(args) != 1:
ast3/Parser/grammar1.c+1 −2 modified@@ -25,8 +25,7 @@ Ta3Grammar_FindDFA(grammar *g, int type) if (d->d_type == type) return d; } - assert(0); - /* NOTREACHED */ + abort(); #endif }
ast3/Parser/parser.h+10 −10 modified@@ -10,23 +10,23 @@ extern "C" { #define MAXSTACK 1500 typedef struct { - int s_state; /* State in current DFA */ - dfa *s_dfa; /* Current DFA */ - struct _node *s_parent; /* Where to add next node */ + int s_state; /* State in current DFA */ + dfa *s_dfa; /* Current DFA */ + struct _node *s_parent; /* Where to add next node */ } stackentry; typedef struct { - stackentry *s_top; /* Top entry */ - stackentry s_base[MAXSTACK];/* Array of stack entries */ - /* NB The stack grows down */ + stackentry *s_top; /* Top entry */ + stackentry s_base[MAXSTACK];/* Array of stack entries */ + /* NB The stack grows down */ } stack; typedef struct { - stack p_stack; /* Stack of parser states */ - grammar *p_grammar; /* Grammar to use */ - node *p_tree; /* Top of parse tree */ + stack p_stack; /* Stack of parser states */ + grammar *p_grammar; /* Grammar to use */ + node *p_tree; /* Top of parse tree */ #ifdef PY_PARSER_REQUIRES_FUTURE_KEYWORD - unsigned long p_flags; /* see co_flags in Include/code.h */ + unsigned long p_flags; /* see co_flags in Include/code.h */ #endif } parser_state;
ast3/Parser/parsetok.c+8 −5 modified@@ -64,6 +64,8 @@ Ta3Parser_ParseStringObject(const char *s, PyObject *filename, Py_INCREF(err_ret->filename); tok->filename = err_ret->filename; #endif + if (*flags & PyPARSE_ASYNC_ALWAYS) + tok->async_always = 1; return parsetok(tok, g, start, err_ret, flags); } @@ -264,7 +266,7 @@ parsetok(struct tok_state *tok, grammar *g, int start, perrdetail *err_ret, } else started = 1; - len = b - a; /* XXX this may compute NULL - NULL */ + len = (a != NULL && b != NULL) ? b - a : 0; str = (char *) PyObject_MALLOC(len + 1); if (str == NULL) { err_ret->error = E_NOMEM; @@ -285,18 +287,19 @@ parsetok(struct tok_state *tok, grammar *g, int start, perrdetail *err_ret, else if ((ps->p_flags & CO_FUTURE_BARRY_AS_BDFL) && strcmp(str, "<>")) { PyObject_FREE(str); - err_ret->text = "with Barry as BDFL, use '<>' " - "instead of '!='"; + err_ret->expected = NOTEQUAL; err_ret->error = E_SYNTAX; break; } } #endif - if (a >= tok->line_start) + if (a != NULL && a >= tok->line_start) { col_offset = Py_SAFE_DOWNCAST(a - tok->line_start, intptr_t, int); - else + } + else { col_offset = -1; + } if (type == TYPE_IGNORE) { if (!growable_int_array_add(&type_ignores, tok->lineno)) {
ast3/Parser/Python.asdl+1 −1 modified@@ -77,7 +77,7 @@ module Python | Compare(expr left, cmpop* ops, expr* comparators) | Call(expr func, expr* args, keyword* keywords) | Num(object n) -- a number as a PyObject. - | Str(string s, string kind) + | Str(string s, string kind) -- need to specify raw, unicode, etc? | FormattedValue(expr value, int? conversion, expr? format_spec) | JoinedStr(expr* values) | Bytes(bytes s)
ast3/Parser/tokenizer.c+29 −43 modified@@ -27,6 +27,13 @@ } while (0) #endif /* Py_XSETREF */ +#ifndef _PyObject_CallNoArg +#define _PyObject_CallNoArg(func) PyObject_CallObject(func, NULL) +#endif + +/* Alternate tab spacing */ +#define ALTTABSIZE 1 + #define is_potential_identifier_start(c) (\ (c >= 'a' && c <= 'z')\ || (c >= 'A' && c <= 'Z')\ @@ -40,12 +47,7 @@ || c == '_'\ || (c >= 128)) -#if PY_MINOR_VERSION >= 4 PyAPI_FUNC(char *) PyOS_Readline(FILE *, FILE *, const char *); -#else -// Python 3.3 doesn't have PyAPI_FUNC, but it's not supported on Windows anyway. -char *PyOS_Readline(FILE *, FILE *, char *); -#endif /* Return malloc'ed string including trailing \n; empty malloc'ed string for EOF; NULL if interrupted */ @@ -122,6 +124,9 @@ const char *_Ta3Parser_TokenNames[] = { "TYPE_IGNORE", "TYPE_COMMENT", "<ERRORTOKEN>", + "COMMENT", + "NL", + "ENCODING", "<N_TOKENS>" }; @@ -152,9 +157,6 @@ tok_new(void) tok->prompt = tok->nextprompt = NULL; tok->lineno = 0; tok->level = 0; - tok->altwarning = 1; - tok->alterror = 1; - tok->alttabsize = 1; tok->altindstack[0] = 0; tok->decoding_state = STATE_INIT; tok->decoding_erred = 0; @@ -171,6 +173,7 @@ tok_new(void) tok->async_def = 0; tok->async_def_indent = 0; tok->async_def_nl = 0; + tok->async_always = 0; return tok; } @@ -460,7 +463,7 @@ fp_readl(char *s, int size, struct tok_state *tok) } else { - bufobj = PyObject_CallObject(tok->decoding_readline, NULL); + bufobj = _PyObject_CallNoArg(tok->decoding_readline); if (bufobj == NULL) goto error; } @@ -553,7 +556,7 @@ fp_setreadl(struct tok_state *tok, const char* enc) Py_XSETREF(tok->decoding_readline, readline); if (pos > 0) { - PyObject *bufobj = PyObject_CallObject(readline, NULL); + PyObject *bufobj = _PyObject_CallNoArg(readline); if (bufobj == NULL) return 0; Py_DECREF(bufobj); @@ -670,7 +673,7 @@ decoding_feof(struct tok_state *tok) } else { PyObject* buf = tok->decoding_buffer; if (buf == NULL) { - buf = PyObject_CallObject(tok->decoding_readline, NULL); + buf = _PyObject_CallNoArg(tok->decoding_readline); if (buf == NULL) { error_ret(tok); return 1; @@ -976,6 +979,11 @@ tok_nextc(struct tok_state *tok) buflen = PyBytes_GET_SIZE(u); buf = PyBytes_AS_STRING(u); newtok = PyMem_MALLOC(buflen+1); + if (newtok == NULL) { + Py_DECREF(u); + tok->done = E_NOMEM; + return EOF; + } strcpy(newtok, buf); Py_DECREF(u); } @@ -1306,22 +1314,9 @@ Ta3Token_ThreeChars(int c1, int c2, int c3) static int indenterror(struct tok_state *tok) { - if (tok->alterror) { - tok->done = E_TABSPACE; - tok->cur = tok->inp; - return 1; - } - if (tok->altwarning) { -#ifdef PGEN - PySys_WriteStderr("inconsistent use of tabs and spaces " - "in indentation\n"); -#else - PySys_FormatStderr("%U: inconsistent use of tabs and spaces " - "in indentation\n", tok->filename); -#endif - tok->altwarning = 0; - } - return 0; + tok->done = E_TABSPACE; + tok->cur = tok->inp; + return ERRORTOKEN; } #ifdef PGEN @@ -1401,9 +1396,8 @@ tok_get(struct tok_state *tok, char **p_start, char **p_end) col++, altcol++; } else if (c == '\t') { - col = (col/tok->tabsize + 1) * tok->tabsize; - altcol = (altcol/tok->alttabsize + 1) - * tok->alttabsize; + col = (col / tok->tabsize + 1) * tok->tabsize; + altcol = (altcol / ALTTABSIZE + 1) * ALTTABSIZE; } else if (c == '\014') {/* Control-L (formfeed) */ col = altcol = 0; /* For Emacs users */ @@ -1432,9 +1426,7 @@ tok_get(struct tok_state *tok, char **p_start, char **p_end) if (col == tok->indstack[tok->indent]) { /* No change */ if (altcol != tok->altindstack[tok->indent]) { - if (indenterror(tok)) { - return ERRORTOKEN; - } + return indenterror(tok); } } else if (col > tok->indstack[tok->indent]) { @@ -1445,9 +1437,7 @@ tok_get(struct tok_state *tok, char **p_start, char **p_end) return ERRORTOKEN; } if (altcol <= tok->altindstack[tok->indent]) { - if (indenterror(tok)) { - return ERRORTOKEN; - } + return indenterror(tok); } tok->pendin++; tok->indstack[++tok->indent] = col; @@ -1466,9 +1456,7 @@ tok_get(struct tok_state *tok, char **p_start, char **p_end) return ERRORTOKEN; } if (altcol != tok->altindstack[tok->indent]) { - if (indenterror(tok)) { - return ERRORTOKEN; - } + return indenterror(tok); } } } @@ -1574,7 +1562,7 @@ tok_get(struct tok_state *tok, char **p_start, char **p_end) /* Identifier (most frequent token!) */ nonascii = 0; if (is_potential_identifier_start(c)) { - /* Process b"", r"", u"", br"" and rb"" */ + /* Process the various legal combinations of b"", r"", u"", and f"". */ int saw_b = 0, saw_r = 0, saw_u = 0, saw_f = 0; while (1) { if (!(saw_b || saw_u || saw_f) && (c == 'b' || c == 'B')) @@ -1616,7 +1604,7 @@ tok_get(struct tok_state *tok, char **p_start, char **p_end) /* async/await parsing block. */ if (tok->cur - tok->start == 5) { /* Current token length is 5. */ - if (tok->async_def) { + if (tok->async_always || tok->async_def) { /* We're inside an 'async def' function. */ if (memcmp(tok->start, "async", 5) == 0) { return ASYNC; @@ -1988,9 +1976,7 @@ Ta3Tokenizer_FindEncodingFilename(int fd, PyObject *filename) char *p_start =NULL , *p_end =NULL , *encoding = NULL; #ifndef PGEN -#if PY_MINOR_VERSION >= 4 fd = _Py_dup(fd); -#endif #else fd = dup(fd); #endif
ast3/Parser/tokenizer.h+1 −5 modified@@ -47,9 +47,6 @@ struct tok_state { (Grammar/Grammar). */ PyObject *filename; #endif - int altwarning; /* Issue warning if alternate tabs don't match */ - int alterror; /* Issue error if alternate tabs don't match */ - int alttabsize; /* Alternate tab spacing */ int altindstack[MAXINDENT]; /* Stack of alternate indents */ /* Stuff for PEP 0263 */ enum decoding_state decoding_state; @@ -72,6 +69,7 @@ struct tok_state { int async_def_indent; /* Indentation level of the outermost 'async def'. */ int async_def_nl; /* =1 if the outermost 'async def' had at least one NEWLINE token after it. */ + int async_always; /* =1 if async/await are always keywords */ }; extern struct tok_state *Ta3Tokenizer_FromString(const char *, int); @@ -80,8 +78,6 @@ extern struct tok_state *Ta3Tokenizer_FromFile(FILE *, const char*, const char *, const char *); extern void Ta3Tokenizer_Free(struct tok_state *); extern int Ta3Tokenizer_Get(struct tok_state *, char **, char **); -extern char * PyTokenizer_RestoreEncoding(struct tok_state* tok, - int len, int *offset); #ifdef __cplusplus }
ast3/Pgen/dynamic_annotations.c+0 −154 removed@@ -1,154 +0,0 @@ -/* Copyright (c) 2008-2009, Google Inc. - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions are - * met: - * - * * Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * * Neither the name of Google Inc. nor the names of its - * contributors may be used to endorse or promote products derived from - * this software without specific prior written permission. - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - * - * --- - * Author: Kostya Serebryany - */ - -#ifdef _MSC_VER -# include <windows.h> -#endif - -#ifdef __cplusplus -# error "This file should be built as pure C to avoid name mangling" -#endif - -#include <stdlib.h> -#include <string.h> - -#include "dynamic_annotations.h" - -/* Each function is empty and called (via a macro) only in debug mode. - The arguments are captured by dynamic tools at runtime. */ - -#if DYNAMIC_ANNOTATIONS_ENABLED == 1 - -void AnnotateRWLockCreate(const char *file, int line, - const volatile void *lock){} -void AnnotateRWLockDestroy(const char *file, int line, - const volatile void *lock){} -void AnnotateRWLockAcquired(const char *file, int line, - const volatile void *lock, long is_w){} -void AnnotateRWLockReleased(const char *file, int line, - const volatile void *lock, long is_w){} -void AnnotateBarrierInit(const char *file, int line, - const volatile void *barrier, long count, - long reinitialization_allowed) {} -void AnnotateBarrierWaitBefore(const char *file, int line, - const volatile void *barrier) {} -void AnnotateBarrierWaitAfter(const char *file, int line, - const volatile void *barrier) {} -void AnnotateBarrierDestroy(const char *file, int line, - const volatile void *barrier) {} - -void AnnotateCondVarWait(const char *file, int line, - const volatile void *cv, - const volatile void *lock){} -void AnnotateCondVarSignal(const char *file, int line, - const volatile void *cv){} -void AnnotateCondVarSignalAll(const char *file, int line, - const volatile void *cv){} -void AnnotatePublishMemoryRange(const char *file, int line, - const volatile void *address, - long size){} -void AnnotateUnpublishMemoryRange(const char *file, int line, - const volatile void *address, - long size){} -void AnnotatePCQCreate(const char *file, int line, - const volatile void *pcq){} -void AnnotatePCQDestroy(const char *file, int line, - const volatile void *pcq){} -void AnnotatePCQPut(const char *file, int line, - const volatile void *pcq){} -void AnnotatePCQGet(const char *file, int line, - const volatile void *pcq){} -void AnnotateNewMemory(const char *file, int line, - const volatile void *mem, - long size){} -void AnnotateExpectRace(const char *file, int line, - const volatile void *mem, - const char *description){} -void AnnotateBenignRace(const char *file, int line, - const volatile void *mem, - const char *description){} -void AnnotateBenignRaceSized(const char *file, int line, - const volatile void *mem, - long size, - const char *description) {} -void AnnotateMutexIsUsedAsCondVar(const char *file, int line, - const volatile void *mu){} -void AnnotateTraceMemory(const char *file, int line, - const volatile void *arg){} -void AnnotateThreadName(const char *file, int line, - const char *name){} -void AnnotateIgnoreReadsBegin(const char *file, int line){} -void AnnotateIgnoreReadsEnd(const char *file, int line){} -void AnnotateIgnoreWritesBegin(const char *file, int line){} -void AnnotateIgnoreWritesEnd(const char *file, int line){} -void AnnotateIgnoreSyncBegin(const char *file, int line){} -void AnnotateIgnoreSyncEnd(const char *file, int line){} -void AnnotateEnableRaceDetection(const char *file, int line, int enable){} -void AnnotateNoOp(const char *file, int line, - const volatile void *arg){} -void AnnotateFlushState(const char *file, int line){} - -static int GetRunningOnValgrind(void) { -#ifdef RUNNING_ON_VALGRIND - if (RUNNING_ON_VALGRIND) return 1; -#endif - -#ifndef _MSC_VER - char *running_on_valgrind_str = getenv("RUNNING_ON_VALGRIND"); - if (running_on_valgrind_str) { - return strcmp(running_on_valgrind_str, "0") != 0; - } -#else - /* Visual Studio issues warnings if we use getenv, - * so we use GetEnvironmentVariableA instead. - */ - char value[100] = "1"; - int res = GetEnvironmentVariableA("RUNNING_ON_VALGRIND", - value, sizeof(value)); - /* value will remain "1" if res == 0 or res >= sizeof(value). The latter - * can happen only if the given value is long, in this case it can't be "0". - */ - if (res > 0 && !strcmp(value, "0")) - return 1; -#endif - return 0; -} - -/* See the comments in dynamic_annotations.h */ -int RunningOnValgrind(void) { - static volatile int running_on_valgrind = -1; - /* C doesn't have thread-safe initialization of statics, and we - don't want to depend on pthread_once here, so hack it. */ - int local_running_on_valgrind = running_on_valgrind; - if (local_running_on_valgrind == -1) - running_on_valgrind = local_running_on_valgrind = GetRunningOnValgrind(); - return local_running_on_valgrind; -} - -#endif /* DYNAMIC_ANNOTATIONS_ENABLED == 1 */
ast3/Pgen/firstsets.c+0 −113 removed@@ -1,113 +0,0 @@ - -/* Computation of FIRST stets */ - -#include "pgenheaders.h" -#include "grammar.h" -#include "token.h" - -extern int Py_DebugFlag; - -/* Forward */ -static void calcfirstset(grammar *, dfa *); - -void -addfirstsets(grammar *g) -{ - int i; - dfa *d; - - if (Py_DebugFlag) - printf("Adding FIRST sets ...\n"); - for (i = 0; i < g->g_ndfas; i++) { - d = &g->g_dfa[i]; - if (d->d_first == NULL) - calcfirstset(g, d); - } -} - -static void -calcfirstset(grammar *g, dfa *d) -{ - int i, j; - state *s; - arc *a; - int nsyms; - int *sym; - int nbits; - static bitset dummy; - bitset result; - int type; - dfa *d1; - label *l0; - - if (Py_DebugFlag) - printf("Calculate FIRST set for '%s'\n", d->d_name); - - if (dummy == NULL) - dummy = newbitset(1); - if (d->d_first == dummy) { - fprintf(stderr, "Left-recursion for '%s'\n", d->d_name); - return; - } - if (d->d_first != NULL) { - fprintf(stderr, "Re-calculating FIRST set for '%s' ???\n", - d->d_name); - } - d->d_first = dummy; - - l0 = g->g_ll.ll_label; - nbits = g->g_ll.ll_nlabels; - result = newbitset(nbits); - - sym = (int *)PyObject_MALLOC(sizeof(int)); - if (sym == NULL) - Py_FatalError("no mem for new sym in calcfirstset"); - nsyms = 1; - sym[0] = findlabel(&g->g_ll, d->d_type, (char *)NULL); - - s = &d->d_state[d->d_initial]; - for (i = 0; i < s->s_narcs; i++) { - a = &s->s_arc[i]; - for (j = 0; j < nsyms; j++) { - if (sym[j] == a->a_lbl) - break; - } - if (j >= nsyms) { /* New label */ - sym = (int *)PyObject_REALLOC(sym, - sizeof(int) * (nsyms + 1)); - if (sym == NULL) - Py_FatalError( - "no mem to resize sym in calcfirstset"); - sym[nsyms++] = a->a_lbl; - type = l0[a->a_lbl].lb_type; - if (ISNONTERMINAL(type)) { - d1 = Ta3Grammar_FindDFA(g, type); - if (d1->d_first == dummy) { - fprintf(stderr, - "Left-recursion below '%s'\n", - d->d_name); - } - else { - if (d1->d_first == NULL) - calcfirstset(g, d1); - mergebitset(result, - d1->d_first, nbits); - } - } - else if (ISTERMINAL(type)) { - addbit(result, a->a_lbl); - } - } - } - d->d_first = result; - if (Py_DebugFlag) { - printf("FIRST set for '%s': {", d->d_name); - for (i = 0; i < nbits; i++) { - if (testbit(result, i)) - printf(" %s", Ta3Grammar_LabelRepr(&l0[i])); - } - printf(" }\n"); - } - - PyObject_FREE(sym); -}
ast3/Pgen/listnode.c+0 −66 removed@@ -1,66 +0,0 @@ - -/* List a node on a file */ - -#include "pgenheaders.h" -#include "token.h" -#include "node.h" - -/* Forward */ -static void list1node(FILE *, node *); -static void listnode(FILE *, node *); - -void -PyNode_ListTree(node *n) -{ - listnode(stdout, n); -} - -static int level, atbol; - -static void -listnode(FILE *fp, node *n) -{ - level = 0; - atbol = 1; - list1node(fp, n); -} - -static void -list1node(FILE *fp, node *n) -{ - if (n == 0) - return; - if (ISNONTERMINAL(TYPE(n))) { - int i; - for (i = 0; i < NCH(n); i++) - list1node(fp, CHILD(n, i)); - } - else if (ISTERMINAL(TYPE(n))) { - switch (TYPE(n)) { - case INDENT: - ++level; - break; - case DEDENT: - --level; - break; - default: - if (atbol) { - int i; - for (i = 0; i < level; ++i) - fprintf(fp, "\t"); - atbol = 0; - } - if (TYPE(n) == NEWLINE) { - if (STR(n) != NULL) - fprintf(fp, "%s", STR(n)); - fprintf(fp, "\n"); - atbol = 1; - } - else - fprintf(fp, "%s ", STR(n)); - break; - } - } - else - fprintf(fp, "? "); -}
ast3/Pgen/metagrammar.c+0 −159 removed@@ -1,159 +0,0 @@ - -#include "pgenheaders.h" -#include "metagrammar.h" -#include "grammar.h" -#include "pgen.h" -static arc arcs_0_0[3] = { - {2, 0}, - {3, 0}, - {4, 1}, -}; -static arc arcs_0_1[1] = { - {0, 1}, -}; -static state states_0[2] = { - {3, arcs_0_0}, - {1, arcs_0_1}, -}; -static arc arcs_1_0[1] = { - {5, 1}, -}; -static arc arcs_1_1[1] = { - {6, 2}, -}; -static arc arcs_1_2[1] = { - {7, 3}, -}; -static arc arcs_1_3[1] = { - {3, 4}, -}; -static arc arcs_1_4[1] = { - {0, 4}, -}; -static state states_1[5] = { - {1, arcs_1_0}, - {1, arcs_1_1}, - {1, arcs_1_2}, - {1, arcs_1_3}, - {1, arcs_1_4}, -}; -static arc arcs_2_0[1] = { - {8, 1}, -}; -static arc arcs_2_1[2] = { - {9, 0}, - {0, 1}, -}; -static state states_2[2] = { - {1, arcs_2_0}, - {2, arcs_2_1}, -}; -static arc arcs_3_0[1] = { - {10, 1}, -}; -static arc arcs_3_1[2] = { - {10, 1}, - {0, 1}, -}; -static state states_3[2] = { - {1, arcs_3_0}, - {2, arcs_3_1}, -}; -static arc arcs_4_0[2] = { - {11, 1}, - {13, 2}, -}; -static arc arcs_4_1[1] = { - {7, 3}, -}; -static arc arcs_4_2[3] = { - {14, 4}, - {15, 4}, - {0, 2}, -}; -static arc arcs_4_3[1] = { - {12, 4}, -}; -static arc arcs_4_4[1] = { - {0, 4}, -}; -static state states_4[5] = { - {2, arcs_4_0}, - {1, arcs_4_1}, - {3, arcs_4_2}, - {1, arcs_4_3}, - {1, arcs_4_4}, -}; -static arc arcs_5_0[3] = { - {5, 1}, - {16, 1}, - {17, 2}, -}; -static arc arcs_5_1[1] = { - {0, 1}, -}; -static arc arcs_5_2[1] = { - {7, 3}, -}; -static arc arcs_5_3[1] = { - {18, 1}, -}; -static state states_5[4] = { - {3, arcs_5_0}, - {1, arcs_5_1}, - {1, arcs_5_2}, - {1, arcs_5_3}, -}; -static dfa dfas[6] = { - {256, "MSTART", 0, 2, states_0, - "\070\000\000"}, - {257, "RULE", 0, 5, states_1, - "\040\000\000"}, - {258, "RHS", 0, 2, states_2, - "\040\010\003"}, - {259, "ALT", 0, 2, states_3, - "\040\010\003"}, - {260, "ITEM", 0, 5, states_4, - "\040\010\003"}, - {261, "ATOM", 0, 4, states_5, - "\040\000\003"}, -}; -static label labels[19] = { - {0, "EMPTY"}, - {256, 0}, - {257, 0}, - {4, 0}, - {0, 0}, - {1, 0}, - {11, 0}, - {258, 0}, - {259, 0}, - {18, 0}, - {260, 0}, - {9, 0}, - {10, 0}, - {261, 0}, - {16, 0}, - {14, 0}, - {3, 0}, - {7, 0}, - {8, 0}, -}; -static grammar _Ta3Parser_Grammar = { - 6, - dfas, - {19, labels}, - 256 -}; - -grammar * -meta_grammar(void) -{ - return &_Ta3Parser_Grammar; -} - -grammar * -Py_meta_grammar(void) -{ - return meta_grammar(); -}
ast3/Pgen/mysnprintf.c+0 −104 removed@@ -1,104 +0,0 @@ -#include "Python.h" - -/* snprintf() wrappers. If the platform has vsnprintf, we use it, else we - emulate it in a half-hearted way. Even if the platform has it, we wrap - it because platforms differ in what vsnprintf does in case the buffer - is too small: C99 behavior is to return the number of characters that - would have been written had the buffer not been too small, and to set - the last byte of the buffer to \0. At least MS _vsnprintf returns a - negative value instead, and fills the entire buffer with non-\0 data. - - The wrappers ensure that str[size-1] is always \0 upon return. - - PyOS_snprintf and PyOS_vsnprintf never write more than size bytes - (including the trailing '\0') into str. - - If the platform doesn't have vsnprintf, and the buffer size needed to - avoid truncation exceeds size by more than 512, Python aborts with a - Py_FatalError. - - Return value (rv): - - When 0 <= rv < size, the output conversion was unexceptional, and - rv characters were written to str (excluding a trailing \0 byte at - str[rv]). - - When rv >= size, output conversion was truncated, and a buffer of - size rv+1 would have been needed to avoid truncation. str[size-1] - is \0 in this case. - - When rv < 0, "something bad happened". str[size-1] is \0 in this - case too, but the rest of str is unreliable. It could be that - an error in format codes was detected by libc, or on platforms - with a non-C99 vsnprintf simply that the buffer wasn't big enough - to avoid truncation, or on platforms without any vsnprintf that - PyMem_Malloc couldn't obtain space for a temp buffer. - - CAUTION: Unlike C99, str != NULL and size > 0 are required. -*/ - -int -PyOS_snprintf(char *str, size_t size, const char *format, ...) -{ - int rc; - va_list va; - - va_start(va, format); - rc = PyOS_vsnprintf(str, size, format, va); - va_end(va); - return rc; -} - -int -PyOS_vsnprintf(char *str, size_t size, const char *format, va_list va) -{ - int len; /* # bytes written, excluding \0 */ -#ifdef HAVE_SNPRINTF -#define _PyOS_vsnprintf_EXTRA_SPACE 1 -#else -#define _PyOS_vsnprintf_EXTRA_SPACE 512 - char *buffer; -#endif - assert(str != NULL); - assert(size > 0); - assert(format != NULL); - /* We take a size_t as input but return an int. Sanity check - * our input so that it won't cause an overflow in the - * vsnprintf return value or the buffer malloc size. */ - if (size > INT_MAX - _PyOS_vsnprintf_EXTRA_SPACE) { - len = -666; - goto Done; - } - -#ifdef HAVE_SNPRINTF - len = vsnprintf(str, size, format, va); -#else - /* Emulate it. */ - buffer = PyMem_MALLOC(size + _PyOS_vsnprintf_EXTRA_SPACE); - if (buffer == NULL) { - len = -666; - goto Done; - } - - len = vsprintf(buffer, format, va); - if (len < 0) - /* ignore the error */; - - else if ((size_t)len >= size + _PyOS_vsnprintf_EXTRA_SPACE) - Py_FatalError("Buffer overflow in PyOS_snprintf/PyOS_vsnprintf"); - - else { - const size_t to_copy = (size_t)len < size ? - (size_t)len : size - 1; - assert(to_copy < size); - memcpy(str, buffer, to_copy); - str[to_copy] = '\0'; - } - PyMem_FREE(buffer); -#endif -Done: - if (size > 0) - str[size-1] = '\0'; - return len; -#undef _PyOS_vsnprintf_EXTRA_SPACE -}
ast3/Pgen/obmalloc.c+0 −2385 removed@@ -1,2385 +0,0 @@ -#include "Python.h" - -#include <stdbool.h> - - -/* Defined in tracemalloc.c */ -extern void _PyMem_DumpTraceback(int fd, const void *ptr); - - -/* Python's malloc wrappers (see pymem.h) */ - -#undef uint -#define uint unsigned int /* assuming >= 16 bits */ - -/* Forward declaration */ -static void* _PyMem_DebugRawMalloc(void *ctx, size_t size); -static void* _PyMem_DebugRawCalloc(void *ctx, size_t nelem, size_t elsize); -static void* _PyMem_DebugRawRealloc(void *ctx, void *ptr, size_t size); -static void _PyMem_DebugRawFree(void *ctx, void *p); - -static void* _PyMem_DebugMalloc(void *ctx, size_t size); -static void* _PyMem_DebugCalloc(void *ctx, size_t nelem, size_t elsize); -static void* _PyMem_DebugRealloc(void *ctx, void *ptr, size_t size); -static void _PyMem_DebugFree(void *ctx, void *p); - -static void _PyObject_DebugDumpAddress(const void *p); -static void _PyMem_DebugCheckAddress(char api_id, const void *p); - -#if defined(__has_feature) /* Clang */ - #if __has_feature(address_sanitizer) /* is ASAN enabled? */ - #define ATTRIBUTE_NO_ADDRESS_SAFETY_ANALYSIS \ - __attribute__((no_address_safety_analysis)) - #else - #define ATTRIBUTE_NO_ADDRESS_SAFETY_ANALYSIS - #endif -#else - #if defined(__SANITIZE_ADDRESS__) /* GCC 4.8.x, is ASAN enabled? */ - #define ATTRIBUTE_NO_ADDRESS_SAFETY_ANALYSIS \ - __attribute__((no_address_safety_analysis)) - #else - #define ATTRIBUTE_NO_ADDRESS_SAFETY_ANALYSIS - #endif -#endif - -#ifdef WITH_PYMALLOC - -#ifdef MS_WINDOWS -# include <windows.h> -#elif defined(HAVE_MMAP) -# include <sys/mman.h> -# ifdef MAP_ANONYMOUS -# define ARENAS_USE_MMAP -# endif -#endif - -/* Forward declaration */ -static void* _PyObject_Malloc(void *ctx, size_t size); -static void* _PyObject_Calloc(void *ctx, size_t nelem, size_t elsize); -static void _PyObject_Free(void *ctx, void *p); -static void* _PyObject_Realloc(void *ctx, void *ptr, size_t size); -#endif - - -static void * -_PyMem_RawMalloc(void *ctx, size_t size) -{ - /* PyMem_RawMalloc(0) means malloc(1). Some systems would return NULL - for malloc(0), which would be treated as an error. Some platforms would - return a pointer with no memory behind it, which would break pymalloc. - To solve these problems, allocate an extra byte. */ - if (size == 0) - size = 1; - return malloc(size); -} - -static void * -_PyMem_RawCalloc(void *ctx, size_t nelem, size_t elsize) -{ - /* PyMem_RawCalloc(0, 0) means calloc(1, 1). Some systems would return NULL - for calloc(0, 0), which would be treated as an error. Some platforms - would return a pointer with no memory behind it, which would break - pymalloc. To solve these problems, allocate an extra byte. */ - if (nelem == 0 || elsize == 0) { - nelem = 1; - elsize = 1; - } - return calloc(nelem, elsize); -} - -static void * -_PyMem_RawRealloc(void *ctx, void *ptr, size_t size) -{ - if (size == 0) - size = 1; - return realloc(ptr, size); -} - -static void -_PyMem_RawFree(void *ctx, void *ptr) -{ - free(ptr); -} - - -#ifdef MS_WINDOWS -static void * -_PyObject_ArenaVirtualAlloc(void *ctx, size_t size) -{ - return VirtualAlloc(NULL, size, - MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE); -} - -static void -_PyObject_ArenaVirtualFree(void *ctx, void *ptr, size_t size) -{ - VirtualFree(ptr, 0, MEM_RELEASE); -} - -#elif defined(ARENAS_USE_MMAP) -static void * -_PyObject_ArenaMmap(void *ctx, size_t size) -{ - void *ptr; - ptr = mmap(NULL, size, PROT_READ|PROT_WRITE, - MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); - if (ptr == MAP_FAILED) - return NULL; - assert(ptr != NULL); - return ptr; -} - -static void -_PyObject_ArenaMunmap(void *ctx, void *ptr, size_t size) -{ - munmap(ptr, size); -} - -#else -static void * -_PyObject_ArenaMalloc(void *ctx, size_t size) -{ - return malloc(size); -} - -static void -_PyObject_ArenaFree(void *ctx, void *ptr, size_t size) -{ - free(ptr); -} -#endif - - -#define PYRAW_FUNCS _PyMem_RawMalloc, _PyMem_RawCalloc, _PyMem_RawRealloc, _PyMem_RawFree -#ifdef WITH_PYMALLOC -# define PYOBJ_FUNCS _PyObject_Malloc, _PyObject_Calloc, _PyObject_Realloc, _PyObject_Free -#else -# define PYOBJ_FUNCS PYRAW_FUNCS -#endif -#define PYMEM_FUNCS PYOBJ_FUNCS - -typedef struct { - /* We tag each block with an API ID in order to tag API violations */ - char api_id; - PyMemAllocatorEx alloc; -} debug_alloc_api_t; -static struct { - debug_alloc_api_t raw; - debug_alloc_api_t mem; - debug_alloc_api_t obj; -} _PyMem_Debug = { - {'r', {NULL, PYRAW_FUNCS}}, - {'m', {NULL, PYMEM_FUNCS}}, - {'o', {NULL, PYOBJ_FUNCS}} - }; - -#define PYRAWDBG_FUNCS \ - _PyMem_DebugRawMalloc, _PyMem_DebugRawCalloc, _PyMem_DebugRawRealloc, _PyMem_DebugRawFree -#define PYDBG_FUNCS \ - _PyMem_DebugMalloc, _PyMem_DebugCalloc, _PyMem_DebugRealloc, _PyMem_DebugFree - -static PyMemAllocatorEx _PyMem_Raw = { -#ifdef Py_DEBUG - &_PyMem_Debug.raw, PYRAWDBG_FUNCS -#else - NULL, PYRAW_FUNCS -#endif - }; - -static PyMemAllocatorEx _PyMem = { -#ifdef Py_DEBUG - &_PyMem_Debug.mem, PYDBG_FUNCS -#else - NULL, PYMEM_FUNCS -#endif - }; - -static PyMemAllocatorEx _PyObject = { -#ifdef Py_DEBUG - &_PyMem_Debug.obj, PYDBG_FUNCS -#else - NULL, PYOBJ_FUNCS -#endif - }; - -int -_PyMem_SetupAllocators(const char *opt) -{ - if (opt == NULL || *opt == '\0') { - /* PYTHONMALLOC is empty or is not set or ignored (-E/-I command line - options): use default allocators */ -#ifdef Py_DEBUG -# ifdef WITH_PYMALLOC - opt = "pymalloc_debug"; -# else - opt = "malloc_debug"; -# endif -#else - /* !Py_DEBUG */ -# ifdef WITH_PYMALLOC - opt = "pymalloc"; -# else - opt = "malloc"; -# endif -#endif - } - - if (strcmp(opt, "debug") == 0) { - PyMem_SetupDebugHooks(); - } - else if (strcmp(opt, "malloc") == 0 || strcmp(opt, "malloc_debug") == 0) - { - PyMemAllocatorEx alloc = {NULL, PYRAW_FUNCS}; - - PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc); - PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &alloc); - PyMem_SetAllocator(PYMEM_DOMAIN_OBJ, &alloc); - - if (strcmp(opt, "malloc_debug") == 0) - PyMem_SetupDebugHooks(); - } -#ifdef WITH_PYMALLOC - else if (strcmp(opt, "pymalloc") == 0 - || strcmp(opt, "pymalloc_debug") == 0) - { - PyMemAllocatorEx raw_alloc = {NULL, PYRAW_FUNCS}; - PyMemAllocatorEx mem_alloc = {NULL, PYMEM_FUNCS}; - PyMemAllocatorEx obj_alloc = {NULL, PYOBJ_FUNCS}; - - PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &raw_alloc); - PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &mem_alloc); - PyMem_SetAllocator(PYMEM_DOMAIN_OBJ, &obj_alloc); - - if (strcmp(opt, "pymalloc_debug") == 0) - PyMem_SetupDebugHooks(); - } -#endif - else { - /* unknown allocator */ - return -1; - } - return 0; -} - -#undef PYRAW_FUNCS -#undef PYMEM_FUNCS -#undef PYOBJ_FUNCS -#undef PYRAWDBG_FUNCS -#undef PYDBG_FUNCS - -static PyObjectArenaAllocator _PyObject_Arena = {NULL, -#ifdef MS_WINDOWS - _PyObject_ArenaVirtualAlloc, _PyObject_ArenaVirtualFree -#elif defined(ARENAS_USE_MMAP) - _PyObject_ArenaMmap, _PyObject_ArenaMunmap -#else - _PyObject_ArenaMalloc, _PyObject_ArenaFree -#endif - }; - -#ifdef WITH_PYMALLOC -static int -_PyMem_DebugEnabled(void) -{ - return (_PyObject.malloc == _PyMem_DebugMalloc); -} - -int -_PyMem_PymallocEnabled(void) -{ - if (_PyMem_DebugEnabled()) { - return (_PyMem_Debug.obj.alloc.malloc == _PyObject_Malloc); - } - else { - return (_PyObject.malloc == _PyObject_Malloc); - } -} -#endif - -void -PyMem_SetupDebugHooks(void) -{ - PyMemAllocatorEx alloc; - - alloc.malloc = _PyMem_DebugRawMalloc; - alloc.calloc = _PyMem_DebugRawCalloc; - alloc.realloc = _PyMem_DebugRawRealloc; - alloc.free = _PyMem_DebugRawFree; - - if (_PyMem_Raw.malloc != _PyMem_DebugRawMalloc) { - alloc.ctx = &_PyMem_Debug.raw; - PyMem_GetAllocator(PYMEM_DOMAIN_RAW, &_PyMem_Debug.raw.alloc); - PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc); - } - - alloc.malloc = _PyMem_DebugMalloc; - alloc.calloc = _PyMem_DebugCalloc; - alloc.realloc = _PyMem_DebugRealloc; - alloc.free = _PyMem_DebugFree; - - if (_PyMem.malloc != _PyMem_DebugMalloc) { - alloc.ctx = &_PyMem_Debug.mem; - PyMem_GetAllocator(PYMEM_DOMAIN_MEM, &_PyMem_Debug.mem.alloc); - PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &alloc); - } - - if (_PyObject.malloc != _PyMem_DebugMalloc) { - alloc.ctx = &_PyMem_Debug.obj; - PyMem_GetAllocator(PYMEM_DOMAIN_OBJ, &_PyMem_Debug.obj.alloc); - PyMem_SetAllocator(PYMEM_DOMAIN_OBJ, &alloc); - } -} - -void -PyMem_GetAllocator(PyMemAllocatorDomain domain, PyMemAllocatorEx *allocator) -{ - switch(domain) - { - case PYMEM_DOMAIN_RAW: *allocator = _PyMem_Raw; break; - case PYMEM_DOMAIN_MEM: *allocator = _PyMem; break; - case PYMEM_DOMAIN_OBJ: *allocator = _PyObject; break; - default: - /* unknown domain: set all attributes to NULL */ - allocator->ctx = NULL; - allocator->malloc = NULL; - allocator->calloc = NULL; - allocator->realloc = NULL; - allocator->free = NULL; - } -} - -void -PyMem_SetAllocator(PyMemAllocatorDomain domain, PyMemAllocatorEx *allocator) -{ - switch(domain) - { - case PYMEM_DOMAIN_RAW: _PyMem_Raw = *allocator; break; - case PYMEM_DOMAIN_MEM: _PyMem = *allocator; break; - case PYMEM_DOMAIN_OBJ: _PyObject = *allocator; break; - /* ignore unknown domain */ - } -} - -void -PyObject_GetArenaAllocator(PyObjectArenaAllocator *allocator) -{ - *allocator = _PyObject_Arena; -} - -void -PyObject_SetArenaAllocator(PyObjectArenaAllocator *allocator) -{ - _PyObject_Arena = *allocator; -} - -void * -PyMem_RawMalloc(size_t size) -{ - /* - * Limit ourselves to PY_SSIZE_T_MAX bytes to prevent security holes. - * Most python internals blindly use a signed Py_ssize_t to track - * things without checking for overflows or negatives. - * As size_t is unsigned, checking for size < 0 is not required. - */ - if (size > (size_t)PY_SSIZE_T_MAX) - return NULL; - return _PyMem_Raw.malloc(_PyMem_Raw.ctx, size); -} - -void * -PyMem_RawCalloc(size_t nelem, size_t elsize) -{ - /* see PyMem_RawMalloc() */ - if (elsize != 0 && nelem > (size_t)PY_SSIZE_T_MAX / elsize) - return NULL; - return _PyMem_Raw.calloc(_PyMem_Raw.ctx, nelem, elsize); -} - -void* -PyMem_RawRealloc(void *ptr, size_t new_size) -{ - /* see PyMem_RawMalloc() */ - if (new_size > (size_t)PY_SSIZE_T_MAX) - return NULL; - return _PyMem_Raw.realloc(_PyMem_Raw.ctx, ptr, new_size); -} - -void PyMem_RawFree(void *ptr) -{ - _PyMem_Raw.free(_PyMem_Raw.ctx, ptr); -} - -void * -PyMem_Malloc(size_t size) -{ - /* see PyMem_RawMalloc() */ - if (size > (size_t)PY_SSIZE_T_MAX) - return NULL; - return _PyMem.malloc(_PyMem.ctx, size); -} - -void * -PyMem_Calloc(size_t nelem, size_t elsize) -{ - /* see PyMem_RawMalloc() */ - if (elsize != 0 && nelem > (size_t)PY_SSIZE_T_MAX / elsize) - return NULL; - return _PyMem.calloc(_PyMem.ctx, nelem, elsize); -} - -void * -PyMem_Realloc(void *ptr, size_t new_size) -{ - /* see PyMem_RawMalloc() */ - if (new_size > (size_t)PY_SSIZE_T_MAX) - return NULL; - return _PyMem.realloc(_PyMem.ctx, ptr, new_size); -} - -void -PyMem_Free(void *ptr) -{ - _PyMem.free(_PyMem.ctx, ptr); -} - -char * -_PyMem_RawStrdup(const char *str) -{ - size_t size; - char *copy; - - size = strlen(str) + 1; - copy = PyMem_RawMalloc(size); - if (copy == NULL) - return NULL; - memcpy(copy, str, size); - return copy; -} - -char * -_PyMem_Strdup(const char *str) -{ - size_t size; - char *copy; - - size = strlen(str) + 1; - copy = PyMem_Malloc(size); - if (copy == NULL) - return NULL; - memcpy(copy, str, size); - return copy; -} - -void * -PyObject_Malloc(size_t size) -{ - /* see PyMem_RawMalloc() */ - if (size > (size_t)PY_SSIZE_T_MAX) - return NULL; - return _PyObject.malloc(_PyObject.ctx, size); -} - -void * -PyObject_Calloc(size_t nelem, size_t elsize) -{ - /* see PyMem_RawMalloc() */ - if (elsize != 0 && nelem > (size_t)PY_SSIZE_T_MAX / elsize) - return NULL; - return _PyObject.calloc(_PyObject.ctx, nelem, elsize); -} - -void * -PyObject_Realloc(void *ptr, size_t new_size) -{ - /* see PyMem_RawMalloc() */ - if (new_size > (size_t)PY_SSIZE_T_MAX) - return NULL; - return _PyObject.realloc(_PyObject.ctx, ptr, new_size); -} - -void -PyObject_Free(void *ptr) -{ - _PyObject.free(_PyObject.ctx, ptr); -} - - -#ifdef WITH_PYMALLOC - -#ifdef WITH_VALGRIND -#include <valgrind/valgrind.h> - -/* If we're using GCC, use __builtin_expect() to reduce overhead of - the valgrind checks */ -#if defined(__GNUC__) && (__GNUC__ > 2) && defined(__OPTIMIZE__) -# define UNLIKELY(value) __builtin_expect((value), 0) -#else -# define UNLIKELY(value) (value) -#endif - -/* -1 indicates that we haven't checked that we're running on valgrind yet. */ -static int running_on_valgrind = -1; -#endif - -/* An object allocator for Python. - - Here is an introduction to the layers of the Python memory architecture, - showing where the object allocator is actually used (layer +2), It is - called for every object allocation and deallocation (PyObject_New/Del), - unless the object-specific allocators implement a proprietary allocation - scheme (ex.: ints use a simple free list). This is also the place where - the cyclic garbage collector operates selectively on container objects. - - - Object-specific allocators - _____ ______ ______ ________ - [ int ] [ dict ] [ list ] ... [ string ] Python core | -+3 | <----- Object-specific memory -----> | <-- Non-object memory --> | - _______________________________ | | - [ Python's object allocator ] | | -+2 | ####### Object memory ####### | <------ Internal buffers ------> | - ______________________________________________________________ | - [ Python's raw memory allocator (PyMem_ API) ] | -+1 | <----- Python memory (under PyMem manager's control) ------> | | - __________________________________________________________________ - [ Underlying general-purpose allocator (ex: C library malloc) ] - 0 | <------ Virtual memory allocated for the python process -------> | - - ========================================================================= - _______________________________________________________________________ - [ OS-specific Virtual Memory Manager (VMM) ] --1 | <--- Kernel dynamic storage allocation & management (page-based) ---> | - __________________________________ __________________________________ - [ ] [ ] --2 | <-- Physical memory: ROM/RAM --> | | <-- Secondary storage (swap) --> | - -*/ -/*==========================================================================*/ - -/* A fast, special-purpose memory allocator for small blocks, to be used - on top of a general-purpose malloc -- heavily based on previous art. */ - -/* Vladimir Marangozov -- August 2000 */ - -/* - * "Memory management is where the rubber meets the road -- if we do the wrong - * thing at any level, the results will not be good. And if we don't make the - * levels work well together, we are in serious trouble." (1) - * - * (1) Paul R. Wilson, Mark S. Johnstone, Michael Neely, and David Boles, - * "Dynamic Storage Allocation: A Survey and Critical Review", - * in Proc. 1995 Int'l. Workshop on Memory Management, September 1995. - */ - -/* #undef WITH_MEMORY_LIMITS */ /* disable mem limit checks */ - -/*==========================================================================*/ - -/* - * Allocation strategy abstract: - * - * For small requests, the allocator sub-allocates <Big> blocks of memory. - * Requests greater than SMALL_REQUEST_THRESHOLD bytes are routed to the - * system's allocator. - * - * Small requests are grouped in size classes spaced 8 bytes apart, due - * to the required valid alignment of the returned address. Requests of - * a particular size are serviced from memory pools of 4K (one VMM page). - * Pools are fragmented on demand and contain free lists of blocks of one - * particular size class. In other words, there is a fixed-size allocator - * for each size class. Free pools are shared by the different allocators - * thus minimizing the space reserved for a particular size class. - * - * This allocation strategy is a variant of what is known as "simple - * segregated storage based on array of free lists". The main drawback of - * simple segregated storage is that we might end up with lot of reserved - * memory for the different free lists, which degenerate in time. To avoid - * this, we partition each free list in pools and we share dynamically the - * reserved space between all free lists. This technique is quite efficient - * for memory intensive programs which allocate mainly small-sized blocks. - * - * For small requests we have the following table: - * - * Request in bytes Size of allocated block Size class idx - * ---------------------------------------------------------------- - * 1-8 8 0 - * 9-16 16 1 - * 17-24 24 2 - * 25-32 32 3 - * 33-40 40 4 - * 41-48 48 5 - * 49-56 56 6 - * 57-64 64 7 - * 65-72 72 8 - * ... ... ... - * 497-504 504 62 - * 505-512 512 63 - * - * 0, SMALL_REQUEST_THRESHOLD + 1 and up: routed to the underlying - * allocator. - */ - -/*==========================================================================*/ - -/* - * -- Main tunable settings section -- - */ - -/* - * Alignment of addresses returned to the user. 8-bytes alignment works - * on most current architectures (with 32-bit or 64-bit address busses). - * The alignment value is also used for grouping small requests in size - * classes spaced ALIGNMENT bytes apart. - * - * You shouldn't change this unless you know what you are doing. - */ -#define ALIGNMENT 8 /* must be 2^N */ -#define ALIGNMENT_SHIFT 3 - -/* Return the number of bytes in size class I, as a uint. */ -#define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT) - -/* - * Max size threshold below which malloc requests are considered to be - * small enough in order to use preallocated memory pools. You can tune - * this value according to your application behaviour and memory needs. - * - * Note: a size threshold of 512 guarantees that newly created dictionaries - * will be allocated from preallocated memory pools on 64-bit. - * - * The following invariants must hold: - * 1) ALIGNMENT <= SMALL_REQUEST_THRESHOLD <= 512 - * 2) SMALL_REQUEST_THRESHOLD is evenly divisible by ALIGNMENT - * - * Although not required, for better performance and space efficiency, - * it is recommended that SMALL_REQUEST_THRESHOLD is set to a power of 2. - */ -#define SMALL_REQUEST_THRESHOLD 512 -#define NB_SMALL_SIZE_CLASSES (SMALL_REQUEST_THRESHOLD / ALIGNMENT) - -/* - * The system's VMM page size can be obtained on most unices with a - * getpagesize() call or deduced from various header files. To make - * things simpler, we assume that it is 4K, which is OK for most systems. - * It is probably better if this is the native page size, but it doesn't - * have to be. In theory, if SYSTEM_PAGE_SIZE is larger than the native page - * size, then `POOL_ADDR(p)->arenaindex' could rarely cause a segmentation - * violation fault. 4K is apparently OK for all the platforms that python - * currently targets. - */ -#define SYSTEM_PAGE_SIZE (4 * 1024) -#define SYSTEM_PAGE_SIZE_MASK (SYSTEM_PAGE_SIZE - 1) - -/* - * Maximum amount of memory managed by the allocator for small requests. - */ -#ifdef WITH_MEMORY_LIMITS -#ifndef SMALL_MEMORY_LIMIT -#define SMALL_MEMORY_LIMIT (64 * 1024 * 1024) /* 64 MB -- more? */ -#endif -#endif - -/* - * The allocator sub-allocates <Big> blocks of memory (called arenas) aligned - * on a page boundary. This is a reserved virtual address space for the - * current process (obtained through a malloc()/mmap() call). In no way this - * means that the memory arenas will be used entirely. A malloc(<Big>) is - * usually an address range reservation for <Big> bytes, unless all pages within - * this space are referenced subsequently. So malloc'ing big blocks and not - * using them does not mean "wasting memory". It's an addressable range - * wastage... - * - * Arenas are allocated with mmap() on systems supporting anonymous memory - * mappings to reduce heap fragmentation. - */ -#define ARENA_SIZE (256 << 10) /* 256KB */ - -#ifdef WITH_MEMORY_LIMITS -#define MAX_ARENAS (SMALL_MEMORY_LIMIT / ARENA_SIZE) -#endif - -/* - * Size of the pools used for small blocks. Should be a power of 2, - * between 1K and SYSTEM_PAGE_SIZE, that is: 1k, 2k, 4k. - */ -#define POOL_SIZE SYSTEM_PAGE_SIZE /* must be 2^N */ -#define POOL_SIZE_MASK SYSTEM_PAGE_SIZE_MASK - -/* - * -- End of tunable settings section -- - */ - -/*==========================================================================*/ - -/* - * Locking - * - * To reduce lock contention, it would probably be better to refine the - * crude function locking with per size class locking. I'm not positive - * however, whether it's worth switching to such locking policy because - * of the performance penalty it might introduce. - * - * The following macros describe the simplest (should also be the fastest) - * lock object on a particular platform and the init/fini/lock/unlock - * operations on it. The locks defined here are not expected to be recursive - * because it is assumed that they will always be called in the order: - * INIT, [LOCK, UNLOCK]*, FINI. - */ - -/* - * Python's threads are serialized, so object malloc locking is disabled. - */ -#define SIMPLELOCK_DECL(lock) /* simple lock declaration */ -#define SIMPLELOCK_INIT(lock) /* allocate (if needed) and initialize */ -#define SIMPLELOCK_FINI(lock) /* free/destroy an existing lock */ -#define SIMPLELOCK_LOCK(lock) /* acquire released lock */ -#define SIMPLELOCK_UNLOCK(lock) /* release acquired lock */ - -/* When you say memory, my mind reasons in terms of (pointers to) blocks */ -typedef uint8_t block; - -/* Pool for small blocks. */ -struct pool_header { - union { block *_padding; - uint count; } ref; /* number of allocated blocks */ - block *freeblock; /* pool's free list head */ - struct pool_header *nextpool; /* next pool of this size class */ - struct pool_header *prevpool; /* previous pool "" */ - uint arenaindex; /* index into arenas of base adr */ - uint szidx; /* block size class index */ - uint nextoffset; /* bytes to virgin block */ - uint maxnextoffset; /* largest valid nextoffset */ -}; - -typedef struct pool_header *poolp; - -/* Record keeping for arenas. */ -struct arena_object { - /* The address of the arena, as returned by malloc. Note that 0 - * will never be returned by a successful malloc, and is used - * here to mark an arena_object that doesn't correspond to an - * allocated arena. - */ - uintptr_t address; - - /* Pool-aligned pointer to the next pool to be carved off. */ - block* pool_address; - - /* The number of available pools in the arena: free pools + never- - * allocated pools. - */ - uint nfreepools; - - /* The total number of pools in the arena, whether or not available. */ - uint ntotalpools; - - /* Singly-linked list of available pools. */ - struct pool_header* freepools; - - /* Whenever this arena_object is not associated with an allocated - * arena, the nextarena member is used to link all unassociated - * arena_objects in the singly-linked `unused_arena_objects` list. - * The prevarena member is unused in this case. - * - * When this arena_object is associated with an allocated arena - * with at least one available pool, both members are used in the - * doubly-linked `usable_arenas` list, which is maintained in - * increasing order of `nfreepools` values. - * - * Else this arena_object is associated with an allocated arena - * all of whose pools are in use. `nextarena` and `prevarena` - * are both meaningless in this case. - */ - struct arena_object* nextarena; - struct arena_object* prevarena; -}; - -#define POOL_OVERHEAD _Py_SIZE_ROUND_UP(sizeof(struct pool_header), ALIGNMENT) - -#define DUMMY_SIZE_IDX 0xffff /* size class of newly cached pools */ - -/* Round pointer P down to the closest pool-aligned address <= P, as a poolp */ -#define POOL_ADDR(P) ((poolp)_Py_ALIGN_DOWN((P), POOL_SIZE)) - -/* Return total number of blocks in pool of size index I, as a uint. */ -#define NUMBLOCKS(I) ((uint)(POOL_SIZE - POOL_OVERHEAD) / INDEX2SIZE(I)) - -/*==========================================================================*/ - -/* - * This malloc lock - */ -SIMPLELOCK_DECL(_malloc_lock) -#define LOCK() SIMPLELOCK_LOCK(_malloc_lock) -#define UNLOCK() SIMPLELOCK_UNLOCK(_malloc_lock) -#define LOCK_INIT() SIMPLELOCK_INIT(_malloc_lock) -#define LOCK_FINI() SIMPLELOCK_FINI(_malloc_lock) - -/* - * Pool table -- headed, circular, doubly-linked lists of partially used pools. - -This is involved. For an index i, usedpools[i+i] is the header for a list of -all partially used pools holding small blocks with "size class idx" i. So -usedpools[0] corresponds to blocks of size 8, usedpools[2] to blocks of size -16, and so on: index 2*i <-> blocks of size (i+1)<<ALIGNMENT_SHIFT. - -Pools are carved off an arena's highwater mark (an arena_object's pool_address -member) as needed. Once carved off, a pool is in one of three states forever -after: - -used == partially used, neither empty nor full - At least one block in the pool is currently allocated, and at least one - block in the pool is not currently allocated (note this implies a pool - has room for at least two blocks). - This is a pool's initial state, as a pool is created only when malloc - needs space. - The pool holds blocks of a fixed size, and is in the circular list headed - at usedpools[i] (see above). It's linked to the other used pools of the - same size class via the pool_header's nextpool and prevpool members. - If all but one block is currently allocated, a malloc can cause a - transition to the full state. If all but one block is not currently - allocated, a free can cause a transition to the empty state. - -full == all the pool's blocks are currently allocated - On transition to full, a pool is unlinked from its usedpools[] list. - It's not linked to from anything then anymore, and its nextpool and - prevpool members are meaningless until it transitions back to used. - A free of a block in a full pool puts the pool back in the used state. - Then it's linked in at the front of the appropriate usedpools[] list, so - that the next allocation for its size class will reuse the freed block. - -empty == all the pool's blocks are currently available for allocation - On transition to empty, a pool is unlinked from its usedpools[] list, - and linked to the front of its arena_object's singly-linked freepools list, - via its nextpool member. The prevpool member has no meaning in this case. - Empty pools have no inherent size class: the next time a malloc finds - an empty list in usedpools[], it takes the first pool off of freepools. - If the size class needed happens to be the same as the size class the pool - last had, some pool initialization can be skipped. - - -Block Management - -Blocks within pools are again carved out as needed. pool->freeblock points to -the start of a singly-linked list of free blocks within the pool. When a -block is freed, it's inserted at the front of its pool's freeblock list. Note -that the available blocks in a pool are *not* linked all together when a pool -is initialized. Instead only "the first two" (lowest addresses) blocks are -set up, returning the first such block, and setting pool->freeblock to a -one-block list holding the second such block. This is consistent with that -pymalloc strives at all levels (arena, pool, and block) never to touch a piece -of memory until it's actually needed. - -So long as a pool is in the used state, we're certain there *is* a block -available for allocating, and pool->freeblock is not NULL. If pool->freeblock -points to the end of the free list before we've carved the entire pool into -blocks, that means we simply haven't yet gotten to one of the higher-address -blocks. The offset from the pool_header to the start of "the next" virgin -block is stored in the pool_header nextoffset member, and the largest value -of nextoffset that makes sense is stored in the maxnextoffset member when a -pool is initialized. All the blocks in a pool have been passed out at least -once when and only when nextoffset > maxnextoffset. - - -Major obscurity: While the usedpools vector is declared to have poolp -entries, it doesn't really. It really contains two pointers per (conceptual) -poolp entry, the nextpool and prevpool members of a pool_header. The -excruciating initialization code below fools C so that - - usedpool[i+i] - -"acts like" a genuine poolp, but only so long as you only reference its -nextpool and prevpool members. The "- 2*sizeof(block *)" gibberish is -compensating for that a pool_header's nextpool and prevpool members -immediately follow a pool_header's first two members: - - union { block *_padding; - uint count; } ref; - block *freeblock; - -each of which consume sizeof(block *) bytes. So what usedpools[i+i] really -contains is a fudged-up pointer p such that *if* C believes it's a poolp -pointer, then p->nextpool and p->prevpool are both p (meaning that the headed -circular list is empty). - -It's unclear why the usedpools setup is so convoluted. It could be to -minimize the amount of cache required to hold this heavily-referenced table -(which only *needs* the two interpool pointer members of a pool_header). OTOH, -referencing code has to remember to "double the index" and doing so isn't -free, usedpools[0] isn't a strictly legal pointer, and we're crucially relying -on that C doesn't insert any padding anywhere in a pool_header at or before -the prevpool member. -**************************************************************************** */ - -#define PTA(x) ((poolp )((uint8_t *)&(usedpools[2*(x)]) - 2*sizeof(block *))) -#define PT(x) PTA(x), PTA(x) - -static poolp usedpools[2 * ((NB_SMALL_SIZE_CLASSES + 7) / 8) * 8] = { - PT(0), PT(1), PT(2), PT(3), PT(4), PT(5), PT(6), PT(7) -#if NB_SMALL_SIZE_CLASSES > 8 - , PT(8), PT(9), PT(10), PT(11), PT(12), PT(13), PT(14), PT(15) -#if NB_SMALL_SIZE_CLASSES > 16 - , PT(16), PT(17), PT(18), PT(19), PT(20), PT(21), PT(22), PT(23) -#if NB_SMALL_SIZE_CLASSES > 24 - , PT(24), PT(25), PT(26), PT(27), PT(28), PT(29), PT(30), PT(31) -#if NB_SMALL_SIZE_CLASSES > 32 - , PT(32), PT(33), PT(34), PT(35), PT(36), PT(37), PT(38), PT(39) -#if NB_SMALL_SIZE_CLASSES > 40 - , PT(40), PT(41), PT(42), PT(43), PT(44), PT(45), PT(46), PT(47) -#if NB_SMALL_SIZE_CLASSES > 48 - , PT(48), PT(49), PT(50), PT(51), PT(52), PT(53), PT(54), PT(55) -#if NB_SMALL_SIZE_CLASSES > 56 - , PT(56), PT(57), PT(58), PT(59), PT(60), PT(61), PT(62), PT(63) -#if NB_SMALL_SIZE_CLASSES > 64 -#error "NB_SMALL_SIZE_CLASSES should be less than 64" -#endif /* NB_SMALL_SIZE_CLASSES > 64 */ -#endif /* NB_SMALL_SIZE_CLASSES > 56 */ -#endif /* NB_SMALL_SIZE_CLASSES > 48 */ -#endif /* NB_SMALL_SIZE_CLASSES > 40 */ -#endif /* NB_SMALL_SIZE_CLASSES > 32 */ -#endif /* NB_SMALL_SIZE_CLASSES > 24 */ -#endif /* NB_SMALL_SIZE_CLASSES > 16 */ -#endif /* NB_SMALL_SIZE_CLASSES > 8 */ -}; - -/*========================================================================== -Arena management. - -`arenas` is a vector of arena_objects. It contains maxarenas entries, some of -which may not be currently used (== they're arena_objects that aren't -currently associated with an allocated arena). Note that arenas proper are -separately malloc'ed. - -Prior to Python 2.5, arenas were never free()'ed. Starting with Python 2.5, -we do try to free() arenas, and use some mild heuristic strategies to increase -the likelihood that arenas eventually can be freed. - -unused_arena_objects - - This is a singly-linked list of the arena_objects that are currently not - being used (no arena is associated with them). Objects are taken off the - head of the list in new_arena(), and are pushed on the head of the list in - PyObject_Free() when the arena is empty. Key invariant: an arena_object - is on this list if and only if its .address member is 0. - -usable_arenas - - This is a doubly-linked list of the arena_objects associated with arenas - that have pools available. These pools are either waiting to be reused, - or have not been used before. The list is sorted to have the most- - allocated arenas first (ascending order based on the nfreepools member). - This means that the next allocation will come from a heavily used arena, - which gives the nearly empty arenas a chance to be returned to the system. - In my unscientific tests this dramatically improved the number of arenas - that could be freed. - -Note that an arena_object associated with an arena all of whose pools are -currently in use isn't on either list. -*/ - -/* Array of objects used to track chunks of memory (arenas). */ -static struct arena_object* arenas = NULL; -/* Number of slots currently allocated in the `arenas` vector. */ -static uint maxarenas = 0; - -/* The head of the singly-linked, NULL-terminated list of available - * arena_objects. - */ -static struct arena_object* unused_arena_objects = NULL; - -/* The head of the doubly-linked, NULL-terminated at each end, list of - * arena_objects associated with arenas that have pools available. - */ -static struct arena_object* usable_arenas = NULL; - -/* How many arena_objects do we initially allocate? - * 16 = can allocate 16 arenas = 16 * ARENA_SIZE = 4MB before growing the - * `arenas` vector. - */ -#define INITIAL_ARENA_OBJECTS 16 - -/* Number of arenas allocated that haven't been free()'d. */ -static size_t narenas_currently_allocated = 0; - -/* Total number of times malloc() called to allocate an arena. */ -static size_t ntimes_arena_allocated = 0; -/* High water mark (max value ever seen) for narenas_currently_allocated. */ -static size_t narenas_highwater = 0; - -static Py_ssize_t _Py_AllocatedBlocks = 0; - -Py_ssize_t -_Py_GetAllocatedBlocks(void) -{ - return _Py_AllocatedBlocks; -} - - -/* Allocate a new arena. If we run out of memory, return NULL. Else - * allocate a new arena, and return the address of an arena_object - * describing the new arena. It's expected that the caller will set - * `usable_arenas` to the return value. - */ -static struct arena_object* -new_arena(void) -{ - struct arena_object* arenaobj; - uint excess; /* number of bytes above pool alignment */ - void *address; - static int debug_stats = -1; - - if (debug_stats == -1) { - char *opt = Py_GETENV("PYTHONMALLOCSTATS"); - debug_stats = (opt != NULL && *opt != '\0'); - } - if (debug_stats) - _PyObject_DebugMallocStats(stderr); - - if (unused_arena_objects == NULL) { - uint i; - uint numarenas; - size_t nbytes; - - /* Double the number of arena objects on each allocation. - * Note that it's possible for `numarenas` to overflow. - */ - numarenas = maxarenas ? maxarenas << 1 : INITIAL_ARENA_OBJECTS; - if (numarenas <= maxarenas) - return NULL; /* overflow */ -#if SIZEOF_SIZE_T <= SIZEOF_INT - if (numarenas > SIZE_MAX / sizeof(*arenas)) - return NULL; /* overflow */ -#endif - nbytes = numarenas * sizeof(*arenas); - arenaobj = (struct arena_object *)PyMem_RawRealloc(arenas, nbytes); - if (arenaobj == NULL) - return NULL; - arenas = arenaobj; - - /* We might need to fix pointers that were copied. However, - * new_arena only gets called when all the pages in the - * previous arenas are full. Thus, there are *no* pointers - * into the old array. Thus, we don't have to worry about - * invalid pointers. Just to be sure, some asserts: - */ - assert(usable_arenas == NULL); - assert(unused_arena_objects == NULL); - - /* Put the new arenas on the unused_arena_objects list. */ - for (i = maxarenas; i < numarenas; ++i) { - arenas[i].address = 0; /* mark as unassociated */ - arenas[i].nextarena = i < numarenas - 1 ? - &arenas[i+1] : NULL; - } - - /* Update globals. */ - unused_arena_objects = &arenas[maxarenas]; - maxarenas = numarenas; - } - - /* Take the next available arena object off the head of the list. */ - assert(unused_arena_objects != NULL); - arenaobj = unused_arena_objects; - unused_arena_objects = arenaobj->nextarena; - assert(arenaobj->address == 0); - address = _PyObject_Arena.alloc(_PyObject_Arena.ctx, ARENA_SIZE); - if (address == NULL) { - /* The allocation failed: return NULL after putting the - * arenaobj back. - */ - arenaobj->nextarena = unused_arena_objects; - unused_arena_objects = arenaobj; - return NULL; - } - arenaobj->address = (uintptr_t)address; - - ++narenas_currently_allocated; - ++ntimes_arena_allocated; - if (narenas_currently_allocated > narenas_highwater) - narenas_highwater = narenas_currently_allocated; - arenaobj->freepools = NULL; - /* pool_address <- first pool-aligned address in the arena - nfreepools <- number of whole pools that fit after alignment */ - arenaobj->pool_address = (block*)arenaobj->address; - arenaobj->nfreepools = ARENA_SIZE / POOL_SIZE; - assert(POOL_SIZE * arenaobj->nfreepools == ARENA_SIZE); - excess = (uint)(arenaobj->address & POOL_SIZE_MASK); - if (excess != 0) { - --arenaobj->nfreepools; - arenaobj->pool_address += POOL_SIZE - excess; - } - arenaobj->ntotalpools = arenaobj->nfreepools; - - return arenaobj; -} - -/* -address_in_range(P, POOL) - -Return true if and only if P is an address that was allocated by pymalloc. -POOL must be the pool address associated with P, i.e., POOL = POOL_ADDR(P) -(the caller is asked to compute this because the macro expands POOL more than -once, and for efficiency it's best for the caller to assign POOL_ADDR(P) to a -variable and pass the latter to the macro; because address_in_range is -called on every alloc/realloc/free, micro-efficiency is important here). - -Tricky: Let B be the arena base address associated with the pool, B = -arenas[(POOL)->arenaindex].address. Then P belongs to the arena if and only if - - B <= P < B + ARENA_SIZE - -Subtracting B throughout, this is true iff - - 0 <= P-B < ARENA_SIZE - -By using unsigned arithmetic, the "0 <=" half of the test can be skipped. - -Obscure: A PyMem "free memory" function can call the pymalloc free or realloc -before the first arena has been allocated. `arenas` is still NULL in that -case. We're relying on that maxarenas is also 0 in that case, so that -(POOL)->arenaindex < maxarenas must be false, saving us from trying to index -into a NULL arenas. - -Details: given P and POOL, the arena_object corresponding to P is AO = -arenas[(POOL)->arenaindex]. Suppose obmalloc controls P. Then (barring wild -stores, etc), POOL is the correct address of P's pool, AO.address is the -correct base address of the pool's arena, and P must be within ARENA_SIZE of -AO.address. In addition, AO.address is not 0 (no arena can start at address 0 -(NULL)). Therefore address_in_range correctly reports that obmalloc -controls P. - -Now suppose obmalloc does not control P (e.g., P was obtained via a direct -call to the system malloc() or realloc()). (POOL)->arenaindex may be anything -in this case -- it may even be uninitialized trash. If the trash arenaindex -is >= maxarenas, the macro correctly concludes at once that obmalloc doesn't -control P. - -Else arenaindex is < maxarena, and AO is read up. If AO corresponds to an -allocated arena, obmalloc controls all the memory in slice AO.address : -AO.address+ARENA_SIZE. By case assumption, P is not controlled by obmalloc, -so P doesn't lie in that slice, so the macro correctly reports that P is not -controlled by obmalloc. - -Finally, if P is not controlled by obmalloc and AO corresponds to an unused -arena_object (one not currently associated with an allocated arena), -AO.address is 0, and the second test in the macro reduces to: - - P < ARENA_SIZE - -If P >= ARENA_SIZE (extremely likely), the macro again correctly concludes -that P is not controlled by obmalloc. However, if P < ARENA_SIZE, this part -of the test still passes, and the third clause (AO.address != 0) is necessary -to get the correct result: AO.address is 0 in this case, so the macro -correctly reports that P is not controlled by obmalloc (despite that P lies in -slice AO.address : AO.address + ARENA_SIZE). - -Note: The third (AO.address != 0) clause was added in Python 2.5. Before -2.5, arenas were never free()'ed, and an arenaindex < maxarena always -corresponded to a currently-allocated arena, so the "P is not controlled by -obmalloc, AO corresponds to an unused arena_object, and P < ARENA_SIZE" case -was impossible. - -Note that the logic is excruciating, and reading up possibly uninitialized -memory when P is not controlled by obmalloc (to get at (POOL)->arenaindex) -creates problems for some memory debuggers. The overwhelming advantage is -that this test determines whether an arbitrary address is controlled by -obmalloc in a small constant time, independent of the number of arenas -obmalloc controls. Since this test is needed at every entry point, it's -extremely desirable that it be this fast. -*/ - -static bool ATTRIBUTE_NO_ADDRESS_SAFETY_ANALYSIS -address_in_range(void *p, poolp pool) -{ - // Since address_in_range may be reading from memory which was not allocated - // by Python, it is important that pool->arenaindex is read only once, as - // another thread may be concurrently modifying the value without holding - // the GIL. The following dance forces the compiler to read pool->arenaindex - // only once. - uint arenaindex = *((volatile uint *)&pool->arenaindex); - return arenaindex < maxarenas && - (uintptr_t)p - arenas[arenaindex].address < ARENA_SIZE && - arenas[arenaindex].address != 0; -} - -/*==========================================================================*/ - -/* malloc. Note that nbytes==0 tries to return a non-NULL pointer, distinct - * from all other currently live pointers. This may not be possible. - */ - -/* - * The basic blocks are ordered by decreasing execution frequency, - * which minimizes the number of jumps in the most common cases, - * improves branching prediction and instruction scheduling (small - * block allocations typically result in a couple of instructions). - * Unless the optimizer reorders everything, being too smart... - */ - -static void * -_PyObject_Alloc(int use_calloc, void *ctx, size_t nelem, size_t elsize) -{ - size_t nbytes; - block *bp; - poolp pool; - poolp next; - uint size; - - _Py_AllocatedBlocks++; - - assert(nelem <= PY_SSIZE_T_MAX / elsize); - nbytes = nelem * elsize; - -#ifdef WITH_VALGRIND - if (UNLIKELY(running_on_valgrind == -1)) - running_on_valgrind = RUNNING_ON_VALGRIND; - if (UNLIKELY(running_on_valgrind)) - goto redirect; -#endif - - if (nelem == 0 || elsize == 0) - goto redirect; - - if ((nbytes - 1) < SMALL_REQUEST_THRESHOLD) { - LOCK(); - /* - * Most frequent paths first - */ - size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT; - pool = usedpools[size + size]; - if (pool != pool->nextpool) { - /* - * There is a used pool for this size class. - * Pick up the head block of its free list. - */ - ++pool->ref.count; - bp = pool->freeblock; - assert(bp != NULL); - if ((pool->freeblock = *(block **)bp) != NULL) { - UNLOCK(); - if (use_calloc) - memset(bp, 0, nbytes); - return (void *)bp; - } - /* - * Reached the end of the free list, try to extend it. - */ - if (pool->nextoffset <= pool->maxnextoffset) { - /* There is room for another block. */ - pool->freeblock = (block*)pool + - pool->nextoffset; - pool->nextoffset += INDEX2SIZE(size); - *(block **)(pool->freeblock) = NULL; - UNLOCK(); - if (use_calloc) - memset(bp, 0, nbytes); - return (void *)bp; - } - /* Pool is full, unlink from used pools. */ - next = pool->nextpool; - pool = pool->prevpool; - next->prevpool = pool; - pool->nextpool = next; - UNLOCK(); - if (use_calloc) - memset(bp, 0, nbytes); - return (void *)bp; - } - - /* There isn't a pool of the right size class immediately - * available: use a free pool. - */ - if (usable_arenas == NULL) { - /* No arena has a free pool: allocate a new arena. */ -#ifdef WITH_MEMORY_LIMITS - if (narenas_currently_allocated >= MAX_ARENAS) { - UNLOCK(); - goto redirect; - } -#endif - usable_arenas = new_arena(); - if (usable_arenas == NULL) { - UNLOCK(); - goto redirect; - } - usable_arenas->nextarena = - usable_arenas->prevarena = NULL; - } - assert(usable_arenas->address != 0); - - /* Try to get a cached free pool. */ - pool = usable_arenas->freepools; - if (pool != NULL) { - /* Unlink from cached pools. */ - usable_arenas->freepools = pool->nextpool; - - /* This arena already had the smallest nfreepools - * value, so decreasing nfreepools doesn't change - * that, and we don't need to rearrange the - * usable_arenas list. However, if the arena has - * become wholly allocated, we need to remove its - * arena_object from usable_arenas. - */ - --usable_arenas->nfreepools; - if (usable_arenas->nfreepools == 0) { - /* Wholly allocated: remove. */ - assert(usable_arenas->freepools == NULL); - assert(usable_arenas->nextarena == NULL || - usable_arenas->nextarena->prevarena == - usable_arenas); - - usable_arenas = usable_arenas->nextarena; - if (usable_arenas != NULL) { - usable_arenas->prevarena = NULL; - assert(usable_arenas->address != 0); - } - } - else { - /* nfreepools > 0: it must be that freepools - * isn't NULL, or that we haven't yet carved - * off all the arena's pools for the first - * time. - */ - assert(usable_arenas->freepools != NULL || - usable_arenas->pool_address <= - (block*)usable_arenas->address + - ARENA_SIZE - POOL_SIZE); - } - init_pool: - /* Frontlink to used pools. */ - next = usedpools[size + size]; /* == prev */ - pool->nextpool = next; - pool->prevpool = next; - next->nextpool = pool; - next->prevpool = pool; - pool->ref.count = 1; - if (pool->szidx == size) { - /* Luckily, this pool last contained blocks - * of the same size class, so its header - * and free list are already initialized. - */ - bp = pool->freeblock; - assert(bp != NULL); - pool->freeblock = *(block **)bp; - UNLOCK(); - if (use_calloc) - memset(bp, 0, nbytes); - return (void *)bp; - } - /* - * Initialize the pool header, set up the free list to - * contain just the second block, and return the first - * block. - */ - pool->szidx = size; - size = INDEX2SIZE(size); - bp = (block *)pool + POOL_OVERHEAD; - pool->nextoffset = POOL_OVERHEAD + (size << 1); - pool->maxnextoffset = POOL_SIZE - size; - pool->freeblock = bp + size; - *(block **)(pool->freeblock) = NULL; - UNLOCK(); - if (use_calloc) - memset(bp, 0, nbytes); - return (void *)bp; - } - - /* Carve off a new pool. */ - assert(usable_arenas->nfreepools > 0); - assert(usable_arenas->freepools == NULL); - pool = (poolp)usable_arenas->pool_address; - assert((block*)pool <= (block*)usable_arenas->address + - ARENA_SIZE - POOL_SIZE); - pool->arenaindex = (uint)(usable_arenas - arenas); - assert(&arenas[pool->arenaindex] == usable_arenas); - pool->szidx = DUMMY_SIZE_IDX; - usable_arenas->pool_address += POOL_SIZE; - --usable_arenas->nfreepools; - - if (usable_arenas->nfreepools == 0) { - assert(usable_arenas->nextarena == NULL || - usable_arenas->nextarena->prevarena == - usable_arenas); - /* Unlink the arena: it is completely allocated. */ - usable_arenas = usable_arenas->nextarena; - if (usable_arenas != NULL) { - usable_arenas->prevarena = NULL; - assert(usable_arenas->address != 0); - } - } - - goto init_pool; - } - - /* The small block allocator ends here. */ - -redirect: - /* Redirect the original request to the underlying (libc) allocator. - * We jump here on bigger requests, on error in the code above (as a - * last chance to serve the request) or when the max memory limit - * has been reached. - */ - { - void *result; - if (use_calloc) - result = PyMem_RawCalloc(nelem, elsize); - else - result = PyMem_RawMalloc(nbytes); - if (!result) - _Py_AllocatedBlocks--; - return result; - } -} - -static void * -_PyObject_Malloc(void *ctx, size_t nbytes) -{ - return _PyObject_Alloc(0, ctx, 1, nbytes); -} - -static void * -_PyObject_Calloc(void *ctx, size_t nelem, size_t elsize) -{ - return _PyObject_Alloc(1, ctx, nelem, elsize); -} - -/* free */ - -static void -_PyObject_Free(void *ctx, void *p) -{ - poolp pool; - block *lastfree; - poolp next, prev; - uint size; - - if (p == NULL) /* free(NULL) has no effect */ - return; - - _Py_AllocatedBlocks--; - -#ifdef WITH_VALGRIND - if (UNLIKELY(running_on_valgrind > 0)) - goto redirect; -#endif - - pool = POOL_ADDR(p); - if (address_in_range(p, pool)) { - /* We allocated this address. */ - LOCK(); - /* Link p to the start of the pool's freeblock list. Since - * the pool had at least the p block outstanding, the pool - * wasn't empty (so it's already in a usedpools[] list, or - * was full and is in no list -- it's not in the freeblocks - * list in any case). - */ - assert(pool->ref.count > 0); /* else it was empty */ - *(block **)p = lastfree = pool->freeblock; - pool->freeblock = (block *)p; - if (lastfree) { - struct arena_object* ao; - uint nf; /* ao->nfreepools */ - - /* freeblock wasn't NULL, so the pool wasn't full, - * and the pool is in a usedpools[] list. - */ - if (--pool->ref.count != 0) { - /* pool isn't empty: leave it in usedpools */ - UNLOCK(); - return; - } - /* Pool is now empty: unlink from usedpools, and - * link to the front of freepools. This ensures that - * previously freed pools will be allocated later - * (being not referenced, they are perhaps paged out). - */ - next = pool->nextpool; - prev = pool->prevpool; - next->prevpool = prev; - prev->nextpool = next; - - /* Link the pool to freepools. This is a singly-linked - * list, and pool->prevpool isn't used there. - */ - ao = &arenas[pool->arenaindex]; - pool->nextpool = ao->freepools; - ao->freepools = pool; - nf = ++ao->nfreepools; - - /* All the rest is arena management. We just freed - * a pool, and there are 4 cases for arena mgmt: - * 1. If all the pools are free, return the arena to - * the system free(). - * 2. If this is the only free pool in the arena, - * add the arena back to the `usable_arenas` list. - * 3. If the "next" arena has a smaller count of free - * pools, we have to "slide this arena right" to - * restore that usable_arenas is sorted in order of - * nfreepools. - * 4. Else there's nothing more to do. - */ - if (nf == ao->ntotalpools) { - /* Case 1. First unlink ao from usable_arenas. - */ - assert(ao->prevarena == NULL || - ao->prevarena->address != 0); - assert(ao ->nextarena == NULL || - ao->nextarena->address != 0); - - /* Fix the pointer in the prevarena, or the - * usable_arenas pointer. - */ - if (ao->prevarena == NULL) { - usable_arenas = ao->nextarena; - assert(usable_arenas == NULL || - usable_arenas->address != 0); - } - else { - assert(ao->prevarena->nextarena == ao); - ao->prevarena->nextarena = - ao->nextarena; - } - /* Fix the pointer in the nextarena. */ - if (ao->nextarena != NULL) { - assert(ao->nextarena->prevarena == ao); - ao->nextarena->prevarena = - ao->prevarena; - } - /* Record that this arena_object slot is - * available to be reused. - */ - ao->nextarena = unused_arena_objects; - unused_arena_objects = ao; - - /* Free the entire arena. */ - _PyObject_Arena.free(_PyObject_Arena.ctx, - (void *)ao->address, ARENA_SIZE); - ao->address = 0; /* mark unassociated */ - --narenas_currently_allocated; - - UNLOCK(); - return; - } - if (nf == 1) { - /* Case 2. Put ao at the head of - * usable_arenas. Note that because - * ao->nfreepools was 0 before, ao isn't - * currently on the usable_arenas list. - */ - ao->nextarena = usable_arenas; - ao->prevarena = NULL; - if (usable_arenas) - usable_arenas->prevarena = ao; - usable_arenas = ao; - assert(usable_arenas->address != 0); - - UNLOCK(); - return; - } - /* If this arena is now out of order, we need to keep - * the list sorted. The list is kept sorted so that - * the "most full" arenas are used first, which allows - * the nearly empty arenas to be completely freed. In - * a few un-scientific tests, it seems like this - * approach allowed a lot more memory to be freed. - */ - if (ao->nextarena == NULL || - nf <= ao->nextarena->nfreepools) { - /* Case 4. Nothing to do. */ - UNLOCK(); - return; - } - /* Case 3: We have to move the arena towards the end - * of the list, because it has more free pools than - * the arena to its right. - * First unlink ao from usable_arenas. - */ - if (ao->prevarena != NULL) { - /* ao isn't at the head of the list */ - assert(ao->prevarena->nextarena == ao); - ao->prevarena->nextarena = ao->nextarena; - } - else { - /* ao is at the head of the list */ - assert(usable_arenas == ao); - usable_arenas = ao->nextarena; - } - ao->nextarena->prevarena = ao->prevarena; - - /* Locate the new insertion point by iterating over - * the list, using our nextarena pointer. - */ - while (ao->nextarena != NULL && - nf > ao->nextarena->nfreepools) { - ao->prevarena = ao->nextarena; - ao->nextarena = ao->nextarena->nextarena; - } - - /* Insert ao at this point. */ - assert(ao->nextarena == NULL || - ao->prevarena == ao->nextarena->prevarena); - assert(ao->prevarena->nextarena == ao->nextarena); - - ao->prevarena->nextarena = ao; - if (ao->nextarena != NULL) - ao->nextarena->prevarena = ao; - - /* Verify that the swaps worked. */ - assert(ao->nextarena == NULL || - nf <= ao->nextarena->nfreepools); - assert(ao->prevarena == NULL || - nf > ao->prevarena->nfreepools); - assert(ao->nextarena == NULL || - ao->nextarena->prevarena == ao); - assert((usable_arenas == ao && - ao->prevarena == NULL) || - ao->prevarena->nextarena == ao); - - UNLOCK(); - return; - } - /* Pool was full, so doesn't currently live in any list: - * link it to the front of the appropriate usedpools[] list. - * This mimics LRU pool usage for new allocations and - * targets optimal filling when several pools contain - * blocks of the same size class. - */ - --pool->ref.count; - assert(pool->ref.count > 0); /* else the pool is empty */ - size = pool->szidx; - next = usedpools[size + size]; - prev = next->prevpool; - /* insert pool before next: prev <-> pool <-> next */ - pool->nextpool = next; - pool->prevpool = prev; - next->prevpool = pool; - prev->nextpool = pool; - UNLOCK(); - return; - } - -#ifdef WITH_VALGRIND -redirect: -#endif - /* We didn't allocate this address. */ - PyMem_RawFree(p); -} - -/* realloc. If p is NULL, this acts like malloc(nbytes). Else if nbytes==0, - * then as the Python docs promise, we do not treat this like free(p), and - * return a non-NULL result. - */ - -static void * -_PyObject_Realloc(void *ctx, void *p, size_t nbytes) -{ - void *bp; - poolp pool; - size_t size; - - if (p == NULL) - return _PyObject_Alloc(0, ctx, 1, nbytes); - -#ifdef WITH_VALGRIND - /* Treat running_on_valgrind == -1 the same as 0 */ - if (UNLIKELY(running_on_valgrind > 0)) - goto redirect; -#endif - - pool = POOL_ADDR(p); - if (address_in_range(p, pool)) { - /* We're in charge of this block */ - size = INDEX2SIZE(pool->szidx); - if (nbytes <= size) { - /* The block is staying the same or shrinking. If - * it's shrinking, there's a tradeoff: it costs - * cycles to copy the block to a smaller size class, - * but it wastes memory not to copy it. The - * compromise here is to copy on shrink only if at - * least 25% of size can be shaved off. - */ - if (4 * nbytes > 3 * size) { - /* It's the same, - * or shrinking and new/old > 3/4. - */ - return p; - } - size = nbytes; - } - bp = _PyObject_Alloc(0, ctx, 1, nbytes); - if (bp != NULL) { - memcpy(bp, p, size); - _PyObject_Free(ctx, p); - } - return bp; - } -#ifdef WITH_VALGRIND - redirect: -#endif - /* We're not managing this block. If nbytes <= - * SMALL_REQUEST_THRESHOLD, it's tempting to try to take over this - * block. However, if we do, we need to copy the valid data from - * the C-managed block to one of our blocks, and there's no portable - * way to know how much of the memory space starting at p is valid. - * As bug 1185883 pointed out the hard way, it's possible that the - * C-managed block is "at the end" of allocated VM space, so that - * a memory fault can occur if we try to copy nbytes bytes starting - * at p. Instead we punt: let C continue to manage this block. - */ - if (nbytes) - return PyMem_RawRealloc(p, nbytes); - /* C doesn't define the result of realloc(p, 0) (it may or may not - * return NULL then), but Python's docs promise that nbytes==0 never - * returns NULL. We don't pass 0 to realloc(), to avoid that endcase - * to begin with. Even then, we can't be sure that realloc() won't - * return NULL. - */ - bp = PyMem_RawRealloc(p, 1); - return bp ? bp : p; -} - -#else /* ! WITH_PYMALLOC */ - -/*==========================================================================*/ -/* pymalloc not enabled: Redirect the entry points to malloc. These will - * only be used by extensions that are compiled with pymalloc enabled. */ - -Py_ssize_t -_Py_GetAllocatedBlocks(void) -{ - return 0; -} - -#endif /* WITH_PYMALLOC */ - - -/*==========================================================================*/ -/* A x-platform debugging allocator. This doesn't manage memory directly, - * it wraps a real allocator, adding extra debugging info to the memory blocks. - */ - -/* Special bytes broadcast into debug memory blocks at appropriate times. - * Strings of these are unlikely to be valid addresses, floats, ints or - * 7-bit ASCII. - */ -#undef CLEANBYTE -#undef DEADBYTE -#undef FORBIDDENBYTE -#define CLEANBYTE 0xCB /* clean (newly allocated) memory */ -#define DEADBYTE 0xDB /* dead (newly freed) memory */ -#define FORBIDDENBYTE 0xFB /* untouchable bytes at each end of a block */ - -static size_t serialno = 0; /* incremented on each debug {m,re}alloc */ - -/* serialno is always incremented via calling this routine. The point is - * to supply a single place to set a breakpoint. - */ -static void -bumpserialno(void) -{ - ++serialno; -} - -#define SST SIZEOF_SIZE_T - -/* Read sizeof(size_t) bytes at p as a big-endian size_t. */ -static size_t -read_size_t(const void *p) -{ - const uint8_t *q = (const uint8_t *)p; - size_t result = *q++; - int i; - - for (i = SST; --i > 0; ++q) - ... [truncated]
ast3/Pgen/parsetok_pgen.c+0 −2 removed@@ -1,2 +0,0 @@ -#define PGEN -#include "parsetok.c"
ast3/Pgen/pgen.c+0 −724 removed@@ -1,724 +0,0 @@ -/* Parser generator */ - -/* For a description, see the comments at end of this file */ - -#include "Python.h" -#include "pgenheaders.h" -#include "token.h" -#include "node.h" -#include "grammar.h" -#include "metagrammar.h" -#include "pgen.h" - -extern int Py_DebugFlag; -extern int Py_IgnoreEnvironmentFlag; /* needed by Py_GETENV */ - - -/* PART ONE -- CONSTRUCT NFA -- Cf. Algorithm 3.2 from [Aho&Ullman 77] */ - -typedef struct _nfaarc { - int ar_label; - int ar_arrow; -} nfaarc; - -typedef struct _nfastate { - int st_narcs; - nfaarc *st_arc; -} nfastate; - -typedef struct _nfa { - int nf_type; - char *nf_name; - int nf_nstates; - nfastate *nf_state; - int nf_start, nf_finish; -} nfa; - -/* Forward */ -static void compile_rhs(labellist *ll, - nfa *nf, node *n, int *pa, int *pb); -static void compile_alt(labellist *ll, - nfa *nf, node *n, int *pa, int *pb); -static void compile_item(labellist *ll, - nfa *nf, node *n, int *pa, int *pb); -static void compile_atom(labellist *ll, - nfa *nf, node *n, int *pa, int *pb); - -static int -addnfastate(nfa *nf) -{ - nfastate *st; - - nf->nf_state = (nfastate *)PyObject_REALLOC(nf->nf_state, - sizeof(nfastate) * (nf->nf_nstates + 1)); - if (nf->nf_state == NULL) - Py_FatalError("out of mem"); - st = &nf->nf_state[nf->nf_nstates++]; - st->st_narcs = 0; - st->st_arc = NULL; - return st - nf->nf_state; -} - -static void -addnfaarc(nfa *nf, int from, int to, int lbl) -{ - nfastate *st; - nfaarc *ar; - - st = &nf->nf_state[from]; - st->st_arc = (nfaarc *)PyObject_REALLOC(st->st_arc, - sizeof(nfaarc) * (st->st_narcs + 1)); - if (st->st_arc == NULL) - Py_FatalError("out of mem"); - ar = &st->st_arc[st->st_narcs++]; - ar->ar_label = lbl; - ar->ar_arrow = to; -} - -static nfa * -newnfa(char *name) -{ - nfa *nf; - static int type = NT_OFFSET; /* All types will be disjunct */ - - nf = (nfa *)PyObject_MALLOC(sizeof(nfa)); - if (nf == NULL) - Py_FatalError("no mem for new nfa"); - nf->nf_type = type++; - nf->nf_name = name; /* XXX strdup(name) ??? */ - nf->nf_nstates = 0; - nf->nf_state = NULL; - nf->nf_start = nf->nf_finish = -1; - return nf; -} - -typedef struct _nfagrammar { - int gr_nnfas; - nfa **gr_nfa; - labellist gr_ll; -} nfagrammar; - -/* Forward */ -static void compile_rule(nfagrammar *gr, node *n); - -static nfagrammar * -newnfagrammar(void) -{ - nfagrammar *gr; - - gr = (nfagrammar *)PyObject_MALLOC(sizeof(nfagrammar)); - if (gr == NULL) - Py_FatalError("no mem for new nfa grammar"); - gr->gr_nnfas = 0; - gr->gr_nfa = NULL; - gr->gr_ll.ll_nlabels = 0; - gr->gr_ll.ll_label = NULL; - addlabel(&gr->gr_ll, ENDMARKER, "EMPTY"); - return gr; -} - -static void -freenfagrammar(nfagrammar *gr) -{ - for (int i = 0; i < gr->gr_nnfas; i++) { - PyObject_FREE(gr->gr_nfa[i]->nf_state); - } - PyObject_FREE(gr->gr_nfa); - PyObject_FREE(gr); -} - -static nfa * -addnfa(nfagrammar *gr, char *name) -{ - nfa *nf; - - nf = newnfa(name); - gr->gr_nfa = (nfa **)PyObject_REALLOC(gr->gr_nfa, - sizeof(nfa*) * (gr->gr_nnfas + 1)); - if (gr->gr_nfa == NULL) - Py_FatalError("out of mem"); - gr->gr_nfa[gr->gr_nnfas++] = nf; - addlabel(&gr->gr_ll, NAME, nf->nf_name); - return nf; -} - -#ifdef Py_DEBUG - -static const char REQNFMT[] = "metacompile: less than %d children\n"; - -#define REQN(i, count) do { \ - if (i < count) { \ - fprintf(stderr, REQNFMT, count); \ - Py_FatalError("REQN"); \ - } \ -} while (0) - -#else -#define REQN(i, count) /* empty */ -#endif - -static nfagrammar * -metacompile(node *n) -{ - nfagrammar *gr; - int i; - - if (Py_DebugFlag) - printf("Compiling (meta-) parse tree into NFA grammar\n"); - gr = newnfagrammar(); - REQ(n, MSTART); - i = n->n_nchildren - 1; /* Last child is ENDMARKER */ - n = n->n_child; - for (; --i >= 0; n++) { - if (n->n_type != NEWLINE) - compile_rule(gr, n); - } - return gr; -} - -static void -compile_rule(nfagrammar *gr, node *n) -{ - nfa *nf; - - REQ(n, RULE); - REQN(n->n_nchildren, 4); - n = n->n_child; - REQ(n, NAME); - nf = addnfa(gr, n->n_str); - n++; - REQ(n, COLON); - n++; - REQ(n, RHS); - compile_rhs(&gr->gr_ll, nf, n, &nf->nf_start, &nf->nf_finish); - n++; - REQ(n, NEWLINE); -} - -static void -compile_rhs(labellist *ll, nfa *nf, node *n, int *pa, int *pb) -{ - int i; - int a, b; - - REQ(n, RHS); - i = n->n_nchildren; - REQN(i, 1); - n = n->n_child; - REQ(n, ALT); - compile_alt(ll, nf, n, pa, pb); - if (--i <= 0) - return; - n++; - a = *pa; - b = *pb; - *pa = addnfastate(nf); - *pb = addnfastate(nf); - addnfaarc(nf, *pa, a, EMPTY); - addnfaarc(nf, b, *pb, EMPTY); - for (; --i >= 0; n++) { - REQ(n, VBAR); - REQN(i, 1); - --i; - n++; - REQ(n, ALT); - compile_alt(ll, nf, n, &a, &b); - addnfaarc(nf, *pa, a, EMPTY); - addnfaarc(nf, b, *pb, EMPTY); - } -} - -static void -compile_alt(labellist *ll, nfa *nf, node *n, int *pa, int *pb) -{ - int i; - int a, b; - - REQ(n, ALT); - i = n->n_nchildren; - REQN(i, 1); - n = n->n_child; - REQ(n, ITEM); - compile_item(ll, nf, n, pa, pb); - --i; - n++; - for (; --i >= 0; n++) { - REQ(n, ITEM); - compile_item(ll, nf, n, &a, &b); - addnfaarc(nf, *pb, a, EMPTY); - *pb = b; - } -} - -static void -compile_item(labellist *ll, nfa *nf, node *n, int *pa, int *pb) -{ - int i; - int a, b; - - REQ(n, ITEM); - i = n->n_nchildren; - REQN(i, 1); - n = n->n_child; - if (n->n_type == LSQB) { - REQN(i, 3); - n++; - REQ(n, RHS); - *pa = addnfastate(nf); - *pb = addnfastate(nf); - addnfaarc(nf, *pa, *pb, EMPTY); - compile_rhs(ll, nf, n, &a, &b); - addnfaarc(nf, *pa, a, EMPTY); - addnfaarc(nf, b, *pb, EMPTY); - REQN(i, 1); - n++; - REQ(n, RSQB); - } - else { - compile_atom(ll, nf, n, pa, pb); - if (--i <= 0) - return; - n++; - addnfaarc(nf, *pb, *pa, EMPTY); - if (n->n_type == STAR) - *pb = *pa; - else - REQ(n, PLUS); - } -} - -static void -compile_atom(labellist *ll, nfa *nf, node *n, int *pa, int *pb) -{ - int i; - - REQ(n, ATOM); - i = n->n_nchildren; - (void)i; /* Don't warn about set but unused */ - REQN(i, 1); - n = n->n_child; - if (n->n_type == LPAR) { - REQN(i, 3); - n++; - REQ(n, RHS); - compile_rhs(ll, nf, n, pa, pb); - n++; - REQ(n, RPAR); - } - else if (n->n_type == NAME || n->n_type == STRING) { - *pa = addnfastate(nf); - *pb = addnfastate(nf); - addnfaarc(nf, *pa, *pb, addlabel(ll, n->n_type, n->n_str)); - } - else - REQ(n, NAME); -} - -static void -dumpstate(labellist *ll, nfa *nf, int istate) -{ - nfastate *st; - int i; - nfaarc *ar; - - printf("%c%2d%c", - istate == nf->nf_start ? '*' : ' ', - istate, - istate == nf->nf_finish ? '.' : ' '); - st = &nf->nf_state[istate]; - ar = st->st_arc; - for (i = 0; i < st->st_narcs; i++) { - if (i > 0) - printf("\n "); - printf("-> %2d %s", ar->ar_arrow, - Ta3Grammar_LabelRepr(&ll->ll_label[ar->ar_label])); - ar++; - } - printf("\n"); -} - -static void -dumpnfa(labellist *ll, nfa *nf) -{ - int i; - - printf("NFA '%s' has %d states; start %d, finish %d\n", - nf->nf_name, nf->nf_nstates, nf->nf_start, nf->nf_finish); - for (i = 0; i < nf->nf_nstates; i++) - dumpstate(ll, nf, i); -} - - -/* PART TWO -- CONSTRUCT DFA -- Algorithm 3.1 from [Aho&Ullman 77] */ - -static void -addclosure(bitset ss, nfa *nf, int istate) -{ - if (addbit(ss, istate)) { - nfastate *st = &nf->nf_state[istate]; - nfaarc *ar = st->st_arc; - int i; - - for (i = st->st_narcs; --i >= 0; ) { - if (ar->ar_label == EMPTY) - addclosure(ss, nf, ar->ar_arrow); - ar++; - } - } -} - -typedef struct _ss_arc { - bitset sa_bitset; - int sa_arrow; - int sa_label; -} ss_arc; - -typedef struct _ss_state { - bitset ss_ss; - int ss_narcs; - struct _ss_arc *ss_arc; - int ss_deleted; - int ss_finish; - int ss_rename; -} ss_state; - -typedef struct _ss_dfa { - int sd_nstates; - ss_state *sd_state; -} ss_dfa; - -/* Forward */ -static void printssdfa(int xx_nstates, ss_state *xx_state, int nbits, - labellist *ll, const char *msg); -static void simplify(int xx_nstates, ss_state *xx_state); -static void convert(dfa *d, int xx_nstates, ss_state *xx_state); - -static void -makedfa(nfagrammar *gr, nfa *nf, dfa *d) -{ - int nbits = nf->nf_nstates; - bitset ss; - int xx_nstates; - ss_state *xx_state, *yy; - ss_arc *zz; - int istate, jstate, iarc, jarc, ibit; - nfastate *st; - nfaarc *ar; - - ss = newbitset(nbits); - addclosure(ss, nf, nf->nf_start); - xx_state = (ss_state *)PyObject_MALLOC(sizeof(ss_state)); - if (xx_state == NULL) - Py_FatalError("no mem for xx_state in makedfa"); - xx_nstates = 1; - yy = &xx_state[0]; - yy->ss_ss = ss; - yy->ss_narcs = 0; - yy->ss_arc = NULL; - yy->ss_deleted = 0; - yy->ss_finish = testbit(ss, nf->nf_finish); - if (yy->ss_finish) - printf("Error: nonterminal '%s' may produce empty.\n", - nf->nf_name); - - /* This algorithm is from a book written before - the invention of structured programming... */ - - /* For each unmarked state... */ - for (istate = 0; istate < xx_nstates; ++istate) { - size_t size; - yy = &xx_state[istate]; - ss = yy->ss_ss; - /* For all its states... */ - for (ibit = 0; ibit < nf->nf_nstates; ++ibit) { - if (!testbit(ss, ibit)) - continue; - st = &nf->nf_state[ibit]; - /* For all non-empty arcs from this state... */ - for (iarc = 0; iarc < st->st_narcs; iarc++) { - ar = &st->st_arc[iarc]; - if (ar->ar_label == EMPTY) - continue; - /* Look up in list of arcs from this state */ - for (jarc = 0; jarc < yy->ss_narcs; ++jarc) { - zz = &yy->ss_arc[jarc]; - if (ar->ar_label == zz->sa_label) - goto found; - } - /* Add new arc for this state */ - size = sizeof(ss_arc) * (yy->ss_narcs + 1); - yy->ss_arc = (ss_arc *)PyObject_REALLOC( - yy->ss_arc, size); - if (yy->ss_arc == NULL) - Py_FatalError("out of mem"); - zz = &yy->ss_arc[yy->ss_narcs++]; - zz->sa_label = ar->ar_label; - zz->sa_bitset = newbitset(nbits); - zz->sa_arrow = -1; - found: ; - /* Add destination */ - addclosure(zz->sa_bitset, nf, ar->ar_arrow); - } - } - /* Now look up all the arrow states */ - for (jarc = 0; jarc < xx_state[istate].ss_narcs; jarc++) { - zz = &xx_state[istate].ss_arc[jarc]; - for (jstate = 0; jstate < xx_nstates; jstate++) { - if (samebitset(zz->sa_bitset, - xx_state[jstate].ss_ss, nbits)) { - zz->sa_arrow = jstate; - goto done; - } - } - size = sizeof(ss_state) * (xx_nstates + 1); - xx_state = (ss_state *)PyObject_REALLOC(xx_state, - size); - if (xx_state == NULL) - Py_FatalError("out of mem"); - zz->sa_arrow = xx_nstates; - yy = &xx_state[xx_nstates++]; - yy->ss_ss = zz->sa_bitset; - yy->ss_narcs = 0; - yy->ss_arc = NULL; - yy->ss_deleted = 0; - yy->ss_finish = testbit(yy->ss_ss, nf->nf_finish); - done: ; - } - } - - if (Py_DebugFlag) - printssdfa(xx_nstates, xx_state, nbits, &gr->gr_ll, - "before minimizing"); - - simplify(xx_nstates, xx_state); - - if (Py_DebugFlag) - printssdfa(xx_nstates, xx_state, nbits, &gr->gr_ll, - "after minimizing"); - - convert(d, xx_nstates, xx_state); - - for (int i = 0; i < xx_nstates; i++) { - for (int j = 0; j < xx_state[i].ss_narcs; j++) - delbitset(xx_state[i].ss_arc[j].sa_bitset); - PyObject_FREE(xx_state[i].ss_arc); - } - PyObject_FREE(xx_state); -} - -static void -printssdfa(int xx_nstates, ss_state *xx_state, int nbits, - labellist *ll, const char *msg) -{ - int i, ibit, iarc; - ss_state *yy; - ss_arc *zz; - - printf("Subset DFA %s\n", msg); - for (i = 0; i < xx_nstates; i++) { - yy = &xx_state[i]; - if (yy->ss_deleted) - continue; - printf(" Subset %d", i); - if (yy->ss_finish) - printf(" (finish)"); - printf(" { "); - for (ibit = 0; ibit < nbits; ibit++) { - if (testbit(yy->ss_ss, ibit)) - printf("%d ", ibit); - } - printf("}\n"); - for (iarc = 0; iarc < yy->ss_narcs; iarc++) { - zz = &yy->ss_arc[iarc]; - printf(" Arc to state %d, label %s\n", - zz->sa_arrow, - Ta3Grammar_LabelRepr( - &ll->ll_label[zz->sa_label])); - } - } -} - - -/* PART THREE -- SIMPLIFY DFA */ - -/* Simplify the DFA by repeatedly eliminating states that are - equivalent to another oner. This is NOT Algorithm 3.3 from - [Aho&Ullman 77]. It does not always finds the minimal DFA, - but it does usually make a much smaller one... (For an example - of sub-optimal behavior, try S: x a b+ | y a b+.) -*/ - -static int -samestate(ss_state *s1, ss_state *s2) -{ - int i; - - if (s1->ss_narcs != s2->ss_narcs || s1->ss_finish != s2->ss_finish) - return 0; - for (i = 0; i < s1->ss_narcs; i++) { - if (s1->ss_arc[i].sa_arrow != s2->ss_arc[i].sa_arrow || - s1->ss_arc[i].sa_label != s2->ss_arc[i].sa_label) - return 0; - } - return 1; -} - -static void -renamestates(int xx_nstates, ss_state *xx_state, int from, int to) -{ - int i, j; - - if (Py_DebugFlag) - printf("Rename state %d to %d.\n", from, to); - for (i = 0; i < xx_nstates; i++) { - if (xx_state[i].ss_deleted) - continue; - for (j = 0; j < xx_state[i].ss_narcs; j++) { - if (xx_state[i].ss_arc[j].sa_arrow == from) - xx_state[i].ss_arc[j].sa_arrow = to; - } - } -} - -static void -simplify(int xx_nstates, ss_state *xx_state) -{ - int changes; - int i, j; - - do { - changes = 0; - for (i = 1; i < xx_nstates; i++) { - if (xx_state[i].ss_deleted) - continue; - for (j = 0; j < i; j++) { - if (xx_state[j].ss_deleted) - continue; - if (samestate(&xx_state[i], &xx_state[j])) { - xx_state[i].ss_deleted++; - renamestates(xx_nstates, xx_state, - i, j); - changes++; - break; - } - } - } - } while (changes); -} - - -/* PART FOUR -- GENERATE PARSING TABLES */ - -/* Convert the DFA into a grammar that can be used by our parser */ - -static void -convert(dfa *d, int xx_nstates, ss_state *xx_state) -{ - int i, j; - ss_state *yy; - ss_arc *zz; - - for (i = 0; i < xx_nstates; i++) { - yy = &xx_state[i]; - if (yy->ss_deleted) - continue; - yy->ss_rename = addstate(d); - } - - for (i = 0; i < xx_nstates; i++) { - yy = &xx_state[i]; - if (yy->ss_deleted) - continue; - for (j = 0; j < yy->ss_narcs; j++) { - zz = &yy->ss_arc[j]; - addarc(d, yy->ss_rename, - xx_state[zz->sa_arrow].ss_rename, - zz->sa_label); - } - if (yy->ss_finish) - addarc(d, yy->ss_rename, yy->ss_rename, 0); - } - - d->d_initial = 0; -} - - -/* PART FIVE -- GLUE IT ALL TOGETHER */ - -static grammar * -maketables(nfagrammar *gr) -{ - int i; - nfa *nf; - dfa *d; - grammar *g; - - if (gr->gr_nnfas == 0) - return NULL; - g = newgrammar(gr->gr_nfa[0]->nf_type); - /* XXX first rule must be start rule */ - g->g_ll = gr->gr_ll; - - for (i = 0; i < gr->gr_nnfas; i++) { - nf = gr->gr_nfa[i]; - if (Py_DebugFlag) { - printf("Dump of NFA for '%s' ...\n", nf->nf_name); - dumpnfa(&gr->gr_ll, nf); - printf("Making DFA for '%s' ...\n", nf->nf_name); - } - d = adddfa(g, nf->nf_type, nf->nf_name); - makedfa(gr, gr->gr_nfa[i], d); - } - - return g; -} - -grammar * -pgen(node *n) -{ - nfagrammar *gr; - grammar *g; - - gr = metacompile(n); - g = maketables(gr); - translatelabels(g); - addfirstsets(g); - freenfagrammar(gr); - return g; -} - -grammar * -Py_pgen(node *n) -{ - return pgen(n); -} - -/* - -Description ------------ - -Input is a grammar in extended BNF (using * for repetition, + for -at-least-once repetition, [] for optional parts, | for alternatives and -() for grouping). This has already been parsed and turned into a parse -tree. - -Each rule is considered as a regular expression in its own right. -It is turned into a Non-deterministic Finite Automaton (NFA), which -is then turned into a Deterministic Finite Automaton (DFA), which is then -optimized to reduce the number of states. See [Aho&Ullman 77] chapter 3, -or similar compiler books (this technique is more often used for lexical -analyzers). - -The DFA's are used by the parser as parsing tables in a special way -that's probably unique. Before they are usable, the FIRST sets of all -non-terminals are computed. - -Reference ---------- - -[Aho&Ullman 77] - Aho&Ullman, Principles of Compiler Design, Addison-Wesley 1977 - (first edition) - -*/
ast3/Pgen/pgenmain.c+0 −189 removed@@ -1,189 +0,0 @@ - -/* Parser generator main program */ - -/* This expects a filename containing the grammar as argv[1] (UNIX) - or asks the console for such a file name (THINK C). - It writes its output on two files in the current directory: - - "graminit.c" gets the grammar as a bunch of initialized data - - "graminit.h" gets the grammar's non-terminals as #defines. - Error messages and status info during the generation process are - written to stdout, or sometimes to stderr. */ - -/* XXX TO DO: - - check for duplicate definitions of names (instead of fatal err) -*/ - -#define PGEN - -#include "Python.h" -#include "pgenheaders.h" -#include "grammar.h" -#include "node.h" -#include "parsetok.h" -#include "pgen.h" - -int Py_DebugFlag; -int Py_VerboseFlag; -int Py_IgnoreEnvironmentFlag; - -/* Forward */ -grammar *getgrammar(const char *filename); - -void Py_Exit(int) _Py_NO_RETURN; - -void -Py_Exit(int sts) -{ - exit(sts); -} - -#ifdef WITH_THREAD -/* Needed by obmalloc.c */ -int PyGILState_Check(void) -{ return 1; } -#endif - -void _PyMem_DumpTraceback(int fd, const void *ptr) -{} - -int -main(int argc, char **argv) -{ - grammar *g; - FILE *fp; - char *filename, *graminit_h, *graminit_c; - - if (argc != 4) { - fprintf(stderr, - "usage: %s grammar graminit.h graminit.c\n", argv[0]); - Py_Exit(2); - } - filename = argv[1]; - graminit_h = argv[2]; - graminit_c = argv[3]; - g = getgrammar(filename); - fp = fopen(graminit_c, "w"); - if (fp == NULL) { - perror(graminit_c); - Py_Exit(1); - } - if (Py_DebugFlag) - printf("Writing %s ...\n", graminit_c); - printgrammar(g, fp); - fclose(fp); - fp = fopen(graminit_h, "w"); - if (fp == NULL) { - perror(graminit_h); - Py_Exit(1); - } - if (Py_DebugFlag) - printf("Writing %s ...\n", graminit_h); - printnonterminals(g, fp); - fclose(fp); - freegrammar(g); - Py_Exit(0); - return 0; /* Make gcc -Wall happy */ -} - -grammar * -getgrammar(const char *filename) -{ - FILE *fp; - node *n; - grammar *g0, *g; - perrdetail err; - - fp = fopen(filename, "r"); - if (fp == NULL) { - perror(filename); - Py_Exit(1); - } - g0 = meta_grammar(); - n = Ta3Parser_ParseFile(fp, filename, g0, g0->g_start, - (char *)NULL, (char *)NULL, &err); - fclose(fp); - if (n == NULL) { - fprintf(stderr, "Parsing error %d, line %d.\n", - err.error, err.lineno); - if (err.text != NULL) { - size_t len; - int i; - fprintf(stderr, "%s", err.text); - len = strlen(err.text); - if (len == 0 || err.text[len-1] != '\n') - fprintf(stderr, "\n"); - for (i = 0; i < err.offset; i++) { - if (err.text[i] == '\t') - putc('\t', stderr); - else - putc(' ', stderr); - } - fprintf(stderr, "^\n"); - PyObject_FREE(err.text); - } - Py_Exit(1); - } - g = pgen(n); - Ta3Node_Free(n); - if (g == NULL) { - printf("Bad grammar.\n"); - Py_Exit(1); - } - return g; -} - -/* Can't happen in pgen */ -PyObject* -PyErr_Occurred() -{ - return 0; -} - -void -Py_FatalError(const char *msg) -{ - fprintf(stderr, "pgen: FATAL ERROR: %s\n", msg); - Py_Exit(1); -} - -/* No-nonsense my_readline() for tokenizer.c */ - -char * -PyOS_Readline(FILE *sys_stdin, FILE *sys_stdout, const char *prompt) -{ - size_t n = 1000; - char *p = (char *)PyMem_MALLOC(n); - char *q; - if (p == NULL) - return NULL; - fprintf(stderr, "%s", prompt); - q = fgets(p, n, sys_stdin); - if (q == NULL) { - *p = '\0'; - return p; - } - n = strlen(p); - if (n > 0 && p[n-1] != '\n') - p[n-1] = '\n'; - return (char *)PyMem_REALLOC(p, n+1); -} - -/* No-nonsense fgets */ -char * -Py_UniversalNewlineFgets(char *buf, int n, FILE *stream, PyObject *fobj) -{ - return fgets(buf, n, stream); -} - - -#include <stdarg.h> - -void -PySys_WriteStderr(const char *format, ...) -{ - va_list va; - - va_start(va, format); - vfprintf(stderr, format, va); - va_end(va); -}
ast3/Pgen/printgrammar.c+0 −120 removed@@ -1,120 +0,0 @@ - -/* Print a bunch of C initializers that represent a grammar */ - -#define PGEN - -#include "pgenheaders.h" -#include "grammar.h" - -/* Forward */ -static void printarcs(int, dfa *, FILE *); -static void printstates(grammar *, FILE *); -static void printdfas(grammar *, FILE *); -static void printlabels(grammar *, FILE *); - -void -printgrammar(grammar *g, FILE *fp) -{ - fprintf(fp, "/* Generated by Parser/pgen */\n\n"); - fprintf(fp, "#include \"pgenheaders.h\"\n"); - fprintf(fp, "#include \"grammar.h\"\n"); - fprintf(fp, "extern grammar _Ta3Parser_Grammar;\n"); - printdfas(g, fp); - printlabels(g, fp); - fprintf(fp, "grammar _Ta3Parser_Grammar = {\n"); - fprintf(fp, " %d,\n", g->g_ndfas); - fprintf(fp, " dfas,\n"); - fprintf(fp, " {%d, labels},\n", g->g_ll.ll_nlabels); - fprintf(fp, " %d\n", g->g_start); - fprintf(fp, "};\n"); -} - -void -printnonterminals(grammar *g, FILE *fp) -{ - dfa *d; - int i; - - fprintf(fp, "/* Generated by Parser/pgen */\n\n"); - - d = g->g_dfa; - for (i = g->g_ndfas; --i >= 0; d++) - fprintf(fp, "#define %s %d\n", d->d_name, d->d_type); -} - -static void -printarcs(int i, dfa *d, FILE *fp) -{ - arc *a; - state *s; - int j, k; - - s = d->d_state; - for (j = 0; j < d->d_nstates; j++, s++) { - fprintf(fp, "static arc arcs_%d_%d[%d] = {\n", - i, j, s->s_narcs); - a = s->s_arc; - for (k = 0; k < s->s_narcs; k++, a++) - fprintf(fp, " {%d, %d},\n", a->a_lbl, a->a_arrow); - fprintf(fp, "};\n"); - } -} - -static void -printstates(grammar *g, FILE *fp) -{ - state *s; - dfa *d; - int i, j; - - d = g->g_dfa; - for (i = 0; i < g->g_ndfas; i++, d++) { - printarcs(i, d, fp); - fprintf(fp, "static state states_%d[%d] = {\n", - i, d->d_nstates); - s = d->d_state; - for (j = 0; j < d->d_nstates; j++, s++) - fprintf(fp, " {%d, arcs_%d_%d},\n", - s->s_narcs, i, j); - fprintf(fp, "};\n"); - } -} - -static void -printdfas(grammar *g, FILE *fp) -{ - dfa *d; - int i, j, n; - - printstates(g, fp); - fprintf(fp, "static dfa dfas[%d] = {\n", g->g_ndfas); - d = g->g_dfa; - for (i = 0; i < g->g_ndfas; i++, d++) { - fprintf(fp, " {%d, \"%s\", %d, %d, states_%d,\n", - d->d_type, d->d_name, d->d_initial, d->d_nstates, i); - fprintf(fp, " \""); - n = NBYTES(g->g_ll.ll_nlabels); - for (j = 0; j < n; j++) - fprintf(fp, "\\%03o", d->d_first[j] & 0xff); - fprintf(fp, "\"},\n"); - } - fprintf(fp, "};\n"); -} - -static void -printlabels(grammar *g, FILE *fp) -{ - label *l; - int i; - - fprintf(fp, "static label labels[%d] = {\n", g->g_ll.ll_nlabels); - l = g->g_ll.ll_label; - for (i = g->g_ll.ll_nlabels; --i >= 0; l++) { - if (l->lb_str == NULL) - fprintf(fp, " {%d, 0},\n", l->lb_type); - else - fprintf(fp, " {%d, \"%s\"},\n", - l->lb_type, l->lb_str); - } - fprintf(fp, "};\n"); -}
ast3/Pgen/pyctype.c+0 −214 removed@@ -1,214 +0,0 @@ -#include "Python.h" - -/* Our own locale-independent ctype.h-like macros */ - -const unsigned int _Py_ctype_table[256] = { - 0, /* 0x0 '\x00' */ - 0, /* 0x1 '\x01' */ - 0, /* 0x2 '\x02' */ - 0, /* 0x3 '\x03' */ - 0, /* 0x4 '\x04' */ - 0, /* 0x5 '\x05' */ - 0, /* 0x6 '\x06' */ - 0, /* 0x7 '\x07' */ - 0, /* 0x8 '\x08' */ - PY_CTF_SPACE, /* 0x9 '\t' */ - PY_CTF_SPACE, /* 0xa '\n' */ - PY_CTF_SPACE, /* 0xb '\v' */ - PY_CTF_SPACE, /* 0xc '\f' */ - PY_CTF_SPACE, /* 0xd '\r' */ - 0, /* 0xe '\x0e' */ - 0, /* 0xf '\x0f' */ - 0, /* 0x10 '\x10' */ - 0, /* 0x11 '\x11' */ - 0, /* 0x12 '\x12' */ - 0, /* 0x13 '\x13' */ - 0, /* 0x14 '\x14' */ - 0, /* 0x15 '\x15' */ - 0, /* 0x16 '\x16' */ - 0, /* 0x17 '\x17' */ - 0, /* 0x18 '\x18' */ - 0, /* 0x19 '\x19' */ - 0, /* 0x1a '\x1a' */ - 0, /* 0x1b '\x1b' */ - 0, /* 0x1c '\x1c' */ - 0, /* 0x1d '\x1d' */ - 0, /* 0x1e '\x1e' */ - 0, /* 0x1f '\x1f' */ - PY_CTF_SPACE, /* 0x20 ' ' */ - 0, /* 0x21 '!' */ - 0, /* 0x22 '"' */ - 0, /* 0x23 '#' */ - 0, /* 0x24 '$' */ - 0, /* 0x25 '%' */ - 0, /* 0x26 '&' */ - 0, /* 0x27 "'" */ - 0, /* 0x28 '(' */ - 0, /* 0x29 ')' */ - 0, /* 0x2a '*' */ - 0, /* 0x2b '+' */ - 0, /* 0x2c ',' */ - 0, /* 0x2d '-' */ - 0, /* 0x2e '.' */ - 0, /* 0x2f '/' */ - PY_CTF_DIGIT|PY_CTF_XDIGIT, /* 0x30 '0' */ - PY_CTF_DIGIT|PY_CTF_XDIGIT, /* 0x31 '1' */ - PY_CTF_DIGIT|PY_CTF_XDIGIT, /* 0x32 '2' */ - PY_CTF_DIGIT|PY_CTF_XDIGIT, /* 0x33 '3' */ - PY_CTF_DIGIT|PY_CTF_XDIGIT, /* 0x34 '4' */ - PY_CTF_DIGIT|PY_CTF_XDIGIT, /* 0x35 '5' */ - PY_CTF_DIGIT|PY_CTF_XDIGIT, /* 0x36 '6' */ - PY_CTF_DIGIT|PY_CTF_XDIGIT, /* 0x37 '7' */ - PY_CTF_DIGIT|PY_CTF_XDIGIT, /* 0x38 '8' */ - PY_CTF_DIGIT|PY_CTF_XDIGIT, /* 0x39 '9' */ - 0, /* 0x3a ':' */ - 0, /* 0x3b ';' */ - 0, /* 0x3c '<' */ - 0, /* 0x3d '=' */ - 0, /* 0x3e '>' */ - 0, /* 0x3f '?' */ - 0, /* 0x40 '@' */ - PY_CTF_UPPER|PY_CTF_XDIGIT, /* 0x41 'A' */ - PY_CTF_UPPER|PY_CTF_XDIGIT, /* 0x42 'B' */ - PY_CTF_UPPER|PY_CTF_XDIGIT, /* 0x43 'C' */ - PY_CTF_UPPER|PY_CTF_XDIGIT, /* 0x44 'D' */ - PY_CTF_UPPER|PY_CTF_XDIGIT, /* 0x45 'E' */ - PY_CTF_UPPER|PY_CTF_XDIGIT, /* 0x46 'F' */ - PY_CTF_UPPER, /* 0x47 'G' */ - PY_CTF_UPPER, /* 0x48 'H' */ - PY_CTF_UPPER, /* 0x49 'I' */ - PY_CTF_UPPER, /* 0x4a 'J' */ - PY_CTF_UPPER, /* 0x4b 'K' */ - PY_CTF_UPPER, /* 0x4c 'L' */ - PY_CTF_UPPER, /* 0x4d 'M' */ - PY_CTF_UPPER, /* 0x4e 'N' */ - PY_CTF_UPPER, /* 0x4f 'O' */ - PY_CTF_UPPER, /* 0x50 'P' */ - PY_CTF_UPPER, /* 0x51 'Q' */ - PY_CTF_UPPER, /* 0x52 'R' */ - PY_CTF_UPPER, /* 0x53 'S' */ - PY_CTF_UPPER, /* 0x54 'T' */ - PY_CTF_UPPER, /* 0x55 'U' */ - PY_CTF_UPPER, /* 0x56 'V' */ - PY_CTF_UPPER, /* 0x57 'W' */ - PY_CTF_UPPER, /* 0x58 'X' */ - PY_CTF_UPPER, /* 0x59 'Y' */ - PY_CTF_UPPER, /* 0x5a 'Z' */ - 0, /* 0x5b '[' */ - 0, /* 0x5c '\\' */ - 0, /* 0x5d ']' */ - 0, /* 0x5e '^' */ - 0, /* 0x5f '_' */ - 0, /* 0x60 '`' */ - PY_CTF_LOWER|PY_CTF_XDIGIT, /* 0x61 'a' */ - PY_CTF_LOWER|PY_CTF_XDIGIT, /* 0x62 'b' */ - PY_CTF_LOWER|PY_CTF_XDIGIT, /* 0x63 'c' */ - PY_CTF_LOWER|PY_CTF_XDIGIT, /* 0x64 'd' */ - PY_CTF_LOWER|PY_CTF_XDIGIT, /* 0x65 'e' */ - PY_CTF_LOWER|PY_CTF_XDIGIT, /* 0x66 'f' */ - PY_CTF_LOWER, /* 0x67 'g' */ - PY_CTF_LOWER, /* 0x68 'h' */ - PY_CTF_LOWER, /* 0x69 'i' */ - PY_CTF_LOWER, /* 0x6a 'j' */ - PY_CTF_LOWER, /* 0x6b 'k' */ - PY_CTF_LOWER, /* 0x6c 'l' */ - PY_CTF_LOWER, /* 0x6d 'm' */ - PY_CTF_LOWER, /* 0x6e 'n' */ - PY_CTF_LOWER, /* 0x6f 'o' */ - PY_CTF_LOWER, /* 0x70 'p' */ - PY_CTF_LOWER, /* 0x71 'q' */ - PY_CTF_LOWER, /* 0x72 'r' */ - PY_CTF_LOWER, /* 0x73 's' */ - PY_CTF_LOWER, /* 0x74 't' */ - PY_CTF_LOWER, /* 0x75 'u' */ - PY_CTF_LOWER, /* 0x76 'v' */ - PY_CTF_LOWER, /* 0x77 'w' */ - PY_CTF_LOWER, /* 0x78 'x' */ - PY_CTF_LOWER, /* 0x79 'y' */ - PY_CTF_LOWER, /* 0x7a 'z' */ - 0, /* 0x7b '{' */ - 0, /* 0x7c '|' */ - 0, /* 0x7d '}' */ - 0, /* 0x7e '~' */ - 0, /* 0x7f '\x7f' */ - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -}; - - -const unsigned char _Py_ctype_tolower[256] = { - 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, - 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, - 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, - 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, - 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, - 0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e, 0x2f, - 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, - 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f, - 0x40, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67, - 0x68, 0x69, 0x6a, 0x6b, 0x6c, 0x6d, 0x6e, 0x6f, - 0x70, 0x71, 0x72, 0x73, 0x74, 0x75, 0x76, 0x77, - 0x78, 0x79, 0x7a, 0x5b, 0x5c, 0x5d, 0x5e, 0x5f, - 0x60, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67, - 0x68, 0x69, 0x6a, 0x6b, 0x6c, 0x6d, 0x6e, 0x6f, - 0x70, 0x71, 0x72, 0x73, 0x74, 0x75, 0x76, 0x77, - 0x78, 0x79, 0x7a, 0x7b, 0x7c, 0x7d, 0x7e, 0x7f, - 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, - 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, - 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, - 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f, - 0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7, - 0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf, - 0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7, - 0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf, - 0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7, - 0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf, - 0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7, - 0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf, - 0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7, - 0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef, - 0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7, - 0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff, -}; - -const unsigned char _Py_ctype_toupper[256] = { - 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, - 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, - 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, - 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, - 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, - 0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e, 0x2f, - 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, - 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f, - 0x40, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, - 0x48, 0x49, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4f, - 0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57, - 0x58, 0x59, 0x5a, 0x5b, 0x5c, 0x5d, 0x5e, 0x5f, - 0x60, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, - 0x48, 0x49, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4f, - 0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57, - 0x58, 0x59, 0x5a, 0x7b, 0x7c, 0x7d, 0x7e, 0x7f, - 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, - 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, - 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, - 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f, - 0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7, - 0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf, - 0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7, - 0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf, - 0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7, - 0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf, - 0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7, - 0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf, - 0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7, - 0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef, - 0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7, - 0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff, -}; -
ast3/Pgen/tokenizer_pgen.c+0 −2 removed@@ -1,2 +0,0 @@ -#define PGEN -#include "tokenizer.c"
ast3/Python/ast.c+360 −171 modified@@ -8,15 +8,63 @@ #include "node.h" #include "ast.h" #include "token.h" +#include "pythonrun.h" #include <assert.h> -#if PY_MINOR_VERSION < 4 -#define PyErr_ProgramTextObject PyErr_ProgramText +// VS 2010 doesn't have <stdbool.h>... +typedef int bool; +#define false 0 +#define true 1 + +#ifndef _PyObject_FastCall +static PyObject * +_PyObject_FastCall(PyObject *func, PyObject *const *args, int nargs) +{ + PyObject *t, *res; + int i; + + t = PyTuple_New(nargs); + if (t == NULL) { + return NULL; + } + for (i = 0; i < nargs; i++) { + if (PyTuple_SetItem(t, i, args[i]) < 0) { + Py_DECREF(t); + return NULL; + } + } + res = PyObject_CallObject(func, t); + Py_DECREF(t); + return res; +} +#endif + +#if PY_MINOR_VERSION < 6 +#define _PyUnicode_EqualToASCIIString(a, b) (PyUnicode_CompareWithASCIIString((a), (b)) == 0) + +static PyObject * +_PyBytes_DecodeEscape(const char *s, + Py_ssize_t len, + const char *errors, + Py_ssize_t unicode, + const char *recode_encoding, + const char **first_invalid_escape) +{ + *first_invalid_escape = NULL; + return PyBytes_DecodeEscape(s, len, errors, unicode, recode_encoding); +} + +PyObject * +_PyUnicode_DecodeUnicodeEscape(const char *s, + Py_ssize_t size, + const char *errors, + const char **first_invalid_escape) +{ + *first_invalid_escape = NULL; + return PyUnicode_DecodeUnicodeEscape(s, size, errors); +} -#define PyMem_RawMalloc PyMem_Malloc -#define PyMem_RawRealloc PyMem_Realloc -#define PyMem_RawFree PyMem_Free #endif static int validate_stmts(asdl_seq *); @@ -112,8 +160,7 @@ expr_context_name(expr_context_ty ctx) case Param: return "Param"; default: - assert(0); - return "(unknown)"; + abort(); } } @@ -601,24 +648,23 @@ struct compiling { PyArena *c_arena; /* Arena for allocating memory. */ PyObject *c_filename; /* filename */ PyObject *c_normalize; /* Normalization function from unicodedata. */ - PyObject *c_normalize_args; /* Normalization argument tuple. */ int c_feature_version; /* Latest minior version of Python for allowed features */ }; static asdl_seq *seq_for_testlist(struct compiling *, const node *); static expr_ty ast_for_expr(struct compiling *, const node *); static stmt_ty ast_for_stmt(struct compiling *, const node *); -static asdl_seq *ast_for_suite(struct compiling *, const node *); +static asdl_seq *ast_for_suite(struct compiling *c, const node *n); static asdl_seq *ast_for_exprlist(struct compiling *, const node *, expr_context_ty); static expr_ty ast_for_testlist(struct compiling *, const node *); static stmt_ty ast_for_classdef(struct compiling *, const node *, asdl_seq *); -static stmt_ty ast_for_with_stmt(struct compiling *, const node *, int); -static stmt_ty ast_for_for_stmt(struct compiling *, const node *, int); +static stmt_ty ast_for_with_stmt(struct compiling *, const node *, bool); +static stmt_ty ast_for_for_stmt(struct compiling *, const node *, bool); /* Note different signature for ast_for_call */ -static expr_ty ast_for_call(struct compiling *, const node *, expr_ty); +static expr_ty ast_for_call(struct compiling *, const node *, expr_ty, bool); static PyObject *parsenumber(struct compiling *, const char *); static expr_ty parsestrplus(struct compiling *, const node *n); @@ -637,12 +683,6 @@ init_normalization(struct compiling *c) Py_DECREF(m); if (!c->c_normalize) return 0; - c->c_normalize_args = Py_BuildValue("(sN)", "NFKC", Py_None); - if (!c->c_normalize_args) { - Py_CLEAR(c->c_normalize); - return 0; - } - PyTuple_SET_ITEM(c->c_normalize_args, 1, NULL); return 1; } @@ -658,15 +698,32 @@ new_identifier(const char *n, struct compiling *c) identifier; if so, normalize to NFKC. */ if (!PyUnicode_IS_ASCII(id)) { PyObject *id2; + PyObject *form; + PyObject *args[2]; + _Py_IDENTIFIER(NFKC); if (!c->c_normalize && !init_normalization(c)) { Py_DECREF(id); return NULL; } - PyTuple_SET_ITEM(c->c_normalize_args, 1, id); - id2 = PyObject_Call(c->c_normalize, c->c_normalize_args, NULL); + form = _PyUnicode_FromId(&PyId_NFKC); + if (form == NULL) { + Py_DECREF(id); + return NULL; + } + args[0] = form; + args[1] = id; + id2 = _PyObject_FastCall(c->c_normalize, args, 2); Py_DECREF(id); if (!id2) return NULL; + if (!PyUnicode_Check(id2)) { + PyErr_Format(PyExc_TypeError, + "unicodedata.normalize() must return a string, not " + "%.200s", + Py_TYPE(id2)->tp_name); + Py_DECREF(id2); + return NULL; + } id = id2; } PyUnicode_InternInPlace(&id); @@ -775,17 +832,16 @@ num_stmts(const node *n) Py_FatalError(buf); } } - assert(0); - return 0; + abort(); } /* Transform the CST rooted at node * to the appropriate AST */ mod_ty Ta3AST_FromNodeObject(const node *n, PyCompilerFlags *flags, - PyObject *filename, int feature_version, - PyArena *arena) + PyObject *filename, int feature_version, + PyArena *arena) { int i, j, k, num; asdl_seq *stmts = NULL; @@ -801,7 +857,6 @@ Ta3AST_FromNodeObject(const node *n, PyCompilerFlags *flags, /* borrowed reference */ c.c_filename = filename; c.c_normalize = NULL; - c.c_normalize_args = NULL; c.c_feature_version = feature_version; if (TYPE(n) == encoding_decl) @@ -848,8 +903,8 @@ Ta3AST_FromNodeObject(const node *n, PyCompilerFlags *flags, for (i = 0; i < num; i++) { type_ignore_ty ti = TypeIgnore(LINENO(CHILD(ch, i)), arena); if (!ti) - goto out; - asdl_seq_SET(type_ignores, i, ti); + goto out; + asdl_seq_SET(type_ignores, i, ti); } res = Module(stmts, type_ignores, arena); @@ -945,15 +1000,13 @@ Ta3AST_FromNodeObject(const node *n, PyCompilerFlags *flags, out: if (c.c_normalize) { Py_DECREF(c.c_normalize); - PyTuple_SET_ITEM(c.c_normalize_args, 1, NULL); - Py_DECREF(c.c_normalize_args); } return res; } mod_ty Ta3AST_FromNode(const node *n, PyCompilerFlags *flags, const char *filename_str, - int feature_version, PyArena *arena) + int feature_version, PyArena *arena) { mod_ty mod; PyObject *filename; @@ -1019,14 +1072,14 @@ forbidden_name(struct compiling *c, identifier name, const node *n, int full_checks) { assert(PyUnicode_Check(name)); - if (PyUnicode_CompareWithASCIIString(name, "__debug__") == 0) { + if (_PyUnicode_EqualToASCIIString(name, "__debug__")) { ast_error(c, n, "assignment to keyword"); return 1; } if (full_checks) { const char * const *p; for (p = FORBIDDEN; *p; p++) { - if (PyUnicode_CompareWithASCIIString(name, *p) == 0) { + if (_PyUnicode_EqualToASCIIString(name, *p)) { ast_error(c, n, "assignment to keyword"); return 1; } @@ -1242,6 +1295,7 @@ ast_for_comp_op(struct compiling *c, const node *n) return In; if (strcmp(STR(n), "is") == 0) return Is; + /* fall through */ default: PyErr_Format(PyExc_SystemError, "invalid comp_op: %s", STR(n)); @@ -1256,6 +1310,7 @@ ast_for_comp_op(struct compiling *c, const node *n) return NotIn; if (strcmp(STR(CHILD(n, 0)), "is") == 0) return IsNot; + /* fall through */ default: PyErr_Format(PyExc_SystemError, "invalid comp_op: %s %s", STR(CHILD(n, 0)), STR(CHILD(n, 1))); @@ -1382,7 +1437,7 @@ handle_keywordonly_args(struct compiling *c, const node *n, int start, goto error; asdl_seq_SET(kwonlyargs, j++, arg); i += 1; /* the name */ - if (i < NCH(n) && TYPE(CHILD(n, i)) == COMMA) + if (TYPE(CHILD(n, i)) == COMMA) i += 1; /* the comma, if present */ break; case TYPE_COMMENT: @@ -1486,11 +1541,6 @@ ast_for_arguments(struct compiling *c, const node *n) if (!kwdefaults && nkwonlyargs) return NULL; - if (nposargs + nkwonlyargs > 255) { - ast_error(c, n, "more than 255 arguments"); - return NULL; - } - /* tfpdef: NAME [':' test] vfpdef: NAME */ @@ -1524,7 +1574,7 @@ ast_for_arguments(struct compiling *c, const node *n) return NULL; asdl_seq_SET(posargs, k++, arg); i += 1; /* the name */ - if (i < NCH(n) && TYPE(CHILD(n, i)) == COMMA) + if (TYPE(CHILD(n, i)) == COMMA) i += 1; /* the comma, if present */ break; case STAR: @@ -1540,7 +1590,7 @@ ast_for_arguments(struct compiling *c, const node *n) int res = 0; i += 2; /* now follows keyword only arguments */ - if (i < NCH(n) && TYPE(CHILD(n, i)) == TYPE_COMMENT) { + if (TYPE(CHILD(n, i)) == TYPE_COMMENT) { ast_error(c, CHILD(n, i), "bare * has associated type comment"); return NULL; @@ -1556,11 +1606,11 @@ ast_for_arguments(struct compiling *c, const node *n) if (!vararg) return NULL; - i += 2; /* the star and the name */ - if (i < NCH(n) && TYPE(CHILD(n, i)) == COMMA) - i += 1; /* the comma, if present */ + i += 2; /* the star and the name */ + if (TYPE(CHILD(n, i)) == COMMA) + i += 1; /* the comma, if present */ - if (i < NCH(n) && TYPE(CHILD(n, i)) == TYPE_COMMENT) { + if (TYPE(CHILD(n, i)) == TYPE_COMMENT) { vararg->type_comment = NEW_TYPE_COMMENT(CHILD(n, i)); i += 1; } @@ -1582,7 +1632,7 @@ ast_for_arguments(struct compiling *c, const node *n) if (!kwarg) return NULL; i += 2; /* the double star and the name */ - if (i < NCH(n) && TYPE(CHILD(n, i)) == COMMA) + if (TYPE(CHILD(n, i)) == COMMA) i += 1; /* the comma, if present */ break; case TYPE_COMMENT: @@ -1664,7 +1714,7 @@ ast_for_decorator(struct compiling *c, const node *n) name_expr = NULL; } else { - d = ast_for_call(c, CHILD(n, 3), name_expr); + d = ast_for_call(c, CHILD(n, 3), name_expr, true); if (!d) return NULL; name_expr = NULL; @@ -1695,10 +1745,11 @@ ast_for_decorators(struct compiling *c, const node *n) } static stmt_ty -ast_for_funcdef_impl(struct compiling *c, const node *n, - asdl_seq *decorator_seq, int is_async) +ast_for_funcdef_impl(struct compiling *c, const node *n0, + asdl_seq *decorator_seq, bool is_async) { /* funcdef: 'def' NAME parameters ['->' test] ':' [TYPE_COMMENT] suite */ + const node * const n = is_async ? CHILD(n0, 1) : n0; identifier name; arguments_ty args; asdl_seq *body; @@ -1709,7 +1760,7 @@ ast_for_funcdef_impl(struct compiling *c, const node *n, if (is_async && c->c_feature_version < 5) { ast_error(c, n, - "Async functions are only supported in Python 3.5 and greater"); + "Async functions are only supported in Python 3.5 and greater"); return NULL; } @@ -1748,53 +1799,53 @@ ast_for_funcdef_impl(struct compiling *c, const node *n, if (is_async) return AsyncFunctionDef(name, args, body, decorator_seq, returns, - type_comment, LINENO(n), - n->n_col_offset, c->c_arena); + type_comment, LINENO(n0), n0->n_col_offset, c->c_arena); else return FunctionDef(name, args, body, decorator_seq, returns, - type_comment, LINENO(n), - n->n_col_offset, c->c_arena); + type_comment, LINENO(n), n->n_col_offset, c->c_arena); } static stmt_ty ast_for_async_funcdef(struct compiling *c, const node *n, asdl_seq *decorator_seq) { - /* async_funcdef: ASYNC funcdef */ + /* async_funcdef: 'async' funcdef */ REQ(n, async_funcdef); - REQ(CHILD(n, 0), ASYNC); + REQ(CHILD(n, 0), NAME); + assert(strcmp(STR(CHILD(n, 0)), "async") == 0); REQ(CHILD(n, 1), funcdef); - return ast_for_funcdef_impl(c, CHILD(n, 1), decorator_seq, - 1 /* is_async */); + return ast_for_funcdef_impl(c, n, decorator_seq, + true /* is_async */); } static stmt_ty ast_for_funcdef(struct compiling *c, const node *n, asdl_seq *decorator_seq) { /* funcdef: 'def' NAME parameters ['->' test] ':' suite */ return ast_for_funcdef_impl(c, n, decorator_seq, - 0 /* is_async */); + false /* is_async */); } static stmt_ty ast_for_async_stmt(struct compiling *c, const node *n) { - /* async_stmt: ASYNC (funcdef | with_stmt | for_stmt) */ + /* async_stmt: 'async' (funcdef | with_stmt | for_stmt) */ REQ(n, async_stmt); - REQ(CHILD(n, 0), ASYNC); + REQ(CHILD(n, 0), NAME); + assert(strcmp(STR(CHILD(n, 0)), "async") == 0); switch (TYPE(CHILD(n, 1))) { case funcdef: - return ast_for_funcdef_impl(c, CHILD(n, 1), NULL, - 1 /* is_async */); + return ast_for_funcdef_impl(c, n, NULL, + true /* is_async */); case with_stmt: - return ast_for_with_stmt(c, CHILD(n, 1), - 1 /* is_async */); + return ast_for_with_stmt(c, n, + true /* is_async */); case for_stmt: - return ast_for_for_stmt(c, CHILD(n, 1), - 1 /* is_async */); + return ast_for_for_stmt(c, n, + true /* is_async */); default: PyErr_Format(PyExc_SystemError, @@ -1895,17 +1946,23 @@ static int count_comp_fors(struct compiling *c, const node *n) { int n_fors = 0; - int is_async; count_comp_for: - is_async = 0; n_fors++; REQ(n, comp_for); - if (TYPE(CHILD(n, 0)) == ASYNC) { - is_async = 1; + if (NCH(n) == 2) { + REQ(CHILD(n, 0), NAME); + assert(strcmp(STR(CHILD(n, 0)), "async") == 0); + n = CHILD(n, 1); } - if (NCH(n) == (5 + is_async)) { - n = CHILD(n, 4 + is_async); + else if (NCH(n) == 1) { + n = CHILD(n, 0); + } + else { + goto error; + } + if (NCH(n) == (5)) { + n = CHILD(n, 4); } else { return n_fors; @@ -1924,6 +1981,7 @@ count_comp_fors(struct compiling *c, const node *n) return n_fors; } + error: /* Should never be reached */ PyErr_SetString(PyExc_SystemError, "logic error in count_comp_fors"); @@ -1972,13 +2030,21 @@ ast_for_comprehension(struct compiling *c, const node *n) asdl_seq *t; expr_ty expression, first; node *for_ch; + node *sync_n; int is_async = 0; REQ(n, comp_for); - if (TYPE(CHILD(n, 0)) == ASYNC) { + if (NCH(n) == 2) { is_async = 1; + REQ(CHILD(n, 0), NAME); + assert(strcmp(STR(CHILD(n, 0)), "async") == 0); + sync_n = CHILD(n, 1); } + else { + sync_n = CHILD(n, 0); + } + REQ(sync_n, sync_comp_for); /* Async comprehensions only allowed in Python 3.6 and greater */ if (is_async && c->c_feature_version < 6) { @@ -1987,11 +2053,11 @@ ast_for_comprehension(struct compiling *c, const node *n) return NULL; } - for_ch = CHILD(n, 1 + is_async); + for_ch = CHILD(sync_n, 1); t = ast_for_exprlist(c, for_ch, Store); if (!t) return NULL; - expression = ast_for_expr(c, CHILD(n, 3 + is_async)); + expression = ast_for_expr(c, CHILD(sync_n, 3)); if (!expression) return NULL; @@ -2008,11 +2074,11 @@ ast_for_comprehension(struct compiling *c, const node *n) if (!comp) return NULL; - if (NCH(n) == (5 + is_async)) { + if (NCH(sync_n) == 5) { int j, n_ifs; asdl_seq *ifs; - n = CHILD(n, 4 + is_async); + n = CHILD(sync_n, 4); n_ifs = count_comp_ifs(c, n); if (n_ifs == -1) return NULL; @@ -2283,7 +2349,7 @@ ast_for_atom(struct compiling *c, const node *n) "Underscores in numeric literals are only supported in Python 3.6 and greater"); return NULL; } - pynum = parsenumber(c, s); + pynum = parsenumber(c, STR(ch)); if (!pynum) return NULL; @@ -2506,7 +2572,7 @@ ast_for_trailer(struct compiling *c, const node *n, expr_ty left_expr) return Call(left_expr, NULL, NULL, LINENO(n), n->n_col_offset, c->c_arena); else - return ast_for_call(c, CHILD(n, 1), left_expr); + return ast_for_call(c, CHILD(n, 1), left_expr, true); } else if (TYPE(CHILD(n, 0)) == DOT) { PyObject *attr_id = NEW_IDENTIFIER(CHILD(n, 1)); @@ -2635,7 +2701,7 @@ ast_for_atom_expr(struct compiling *c, const node *n) } if (start) { - /* there was an AWAIT */ + /* there was an 'await' */ return Await(e, LINENO(n), n->n_col_offset, c->c_arena); } else { @@ -2700,7 +2766,7 @@ ast_for_expr(struct compiling *c, const node *n) term: factor (('*'|'@'|'/'|'%'|'//') factor)* factor: ('+'|'-'|'~') factor | power power: atom_expr ['**' factor] - atom_expr: [AWAIT] atom trailer* + atom_expr: ['await'] atom trailer* yield_expr: 'yield' [yield_arg] */ @@ -2848,14 +2914,14 @@ ast_for_expr(struct compiling *c, const node *n) } static expr_ty -ast_for_call(struct compiling *c, const node *n, expr_ty func) +ast_for_call(struct compiling *c, const node *n, expr_ty func, bool allowgen) { /* arglist: argument (',' argument)* [','] argument: ( test [comp_for] | '*' test | test '=' test | '**' test ) */ - int i, nargs, nkeywords, ngens; + int i, nargs, nkeywords; int ndoublestars; asdl_seq *args; asdl_seq *keywords; @@ -2864,33 +2930,31 @@ ast_for_call(struct compiling *c, const node *n, expr_ty func) nargs = 0; nkeywords = 0; - ngens = 0; for (i = 0; i < NCH(n); i++) { node *ch = CHILD(n, i); if (TYPE(ch) == argument) { if (NCH(ch) == 1) nargs++; - else if (TYPE(CHILD(ch, 1)) == comp_for) - ngens++; + else if (TYPE(CHILD(ch, 1)) == comp_for) { + nargs++; + if (!allowgen) { + ast_error(c, ch, "invalid syntax"); + return NULL; + } + if (NCH(n) > 1) { + ast_error(c, ch, "Generator expression must be parenthesized"); + return NULL; + } + } else if (TYPE(CHILD(ch, 0)) == STAR) nargs++; else /* TYPE(CHILD(ch, 0)) == DOUBLESTAR or keyword argument */ nkeywords++; } } - if (ngens > 1 || (ngens && (nargs || nkeywords))) { - ast_error(c, n, "Generator expression must be parenthesized " - "if not sole argument"); - return NULL; - } - if (nargs + nkeywords + ngens > 255) { - ast_error(c, n, "more than 255 arguments"); - return NULL; - } - - args = _Ta3_asdl_seq_new(nargs + ngens, c->c_arena); + args = _Ta3_asdl_seq_new(nargs, c->c_arena); if (!args) return NULL; keywords = _Ta3_asdl_seq_new(nkeywords, c->c_arena); @@ -3042,6 +3106,7 @@ static stmt_ty ast_for_expr_stmt(struct compiling *c, const node *n) { int num; + REQ(n, expr_stmt); /* expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist) | ('=' (yield_expr|testlist_star_expr))* [TYPE_COMMENT]) @@ -3313,6 +3378,7 @@ ast_for_flow_stmt(struct compiling *c, const node *n) } return Raise(expression, cause, LINENO(n), n->n_col_offset, c->c_arena); } + /* fall through */ default: PyErr_Format(PyExc_SystemError, "unexpected flow_stmt: %d", TYPE(ch)); @@ -3424,6 +3490,8 @@ alias_for_import_name(struct compiling *c, const node *n, int store) break; case STAR: str = PyUnicode_InternFromString("*"); + if (!str) + return NULL; if (PyArena_AddPyObject(c->c_arena, str) < 0) { Py_DECREF(str); return NULL; @@ -3855,8 +3923,9 @@ ast_for_while_stmt(struct compiling *c, const node *n) } static stmt_ty -ast_for_for_stmt(struct compiling *c, const node *n, int is_async) +ast_for_for_stmt(struct compiling *c, const node *n0, bool is_async) { + const node * const n = is_async ? CHILD(n0, 1) : n0; asdl_seq *_target, *seq = NULL, *suite_seq; expr_ty expression; expr_ty target, first; @@ -3906,12 +3975,12 @@ ast_for_for_stmt(struct compiling *c, const node *n, int is_async) type_comment = NULL; if (is_async) - return AsyncFor(target, expression, suite_seq, seq, - type_comment, LINENO(n), n->n_col_offset, + return AsyncFor(target, expression, suite_seq, seq, type_comment, + LINENO(n0), n0->n_col_offset, c->c_arena); else - return For(target, expression, suite_seq, seq, - type_comment, LINENO(n), n->n_col_offset, + return For(target, expression, suite_seq, seq, type_comment, + LINENO(n), n->n_col_offset, c->c_arena); } @@ -4059,8 +4128,9 @@ ast_for_with_item(struct compiling *c, const node *n) /* with_stmt: 'with' with_item (',' with_item)* ':' [TYPE_COMMENT] suite */ static stmt_ty -ast_for_with_stmt(struct compiling *c, const node *n, int is_async) +ast_for_with_stmt(struct compiling *c, const node *n0, bool is_async) { + const node * const n = is_async ? CHILD(n0, 1) : n0; int i, n_items, nch_minus_type, has_type_comment; asdl_seq *items, *body; string type_comment; @@ -4097,7 +4167,7 @@ ast_for_with_stmt(struct compiling *c, const node *n, int is_async) type_comment = NULL; if (is_async) - return AsyncWith(items, body, type_comment, LINENO(n), n->n_col_offset, c->c_arena); + return AsyncWith(items, body, type_comment, LINENO(n0), n0->n_col_offset, c->c_arena); else return With(items, body, type_comment, LINENO(n), n->n_col_offset, c->c_arena); } @@ -4121,21 +4191,21 @@ ast_for_classdef(struct compiling *c, const node *n, asdl_seq *decorator_seq) return NULL; if (forbidden_name(c, classname, CHILD(n, 3), 0)) return NULL; - return ClassDef(classname, NULL, NULL, s, decorator_seq, LINENO(n), - n->n_col_offset, c->c_arena); + return ClassDef(classname, NULL, NULL, s, decorator_seq, + LINENO(n), n->n_col_offset, c->c_arena); } if (TYPE(CHILD(n, 3)) == RPAR) { /* class NAME '(' ')' ':' suite */ - s = ast_for_suite(c, CHILD(n,5)); + s = ast_for_suite(c, CHILD(n, 5)); if (!s) return NULL; classname = NEW_IDENTIFIER(CHILD(n, 1)); if (!classname) return NULL; if (forbidden_name(c, classname, CHILD(n, 3), 0)) return NULL; - return ClassDef(classname, NULL, NULL, s, decorator_seq, LINENO(n), - n->n_col_offset, c->c_arena); + return ClassDef(classname, NULL, NULL, s, decorator_seq, + LINENO(n), n->n_col_offset, c->c_arena); } /* class NAME '(' arglist ')' ':' suite */ @@ -4147,7 +4217,7 @@ ast_for_classdef(struct compiling *c, const node *n, asdl_seq *decorator_seq) if (!dummy_name) return NULL; dummy = Name(dummy_name, Load, LINENO(n), n->n_col_offset, c->c_arena); - call = ast_for_call(c, CHILD(n, 3), dummy); + call = ast_for_call(c, CHILD(n, 3), dummy, false); if (!call) return NULL; } @@ -4294,6 +4364,9 @@ parsenumber(struct compiling *c, const char *s) } /* Create a duplicate without underscores. */ dup = PyMem_Malloc(strlen(s) + 1); + if (dup == NULL) { + return PyErr_NoMemory(); + } end = dup; for (; *s; s++) { if (*s != '_') { @@ -4317,14 +4390,47 @@ decode_utf8(struct compiling *c, const char **sPtr, const char *end) return PyUnicode_DecodeUTF8(t, s - t, NULL); } +static int +warn_invalid_escape_sequence(struct compiling *c, const node *n, + unsigned char first_invalid_escape_char) +{ + PyObject *msg = PyUnicode_FromFormat("invalid escape sequence \\%c", + first_invalid_escape_char); + if (msg == NULL) { + return -1; + } + if (PyErr_WarnExplicitObject(PyExc_DeprecationWarning, msg, + c->c_filename, LINENO(n), + NULL, NULL) < 0) + { + if (PyErr_ExceptionMatches(PyExc_DeprecationWarning)) { + const char *s; + + /* Replace the DeprecationWarning exception with a SyntaxError + to get a more accurate error report */ + PyErr_Clear(); + + s = PyUnicode_AsUTF8(msg); + if (s != NULL) { + ast_error(c, n, s); + } + } + Py_DECREF(msg); + return -1; + } + Py_DECREF(msg); + return 0; +} + static PyObject * decode_unicode_with_escapes(struct compiling *c, const node *n, const char *s, size_t len) { - PyObject *u; + PyObject *v, *u; char *buf; char *p; const char *end; + const char *first_invalid_escape; /* check for integer overflow */ if (len > SIZE_MAX / 6) @@ -4339,9 +4445,11 @@ decode_unicode_with_escapes(struct compiling *c, const node *n, const char *s, while (s < end) { if (*s == '\\') { *p++ = *s++; - if (*s & 0x80) { + if (s >= end || *s & 0x80) { strcpy(p, "u005c"); p += 5; + if (s >= end) + break; } } if (*s & 0x80) { /* XXX inefficient */ @@ -4363,7 +4471,7 @@ decode_unicode_with_escapes(struct compiling *c, const node *n, const char *s, p += 10; } /* Should be impossible to overflow */ - assert(p - buf <= Py_SIZE(u)); + assert(p - buf <= PyBytes_GET_SIZE(u)); Py_DECREF(w); } else { *p++ = *s++; @@ -4372,14 +4480,88 @@ decode_unicode_with_escapes(struct compiling *c, const node *n, const char *s, len = p - buf; s = buf; - return PyUnicode_DecodeUnicodeEscape(s, len, NULL); + v = _PyUnicode_DecodeUnicodeEscape(s, len, NULL, &first_invalid_escape); + + if (v != NULL && first_invalid_escape != NULL) { + if (warn_invalid_escape_sequence(c, n, *first_invalid_escape) < 0) { + /* We have not decref u before because first_invalid_escape points + inside u. */ + Py_XDECREF(u); + Py_DECREF(v); + return NULL; + } + } + Py_XDECREF(u); + return v; } static PyObject * decode_bytes_with_escapes(struct compiling *c, const node *n, const char *s, size_t len) { - return PyBytes_DecodeEscape(s, len, NULL, 0, NULL); + const char *first_invalid_escape; + PyObject *result = _PyBytes_DecodeEscape(s, len, NULL, 0, NULL, + &first_invalid_escape); + if (result == NULL) + return NULL; + + if (first_invalid_escape != NULL) { + if (warn_invalid_escape_sequence(c, n, *first_invalid_escape) < 0) { + Py_DECREF(result); + return NULL; + } + } + return result; +} + +/* Shift locations for the given node and all its children by adding `lineno` + and `col_offset` to existing locations. */ +static void fstring_shift_node_locations(node *n, int lineno, int col_offset) +{ + int i; + n->n_col_offset = n->n_col_offset + col_offset; + for (i = 0; i < NCH(n); ++i) { + if (n->n_lineno && n->n_lineno < CHILD(n, i)->n_lineno) { + /* Shifting column offsets unnecessary if there's been newlines. */ + col_offset = 0; + } + fstring_shift_node_locations(CHILD(n, i), lineno, col_offset); + } + n->n_lineno = n->n_lineno + lineno; +} + +/* Fix locations for the given node and its children. + + `parent` is the enclosing node. + `n` is the node which locations are going to be fixed relative to parent. + `expr_str` is the child node's string representation, including braces. +*/ +static void +fstring_fix_node_location(const node *parent, node *n, char *expr_str) +{ + char *substr = NULL; + char *start; + int lines = LINENO(parent) - 1; + int cols = parent->n_col_offset; + /* Find the full fstring to fix location information in `n`. */ + while (parent && parent->n_type != STRING) + parent = parent->n_child; + if (parent && parent->n_str) { + substr = strstr(parent->n_str, expr_str); + if (substr) { + start = substr; + while (start > parent->n_str) { + if (start[0] == '\n') + break; + start--; + } + cols += substr - start; + /* Fix lineno in mulitline strings. */ + while ((substr = strchr(substr + 1, '\n'))) + lines--; + } + } + fstring_shift_node_locations(n, lines, cols); } /* Compile this expression in to an expr_ty. Add parens around the @@ -4389,67 +4571,66 @@ fstring_compile_expr(const char *expr_start, const char *expr_end, struct compiling *c, const node *n) { - int all_whitespace = 1; - int kind; - void *data; PyCompilerFlags cf; + node *mod_n; mod_ty mod; char *str; - PyObject *o, *fstring_name; Py_ssize_t len; - Py_ssize_t i; + const char *s; + PyObject *fstring_name; assert(expr_end >= expr_start); assert(*(expr_start-1) == '{'); assert(*expr_end == '}' || *expr_end == '!' || *expr_end == ':'); - /* We know there are no escapes here, because backslashes are not allowed, - and we know it's utf-8 encoded (per PEP 263). But, in order to check - that each char is not whitespace, we need to decode it to unicode. - Which is unfortunate, but such is life. */ - - /* If the substring is all whitespace, it's an error. We need to catch - this here, and not when we call PyParser_ASTFromString, because turning - the expression '' in to '()' would go from being invalid to valid. */ - /* Note that this code says an empty string is all whitespace. That's - important. There's a test for it: f'{}'. */ - o = PyUnicode_DecodeUTF8(expr_start, expr_end-expr_start, NULL); - if (o == NULL) - return NULL; - len = PyUnicode_GET_LENGTH(o); - kind = PyUnicode_KIND(o); - data = PyUnicode_DATA(o); - for (i = 0; i < len; i++) { - if (!Py_UNICODE_ISSPACE(PyUnicode_READ(kind, data, i))) { - all_whitespace = 0; + /* If the substring is all whitespace, it's an error. We need to catch this + here, and not when we call PyParser_SimpleParseStringFlagsFilename, + because turning the expression '' in to '()' would go from being invalid + to valid. */ + for (s = expr_start; s != expr_end; s++) { + char c = *s; + /* The Python parser ignores only the following whitespace + characters (\r already is converted to \n). */ + if (!(c == ' ' || c == '\t' || c == '\n' || c == '\f')) { break; } } - Py_DECREF(o); - if (all_whitespace) { + if (s == expr_end) { ast_error(c, n, "f-string: empty expression not allowed"); return NULL; } - /* Reuse len to be the length of the utf-8 input string. */ len = expr_end - expr_start; /* Allocate 3 extra bytes: open paren, close paren, null byte. */ str = PyMem_RawMalloc(len + 3); - if (str == NULL) + if (str == NULL) { + PyErr_NoMemory(); return NULL; + } str[0] = '('; memcpy(str+1, expr_start, len); str[len+1] = ')'; str[len+2] = 0; cf.cf_flags = PyCF_ONLY_AST; + mod_n = PyParser_SimpleParseStringFlagsFilename(str, "<fstring>", + Py_eval_input, 0); + if (!mod_n) { + PyMem_RawFree(str); + return NULL; + } + /* Reuse str to find the correct column offset. */ + str[0] = '{'; + str[len+1] = '}'; + fstring_fix_node_location(n, mod_n, str); fstring_name = PyUnicode_FromString("<fstring>"); mod = string_object_to_c_ast(str, fstring_name, Py_eval_input, &cf, c->c_feature_version, c->c_arena); Py_DECREF(fstring_name); PyMem_RawFree(str); + Ta3Node_Free(mod_n); if (!mod) return NULL; return mod->v.Expression.body; @@ -4472,59 +4653,68 @@ fstring_find_literal(const char **str, const char *end, int raw, brace (which isn't part of a unicode name escape such as "\N{EULER CONSTANT}"), or the end of the string. */ - const char *literal_start = *str; - const char *literal_end; - int in_named_escape = 0; + const char *s = *str; + const char *literal_start = s; int result = 0; assert(*literal == NULL); - for (; *str < end; (*str)++) { - char ch = **str; - if (!in_named_escape && ch == '{' && (*str)-literal_start >= 2 && - *(*str-2) == '\\' && *(*str-1) == 'N') { - in_named_escape = 1; - } else if (in_named_escape && ch == '}') { - in_named_escape = 0; - } else if (ch == '{' || ch == '}') { + while (s < end) { + char ch = *s++; + if (!raw && ch == '\\' && s < end) { + ch = *s++; + if (ch == 'N') { + if (s < end && *s++ == '{') { + while (s < end && *s++ != '}') { + } + continue; + } + break; + } + if (ch == '{' && warn_invalid_escape_sequence(c, n, ch) < 0) { + return -1; + } + } + if (ch == '{' || ch == '}') { /* Check for doubled braces, but only at the top level. If we checked at every level, then f'{0:{3}}' would fail with the two closing braces. */ if (recurse_lvl == 0) { - if (*str+1 < end && *(*str+1) == ch) { + if (s < end && *s == ch) { /* We're going to tell the caller that the literal ends here, but that they should continue scanning. But also skip over the second brace when we resume scanning. */ - literal_end = *str+1; - *str += 2; + *str = s + 1; result = 1; goto done; } /* Where a single '{' is the start of a new expression, a single '}' is not allowed. */ if (ch == '}') { + *str = s - 1; ast_error(c, n, "f-string: single '}' is not allowed"); return -1; } } /* We're either at a '{', which means we're starting another expression; or a '}', which means we're at the end of this f-string (for a nested format_spec). */ + s--; break; } } - literal_end = *str; - assert(*str <= end); - assert(*str == end || **str == '{' || **str == '}'); + *str = s; + assert(s <= end); + assert(s == end || *s == '{' || *s == '}'); done: - if (literal_start != literal_end) { + if (literal_start != s) { if (raw) *literal = PyUnicode_DecodeUTF8Stateful(literal_start, - literal_end-literal_start, + s - literal_start, NULL, NULL); else *literal = decode_unicode_with_escapes(c, n, literal_start, - literal_end-literal_start); + s - literal_start); if (!*literal) return -1; } @@ -4936,6 +5126,7 @@ ExprList_Finish(ExprList *l, PyArena *arena) typedef struct { PyObject *last_str; ExprList expr_list; + int fmode; } FstringParser; #ifdef NDEBUG @@ -4954,6 +5145,7 @@ static void FstringParser_Init(FstringParser *state) { state->last_str = NULL; + state->fmode = 0; ExprList_Init(&state->expr_list); FstringParser_check_invariants(state); } @@ -5027,6 +5219,7 @@ FstringParser_ConcatFstring(FstringParser *state, const char **str, struct compiling *c, const node *n) { FstringParser_check_invariants(state); + state->fmode = 1; /* Parse the f-string. */ while (1) { @@ -5048,6 +5241,8 @@ FstringParser_ConcatFstring(FstringParser *state, const char **str, /* Do nothing. Just leave last_str alone (and possibly NULL). */ } else if (!state->last_str) { + /* Note that the literal can be zero length, if the + input string is "\\\n" or "\\\r", among others. */ state->last_str = literal; literal = NULL; } else { @@ -5057,8 +5252,6 @@ FstringParser_ConcatFstring(FstringParser *state, const char **str, return -1; literal = NULL; } - assert(!state->last_str || - PyUnicode_GET_LENGTH(state->last_str) != 0); /* We've dealt with the literal now. It can't be leaked on further errors. */ @@ -5118,7 +5311,8 @@ FstringParser_Finish(FstringParser *state, struct compiling *c, /* If we're just a constant string with no expressions, return that. */ - if(state->expr_list.size == 0) { + if (!state->fmode) { + assert(!state->expr_list.size); if (!state->last_str) { /* Create a zero length string. */ state->last_str = PyUnicode_FromStringAndSize(NULL, 0); @@ -5142,11 +5336,6 @@ FstringParser_Finish(FstringParser *state, struct compiling *c, if (!seq) goto error; - /* If there's only one expression, return it. Otherwise, we need - to join them together. */ - if (seq->size == 1) - return seq->elements[0]; - return JoinedStr(seq, LINENO(n), n->n_col_offset, c->c_arena); error:
ast3/Python/graminit.c+131 −118 modified@@ -1875,204 +1875,214 @@ static state states_80[2] = { {2, arcs_80_0}, {1, arcs_80_1}, }; -static arc arcs_81_0[2] = { - {21, 1}, - {102, 2}, +static arc arcs_81_0[1] = { + {102, 1}, }; static arc arcs_81_1[1] = { - {102, 2}, + {67, 2}, }; static arc arcs_81_2[1] = { - {67, 3}, + {103, 3}, }; static arc arcs_81_3[1] = { - {103, 4}, + {113, 4}, }; -static arc arcs_81_4[1] = { - {113, 5}, +static arc arcs_81_4[2] = { + {172, 5}, + {0, 4}, }; -static arc arcs_81_5[2] = { - {172, 6}, +static arc arcs_81_5[1] = { {0, 5}, }; -static arc arcs_81_6[1] = { - {0, 6}, -}; -static state states_81[7] = { - {2, arcs_81_0}, +static state states_81[6] = { + {1, arcs_81_0}, {1, arcs_81_1}, {1, arcs_81_2}, {1, arcs_81_3}, - {1, arcs_81_4}, - {2, arcs_81_5}, - {1, arcs_81_6}, + {2, arcs_81_4}, + {1, arcs_81_5}, }; -static arc arcs_82_0[1] = { - {98, 1}, +static arc arcs_82_0[2] = { + {21, 1}, + {174, 2}, }; static arc arcs_82_1[1] = { - {115, 2}, + {174, 2}, }; -static arc arcs_82_2[2] = { - {172, 3}, +static arc arcs_82_2[1] = { {0, 2}, }; -static arc arcs_82_3[1] = { - {0, 3}, -}; -static state states_82[4] = { - {1, arcs_82_0}, +static state states_82[3] = { + {2, arcs_82_0}, {1, arcs_82_1}, - {2, arcs_82_2}, - {1, arcs_82_3}, + {1, arcs_82_2}, }; static arc arcs_83_0[1] = { - {23, 1}, + {98, 1}, }; static arc arcs_83_1[1] = { - {0, 1}, + {115, 2}, }; -static state states_83[2] = { +static arc arcs_83_2[2] = { + {172, 3}, + {0, 2}, +}; +static arc arcs_83_3[1] = { + {0, 3}, +}; +static state states_83[4] = { {1, arcs_83_0}, {1, arcs_83_1}, + {2, arcs_83_2}, + {1, arcs_83_3}, }; static arc arcs_84_0[1] = { - {175, 1}, + {23, 1}, }; -static arc arcs_84_1[2] = { - {176, 2}, +static arc arcs_84_1[1] = { {0, 1}, }; -static arc arcs_84_2[1] = { - {0, 2}, -}; -static state states_84[3] = { +static state states_84[2] = { {1, arcs_84_0}, - {2, arcs_84_1}, - {1, arcs_84_2}, + {1, arcs_84_1}, }; -static arc arcs_85_0[2] = { - {78, 1}, - {9, 2}, +static arc arcs_85_0[1] = { + {176, 1}, }; -static arc arcs_85_1[1] = { - {26, 2}, +static arc arcs_85_1[2] = { + {177, 2}, + {0, 1}, }; static arc arcs_85_2[1] = { {0, 2}, }; static state states_85[3] = { - {2, arcs_85_0}, - {1, arcs_85_1}, + {1, arcs_85_0}, + {2, arcs_85_1}, {1, arcs_85_2}, }; -static arc arcs_86_0[1] = { - {178, 1}, +static arc arcs_86_0[2] = { + {78, 1}, + {9, 2}, }; -static arc arcs_86_1[2] = { - {2, 1}, - {7, 2}, +static arc arcs_86_1[1] = { + {26, 2}, }; static arc arcs_86_2[1] = { {0, 2}, }; static state states_86[3] = { - {1, arcs_86_0}, - {2, arcs_86_1}, + {2, arcs_86_0}, + {1, arcs_86_1}, {1, arcs_86_2}, }; static arc arcs_87_0[1] = { - {13, 1}, + {179, 1}, }; static arc arcs_87_1[2] = { - {179, 2}, - {15, 3}, + {2, 1}, + {7, 2}, }; static arc arcs_87_2[1] = { + {0, 2}, +}; +static state states_87[3] = { + {1, arcs_87_0}, + {2, arcs_87_1}, + {1, arcs_87_2}, +}; +static arc arcs_88_0[1] = { + {13, 1}, +}; +static arc arcs_88_1[2] = { + {180, 2}, {15, 3}, }; -static arc arcs_87_3[1] = { +static arc arcs_88_2[1] = { + {15, 3}, +}; +static arc arcs_88_3[1] = { {25, 4}, }; -static arc arcs_87_4[1] = { +static arc arcs_88_4[1] = { {26, 5}, }; -static arc arcs_87_5[1] = { +static arc arcs_88_5[1] = { {0, 5}, }; -static state states_87[6] = { - {1, arcs_87_0}, - {2, arcs_87_1}, - {1, arcs_87_2}, - {1, arcs_87_3}, - {1, arcs_87_4}, - {1, arcs_87_5}, +static state states_88[6] = { + {1, arcs_88_0}, + {2, arcs_88_1}, + {1, arcs_88_2}, + {1, arcs_88_3}, + {1, arcs_88_4}, + {1, arcs_88_5}, }; -static arc arcs_88_0[3] = { +static arc arcs_89_0[3] = { {26, 1}, {34, 2}, {35, 3}, }; -static arc arcs_88_1[2] = { +static arc arcs_89_1[2] = { {33, 4}, {0, 1}, }; -static arc arcs_88_2[3] = { +static arc arcs_89_2[3] = { {26, 5}, {33, 6}, {0, 2}, }; -static arc arcs_88_3[1] = { +static arc arcs_89_3[1] = { {26, 7}, }; -static arc arcs_88_4[4] = { +static arc arcs_89_4[4] = { {26, 1}, {34, 8}, {35, 3}, {0, 4}, }; -static arc arcs_88_5[2] = { +static arc arcs_89_5[2] = { {33, 6}, {0, 5}, }; -static arc arcs_88_6[2] = { +static arc arcs_89_6[2] = { {26, 5}, {35, 3}, }; -static arc arcs_88_7[1] = { +static arc arcs_89_7[1] = { {0, 7}, }; -static arc arcs_88_8[3] = { +static arc arcs_89_8[3] = { {26, 9}, {33, 10}, {0, 8}, }; -static arc arcs_88_9[2] = { +static arc arcs_89_9[2] = { {33, 10}, {0, 9}, }; -static arc arcs_88_10[2] = { +static arc arcs_89_10[2] = { {26, 9}, {35, 3}, }; -static state states_88[11] = { - {3, arcs_88_0}, - {2, arcs_88_1}, - {3, arcs_88_2}, - {1, arcs_88_3}, - {4, arcs_88_4}, - {2, arcs_88_5}, - {2, arcs_88_6}, - {1, arcs_88_7}, - {3, arcs_88_8}, - {2, arcs_88_9}, - {2, arcs_88_10}, -}; -static dfa dfas[89] = { +static state states_89[11] = { + {3, arcs_89_0}, + {2, arcs_89_1}, + {3, arcs_89_2}, + {1, arcs_89_3}, + {4, arcs_89_4}, + {2, arcs_89_5}, + {2, arcs_89_6}, + {1, arcs_89_7}, + {3, arcs_89_8}, + {2, arcs_89_9}, + {2, arcs_89_10}, +}; +static dfa dfas[90] = { {256, "single_input", 0, 3, states_0, - "\004\050\340\000\004\000\000\000\024\174\022\016\144\011\040\004\000\200\041\121\076\204\000"}, + "\004\050\340\000\004\000\000\000\024\174\022\016\144\011\040\004\000\200\041\121\076\004\001"}, {257, "file_input", 0, 2, states_1, - "\204\050\340\000\004\000\000\000\024\174\022\016\144\011\040\004\000\200\041\121\076\204\000"}, + "\204\050\340\000\004\000\000\000\024\174\022\016\144\011\040\004\000\200\041\121\076\004\001"}, {258, "eval_input", 0, 3, states_2, "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\040\004\000\200\041\121\076\000\000"}, {259, "decorator", 0, 7, states_3, @@ -2096,11 +2106,11 @@ static dfa dfas[89] = { {268, "vfpdef", 0, 2, states_12, "\000\000\200\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {269, "stmt", 0, 2, states_13, - "\000\050\340\000\004\000\000\000\024\174\022\016\144\011\040\004\000\200\041\121\076\204\000"}, + "\000\050\340\000\004\000\000\000\024\174\022\016\144\011\040\004\000\200\041\121\076\004\001"}, {270, "simple_stmt", 0, 4, states_14, - "\000\040\200\000\004\000\000\000\024\174\022\016\000\000\040\004\000\200\041\121\076\200\000"}, + "\000\040\200\000\004\000\000\000\024\174\022\016\000\000\040\004\000\200\041\121\076\000\001"}, {271, "small_stmt", 0, 2, states_15, - "\000\040\200\000\004\000\000\000\024\174\022\016\000\000\040\004\000\200\041\121\076\200\000"}, + "\000\040\200\000\004\000\000\000\024\174\022\016\000\000\040\004\000\200\041\121\076\000\001"}, {272, "expr_stmt", 0, 6, states_16, "\000\040\200\000\004\000\000\000\000\000\020\000\000\000\040\004\000\200\041\121\076\000\000"}, {273, "annassign", 0, 5, states_17, @@ -2114,15 +2124,15 @@ static dfa dfas[89] = { {277, "pass_stmt", 0, 2, states_21, "\000\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {278, "flow_stmt", 0, 2, states_22, - "\000\000\000\000\000\000\000\000\000\074\000\000\000\000\000\000\000\000\000\000\000\200\000"}, + "\000\000\000\000\000\000\000\000\000\074\000\000\000\000\000\000\000\000\000\000\000\000\001"}, {279, "break_stmt", 0, 2, states_23, "\000\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {280, "continue_stmt", 0, 2, states_24, "\000\000\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {281, "return_stmt", 0, 3, states_25, "\000\000\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {282, "yield_stmt", 0, 2, states_26, - "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\200\000"}, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\001"}, {283, "raise_stmt", 0, 5, states_27, "\000\000\000\000\000\000\000\000\000\040\000\000\000\000\000\000\000\000\000\000\000\000\000"}, {284, "import_stmt", 0, 2, states_28, @@ -2166,7 +2176,7 @@ static dfa dfas[89] = { {303, "except_clause", 0, 5, states_47, "\000\000\000\000\000\000\000\000\000\000\000\000\000\100\000\000\000\000\000\000\000\000\000"}, {304, "suite", 0, 7, states_48, - "\004\040\200\000\004\000\000\000\024\174\022\016\000\000\040\004\000\200\041\121\076\200\000"}, + "\004\040\200\000\004\000\000\000\024\174\022\016\000\000\040\004\000\200\041\121\076\000\001"}, {305, "test", 0, 6, states_49, "\000\040\200\000\000\000\000\000\000\000\020\000\000\000\040\004\000\200\041\121\076\000\000"}, {306, "test_nocond", 0, 2, states_50, @@ -2231,24 +2241,26 @@ static dfa dfas[89] = { "\000\040\200\000\014\000\000\000\000\000\020\000\000\000\040\004\000\200\041\121\076\000\000"}, {336, "comp_iter", 0, 2, states_80, "\000\000\040\000\000\000\000\000\000\000\000\000\104\000\000\000\000\000\000\000\000\000\000"}, - {337, "comp_for", 0, 7, states_81, + {337, "sync_comp_for", 0, 6, states_81, + "\000\000\000\000\000\000\000\000\000\000\000\000\100\000\000\000\000\000\000\000\000\000\000"}, + {338, "comp_for", 0, 3, states_82, "\000\000\040\000\000\000\000\000\000\000\000\000\100\000\000\000\000\000\000\000\000\000\000"}, - {338, "comp_if", 0, 4, states_82, + {339, "comp_if", 0, 4, states_83, "\000\000\000\000\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000\000\000"}, - {339, "encoding_decl", 0, 2, states_83, + {340, "encoding_decl", 0, 2, states_84, "\000\000\200\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, - {340, "yield_expr", 0, 3, states_84, - "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\200\000"}, - {341, "yield_arg", 0, 3, states_85, + {341, "yield_expr", 0, 3, states_85, + "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\001"}, + {342, "yield_arg", 0, 3, states_86, "\000\040\200\000\000\000\000\000\000\100\020\000\000\000\040\004\000\200\041\121\076\000\000"}, - {342, "func_type_input", 0, 3, states_86, + {343, "func_type_input", 0, 3, states_87, "\000\040\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, - {343, "func_type", 0, 6, states_87, + {344, "func_type", 0, 6, states_88, "\000\040\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"}, - {344, "typelist", 0, 11, states_88, + {345, "typelist", 0, 11, states_89, "\000\040\200\000\014\000\000\000\000\000\020\000\000\000\040\004\000\200\041\121\076\000\000"}, }; -static label labels[180] = { +static label labels[181] = { {0, "EMPTY"}, {256, 0}, {4, 0}, @@ -2300,7 +2312,7 @@ static label labels[180] = { {274, 0}, {273, 0}, {275, 0}, - {340, 0}, + {341, 0}, {314, 0}, {36, 0}, {37, 0}, @@ -2415,24 +2427,25 @@ static label labels[180] = { {1, "None"}, {1, "True"}, {1, "False"}, - {337, 0}, + {338, 0}, {327, 0}, {328, 0}, {329, 0}, {1, "class"}, {335, 0}, {336, 0}, - {338, 0}, {339, 0}, + {337, 0}, + {340, 0}, {1, "yield"}, - {341, 0}, {342, 0}, {343, 0}, {344, 0}, + {345, 0}, }; grammar _Ta3Parser_Grammar = { - 89, + 90, dfas, - {180, labels}, + {181, labels}, 256 };
ast3/Python/Python-ast.c+1485 −1131 modifiedast3/tests/test_basics.py+1 −1 modified@@ -6,7 +6,7 @@ # Lowest and highest supported Python 3 minor version (inclusive) MIN_VER = 4 -MAX_VER = 6 +MAX_VER = 7 NEXT_VER = MAX_VER + 1
setup.py+1 −1 modified@@ -67,7 +67,7 @@ 'ast3/Include/asdl.h', 'ast3/Include/ast.h', 'ast3/Include/bitset.h', - 'ast3/Include/compile.h', + 'ast3/Include/compile-ast3.h', 'ast3/Include/errcode.h', 'ast3/Include/graminit.h', 'ast3/Include/grammar.h',
tools/asdl_c.patch+114 −0 added@@ -0,0 +1,114 @@ +--- /Users/guido/src/cpython37/Parser/asdl_c.py 2018-09-10 08:18:23.000000000 -0700 ++++ ast3/Parser/asdl_c.py 2019-01-15 16:13:24.000000000 -0800 +@@ -270,9 +270,9 @@ + margs = "a0" + for i in range(1, len(args)+1): + margs += ", a%d" % i +- self.emit("#define %s(%s) _Py_%s(%s)" % (name, margs, name, margs), 0, ++ self.emit("#define %s(%s) _Ta3_%s(%s)" % (name, margs, name, margs), 0, + reflow=False) +- self.emit("%s _Py_%s(%s);" % (ctype, name, argstr), False) ++ self.emit("%s _Ta3_%s(%s);" % (ctype, name, argstr), False) + + def visitProduct(self, prod, name): + self.emit_function(name, get_c_type(name), +@@ -531,9 +531,9 @@ + self.emit("}", depth+1) + self.emit("len = PyList_GET_SIZE(tmp);", depth+1) + if self.isSimpleType(field): +- self.emit("%s = _Py_asdl_int_seq_new(len, arena);" % field.name, depth+1) ++ self.emit("%s = _Ta3_asdl_int_seq_new(len, arena);" % field.name, depth+1) + else: +- self.emit("%s = _Py_asdl_seq_new(len, arena);" % field.name, depth+1) ++ self.emit("%s = _Ta3_asdl_seq_new(len, arena);" % field.name, depth+1) + self.emit("if (%s == NULL) goto failed;" % field.name, depth+1) + self.emit("for (i = 0; i < len; i++) {", depth+1) + self.emit("%s val;" % ctype, depth+2) +@@ -729,8 +729,8 @@ + }; + + static PyTypeObject AST_type = { +- PyVarObject_HEAD_INIT(&PyType_Type, 0) +- "_ast.AST", ++ PyVarObject_HEAD_INIT(NULL, 0) ++ "_ast3.AST", + sizeof(AST_object), + 0, + (destructor)ast_dealloc, /* tp_dealloc */ +@@ -774,7 +774,7 @@ + static PyTypeObject* make_type(char *type, PyTypeObject* base, char**fields, int num_fields) + { + _Py_IDENTIFIER(__module__); +- _Py_IDENTIFIER(_ast); ++ _Py_IDENTIFIER(_ast3); + PyObject *fnames, *result; + int i; + fnames = PyTuple_New(num_fields); +@@ -791,7 +791,7 @@ + type, base, + _PyUnicode_FromId(&PyId__fields), fnames, + _PyUnicode_FromId(&PyId___module__), +- _PyUnicode_FromId(&PyId__ast)); ++ _PyUnicode_FromId(&PyId__ast3)); + Py_DECREF(fnames); + return (PyTypeObject*)result; + } +@@ -1010,11 +1010,16 @@ + class ASTModuleVisitor(PickleVisitor): + + def visitModule(self, mod): ++ self.emit("PyObject *ast3_parse(PyObject *self, PyObject *args);", 0) ++ self.emit("static PyMethodDef ast3_methods[] = {", 0) ++ self.emit(' {"_parse", ast3_parse, METH_VARARGS, "Parse string into typed AST."},', 0) ++ self.emit(" {NULL, NULL, 0, NULL}", 0) ++ self.emit("};", 0) + self.emit("static struct PyModuleDef _astmodule = {", 0) +- self.emit(' PyModuleDef_HEAD_INIT, "_ast"', 0) ++ self.emit(' PyModuleDef_HEAD_INIT, "_ast3", NULL, 0, ast3_methods', 0) + self.emit("};", 0) + self.emit("PyMODINIT_FUNC", 0) +- self.emit("PyInit__ast(void)", 0) ++ self.emit("PyInit__ast3(void)", 0) + self.emit("{", 0) + self.emit("PyObject *m, *d;", 1) + self.emit("if (!init_types()) return NULL;", 1) +@@ -1199,7 +1204,7 @@ + class PartingShots(StaticVisitor): + + CODE = """ +-PyObject* PyAST_mod2obj(mod_ty t) ++PyObject* Ta3AST_mod2obj(mod_ty t) + { + if (!init_types()) + return NULL; +@@ -1207,7 +1212,7 @@ + } + + /* mode is 0 for "exec", 1 for "eval" and 2 for "single" input */ +-mod_ty PyAST_obj2mod(PyObject* ast, PyArena* arena, int mode) ++mod_ty Ta3AST_obj2mod(PyObject* ast, PyArena* arena, int mode) + { + mod_ty res; + PyObject *req_type[3]; +@@ -1237,7 +1242,7 @@ + return res; + } + +-int PyAST_Check(PyObject* obj) ++int Ta3AST_Check(PyObject* obj) + { + if (!init_types()) + return -1; +@@ -1276,9 +1281,9 @@ + PrototypeVisitor(f), + ) + c.visit(mod) +- f.write("PyObject* PyAST_mod2obj(mod_ty t);\n") +- f.write("mod_ty PyAST_obj2mod(PyObject* ast, PyArena* arena, int mode);\n") +- f.write("int PyAST_Check(PyObject* obj);\n") ++ f.write("PyObject* Ta3AST_mod2obj(mod_ty t);\n") ++ f.write("mod_ty Ta3AST_obj2mod(PyObject* ast, PyArena* arena, int mode);\n") ++ f.write("int Ta3AST_Check(PyObject* obj);\n") + + if C_FILE: + with open(C_FILE, "w") as f:
tools/ast.patch+444 −0 added@@ -0,0 +1,444 @@ +diff --git a/ast3/Python/ast.c b/ast3/Python/ast.c +index e12f8e6..1fa762d 100644 +--- a/ast3/Python/ast.c ++++ b/ast3/Python/ast.c +@@ -665,6 +665,13 @@ new_identifier(const char *n, struct compiling *c) + + #define NEW_IDENTIFIER(n) new_identifier(STR(n), c) + ++static string ++new_type_comment(const char *s, struct compiling *c) ++{ ++ return PyUnicode_DecodeUTF8(s, strlen(s), NULL); ++} ++#define NEW_TYPE_COMMENT(n) new_type_comment(STR(n), c) ++ + static int + ast_error(struct compiling *c, const node *n, const char *errmsg) + { +@@ -734,11 +741,15 @@ num_stmts(const node *n) + case simple_stmt: + return NCH(n) / 2; /* Divide by 2 to remove count of semi-colons */ + case suite: ++ /* suite: simple_stmt | NEWLINE [TYPE_COMMENT NEWLINE] INDENT stmt+ DEDENT */ + if (NCH(n) == 1) + return num_stmts(CHILD(n, 0)); + else { ++ i = 2; + l = 0; +- for (i = 2; i < (NCH(n) - 1); i++) ++ if (TYPE(CHILD(n, 1)) == TYPE_COMMENT) ++ i += 2; ++ for (; i < (NCH(n) - 1); i++) + l += num_stmts(CHILD(n, i)); + return l; + } +@@ -763,10 +774,13 @@ Ta3AST_FromNodeObject(const node *n, PyCompilerFlags *flags, + { + int i, j, k, num; + asdl_seq *stmts = NULL; ++ asdl_seq *type_ignores = NULL; + stmt_ty s; + node *ch; + struct compiling c; + mod_ty res = NULL; ++ asdl_seq *argtypes = NULL; ++ expr_ty ret, arg; + + c.c_arena = arena; + /* borrowed reference */ +@@ -806,7 +820,23 @@ Ta3AST_FromNodeObject(const node *n, PyCompilerFlags *flags, + } + } + } +- res = Module(stmts, arena); ++ ++ /* Type ignores are stored under the ENDMARKER in file_input. */ ++ ch = CHILD(n, NCH(n) - 1); ++ REQ(ch, ENDMARKER); ++ num = NCH(ch); ++ type_ignores = _Ta3_asdl_seq_new(num, arena); ++ if (!type_ignores) ++ goto out; ++ ++ for (i = 0; i < num; i++) { ++ type_ignore_ty ti = TypeIgnore(LINENO(CHILD(ch, i)), arena); ++ if (!ti) ++ goto out; ++ asdl_seq_SET(type_ignores, i, ti); ++ } ++ ++ res = Module(stmts, type_ignores, arena); + break; + case eval_input: { + expr_ty testlist_ast; +@@ -857,6 +887,40 @@ Ta3AST_FromNodeObject(const node *n, PyCompilerFlags *flags, + res = Interactive(stmts, arena); + } + break; ++ case func_type_input: ++ n = CHILD(n, 0); ++ REQ(n, func_type); ++ ++ if (TYPE(CHILD(n, 1)) == typelist) { ++ ch = CHILD(n, 1); ++ /* this is overly permissive -- we don't pay any attention to ++ * stars on the args -- just parse them into an ordered list */ ++ num = 0; ++ for (i = 0; i < NCH(ch); i++) { ++ if (TYPE(CHILD(ch, i)) == test) ++ num++; ++ } ++ ++ argtypes = _Ta3_asdl_seq_new(num, arena); ++ ++ j = 0; ++ for (i = 0; i < NCH(ch); i++) { ++ if (TYPE(CHILD(ch, i)) == test) { ++ arg = ast_for_expr(&c, CHILD(ch, i)); ++ if (!arg) ++ goto out; ++ asdl_seq_SET(argtypes, j++, arg); ++ } ++ } ++ } ++ else ++ argtypes = _Ta3_asdl_seq_new(0, arena); ++ ++ ret = ast_for_expr(&c, CHILD(n, NCH(n) - 1)); ++ if (!ret) ++ goto out; ++ res = FunctionType(argtypes, ret, arena); ++ break; + default: + PyErr_Format(PyExc_SystemError, + "invalid node %d for Ta3AST_FromNode", TYPE(n)); +@@ -1250,7 +1314,7 @@ ast_for_arg(struct compiling *c, const node *n) + return NULL; + } + +- ret = arg(name, annotation, LINENO(n), n->n_col_offset, c->c_arena); ++ ret = arg(name, annotation, NULL, LINENO(n), n->n_col_offset, c->c_arena); + if (!ret) + return NULL; + return ret; +@@ -1308,12 +1372,19 @@ handle_keywordonly_args(struct compiling *c, const node *n, int start, + goto error; + if (forbidden_name(c, argname, ch, 0)) + goto error; +- arg = arg(argname, annotation, LINENO(ch), ch->n_col_offset, ++ arg = arg(argname, annotation, NULL, LINENO(ch), ch->n_col_offset, + c->c_arena); + if (!arg) + goto error; + asdl_seq_SET(kwonlyargs, j++, arg); +- i += 2; /* the name and the comma */ ++ i += 1; /* the name */ ++ if (TYPE(CHILD(n, i)) == COMMA) ++ i += 1; /* the comma, if present */ ++ break; ++ case TYPE_COMMENT: ++ /* arg will be equal to the last argument processed */ ++ arg->type_comment = NEW_TYPE_COMMENT(ch); ++ i += 1; + break; + case DOUBLESTAR: + return i; +@@ -1448,11 +1519,14 @@ ast_for_arguments(struct compiling *c, const node *n) + if (!arg) + return NULL; + asdl_seq_SET(posargs, k++, arg); +- i += 2; /* the name and the comma */ ++ i += 1; /* the name */ ++ if (TYPE(CHILD(n, i)) == COMMA) ++ i += 1; /* the comma, if present */ + break; + case STAR: + if (i+1 >= NCH(n) || +- (i+2 == NCH(n) && TYPE(CHILD(n, i+1)) == COMMA)) { ++ (i+2 == NCH(n) && (TYPE(CHILD(n, i+1)) == COMMA ++ || TYPE(CHILD(n, i+1)) == TYPE_COMMENT))) { + ast_error(c, CHILD(n, i), + "named arguments must follow bare *"); + return NULL; +@@ -1461,6 +1535,13 @@ ast_for_arguments(struct compiling *c, const node *n) + if (TYPE(ch) == COMMA) { + int res = 0; + i += 2; /* now follows keyword only arguments */ ++ ++ if (TYPE(CHILD(n, i)) == TYPE_COMMENT) { ++ ast_error(c, CHILD(n, i), ++ "bare * has associated type comment"); ++ return NULL; ++ } ++ + res = handle_keywordonly_args(c, n, i, + kwonlyargs, kwdefaults); + if (res == -1) return NULL; +@@ -1471,7 +1552,15 @@ ast_for_arguments(struct compiling *c, const node *n) + if (!vararg) + return NULL; + +- i += 3; ++ i += 2; /* the star and the name */ ++ if (TYPE(CHILD(n, i)) == COMMA) ++ i += 1; /* the comma, if present */ ++ ++ if (TYPE(CHILD(n, i)) == TYPE_COMMENT) { ++ vararg->type_comment = NEW_TYPE_COMMENT(CHILD(n, i)); ++ i += 1; ++ } ++ + if (i < NCH(n) && (TYPE(CHILD(n, i)) == tfpdef + || TYPE(CHILD(n, i)) == vfpdef)) { + int res = 0; +@@ -1488,7 +1577,19 @@ ast_for_arguments(struct compiling *c, const node *n) + kwarg = ast_for_arg(c, ch); + if (!kwarg) + return NULL; +- i += 3; ++ i += 2; /* the double star and the name */ ++ if (TYPE(CHILD(n, i)) == COMMA) ++ i += 1; /* the comma, if present */ ++ break; ++ case TYPE_COMMENT: ++ assert(i); ++ ++ if (kwarg) ++ arg = kwarg; ++ ++ /* arg will be equal to the last argument processed */ ++ arg->type_comment = NEW_TYPE_COMMENT(ch); ++ i += 1; + break; + default: + PyErr_Format(PyExc_SystemError, +@@ -1593,12 +1694,14 @@ static stmt_ty + ast_for_funcdef_impl(struct compiling *c, const node *n, + asdl_seq *decorator_seq, int is_async) + { +- /* funcdef: 'def' NAME parameters ['->' test] ':' suite */ ++ /* funcdef: 'def' NAME parameters ['->' test] ':' [TYPE_COMMENT] suite */ + identifier name; + arguments_ty args; + asdl_seq *body; + expr_ty returns = NULL; + int name_i = 1; ++ node *tc; ++ string type_comment = NULL; + + REQ(n, funcdef); + +@@ -1616,17 +1719,30 @@ ast_for_funcdef_impl(struct compiling *c, const node *n, + return NULL; + name_i += 2; + } ++ if (TYPE(CHILD(n, name_i + 3)) == TYPE_COMMENT) { ++ type_comment = NEW_TYPE_COMMENT(CHILD(n, name_i + 3)); ++ name_i += 1; ++ } + body = ast_for_suite(c, CHILD(n, name_i + 3)); + if (!body) + return NULL; + ++ if (!type_comment && NCH(CHILD(n, name_i + 3)) > 1) { ++ /* If the function doesn't have a type comment on the same line, check ++ * if the suite has a type comment in it. */ ++ tc = CHILD(CHILD(n, name_i + 3), 1); ++ ++ if (TYPE(tc) == TYPE_COMMENT) ++ type_comment = NEW_TYPE_COMMENT(tc); ++ } ++ + if (is_async) + return AsyncFunctionDef(name, args, body, decorator_seq, returns, +- LINENO(n), ++ type_comment, LINENO(n), + n->n_col_offset, c->c_arena); + else + return FunctionDef(name, args, body, decorator_seq, returns, +- LINENO(n), ++ type_comment, LINENO(n), + n->n_col_offset, c->c_arena); + } + +@@ -2896,15 +3012,16 @@ ast_for_expr_stmt(struct compiling *c, const node *n) + { + REQ(n, expr_stmt); + /* expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist) | +- ('=' (yield_expr|testlist_star_expr))*) ++ ('=' (yield_expr|testlist_star_expr))* [TYPE_COMMENT]) + annassign: ':' test ['=' test] + testlist_star_expr: (test|star_expr) (',' test|star_expr)* [','] + augassign: '+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' + | '<<=' | '>>=' | '**=' | '//=' + test: ... here starts the operator precedence dance + */ ++ int num = NCH(n); + +- if (NCH(n) == 1) { ++ if (num == 1 || (num == 2 && TYPE(CHILD(n, 1)) == TYPE_COMMENT)) { + expr_ty e = ast_for_testlist(c, CHILD(n, 0)); + if (!e) + return NULL; +@@ -3020,17 +3137,22 @@ ast_for_expr_stmt(struct compiling *c, const node *n) + } + } + else { +- int i; ++ int i, nch_minus_type, has_type_comment; + asdl_seq *targets; + node *value; + expr_ty expression; ++ string type_comment; + + /* a normal assignment */ + REQ(CHILD(n, 1), EQUAL); +- targets = _Ta3_asdl_seq_new(NCH(n) / 2, c->c_arena); ++ ++ has_type_comment = TYPE(CHILD(n, num - 1)) == TYPE_COMMENT; ++ nch_minus_type = num - has_type_comment; ++ ++ targets = _Ta3_asdl_seq_new(nch_minus_type / 2, c->c_arena); + if (!targets) + return NULL; +- for (i = 0; i < NCH(n) - 2; i += 2) { ++ for (i = 0; i < nch_minus_type - 2; i += 2) { + expr_ty e; + node *ch = CHILD(n, i); + if (TYPE(ch) == yield_expr) { +@@ -3047,14 +3169,18 @@ ast_for_expr_stmt(struct compiling *c, const node *n) + + asdl_seq_SET(targets, i / 2, e); + } +- value = CHILD(n, NCH(n) - 1); ++ value = CHILD(n, nch_minus_type - 1); + if (TYPE(value) == testlist_star_expr) + expression = ast_for_testlist(c, value); + else + expression = ast_for_expr(c, value); + if (!expression) + return NULL; +- return Assign(targets, expression, LINENO(n), n->n_col_offset, c->c_arena); ++ if (has_type_comment) ++ type_comment = NEW_TYPE_COMMENT(CHILD(n, nch_minus_type)); ++ else ++ type_comment = NULL; ++ return Assign(targets, expression, type_comment, LINENO(n), n->n_col_offset, c->c_arena); + } + } + +@@ -3461,7 +3587,7 @@ ast_for_assert_stmt(struct compiling *c, const node *n) + static asdl_seq * + ast_for_suite(struct compiling *c, const node *n) + { +- /* suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT */ ++ /* suite: simple_stmt | NEWLINE [TYPE_COMMENT NEWLINE] INDENT stmt+ DEDENT */ + asdl_seq *seq; + stmt_ty s; + int i, total, num, end, pos = 0; +@@ -3491,7 +3617,11 @@ ast_for_suite(struct compiling *c, const node *n) + } + } + else { +- for (i = 2; i < (NCH(n) - 1); i++) { ++ i = 2; ++ if (TYPE(CHILD(n, 1)) == TYPE_COMMENT) ++ i += 2; ++ ++ for (; i < (NCH(n) - 1); i++) { + ch = CHILD(n, i); + REQ(ch, stmt); + num = num_stmts(ch); +@@ -3692,11 +3822,15 @@ ast_for_for_stmt(struct compiling *c, const node *n, int is_async) + expr_ty expression; + expr_ty target, first; + const node *node_target; +- /* for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite] */ ++ int has_type_comment; ++ string type_comment; ++ /* for_stmt: 'for' exprlist 'in' testlist ':' [TYPE_COMMENT] suite ['else' ':' suite] */ + REQ(n, for_stmt); + +- if (NCH(n) == 9) { +- seq = ast_for_suite(c, CHILD(n, 8)); ++ has_type_comment = TYPE(CHILD(n, 5)) == TYPE_COMMENT; ++ ++ if (NCH(n) == 9 + has_type_comment) { ++ seq = ast_for_suite(c, CHILD(n, 8 + has_type_comment)); + if (!seq) + return NULL; + } +@@ -3716,17 +3850,22 @@ ast_for_for_stmt(struct compiling *c, const node *n, int is_async) + expression = ast_for_testlist(c, CHILD(n, 3)); + if (!expression) + return NULL; +- suite_seq = ast_for_suite(c, CHILD(n, 5)); ++ suite_seq = ast_for_suite(c, CHILD(n, 5 + has_type_comment)); + if (!suite_seq) + return NULL; + ++ if (has_type_comment) ++ type_comment = NEW_TYPE_COMMENT(CHILD(n, 5)); ++ else ++ type_comment = NULL; ++ + if (is_async) + return AsyncFor(target, expression, suite_seq, seq, +- LINENO(n), n->n_col_offset, ++ type_comment, LINENO(n), n->n_col_offset, + c->c_arena); + else + return For(target, expression, suite_seq, seq, +- LINENO(n), n->n_col_offset, ++ type_comment, LINENO(n), n->n_col_offset, + c->c_arena); + } + +@@ -3872,20 +4011,24 @@ ast_for_with_item(struct compiling *c, const node *n) + return withitem(context_expr, optional_vars, c->c_arena); + } + +-/* with_stmt: 'with' with_item (',' with_item)* ':' suite */ ++/* with_stmt: 'with' with_item (',' with_item)* ':' [TYPE_COMMENT] suite */ + static stmt_ty + ast_for_with_stmt(struct compiling *c, const node *n, int is_async) + { +- int i, n_items; ++ int i, n_items, nch_minus_type, has_type_comment; + asdl_seq *items, *body; ++ string type_comment; + + REQ(n, with_stmt); + +- n_items = (NCH(n) - 2) / 2; ++ has_type_comment = TYPE(CHILD(n, NCH(n) - 2)) == TYPE_COMMENT; ++ nch_minus_type = NCH(n) - has_type_comment; ++ ++ n_items = (nch_minus_type - 2) / 2; + items = _Ta3_asdl_seq_new(n_items, c->c_arena); + if (!items) + return NULL; +- for (i = 1; i < NCH(n) - 2; i += 2) { ++ for (i = 1; i < nch_minus_type - 2; i += 2) { + withitem_ty item = ast_for_with_item(c, CHILD(n, i)); + if (!item) + return NULL; +@@ -3896,10 +4039,15 @@ ast_for_with_stmt(struct compiling *c, const node *n, int is_async) + if (!body) + return NULL; + ++ if (has_type_comment) ++ type_comment = NEW_TYPE_COMMENT(CHILD(n, NCH(n) - 2)); ++ else ++ type_comment = NULL; ++ + if (is_async) +- return AsyncWith(items, body, LINENO(n), n->n_col_offset, c->c_arena); ++ return AsyncWith(items, body, type_comment, LINENO(n), n->n_col_offset, c->c_arena); + else +- return With(items, body, LINENO(n), n->n_col_offset, c->c_arena); ++ return With(items, body, type_comment, LINENO(n), n->n_col_offset, c->c_arena); + } + + static stmt_ty
tools/find_exported_symbols+9 −5 modified@@ -1,8 +1,12 @@ #!/bin/bash PROJ_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/.." -gobjdump -t $PROJ_DIR/build/lib*/_ast${1}.*.so | grep ' g ' | grep -v UND > "exported_symbols${1}.txt" -echo "Symbols written to exported_symbols${1}.txt. You should edit this file to " -echo "remove any symbols you still want to export (like PyInit functions) " -echo "and to make each line contain only a function name you want updated " -echo "(and none of the other output) before running update_exported_symbols." +# This requires GNU binutils (e.g. brew install binutils). + +/usr/local/opt/binutils/bin/gobjdump -t $PROJ_DIR/build/lib*/_ast${1}.*.so \ + | grep ' g ' \ + | grep -v UND \ + | sed 's/.* _//' \ + | grep -v PyInit__ast \ + | grep 'Py' \ + > "exported_symbols${1}.txt"
tools/Grammar.patch+78 −0 added@@ -0,0 +1,78 @@ +diff --git a/ast3/Grammar/Grammar b/ast3/Grammar/Grammar +index b139e9f..dfd730f 100644 +--- a/ast3/Grammar/Grammar ++++ b/ast3/Grammar/Grammar +@@ -14,7 +14,10 @@ + # single_input is a single interactive statement; + # file_input is a module or sequence of commands read from an input file; + # eval_input is the input for the eval() functions. ++# func_type_input is a PEP 484 Python 2 function type comment + # NB: compound_stmt in single_input is followed by extra NEWLINE! ++# NB: due to the way TYPE_COMMENT is tokenized it will always be followed by a ++# NEWLINE + single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE + file_input: (NEWLINE | stmt)* ENDMARKER + eval_input: testlist NEWLINE* ENDMARKER +@@ -24,14 +27,14 @@ decorators: decorator+ + decorated: decorators (classdef | funcdef | async_funcdef) + + async_funcdef: ASYNC funcdef +-funcdef: 'def' NAME parameters ['->' test] ':' suite ++funcdef: 'def' NAME parameters ['->' test] ':' [TYPE_COMMENT] suite + + parameters: '(' [typedargslist] ')' +-typedargslist: (tfpdef ['=' test] (',' tfpdef ['=' test])* [',' [ +- '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]] +- | '**' tfpdef [',']]] +- | '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]] +- | '**' tfpdef [',']) ++typedargslist: (tfpdef ['=' test] (',' [TYPE_COMMENT] tfpdef ['=' test])* (TYPE_COMMENT | [',' [TYPE_COMMENT] [ ++ '*' [tfpdef] (',' [TYPE_COMMENT] tfpdef ['=' test])* (TYPE_COMMENT | [',' [TYPE_COMMENT] ['**' tfpdef [','] [TYPE_COMMENT]]]) ++ | '**' tfpdef [','] [TYPE_COMMENT]]]) ++ | '*' [tfpdef] (',' [TYPE_COMMENT] tfpdef ['=' test])* (TYPE_COMMENT | [',' [TYPE_COMMENT] ['**' tfpdef [','] [TYPE_COMMENT]]]) ++ | '**' tfpdef [','] [TYPE_COMMENT]) + tfpdef: NAME [':' test] + varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [ + '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]] +@@ -46,7 +49,7 @@ simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE + small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt | + import_stmt | global_stmt | nonlocal_stmt | assert_stmt) + expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist) | +- ('=' (yield_expr|testlist_star_expr))*) ++ ('=' (yield_expr|testlist_star_expr))* [TYPE_COMMENT]) + annassign: ':' test ['=' test] + testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [','] + augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' | +@@ -78,17 +81,18 @@ compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef + async_stmt: ASYNC (funcdef | with_stmt | for_stmt) + if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite] + while_stmt: 'while' test ':' suite ['else' ':' suite] +-for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite] ++for_stmt: 'for' exprlist 'in' testlist ':' [TYPE_COMMENT] suite ['else' ':' suite] + try_stmt: ('try' ':' suite + ((except_clause ':' suite)+ + ['else' ':' suite] + ['finally' ':' suite] | + 'finally' ':' suite)) +-with_stmt: 'with' with_item (',' with_item)* ':' suite ++with_stmt: 'with' with_item (',' with_item)* ':' [TYPE_COMMENT] suite + with_item: test ['as' expr] + # NB compile.c makes sure that the default except clause is last + except_clause: 'except' [test ['as' NAME]] +-suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT ++# the TYPE_COMMENT in suites is only parsed for funcdefs, but can't go elsewhere due to ambiguity ++suite: simple_stmt | NEWLINE [TYPE_COMMENT NEWLINE] INDENT stmt+ DEDENT + + test: or_test ['if' or_test 'else' test] | lambdef + test_nocond: or_test | lambdef_nocond +@@ -154,3 +158,10 @@ encoding_decl: NAME + + yield_expr: 'yield' [yield_arg] + yield_arg: 'from' test | testlist ++ ++func_type_input: func_type NEWLINE* ENDMARKER ++func_type: '(' [typelist] ')' '->' test ++# typelist is a modified typedargslist (see above) ++typelist: (test (',' test)* [',' ++ ['*' [test] (',' test)* [',' '**' test] | '**' test]] ++ | '*' [test] (',' test)* [',' '**' test] | '**' test)
tools/parsetok.patch+95 −0 added@@ -0,0 +1,95 @@ +diff --git a/ast3/Parser/parsetok.c b/ast3/Parser/parsetok.c +index 9f01a0d..5529feb 100644 +--- a/ast3/Parser/parsetok.c ++++ b/ast3/Parser/parsetok.c +@@ -177,6 +177,38 @@ warn(const char *msg, const char *filename, int lineno) + #endif + #endif + ++typedef struct { ++ int *items; ++ size_t size; ++ size_t num_items; ++} growable_int_array; ++ ++int growable_int_array_init(growable_int_array *arr, size_t initial_size) { ++ assert(initial_size > 0); ++ arr->items = malloc(initial_size * sizeof(*arr->items)); ++ arr->size = initial_size; ++ arr->num_items = 0; ++ ++ return arr->items != NULL; ++} ++ ++int growable_int_array_add(growable_int_array *arr, int item) { ++ if (arr->num_items >= arr->size) { ++ arr->size *= 2; ++ arr->items = realloc(arr->items, arr->size * sizeof(*arr->items)); ++ if (!arr->items) ++ return 0; ++ } ++ ++ arr->items[arr->num_items] = item; ++ arr->num_items++; ++ return 1; ++} ++ ++void growable_int_array_deallocate(growable_int_array *arr) { ++ free(arr->items); ++} ++ + /* Parse input coming from the given tokenizer structure. + Return error code. */ + +@@ -188,6 +220,13 @@ parsetok(struct tok_state *tok, grammar *g, int start, perrdetail *err_ret, + node *n; + int started = 0; + ++ growable_int_array type_ignores; ++ if (!growable_int_array_init(&type_ignores, 10)) { ++ err_ret->error = E_NOMEM; ++ Ta3Tokenizer_Free(tok); ++ return NULL; ++ } ++ + if ((ps = Ta3Parser_New(g, start)) == NULL) { + err_ret->error = E_NOMEM; + Ta3Tokenizer_Free(tok); +@@ -259,6 +298,14 @@ parsetok(struct tok_state *tok, grammar *g, int start, perrdetail *err_ret, + else + col_offset = -1; + ++ if (type == TYPE_IGNORE) { ++ if (!growable_int_array_add(&type_ignores, tok->lineno)) { ++ err_ret->error = E_NOMEM; ++ break; ++ } ++ continue; ++ } ++ + if ((err_ret->error = + Ta3Parser_AddToken(ps, (int)type, str, + tok->lineno, col_offset, +@@ -275,6 +322,22 @@ parsetok(struct tok_state *tok, grammar *g, int start, perrdetail *err_ret, + n = ps->p_tree; + ps->p_tree = NULL; + ++ if (n->n_type == file_input) { ++ /* Put type_ignore nodes in the ENDMARKER of file_input. */ ++ int num; ++ node *ch; ++ size_t i; ++ ++ num = NCH(n); ++ ch = CHILD(n, num - 1); ++ REQ(ch, ENDMARKER); ++ ++ for (i = 0; i < type_ignores.num_items; i++) { ++ Ta3Node_AddChild(ch, TYPE_IGNORE, NULL, type_ignores.items[i], 0); ++ } ++ } ++ growable_int_array_deallocate(&type_ignores); ++ + #ifndef PGEN + /* Check that the source for a single input statement really + is a single statement by looking at what is left in the
tools/Python-asdl.patch+67 −0 added@@ -0,0 +1,67 @@ +diff --git a/ast3/Parser/Python.asdl b/ast3/Parser/Python.asdl +index f470ad1..7bde99c 100644 +--- a/ast3/Parser/Python.asdl ++++ b/ast3/Parser/Python.asdl +@@ -6,17 +6,18 @@ + + module Python + { +- mod = Module(stmt* body) ++ mod = Module(stmt* body, type_ignore *type_ignores) + | Interactive(stmt* body) + | Expression(expr body) ++ | FunctionType(expr* argtypes, expr returns) + + -- not really an actual node but useful in Jython's typesystem. + | Suite(stmt* body) + + stmt = FunctionDef(identifier name, arguments args, +- stmt* body, expr* decorator_list, expr? returns) ++ stmt* body, expr* decorator_list, expr? returns, string? type_comment) + | AsyncFunctionDef(identifier name, arguments args, +- stmt* body, expr* decorator_list, expr? returns) ++ stmt* body, expr* decorator_list, expr? returns, string? type_comment) + + | ClassDef(identifier name, + expr* bases, +@@ -26,18 +27,18 @@ module Python + | Return(expr? value) + + | Delete(expr* targets) +- | Assign(expr* targets, expr value) ++ | Assign(expr* targets, expr value, string? type_comment) + | AugAssign(expr target, operator op, expr value) + -- 'simple' indicates that we annotate simple name without parens + | AnnAssign(expr target, expr annotation, expr? value, int simple) + + -- use 'orelse' because else is a keyword in target languages +- | For(expr target, expr iter, stmt* body, stmt* orelse) +- | AsyncFor(expr target, expr iter, stmt* body, stmt* orelse) ++ | For(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment) ++ | AsyncFor(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment) + | While(expr test, stmt* body, stmt* orelse) + | If(expr test, stmt* body, stmt* orelse) +- | With(withitem* items, stmt* body) +- | AsyncWith(withitem* items, stmt* body) ++ | With(withitem* items, stmt* body, string? type_comment) ++ | AsyncWith(withitem* items, stmt* body, string? type_comment) + + | Raise(expr? exc, expr? cause) + | Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody) +@@ -118,7 +119,7 @@ module Python + arguments = (arg* args, arg? vararg, arg* kwonlyargs, expr* kw_defaults, + arg? kwarg, expr* defaults) + +- arg = (identifier arg, expr? annotation) ++ arg = (identifier arg, expr? annotation, string? type_comment) + attributes (int lineno, int col_offset) + + -- keyword arguments supplied to call (NULL identifier for **kwargs) +@@ -128,5 +129,7 @@ module Python + alias = (identifier name, identifier? asname) + + withitem = (expr context_expr, expr? optional_vars) ++ ++ type_ignore = TypeIgnore(int lineno) + } +
tools/script+44 −0 added@@ -0,0 +1,44 @@ +#!/bin/bash -ex + +# Automate steps 1-4 of update_process.md (Mac). + +HERE=$(dirname ${BASH_SOURCE[0]}) +cd $HERE/.. +pwd + +CPYTHON=~/src/cpython37 + +DIRS="Grammar Include Parser Python" +C_FILES="Parser/acceler.c Parser/bitset.c Parser/grammar.c Parser/grammar1.c Parser/node.c Parser/parser.c Parser/parsetok.c Parser/tokenizer.c Python/asdl.c Python/ast.c Python/graminit.c Python/Python-ast.c" +H_FILES="Include/asdl.h Include/ast.h Include/bitset.h Include/errcode.h Include/graminit.h Include/grammar.h Include/node.h Include/parsetok.h Include/Python-ast.h Include/token.h Parser/parser.h Parser/tokenizer.h" +OTHER_FILES="Grammar/Grammar Parser/Python.asdl Parser/asdl.py Parser/asdl_c.py" + +for dir in $DIRS +do + rm -rf ast3/$dir + mkdir -p ast3/$dir +done + +for file in $C_FILES $H_FILES $OTHER_FILES +do + cp $CPYTHON/$file ast3/$file +done + +./tools/update_header_guards 3 + +rm -rf build +grep -v ast3/Custom setup.py | python3.7 - build + +./tools/find_exported_symbols 3 +./tools/update_exported_symbols 3 + +patch ast3/Parser/asdl_c.py <tools/asdl_c.patch + +python3.7 ast3/Parser/asdl_c.py -h ast3/Include/Python-ast.h ast3/Parser/Python.asdl +python3.7 ast3/Parser/asdl_c.py -c ast3/Python/Python-ast.c ast3/Parser/Python.asdl + +python3.7 setup.py build + +# Lots of manual changes go here... + +##PYTHONPATH=build/lib.macosx-10.9-x86_64-3.7/ python3.7 -m pytest -s ast3/tests
tools/tokenizer.patch+80 −0 added@@ -0,0 +1,80 @@ +diff --git a/ast3/Parser/tokenizer.c b/ast3/Parser/tokenizer.c +index 617a744..667fb4a 100644 +--- a/ast3/Parser/tokenizer.c ++++ b/ast3/Parser/tokenizer.c +@@ -105,10 +105,16 @@ const char *_Ta3Parser_TokenNames[] = { + "OP", + "AWAIT", + "ASYNC", ++ "TYPE_IGNORE", ++ "TYPE_COMMENT", + "<ERRORTOKEN>", + "<N_TOKENS>" + }; + ++/* Spaces in this constant are treated as "zero or more spaces or tabs" when ++ tokenizing. */ ++static const char* type_comment_prefix = "# type: "; ++ + + /* Create and initialize a new tok_state structure */ + +@@ -1493,10 +1499,56 @@ tok_get(struct tok_state *tok, char **p_start, char **p_end) + /* Set start of current token */ + tok->start = tok->cur - 1; + +- /* Skip comment */ ++ /* Skip comment, unless it's a type comment */ + if (c == '#') { +- while (c != EOF && c != '\n') { ++ const char *prefix, *p, *type_start; ++ ++ while (c != EOF && c != '\n') + c = tok_nextc(tok); ++ ++ p = tok->start; ++ prefix = type_comment_prefix; ++ while (*prefix && p < tok->cur) { ++ if (*prefix == ' ') { ++ while (*p == ' ' || *p == '\t') ++ p++; ++ } else if (*prefix == *p) { ++ p++; ++ } else { ++ break; ++ } ++ ++ prefix++; ++ } ++ ++ /* This is a type comment if we matched all of type_comment_prefix. */ ++ if (!*prefix) { ++ int is_type_ignore = 1; ++ tok_backup(tok, c); /* don't eat the newline or EOF */ ++ ++ type_start = p; ++ ++ is_type_ignore = tok->cur >= p + 6 && memcmp(p, "ignore", 6) == 0; ++ p += 6; ++ while (is_type_ignore && p < tok->cur) { ++ if (*p == '#') ++ break; ++ is_type_ignore = is_type_ignore && (*p == ' ' || *p == '\t'); ++ p++; ++ } ++ ++ if (is_type_ignore) { ++ /* If this type ignore is the only thing on the line, consume the newline also. */ ++ if (blankline) { ++ tok_nextc(tok); ++ tok->atbol = 1; ++ } ++ return TYPE_IGNORE; ++ } else { ++ *p_start = (char *) type_start; /* after type_comment_prefix */ ++ *p_end = tok->cur; ++ return TYPE_COMMENT; ++ } + } + } +
tools/token.patch+17 −0 added@@ -0,0 +1,17 @@ +diff --git a/ast3/Include/token.h b/ast3/Include/token.h +index a657fdd..d0b2b94 100644 +--- a/ast3/Include/token.h ++++ b/ast3/Include/token.h +@@ -68,8 +68,10 @@ extern "C" { + /* These aren't used by the C tokenizer but are needed for tokenize.py */ + #define COMMENT 55 + #define NL 56 +-#define ENCODING 57 +-#define N_TOKENS 58 ++#define ENCODING 57 ++#define TYPE_IGNORE 58 ++#define TYPE_COMMENT 59 ++#define N_TOKENS 60 + + /* Special definitions for cooperation with parser */ +
tools/update_ast3_asdl+2 −2 modified@@ -4,5 +4,5 @@ PROJ_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/.." -python3 ast3/Parser/asdl_c.py -h ast3/Include/ ast3/Parser/Python.asdl -python3 ast3/Parser/asdl_c.py -c ast3/Python/ ast3/Parser/Python.asdl +python3 ast3/Parser/asdl_c.py -h ast3/Include/Python-ast.h ast3/Parser/Python.asdl +python3 ast3/Parser/asdl_c.py -c ast3/Python/Python-ast.c ast3/Parser/Python.asdl
tools/update_ast3_grammar+2 −22 modified@@ -4,28 +4,8 @@ PROJ_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/.." -echo 'Compiling pgen' -gcc -I ast3/Parser -I ast3/Include $(python3-config --includes) \ - -o tools/pgen3 \ - ast3/Parser/acceler.c \ - ast3/Parser/grammar1.c \ - ast3/Parser/node.c \ - ast3/Parser/parser.c \ - ast3/Parser/bitset.c \ - ast3/Parser/grammar.c \ - ast3/Pgen/listnode.c \ - ast3/Pgen/metagrammar.c \ - ast3/Pgen/firstsets.c \ - ast3/Pgen/pgen.c \ - ast3/Pgen/obmalloc.c \ - ast3/Pgen/dynamic_annotations.c \ - ast3/Pgen/mysnprintf.c \ - ast3/Pgen/pyctype.c \ - ast3/Pgen/tokenizer_pgen.c \ - ast3/Pgen/printgrammar.c \ - ast3/Pgen/parsetok_pgen.c \ - ast3/Pgen/pgenmain.c - +echo 'Copying pgen' +cp ~/src/cpython37/Parser/pgen tools/pgen3 echo 'Updating graminit files' tools/pgen3 ast3/Grammar/Grammar ast3/Include/graminit.h ast3/Python/graminit.c
typed_ast/ast3.py+3 −1 modified@@ -40,7 +40,7 @@ import _ast3 from _ast3 import * -LATEST_MINOR_VERSION = 6 +LATEST_MINOR_VERSION = 7 def parse(source, filename='<unknown>', mode='exec', feature_version=LATEST_MINOR_VERSION): """ @@ -56,6 +56,8 @@ def parse(source, filename='<unknown>', mode='exec', feature_version=LATEST_MINO When feature_version=4, the parser will forbid the use of the async/await keywords and the '@' operator, but will not forbid the use of PEP 448 additional unpacking generalizations, which were also added in Python 3.5. + + When feature_version>=7, 'async' and 'await' are always keywords. """ return _ast3._parse(source, filename, mode, feature_version)
typed_ast/__init__.py+1 −1 modified@@ -1 +1 @@ -__version__ = "1.2.1-dev" +__version__ = "1.3.0.dev0"
update_process.md+59 −24 modified@@ -38,6 +38,18 @@ version of Python. They are not meant to be comprehensive -- you'll have to troubleshoot problems and use your own intuition along the way. Most steps have an example commit hash in parentheses from the Python 3.6 update. +(At a high level, steps 1-4 alter the code so it uses `Ta3` instead of +`Py` as a prefix for globals; the next few steps add support for type +comments to the lexer, grammar and "asdl" machinery; then we add +support for `feature_version`; finally we work on making the code +compatible with older Python versions and other platforms.) + +Note that steps 1-4 can be automated using tools/script and various +other files in tools/. You need to install GNU binutils in order +to be able to use gobjdump in tools/find_exported_symbols +(e.g. `brew install binutils`). The script assumes you're on a Mac +and your CPython source tree is at `~/src/cpython37`. + 1. Copy over the parser files from CPython. The set of files you want is likely the set currently present in `ast3`. ([a377f1e](https://github.com/python/typed_ast/commit/a377f1e3deb332bfbec3f3bb0d4c42768626d8d4)) @@ -49,37 +61,40 @@ an example commit hash in parentheses from the Python 3.6 update. version of Python it was copied from). 4. Update exported symbols: To avoid dynamic linker conflicts, exported `ast3` functions need their own unique prefix. - 1. Compile the module with `python3 setup.py build`. - 2. Generate the list of exported symbols with `./tools/find_exported_symbols 3`. - The script may require updating to work on your platform, but should serve - as a useful guide at minimum. - 3. The exported symbols will be written to `exported_symbols3.txt`. Make - sure this file looks sane, then remove `_PyInit__ast3` (which we want to - export) and delete the excess output at the beginning of each line (including - the leading `_` of each symbol) to end up with a list of function names to - change. - 4. Run `./tools/update_exported_symbols 3`, which updates the exported - symbols in all the `ast3` files with sed. It may take a few seconds to run. - If you're on Linux, the script will need some very slight modification to - work properly (due to cross-platform sed argument differences). Verify the - changes look sane. - ([d1ec7d0](https://github.com/python/typed_ast/commit/d1ec7d07cb6a7fe016d9446a196dfa3b86c5acf6)) - 5. Update `Parser/asdl_c.py`. Use the changes from git history to guide you. - ([29dbec4](https://github.com/python/typed_ast/commit/29dbec47aa145d84e5faaa431ce3b3afca233b3d)) + + 1. Starting with an empty `build` directory, compile the module with `python3 setup.py build`. + 2. Generate the list of exported symbols with `./tools/find_exported_symbols 3`. + The script may require updating to work on your platform, but should serve + as a useful guide at minimum. + 3. The exported symbols will be written to `exported_symbols3.txt`. Make + sure this file looks sane, then remove `_PyInit__ast3` (which we want to + export) to end up with a list of function names to change. + 4. Run `./tools/update_exported_symbols 3`, which updates the exported + symbols in all the `ast3` files with sed. It may take a few seconds to run. + If you're on Linux, the script will need some very slight modification to + work properly (due to cross-platform sed argument differences). Verify the + changes look sane. + ([d1ec7d0](https://github.com/python/typed_ast/commit/d1ec7d07cb6a7fe016d9446a196dfa3b86c5acf6)) + 5. Update `Parser/asdl_c.py`. Use the changes from git history to guide you. + (Don't be distracted by the generated files in that commit; look at asdl_c.py only. + Much of this renames _ast to _ast3 and substitutes certain _Py_ prefixes with _Ta3_.) + Update the generated files with `tools/update_ast3_asdl`. + ([29dbec4](https://github.com/python/typed_ast/commit/29dbec47aa145d84e5faaa431ce3b3afca233b3d)) + 5. Make a commit. You've likely been making commits along the way, but it's vitally important that there be a commit here so there can be a clean diff for the next time an update needs to be written (without the noise of the function prefix rewriting, etc). -6. Add `Custom/typed_ast.c` back to setup.py. Temporarily remove references to - `TYPE_COMMENT` and `Py_func_type_input` from `Custom/typed_ast.c` to allow it - to compile. +6. Add `Custom/typed_ast.c` back to setup.py. To allow it to compile, + temporarily comment out references to `TYPE_COMMENT` and `Py_func_type_input` and + also the `feature_version` argument in the call to `Ta3AST_FromNodeObject()`. ([b7a034b](https://github.com/python/typed_ast/commit/b7a034bc657dcfd5681b505f3949603fa6597116)) 7. Check that things seem to be working so far. At this point, if you add the `_parse` function to the ast module in `Python-ast.c`, `ast3` should be able to compile and parse things without type information. ([5e1885c](https://github.com/python/typed_ast/commit/5e1885cf54e1434a9422f3f797ecb1ed6fb42fb6)) 8. Port over the changes related to parsing type comments. Use git history to - guide you here. Diffing the previous `ast3` against it's external symbol + guide you here. Diffing the previous `ast3` against its external symbol update commit will show you which changes you need to make, and diffing the previous `ast3` against the current work in progress can be helpful for quickly finding where to put them. You'll need to make `Python.asdl` @@ -88,16 +103,36 @@ an example commit hash in parentheses from the Python 3.6 update. part that compiles pgen may need tweaking to work on your machine). Check that things work before moving on. ([f74d9f3](https://github.com/python/typed_ast/commit/f74d9f3f231110639752c30c0ae5fbebe870ebc6)) + A bit more detail: + + - Add the `TYPE_IGNORE` and `TYPE_COMMENT` symbols to `Include/token.h`, and updating `N_TOKENS`; + also add theze to the list of strings in `Parser/tokenizer.c` (in the same order!) + - NOTE: As of Python 3.7, the `ASYNC` and `AWAIT` symbols also need to be added to both places + - Update `Parser/Python.asdl` to add `type_comments` and `type_ignores` to various definitions + and run `tools/update_ast3_asdl`; this updates `Include/Python-ast.h` and `Python/Python-ast.c` + - Reapply other patches to `Parser/parsetok.c` and `Parser/tokenizer.c` (these implement + recognition of type comments) + - Add `[TYPE_COMMENT]` to various places in `Grammar/Grammar`, and then run + `tools/update_ast3_grammar`; this updates `Python/graminit.c` and `Include/graminit.h` + - NOTE: As of Python 3.7, this is problematic because the upstream developers like to add + dependencies on CPython internals to pgen. I ended up copying some files into CPython, + running pgen there (`make regen-grammar`), and copying the results back + - Copy the definition of `Py_func_type_input` from `Python/graminit.h` to `Include/compile.h` + - NOTE: As of Python 3.7, compile.h depends on CPython internals; I ended up creating a small + file compile-ast3.h with just the four symbols we need (maybe we even only need the one) + - Attempt compilation and fix errors, e.g. add an extra argument to `Module(stmts, arena)` + to pass `type_ignores` + 9. Port over the changes for enforcing `feature_version`. Check this works. ([89aebce](https://github.com/python/typed_ast/commit/89aebcefb612c113446e3a877f78b93e4cf142b3)) 10. Add `feature_version` checks for any new syntax features in the Python version you're updating to. Check these work. -11. Make the changes necessary so `ast3` can compile on the previous Python - version. This is new territory every time. Spelunking in the rest of the +11. Make the changes necessary so `ast3` can compile on previous Python + versions. This is new territory every time. Spelunking in the rest of the CPython source can often be helpful here. ([5ea3eb8](https://github.com/python/typed_ast/commit/5ea3eb8447fd5c72c6f390014b1f7ea7cd6119ea)) 12. Port compatilibity with older Python versions. See git history for - details. The changes in the previous `ast3` should likely suffice here. + details. ([8d2aeae](https://github.com/python/typed_ast/commit/8d2aeae8651c7e86ac51d7abefb91cb563c94555)) 13. Port compatility with Windows. This largely involves replacing `PyAPI_FUNC` and `PyAPI_DATA` with `extern`. See the git history for details.
Vulnerability mechanics
Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
10- github.com/advisories/GHSA-7xxv-wpxj-mx5vghsaADVISORY
- lists.fedoraproject.org/archives/list/package-announce%40lists.fedoraproject.org/message/LG5H4Q6LFVRX7SFXLBEJMNQFI4T5SCEA/mitrevendor-advisoryx_refsource_FEDORA
- nvd.nist.gov/vuln/detail/CVE-2019-19275ghsaADVISORY
- bugs.python.org/issue36495ghsax_refsource_MISCWEB
- github.com/pypa/advisory-database/tree/main/vulns/typed-ast/PYSEC-2019-131.yamlghsaWEB
- github.com/python/cpython/commit/a4d78362397fc3bced6ea80fbc7b5f4827aec55eghsax_refsource_MISCWEB
- github.com/python/cpython/commit/dcfcd146f8e6fc5c2fc16a4c192a0c5f5ca8c53cghsax_refsource_MISCWEB
- github.com/python/typed_ast/commit/156afcb26c198e162504a57caddfe0acd9ed7dceghsax_refsource_MISCWEB
- github.com/python/typed_ast/commit/dc317ac9cff859aa84eeabe03fb5004982545b3bghsax_refsource_MISCWEB
- lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/message/LG5H4Q6LFVRX7SFXLBEJMNQFI4T5SCEAghsaWEB
News mentions
0No linked articles in our index yet.