typeshed `float | int` #1617

beauxq · 2025-11-01T18:12:52Z

to make #319 easier

modifies typeshed stubs, changing float to float | int

doesn't change Final[float]
doesn't change return type -> float
- We might find with experiments that we want to change some return types, but then we'd want a big list of exceptions.

~~I thought it might be good to put it in a different file, but~~ it looks like this generate_docstubs.py is made to be used as both an importable module and an entry point, making a relative import with no package problematic.

beauxq · 2025-11-01T18:13:54Z

build/py3_8/generate_docstubs.py

        copytree(stubs_path, stubs_with_docs_path, dirs_exist_ok=False)
    elif not stubs_with_docs_path.exists():
        copytree(stubs_path, stubs_with_docs_path, dirs_exist_ok=True)
+    float_expand(stubs_with_docs_path)


Will the primer diff include this transformation? I don't know whether this step is included in the CI workflow.

the primer won't run on this PR because it's not changing any code that would typically effect the result. but now that this script does more than add docstrings to stubs, that's no longer the case i guess

KotlinIsland · 2025-11-02T04:06:15Z

are you patching all of typesheds float to be float | int?

beauxq · 2025-11-02T05:01:48Z

are you patching all of typesheds float to be float | int?

If it's in an annotation, and not in the exceptions in the OP

DetachHead · 2025-11-02T05:39:00Z

build/py3_8/generate_docstubs.py

+KEEP_FLOAT: Final[frozenset[str]] = frozenset((
+    # Example:
+    # If we didn't want to change the type of the `priority` parameter
+    # of the `register` function in the `Registry` class of markdown/util.pyi
+    # "stubs/Markdown/markdown/util.pyi/Registry.register.priority",
+    # See implementation `_node_path` for details on this identifier string.
+))
+""" identifiers for `float` that we don't want to change to `float | int` """
+
+
+def name_for_target(node: ast.AnnAssign) -> str:
+    return (
+        node.target.id
+        if isinstance(node.target, ast.Name)
+        else node.target.attr
+        if isinstance(node.target, ast.Attribute)
+        else "subscript"
+    )
+
+
+def name_for_node(
+    node: ast.ClassDef
+    | ast.FunctionDef
+    | ast.AsyncFunctionDef
+    | ast.arg
+    | ast.Name
+    | ast.AnnAssign,
+) -> str:
+    return (
+        node.name
+        if isinstance(node, (ast.ClassDef, ast.FunctionDef, ast.AsyncFunctionDef))
+        else node.arg
+        if isinstance(node, ast.arg)
+        else name_for_target(node)
+        if isinstance(node, ast.AnnAssign)
+        else node.id
+    )
+
+
+def has_int_child(node: ast.BinOp) -> bool:
+    if isinstance(node.right, ast.Name) and node.right.id == "int":
+        return True
+    if isinstance(node.left, ast.Name) and node.left.id == "int":
+        return True
+    if isinstance(node.right, ast.BinOp):
+        # assuming "|" is the only BinOp in annotations
+        assert isinstance(node.right.op, ast.BitOr), node.right.op
+        if has_int_child(node.right):
+            return True
+    if isinstance(node.left, ast.BinOp):
+        assert isinstance(node.left.op, ast.BitOr), node.left.op
+        if has_int_child(node.left):
+            return True
+    return False
+
+
+class AnnotationTrackingVisitor(ast.NodeVisitor):
+    parent_stack: list[ast.AST]
+    in_ann: str | None = None
+    floats: list[ast.Name]
+    module: str
+
+    def __init__(self, module: str) -> None:
+        self.parent_stack = []
+        self.floats = []
+        self.module = module
+
+    @override
+    def visit(self, node: ast.AST) -> None:
+        self.parent_stack.append(node)
+        super().visit(node)
+        _ = self.parent_stack.pop()
+
+    @override
+    def visit_AnnAssign(self, node: ast.AnnAssign) -> None:
+        assert len(list(ast.iter_fields(node))) == 4, list(ast.iter_fields(node))
+        # I don't know what the 4th field "simple" is, but it's not an AST.
+
+        self.visit(node.target)
+
+        self.in_ann = name_for_target(node)
+        self.visit(node.annotation)
+        self.in_ann = None
+
+        if node.value:
+            self.visit(node.value)
+
+    @override
+    def visit_arg(self, node: ast.arg) -> None:
+        # arg name str, annotation, type comment str
+        assert len(list(ast.iter_fields(node))) == 3, list(ast.iter_fields(node))
+
+        self.in_ann = node.arg
+        if node.annotation:
+            self.visit(node.annotation)
+        self.in_ann = None
+
+    # NOTE: assuming function return values are actually float if annotated as such.
+    # If we don't want to assume that, uncomment this:
+    # (probably would also want `visit_AsyncFunctionDef`)
+
+    # @override
+    # def visit_FunctionDef(self, node: ast.FunctionDef) -> None:
+    #     # Copied from implementation of base generic_visit
+    #     # and modified for "returns"
+    #     for field, value in ast.iter_fields(node):
+    #         if isinstance(value, list):
+    #             for item in value:
+    #                 if isinstance(item, ast.AST):
+    #                     self.visit(item)
+    #         elif isinstance(value, ast.AST):
+    #             if field == "returns":
+    #                 self.in_ann = "returns"
+    #             self.visit(value)
+    #             if field == "returns":
+    #                 self.in_ann = None
+
+    def _node_path(self) -> str:
+        """a string that identifies the current node (from `self.parent_stack`)"""
+        strs = [
+            name_for_node(n)
+            for n in self.parent_stack
+            if isinstance(
+                n,
+                (
+                    ast.ClassDef,
+                    ast.FunctionDef,
+                    ast.AsyncFunctionDef,
+                    ast.arg,
+                    ast.Name,
+                    ast.AnnAssign,
+                ),
+            )
+        ]
+        if len(strs) > 0 and strs[-1] == "float":
+            _ = strs.pop()
+        return self.module + "/" + ".".join(strs)
+
+    def _with_int(self) -> bool:
+        """`float | int` already"""
+        assert isinstance(self.parent_stack[-1], ast.Name), self.parent_stack
+        assert self.parent_stack[-1].id == "float", self.parent_stack[-1].id
+        index = len(self.parent_stack) - 2
+        while index >= 0:
+            traverse_node = self.parent_stack[index]
+            if not isinstance(traverse_node, ast.BinOp):
+                return False
+            # assuming "|" is the only BinOp in annotations
+            assert isinstance(traverse_node.op, ast.BitOr), traverse_node.op
+            if has_int_child(traverse_node):
+                return True
+            index -= 1
+        return False
+
+    def _is_final(self) -> bool:
+        assert isinstance(self.parent_stack[-1], ast.Name), self.parent_stack
+        assert self.parent_stack[-1].id == "float", self.parent_stack
+        if len(self.parent_stack) > 1:
+            parent = self.parent_stack[-2]
+            if (
+                isinstance(parent, ast.Subscript)
+                and isinstance(parent.value, ast.Name)
+                and parent.value.id == "Final"
+            ):
+                assert parent.slice is self.parent_stack[-1], parent.slice
+                return True
+        return False
+
+    @override
+    def generic_visit(self, node: ast.AST) -> None:
+        if self.in_ann is not None and isinstance(node, ast.Name) and node.id == "float":
+            assert node is self.parent_stack[-1], self.parent_stack
+
+            if self._with_int() or self._is_final():
+                # There's already some already float | int in typeshed
+                # and assuming `Final[float]` is really float
+                pass
+            else:
+                node_path = self._node_path()
+                if node_path not in KEEP_FLOAT:
+                    self.floats.append(node)
+        super().generic_visit(node)
+
+
+def float_expand(stubs_with_docs_path: Path) -> None:
+    """change stubs in the given directory from `float` to `float | int`"""
+    for dir_path, _dir_names, file_names in os.walk(stubs_with_docs_path):
+        for file_name in file_names:
+            if not file_name.endswith(".pyi"):
+                continue
+            file_path = Path(dir_path) / file_name
+            rel_path = Path(os.path.relpath(file_path, stubs_with_docs_path)).as_posix()
+            file_bytes = Path(file_path).read_bytes()
+            file_parsed = ast.parse(file_bytes)
+            v = AnnotationTrackingVisitor(rel_path)
+            v.visit(file_parsed)
+
+            if len(v.floats) > 0:
+                print(file_path.as_posix())
+                # compute start offset of each line in file
+                lines = file_bytes.split(b"\n")
+                line_starts = [0]
+                for line in lines:
+                    line_starts.append(line_starts[-1] + len(line) + 1)  # +1 for newline
+
+                # process in reverse order to avoid offset changes affecting subsequent replacements
+                v.floats.sort(key=lambda n: (n.lineno, n.col_offset), reverse=True)
+                for fl in v.floats:
+                    assert fl.end_lineno == fl.lineno, fl  # always within 1 line
+                    assert fl.end_col_offset is not None
+                    assert fl.end_col_offset == fl.col_offset + 5  # always "float" 5 chars
+
+                    # calculate offsets in file (from offsets in line)
+                    line_start = line_starts[fl.lineno - 1]
+                    start_offset = line_start + fl.col_offset
+                    end_offset = line_start + fl.end_col_offset
+
+                    assert file_bytes[start_offset:end_offset] == b"float", file_bytes[
+                        start_offset:end_offset
+                    ]
+
+                    file_bytes = (
+                        file_bytes[:start_offset] + b"float | int" + file_bytes[end_offset:]
+                    )
+
+                _ = Path(file_path).write_bytes(file_bytes)


the original purpose of this script was just to add docstrings to stubs. currently it will run across every supported python version on linux, macos and windows to gather all the docstrings from objects at runtime, which i don't think is necessary for what we're doing here.

so if we are going to extend it to do other things as well, i think we need to restructure this a bit. it should probably be moved into a separate script

moved into a separate script
Now it's not used. I'm not sure how we want to integrate it.
But it's not really useful until we do something with #319

yeah i think we should probably hold off merging until we actually address #319. i guess we can just keep this PR open in the mean time

KotlinIsland · 2025-11-02T07:59:26Z

@jorenham do you think this could have unintended side effects in your projects?

jorenham · 2025-11-02T09:36:33Z

It'll be pretty horrible for performance

jorenham · 2025-11-02T09:40:30Z

assert_type will likely also break in a couple of places. And there's a chance this'll break optype.Just, and therefore also scipy-stubs and pandas.

jorenham · 2025-11-02T16:53:08Z

I just remembered that a while back I ran an experiment doing exactly this on scipy-stubs and numtype:

data-apis/array-api-compat#291 (comment)

beauxq added 3 commits November 1, 2025 09:36

start towards patching float | int into typeshed

add8b25

file modification and cleaning

8903491

ruff format

1129817

beauxq commented Nov 1, 2025

View reviewed changes

beauxq added 2 commits November 1, 2025 11:51

annotate KEEP_FLOAT

b246781

some of what ruff finds

e1b1a9b

DetachHead reviewed Nov 2, 2025

View reviewed changes

beauxq added 3 commits November 3, 2025 19:29

remaining ruff rules

3421c1f

move to new file

91a1aab

a pylint issue

a4e338d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

typeshed `float | int` #1617

typeshed `float | int` #1617

Uh oh!

beauxq commented Nov 1, 2025 •

edited

Loading

Uh oh!

beauxq Nov 1, 2025

Uh oh!

DetachHead Nov 2, 2025

Uh oh!

KotlinIsland commented Nov 2, 2025

Uh oh!

beauxq commented Nov 2, 2025

Uh oh!

DetachHead Nov 2, 2025

Uh oh!

beauxq Nov 4, 2025

Uh oh!

DetachHead Nov 5, 2025

Uh oh!

KotlinIsland commented Nov 2, 2025

Uh oh!

jorenham commented Nov 2, 2025

Uh oh!

jorenham commented Nov 2, 2025

Uh oh!

jorenham commented Nov 2, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

typeshed float | int #1617

Are you sure you want to change the base?

typeshed float | int #1617

Uh oh!

Conversation

beauxq commented Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

beauxq Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

DetachHead Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

KotlinIsland commented Nov 2, 2025

Uh oh!

beauxq commented Nov 2, 2025

Uh oh!

DetachHead Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

beauxq Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

DetachHead Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

KotlinIsland commented Nov 2, 2025

Uh oh!

jorenham commented Nov 2, 2025

Uh oh!

jorenham commented Nov 2, 2025

Uh oh!

jorenham commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

typeshed `float | int` #1617

typeshed `float | int` #1617

beauxq commented Nov 1, 2025 •

edited

Loading

jorenham commented Nov 2, 2025 •

edited

Loading