Skip to content

[mypyc] Speed up native-to-native imports within the same group#21101

Open
JukkaL wants to merge 50 commits intomasterfrom
mypyc-imports
Open

[mypyc] Speed up native-to-native imports within the same group#21101
JukkaL wants to merge 50 commits intomasterfrom
mypyc-imports

Conversation

@JukkaL
Copy link
Collaborator

@JukkaL JukkaL commented Mar 25, 2026

When compiling multiple modules, mypyc generally creates one big shared library with all the code, and also tiny shim shared libraries for each compiled module so that Python import machinery can find the modules. This is inefficient at least on macOS, since each shared library that is loaded into the process seems to have a non-trivial cost, including each shim. On the first run, this cost is much higher, and the first mypy run after pip install can take 30s or more on macOS.

This PR addresses the slow imports on macOS by adding a custom implementation of native-to-native imports within the same compilation group that avoids using the shim. We directly construct the module object, populate sys.modules, and set an attribute in the parent package, without using Python import machinery.

This speeds up a minimal mypy run (mypy -c 'import os') on macOS by up to 10x (first cold run after installation), but even small warm runs are significantly faster. The measurements were all over the place, but at least in one measurement the minimal warm run was over 1.5x faster with these changes. Impact on Linux should be small (an earlier version of this PR was slightly faster on Linux, but didn't measure the current one). I haven't measured the impact on Windows.

Some notes about the implementation:

  • Group similar imported names in from <...> import and try to generate a single call to import multiple names to avoid verbose IR.
  • When importing non-native modules or native modules defined in another group, we still rely on Python import machinery.
  • Various attributes are implicitly defined by Python when importing a module, and I set these attributes explicitly.
  • I split module init into two parts, since the attributes mentioned above need to be set before running the module body.
  • Avoid generating shims for some __init__.py files when compiling mypy as a micro-optimization.

I used Claude Code and Codex to implement much of the code in small increments (based on a manually written core implementation). I also iterated on the code quite significantly after the basic implementation was done.

@github-actions

This comment has been minimized.

@ilevkivskyi
Copy link
Member

@JukkaL the test failures on Windows look real.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Mar 25, 2026

Yeah, I'm investigating the Windows failures.

Comment on lines +1229 to +1238
PyObject *file = PyObject_GetAttrString(modobj, "__file__");
if (file != NULL) {
// __file__ already set, nothing to do.
Py_DECREF(file);
return 0;
}
if (!PyErr_ExceptionMatches(PyExc_AttributeError)) {
return -1;
}
PyErr_Clear();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use PyObject_GetOptionalAttrString to not have to deal with the exception here and in other new functions.

Comment on lines +235 to +239
def func() -> int:
return 42
[file driver.py]
import native
assert native.f() == 42
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test if _file is as expected?

@github-actions

This comment has been minimized.

1 similar comment
@github-actions
Copy link
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants