gh-146385: Switch back to re to detect shlex.quote slow path#146408
gh-146385: Switch back to re to detect shlex.quote slow path#146408bonzini wants to merge 1 commit intopython:mainfrom
re to detect shlex.quote slow path#146408Conversation
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
|
@AA-Turner Can you check if the results are consistent on Windows as well please? |
|
Here is what I suggest:
I will also try to check what happens on older systems (openSUSE + older GCC/clang). |
Commit 06a26fd ("pythongh-118761: Optimise import time for ``shlex`` (python#132036)") when the input has to be quoted. This is because the regular expression search was able to short-circuit at the first unsafe character. Go back to the same algorithm as 3.13, but make the "import re" and compilation of the regular expression lazy. Testing s.isascii() makes shlex.quote() twice as fast in the non-ASCII case, but costs up to 25% of the full run time (because it necessitates an earlier isinstance check) if the string *is* ASCII. The latter is probably the common case, so drop the check.
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
|
Yeah I didn't use |
|
For now, let's check whether the regression is consistent across platforms. |
It is, and also it's really a complexity change. |
|
What I want to know is why Adam reported numbers that went in the other direction. Ideally I also want to know the stdev of the numbers using |
|
I suggested in #132036 (comment) to replace the
|
Commit 06a26fd ("gh-118761: Optimise import time for
shlex(#132036)") when the input has to be quoted. This is because the regular expression search was able to short-circuit at the first unsafe character.Go back to the same algorithm as 3.13, but make the "import re" and compilation of the regular expression lazy.
Testing
s.isascii()makesshlex.quote()twice as fast in the non-ASCII case, but costs up to 25% of the full run time (because it necessitates an earlierisinstancecheck) if the string is ASCII. The latter is probably the common case, so drop the check.shlex.quotefrom 3.13 to 3.14 #146385