Skip to content

Python: add new shared-CFG-backed control flow graph (additive)#21921

Open
yoff wants to merge 1 commit into
yoff/python-flow-py-namespacefrom
yoff/python-add-new-cfg-library
Open

Python: add new shared-CFG-backed control flow graph (additive)#21921
yoff wants to merge 1 commit into
yoff/python-flow-py-namespacefrom
yoff/python-add-new-cfg-library

Conversation

@yoff
Copy link
Copy Markdown
Contributor

@yoff yoff commented Jun 1, 2026

Summary

Preparatory refactor for the shared-CFG dataflow migration (#21894).
Based on #21920 — merge that first.

Adds the new Python CFG library additively, without changing any production behaviour.

Library additions

  • semmle.python.controlflow.internal.AstNodeImpl — mediates between the Python AST and the shared codeql.controlflow.ControlFlowGraph signature. Wraps Python's Stmt/Expr/Scope/Pattern and adds two synthetic kinds of node (BlockStmt for body slots, intermediate nodes for multi-operand boolean expressions).
  • semmle.python.controlflow.internal.Cfg — public facade re-exposing the same API surface as semmle/python/Flow.qll (ControlFlowNode, CallNode, BasicBlock, NameNode, DefinitionNode, CompareNode, ...), backed by the shared CFG. Intended as a drop-in replacement for use by the upcoming dataflow migration.
  • lib/printCfgNew.ql — debug/visualisation query for the new CFG.
  • consistency-queries/CfgConsistency.ql — runs the shared CFG's standard structural checks against Python.

Shared library

  • shared/controlflow/.../ControlFlowGraph.qll — adds two defaulted predicates getWhileElse / getForeachElse to AstSig so Python can model while-else / for-else. The defaults are none(), so existing languages are unaffected.

Test additions

  • ControlFlow/bindings/* — annotation-driven SSA-binding tests for the new CFG (annassign, compound, comprehension, decorated, except_handler, imports, match_pattern, parameters, simple, type_params, walrus_starred, with_stmt, dead_under_no_raise).
  • ControlFlow/store-load/* — basic store/load coverage.
  • ControlFlow/evaluation-order/NewCfg*.ql — mirrors of the existing OldCfg evaluation-order self-validation suite, run against the new CFG via NewCfgImpl.qll.
  • Minor extensions to existing test_if.py / test_boolean.py plus cosmetic .expected churn on a handful of OldCfg tests (to add more annotation coverage).

Production impact

None. No dataflow, SSA, or production query is migrated yet — that lands in follow-up PRs. The new CFG library has zero callers in lib/ and src/ after this PR.

Verification

  • All 367 lib/ + src/ + consistency-queries/ queries compile clean.
  • All 56 ControlFlow library-tests pass.
  • All 474 dataflow + PointsTo library-tests + consistency tests pass.
  • syntax_error/CONSISTENCY/CfgConsistency passes.

Notes for reviewers

  • AstNodeImpl.qll is the bulk of the new code (~1.7k LOC) and is the file most worth careful review. Most of it is the AST→CFG mapping table (getChild, isExpressionNode, beginAbruptCompletion, etc.) plus the synthetic BlockStmt newtype.
  • Cfg.qll is the facade and is mostly mechanical class-by-class shadowing of Flow.qll's public API on top of the shared library.

Copilot AI review requested due to automatic review settings June 1, 2026 11:54
@yoff yoff requested review from a team as code owners June 1, 2026 11:54
@yoff yoff force-pushed the yoff/python-add-new-cfg-library branch from acf744a to b547f1b Compare June 1, 2026 11:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR additively introduces a new Python control-flow-graph (CFG) facade backed by the shared codeql.controlflow.ControlFlowGraph library, along with supporting shared-library signature extensions for while-else / for-else, debug/consistency queries, and extensive new/updated library tests to validate the new CFG wiring.

Changes:

  • Extend the shared CFG AST signature with defaulted getWhileElse / getForeachElse hooks and wire loop else control-flow edges.
  • Add new Python shared-CFG adapter libraries (AstNodeImpl.qll, Cfg.qll), plus a new “print CFG” query and a consistency query.
  • Add/adjust ControlFlow library tests (bindings, store/load, evaluation-order) to exercise the new CFG.
Show a summary per file
File Description
shared/controlflow/codeql/controlflow/ControlFlowGraph.qll Adds defaulted loop-else predicates to AstSig and wires while/foreach else edges into shared CFG stepping.
shared/controlflow/change-notes/2026-05-19-loop-else.md Change note for shared CFG loop-else support.
python/ql/test/library-tests/ControlFlow/store-load/test.py New store/load/delete/parameter annotation test source.
python/ql/test/library-tests/ControlFlow/store-load/StoreLoadTest.ql New inline-expectations test query for store/load classification on the new CFG facade.
python/ql/test/library-tests/ControlFlow/store-load/StoreLoadTest.expected Expected output for the store/load test query.
python/ql/test/library-tests/ControlFlow/evaluation-order/TimerUtils.qll Adds helper to extract timestamp literals (supporting tuple timestamps).
python/ql/test/library-tests/ControlFlow/evaluation-order/test_if.py Updates evaluation-order test source for if constructs.
python/ql/test/library-tests/ControlFlow/evaluation-order/test_boolean.py Adjusts boolean evaluation-order annotations to satisfy new branch-timestamp checking.
python/ql/test/library-tests/ControlFlow/evaluation-order/StrictForward.ql Ensures necessary imports for evaluation-order query utilities.
python/ql/test/library-tests/ControlFlow/evaluation-order/StrictForward.expected Expected output adjustments due to source/label changes.
python/ql/test/library-tests/ControlFlow/evaluation-order/OldCfgImpl.qll Updates old-CFG implementation module (import aliasing/typing).
python/ql/test/library-tests/ControlFlow/evaluation-order/NoSharedReachable.ql Ensures necessary imports for evaluation-order query utilities.
python/ql/test/library-tests/ControlFlow/evaluation-order/NoBasicBlock.ql Ensures necessary imports for evaluation-order query utilities.
python/ql/test/library-tests/ControlFlow/evaluation-order/NoBackwardFlow.ql Ensures necessary imports for evaluation-order query utilities.
python/ql/test/library-tests/ControlFlow/evaluation-order/NoBackwardFlow.expected Expected output adjustments due to source/label changes.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgStrictForward.ql Adds new-CFG variant of the StrictForward evaluation-order check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgStrictForward.expected Expected output for new-CFG StrictForward check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgNoSharedReachable.ql Adds new-CFG variant of the NoSharedReachable evaluation-order check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgNoSharedReachable.expected Expected output for new-CFG NoSharedReachable check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgNoBasicBlock.ql Adds new-CFG variant of the NoBasicBlock evaluation-order check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgNoBasicBlock.expected Expected output for new-CFG NoBasicBlock check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgNoBackwardFlow.ql Adds new-CFG variant of the NoBackwardFlow evaluation-order check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgNoBackwardFlow.expected Expected output for new-CFG NoBackwardFlow check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgNeverReachable.ql Adds new-CFG variant of the NeverReachable evaluation-order check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgNeverReachable.expected Expected output for new-CFG NeverReachable check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgImpl.qll Provides new-CFG implementation of the evaluation-order CFG signature.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgConsecutiveTimestamps.ql Adds new-CFG variant of ConsecutiveTimestamps check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgConsecutiveTimestamps.expected Expected output for new-CFG ConsecutiveTimestamps check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgConsecutivePredecessorTimestamps.ql Adds new-CFG variant of ConsecutivePredecessorTimestamps check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgConsecutivePredecessorTimestamps.expected Expected output for new-CFG ConsecutivePredecessorTimestamps check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgBranchTimestamps.ql Adds new-CFG branch timestamp completeness check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgBranchTimestamps.expected Expected output for new-CFG branch timestamp check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgBasicBlockOrdering.ql Adds new-CFG variant of BasicBlockOrdering check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgBasicBlockOrdering.expected Expected output for new-CFG BasicBlockOrdering check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgBasicBlockAnnotationGap.ql Adds new-CFG variant of BasicBlockAnnotationGap check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgBasicBlockAnnotationGap.expected Expected output for new-CFG BasicBlockAnnotationGap check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgAnnotationHasCfgNode.ql Adds new-CFG variant of AnnotationHasCfgNode check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgAnnotationHasCfgNode.expected Expected output for new-CFG AnnotationHasCfgNode check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgAllLiveReachable.ql Adds new-CFG variant of AllLiveReachable check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NewCfgAllLiveReachable.expected Expected output for new-CFG AllLiveReachable check.
python/ql/test/library-tests/ControlFlow/evaluation-order/NeverReachable.ql Ensures necessary imports for evaluation-order query utilities.
python/ql/test/library-tests/ControlFlow/evaluation-order/ContiguousTimestamps.ql Ensures necessary imports for evaluation-order query utilities.
python/ql/test/library-tests/ControlFlow/evaluation-order/ConsecutiveTimestamps.ql Ensures necessary imports for evaluation-order query utilities.
python/ql/test/library-tests/ControlFlow/evaluation-order/ConsecutiveTimestamps.expected Expected output adjustments due to source/label changes.
python/ql/test/library-tests/ControlFlow/evaluation-order/BasicBlockOrdering.ql Ensures necessary imports for evaluation-order query utilities.
python/ql/test/library-tests/ControlFlow/evaluation-order/BasicBlockOrdering.expected Expected output adjustments due to source/label changes.
python/ql/test/library-tests/ControlFlow/evaluation-order/BasicBlockAnnotationGap.ql Ensures necessary imports for evaluation-order query utilities.
python/ql/test/library-tests/ControlFlow/evaluation-order/AnnotationHasCfgNode.ql Ensures necessary imports for evaluation-order query utilities.
python/ql/test/library-tests/ControlFlow/evaluation-order/AllLiveReachable.ql Ensures necessary imports for evaluation-order query utilities.
python/ql/test/library-tests/ControlFlow/bindings/with_stmt.py Adds new-CFG binding coverage for with ... as ... constructs.
python/ql/test/library-tests/ControlFlow/bindings/walrus_starred.py Adds new-CFG binding coverage for walrus and starred-target edge cases.
python/ql/test/library-tests/ControlFlow/bindings/type_params.py Adds new-CFG binding coverage for PEP 695 type parameters and type statements.
python/ql/test/library-tests/ControlFlow/bindings/simple.py Adds basic binding coverage sanity tests for the new CFG.
python/ql/test/library-tests/ControlFlow/bindings/parameters.py Adds binding coverage for parameters (incl. varargs/kwargs/kw-only/pos-only).
python/ql/test/library-tests/ControlFlow/bindings/match_pattern.py Adds binding coverage for match-statement patterns.
python/ql/test/library-tests/ControlFlow/bindings/imports.py Adds binding coverage for import/from ... import ... aliases.
python/ql/test/library-tests/ControlFlow/bindings/except_handler.py Adds binding coverage for exception handler as name bindings (incl. except*).
python/ql/test/library-tests/ControlFlow/bindings/decorated.py Adds binding coverage for decorated defs/classes and decorator stacking.
python/ql/test/library-tests/ControlFlow/bindings/dead_under_no_raise.py Adds regression tests documenting intentionally-dead bindings under “no expressions raise”.
python/ql/test/library-tests/ControlFlow/bindings/comprehension.py Adds binding coverage for for-targets and comprehension scopes (incl. synthetic .0).
python/ql/test/library-tests/ControlFlow/bindings/compound.py Adds binding coverage for tuple/list unpacking, nesting, and star-unpacking.
python/ql/test/library-tests/ControlFlow/bindings/BindingsTest.ql Adds inline-expectations query to assert AST bindings have corresponding new-CFG nodes.
python/ql/test/library-tests/ControlFlow/bindings/BindingsTest.expected Expected output for bindings test query.
python/ql/test/library-tests/ControlFlow/bindings/annassign.py Adds binding coverage for annotated assignments (with/without initializer).
python/ql/test/extractor-tests/syntax_error/CONSISTENCY/CfgConsistency.expected Adds expected output for CFG consistency check on syntax-error corpus.
python/ql/lib/semmle/python/controlflow/internal/Cfg.qll Adds new shared-CFG-backed Python CFG facade mirroring legacy Flow API.
python/ql/lib/semmle/python/controlflow/internal/AstNodeImpl.qll Adds Python AST → shared CFG adapter implementing the shared CFG AstSig.
python/ql/lib/printCfgNew.ql Adds debug/visualisation query for the new CFG (IDE contextual query).
python/ql/lib/change-notes/2026-05-19-add-shared-cfg.md Change note for introducing the new shared-CFG-backed Python CFG library.
python/ql/consistency-queries/CfgConsistency.ql Adds CFG structural consistency query for the new Python shared CFG.

Copilot's findings

Comments suppressed due to low confidence (1)

python/ql/test/library-tests/ControlFlow/evaluation-order/test_if.py:115

  • The file ends with two bare @test decorators without any following function definition, which makes the test file syntactically invalid and will break the evaluation-order test suite.
  • Files reviewed: 60/70 changed files
  • Comments generated: 2

class BasicBlock = Py::BasicBlock;

CfgNode scopeGetEntryNode(PY::Scope s) { result = s.getEntryNode() }
CfgNode scopeGetEntryNode(Scope s) { result = s.getEntryNode() }
Comment on lines +10 to +13
* For subscript / attribute stores the tag fires on the Subscript /
* Attribute node itself, with `value` set to the rightmost identifier
* (the attribute name for `Attribute`, the index expression's textual
* form for `Subscript`).
yoff pushed a commit that referenced this pull request Jun 1, 2026
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
yoff pushed a commit that referenced this pull request Jun 1, 2026
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
yoff pushed a commit that referenced this pull request Jun 1, 2026
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yoff yoff force-pushed the yoff/python-flow-py-namespace branch from 6bed86e to 73f9a1b Compare June 1, 2026 13:27
@yoff yoff force-pushed the yoff/python-add-new-cfg-library branch from b547f1b to 370dc98 Compare June 1, 2026 13:27
yoff pushed a commit that referenced this pull request Jun 1, 2026
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yoff yoff force-pushed the yoff/python-flow-py-namespace branch from 73f9a1b to 709a4a3 Compare June 1, 2026 14:05
Preparatory refactor for the shared-CFG dataflow migration. Adds the
new Python CFG library additively, without changing any production
behaviour.

Library additions:

- semmle.python.controlflow.internal.AstNodeImpl — mediates between
  the Python AST and the shared codeql.controlflow.ControlFlowGraph
  signature. Wraps Python's Stmt/Expr/Scope/Pattern and adds two
  synthetic kinds of node (BlockStmt for body slots, intermediate
  nodes for multi-operand boolean expressions).

- semmle.python.controlflow.internal.Cfg — public facade
  re-exposing the same API surface as semmle/python/Flow.qll
  (ControlFlowNode, CallNode, BasicBlock, NameNode, DefinitionNode,
  CompareNode, ...), backed by the shared CFG.

- lib/printCfgNew.ql — debug/visualisation query for the new CFG.

- consistency-queries/CfgConsistency.ql — consistency query running
  the shared CFG's standard checks against Python.

Shared library:

- shared.controlflow.ControlFlowGraph — adds two defaulted
  getWhileElse / getForeachElse predicates to AstSig so Python can
  model while-else / for-else (no behavioural change for other
  languages).

Test additions:

- ControlFlow/bindings/* — annotation-driven SSA-binding tests for
  the new CFG (annassign, compound, comprehension, decorated,
  except_handler, imports, match_pattern, parameters, simple,
  type_params, walrus_starred, with_stmt, dead_under_no_raise).

- ControlFlow/store-load/* — basic store/load coverage.

- ControlFlow/evaluation-order/NewCfg*.ql — mirrors of the existing
  OldCfg evaluation-order self-validation suite, run against the
  new CFG via NewCfgImpl.qll.

- Minor extensions to existing test_if.py / test_boolean.py +
  cosmetic .expected churn on a handful of OldCfg tests.

No dataflow, SSA, or production query is migrated yet — that lands in
follow-up PRs. The new CFG library has zero callers in lib/ and src/.

Verified by:
- All lib + src + consistency-queries compile clean (367 queries).
- All 56 ControlFlow library-tests pass.
- All 474 dataflow + PointsTo library-tests + consistency tests pass.
- syntax_error/CONSISTENCY/CfgConsistency passes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yoff yoff force-pushed the yoff/python-add-new-cfg-library branch from 370dc98 to 0ca4cab Compare June 1, 2026 14:05
yoff pushed a commit that referenced this pull request Jun 1, 2026
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants