Add PTO pointer cast op#692
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the pto.ptr_cast operation, enabling the reinterpretation of global pointers with different element types. The changes include IR definitions, documentation, Python bindings, and lowering logic for both memref and EmitC. Review feedback identifies a potential issue where hardcoding the size to 1 in memref.reinterpret_cast could lead to out-of-bounds errors when the resulting pointer is used for array indexing. Additionally, it is suggested to optimize the lowering pass by replacing the nested while(changed) walk with a more efficient traversal strategy to improve performance on larger functions.
| auto offsetTy = | ||
| MemRefType::get({dyn}, targetTy.getElementType(), offsetLayout, | ||
| targetTy.getMemorySpace()); | ||
| SmallVector<OpFoldResult, 1> sizes{rewriter.getIndexAttr(1)}; |
There was a problem hiding this comment.
Hardcoding the size to 1 in the memref.reinterpret_cast operation is potentially problematic. If the resulting pointer is used to access an array (e.g., via pto.load_scalar with a non-zero offset or pto.make_tensor_view), a static size of 1 will lead to out-of-bounds accesses in the memref representation. In the provided test case, pto.load_scalar at index 1 would be invalid for a memref<1xT>. It is safer to preserve the dynamic size from the source memref, potentially scaling it if the element sizes differ.
| while (changed) { | ||
| changed = false; | ||
| DefaultInlineVector<mlir::pto::PtrCastOp> ptrCasts; | ||
| func.walk([&](mlir::pto::PtrCastOp op) { ptrCasts.push_back(op); }); |
There was a problem hiding this comment.
The current implementation uses a while(changed) loop with a full func.walk inside each iteration. This results in
Codex Review该评论由 review 机器人自动更新。
Summary发现 2 个问题: Findings
这里把 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 23a98762d8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (getConstIndexValue(offset, constOffset)) { | ||
| int64_t byteOffset = constOffset * static_cast<int64_t>(srcBytes); | ||
| if (byteOffset % static_cast<int64_t>(dstBytes) != 0) { | ||
| anchorOp->emitError( | ||
| "ptr_cast addptr byte offset is not aligned to the result element size"); |
There was a problem hiding this comment.
Accumulate addptr offsets before checking cast alignment
When lowering pto.ptr_cast(pto.addptr(...)), alignment is checked on each addptr offset independently (byteOffset % dstBytes) instead of on the combined byte offset. This rejects valid programs where individual offsets are misaligned but their sum is aligned (for example, casting !pto.ptr<ui16> to !pto.ptr<ui32> after two +1 element steps should be legal because total byte offset is 4). In this case the pass emits an error and fails lowering even though the final address is representable in destination element units.
Useful? React with 👍 / 👎.
c212290 to
fe5cdc4
Compare
fe5cdc4 to
a43976d
Compare
A5 板测成功
|
A3 板测失败
失败用例
|
A3 板测失败详情:PR #692down_proj_residual
out_proj_residual
|
Summary: add pto.ptr_cast IR op, verifier, Python helper, manual docs, GM memref lowering, and EmitC reinterpret_cast support. Validation: cmake --build build-codex-ci --target ptoas -j2; llvm-lit -sv --filter=ptr_cast build-codex-ci/test/lit; llvm-lit -sv --filter=issue481_addptr_gm_slot_buffer build-codex-ci/test/lit; python3 -m py_compile python/pto/dialects/pto.py; git diff --cached --check.