fix: LOB phantom undo suppression + fuzz test improvements#16
Conversation
When Transaction::flush() accumulates multi-piece supplemental log records, an orphaned first-piece (FB_F only) from a LOB row migration could block records from other tables. The new record was dropped (warning 60017) but its FB_L flag still triggered processDml() with the wrong data, permanently losing the DML event. Fix: clear the orphaned redo1/redo2 and replace with the current record. Validated: 27,898 fuzz events, 0 non-LOB mismatches (was 4 before fix).
…ctions Commit 1a2d316 removed the FLG_ROLLBACK_OP0504 check from appendToTransactionCommit() as part of the olr#10 fix, but this was overly aggressive. The rollbackLastOp() fix in Transaction.cpp already handles LOB phantom undo at the op level independently. Without this check, OLR emits ~2% extra phantom events on LOB tables where Oracle internally commits then rolls back the same XID. Fixes #15
LogMiner only includes LOB column values when explicitly changed by the SQL statement. Unchanged LOB columns appear as __debezium_unavailable_value. This is documented Debezium behavior (DBZ-4276), not a bug — OLR delivers actual LOB content that LogMiner cannot. Update validator to skip unavailable markers in both before and after images instead of only before images.
- Add p_skip_lob parameter to FUZZ_WKL.run() to skip LOB table operations. Usage: SKIP_LOB=1 ./fuzz-test.sh run 60 Debezium LogMiner has a known bug dropping LOB events on RAC (see DEBEZIUM-BUG-RAC-LOB.md). Skipping LOB allows sustained fuzz testing focused on absolute accuracy. - Classify events beyond the safe frontier as "tail" instead of mismatches. OLR processes redo faster than Debezium LogMiner, so at drain time OLR is ahead. These tail events are timing lag, not data loss. - Add DBZ_LM_CONNECTOR_JAR env var to mount a patched Debezium connector JAR for the LogMiner adapter. - Add Debezium RAC LOB bug report and review questions.
Extend the phantom undo detection in rollbackLastOp() to cover UPDATE->UPDATE (0x0B05->0x0B05) in addition to INSERT->DELETE (0x0B02->0x0B03). Oracle RAC generates phantom undo for both patterns during LOB segment management. Legitimate LOB rollbacks always strip LOB index records first (lobStripped=true), so the guard remains safe. Fixes 12 missing LOB UPDATE events per 10-min fuzz test run.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
📝 WalkthroughWalkthroughAdds rollback classification for OP0504 records in the parser, expands phantom LOB-undo pattern detection, discards mismatched redo fragments on supplemental-log warnings, and updates test harness and validator to support optional LOB skipping, unavailable-LOb markers, and tail-lag accounting. Changes
Sequence Diagram(s)(Skipped — changes are localized control-flow and test updates that do not introduce a new multi-component feature requiring sequential visualization.) Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/KNOWN-LIMITATIONS.md (1)
7-10:⚠️ Potential issue | 🟡 MinorUpdate header text to include L13.
The header states external limitations are "L1-L7" but L13 is now added as an external limitation. Update for consistency.
Entries are split into two categories: -- **External limitations** (L1-L7): Oracle LogMiner or Debezium behavior that +- **External limitations** (L1-L7, L13): Oracle LogMiner or Debezium behavior that cannot be fixed in OLR. These require workarounds in test comparison scripts. - **OLR bugs** (L8-L12): Issues in OLR that should be fixed. Each has a corresponding GitHub issue.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/KNOWN-LIMITATIONS.md` around lines 7 - 10, The header "**External limitations** (L1-L7)" is now out-of-date because L13 was added; update the header text to list the new range (for example change "(L1-L7)" to "(L1-L7, L13)" or similar) so it accurately reflects that L13 is an external limitation; update the same header string in KNOWN-LIMITATIONS.md (the "**External limitations**" header) and search for any other occurrences of that header text to keep them consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/dbz-twin/rac/perf/fuzz-workload.sql`:
- Around line 642-648: The comment above the LOB-remapping block is inaccurate:
the code (IF g_skip_lob = 1 AND v_table_dice > 40 AND v_table_dice <= 55 THEN
v_table_dice := rand_int(1, 30); END IF;) remaps the entire 15% LOB range to the
scalar bucket (making scalar ~45%), not the redistributed percentages shown;
update the comment to state that when g_skip_lob=1 the 41–55 LOB range is
remapped entirely to scalar (1–30), or alternatively implement proper
redistribution logic across the other buckets if the original percentages (35%
scalar, 12% wide, etc.) were intended—refer to variables/functions v_table_dice,
g_skip_lob, and rand_int to locate the code to change.
In `@tests/dbz-twin/rac/validator.py`:
- Line 392: The print statement using an f-string with no placeholders
(print(f"\n RESULT: PASS", flush=True)) should be changed to a regular string
literal to satisfy static analysis; locate the print call that outputs "\n
RESULT: PASS" in validator.py (the print(...) near line 392) and remove the
unnecessary f prefix so it becomes print("\n RESULT: PASS", flush=True).
---
Outside diff comments:
In `@tests/KNOWN-LIMITATIONS.md`:
- Around line 7-10: The header "**External limitations** (L1-L7)" is now
out-of-date because L13 was added; update the header text to list the new range
(for example change "(L1-L7)" to "(L1-L7, L13)" or similar) so it accurately
reflects that L13 is an external limitation; update the same header string in
KNOWN-LIMITATIONS.md (the "**External limitations**" header) and search for any
other occurrences of that header text to keep them consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5b104883-499f-4994-95d4-f3c2171ccb46
📒 Files selected for processing (6)
src/parser/Parser.cppsrc/parser/Transaction.cpptests/KNOWN-LIMITATIONS.mdtests/dbz-twin/rac/fuzz-test.shtests/dbz-twin/rac/perf/fuzz-workload.sqltests/dbz-twin/rac/validator.py
- Update KNOWN-LIMITATIONS.md header to include L13 - Fix inaccurate comment in fuzz-workload.sql about LOB skip redistribution - Remove unnecessary f-string prefix in validator.py
Summary
FLG_ROLLBACK_OP0504check removed in 1a2d316Changes
OLR fixes
Parser.cpp: RestoreFLG_ROLLBACK_OP0504check on opcode 0x0504 commit records. Upstream had this; our olr#10 fix incorrectly removed it. TherollbackLastOp()fix in Transaction.cpp independently handles LOB phantom undo at the op level.Transaction.cpp: Extend phantom undo detection to cover UPDATE→UPDATE (0x0B05→0x0B05) in addition to INSERT→DELETE (0x0B02→0x0B03). Oracle RAC generates phantom undo for both patterns. Guard:!lobStripped && deferCommittedTransactions && LOB table.Fuzz test improvements
fuzz-workload.sql: Addp_skip_lobparameter.SKIP_LOB=1 ./fuzz-test.sh run 60skips LOB table ops for absolute non-LOB accuracy testing.validator.py: Classify events beyond the safe frontier as "tail" (timing lag) instead of mismatches. Skip__debezium_unavailable_valuein both before and after images (L13).KNOWN-LIMITATIONS.md: Add L13 documenting LogMiner LOB unavailable value behavior per Debezium docs.Test plan
Related
Summary by CodeRabbit
Bug Fixes
Documentation
__debezium_unavailable_valueplaceholders.Tests