Skip to content

chore: DWARF / witness mapping discovery for fused output (sub-#130)#131

Draft
avrabe wants to merge 1 commit into
mainfrom
chore/dwarf-witness-discovery
Draft

chore: DWARF / witness mapping discovery for fused output (sub-#130)#131
avrabe wants to merge 1 commit into
mainfrom
chore/dwarf-witness-discovery

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 1, 2026

Summary

  • Discovery + oracle test for the witness MC/DC integration epic tracked in DWARF / source-map remapping for fused output (witness MC/DC integration) #130.
  • meld currently forwards DWARF custom sections byte-for-byte through fusion. Addresses encode offsets into the input code section; the fused code section has different layout, so every preserved address is wrong. With multi-core-module inputs the fused output also carries duplicate .debug_* sections.
  • This PR ships only the discovery: a 5-test fixture pinning today's lossy behavior so the planned remap work has a green-to-red signal to flip. No production code changes; defaults unchanged.

Concrete numbers from the new test on tests/wit_bindgen/fixtures/lists.wasm:

DWARF sections at top level of fused module: {
    \".debug_abbrev\": 2,  \".debug_info\": 2,  \".debug_line\": 2,
    \".debug_loc\": 2,     \".debug_ranges\": 2, \".debug_str\": 2,
}
input  code-section length (sum across embedded modules): 231531 bytes
fused  code-section length:                                213242 bytes

Audit findings (committed in the test docstring + filed in #130)

  • FuserConfig::default().custom_sections == Merge (meld-core/src/lib.rs:110). DWARF is preserved, not dropped.
  • merger.rs:2010-2012 naively concatenates per-module custom_sections with no dedup or rewriting.
  • lib.rs::encode_output (line 1345-1356) emits all of them unless Drop is configured.
  • No code anywhere in meld-core/src/ parses or rewrites .debug_* content.
  • Component-level custom sections are dropped at parse time (parser.rs:1082-1084); only core-module custom sections flow through.

Cross-repo dependency: witness

pulseengine/witness v0.11.x is the consumer that breaks. It uses gimli (crates/witness-core/src/decisions.rs) to build a (code-section byte offset) -> (file, line) map for MC/DC attribution. The DWARF passed through by meld today gives gimli wrong + duplicated input.

Witness is intentionally NOT a meld-core dependency. End-to-end "run witness on a fused module" verification belongs cross-repo (likely in wasm-component-examples release evidence). The test in this PR covers only meld-side invariants.

What this PR contains

  • meld-core/tests/dwarf_passthrough.rs — 5 tests, all green:
    • current_default_is_merge_not_drop — pins the default.
    • fixture_carries_dwarf_in_some_embedded_module — sanity-check the fixture still has DWARF.
    • fused_output_inherits_dwarf_byte_for_byte_today — pins that DWARF is forwarded with multiplicity (2 of each section in the fused output).
    • drop_policy_strips_dwarf_completely — pins the user's only escape hatch today (CustomSectionHandling::Drop).
    • dwarf_addresses_in_fused_output_are_known_to_be_wrong — pins the structural "code section length differs" invariant that proves passed-through addresses are invalid.
  • TODO comments call out where each assertion flips at Phase 1.5 / Phase 2 / Phase 3.

Phased plan (tracked in #130)

  • Phase 1 (this PR): discovery + oracle test.
  • Phase 1.5: .debug_*-aware policy distinct from generic custom-section handling. Recommend default = drop for .debug_* until Phase 2 ships, so users at least don't get wrong source attribution.
  • Phase 2: real DWARF remap — rewrite .debug_line programs and .debug_info DW_AT_low_pc / DW_AT_high_pc / DW_AT_ranges to the merged code section using the function-body relocation map the merger already builds. Dedup .debug_str. Likely needs a gimli dep on the meld-core side.
  • Phase 3: synthesize DIEs for adapter / lift-lower function bodies (or accept the gap, since witness has a strict-per-br_if fallback). Variable-level debug info explicitly out of scope.

Test plan

  • cargo test --package meld-core --test dwarf_passthrough — 5 tests green.
  • cargo test --package meld-core --test wit_bindgen_runtime — 73 tests green.
  • cargo test --package meld-core --lib — 198 unit tests green.
  • Pre-commit hooks (cargo-fmt, cargo-clippy, cargo-test) all green.
  • Cross-repo verification deferred to Phase 2 (witness end-to-end run on a fused module is the eventual integration oracle).

Refs #130

🤖 Generated with Claude Code

meld currently passes DWARF custom sections through fusion byte-for-byte.
DWARF addresses encode byte offsets into the per-input code section, but
fusion rewrites function bodies into a new merged code section — so every
preserved DWARF address points at the wrong instruction (or out of
range). The pulseengine `witness` MC/DC tool reads those addresses via
gimli, so coverage attribution for fused modules is currently silently
wrong.

This change:
- Adds `meld-core/tests/dwarf_passthrough.rs` pinning the current
  lossy-but-present behaviour as five green tests so any future
  remapping work has a clear oracle to flip.
- Documents the witness DWARF contract and the cross-repo integration
  shape in the test's module docstring (witness intentionally stays
  out of meld-core's dep graph).

No production code changes — defaults are unchanged. Phased plan
(Phase 1.5 explicit policy, Phase 2 DWARF remap, Phase 3 adapter DIEs)
tracked in #130.

Refs #130

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant