Skip to content

examples/llama: CoreML/MPS/QNN export still uses deprecated to_edge() + to_backend() split #19634

@msluszniak

Description

@msluszniak

Problem

_to_edge_and_lower_llama_xnnpack uses to_edge_transform_and_lower(). The generic _to_edge_and_lower_llama (CoreML/MPS/QNN/Vulkan) uses the deprecated export_to_edge() + to_backend() split. CoreMLPartitioner emits a deprecation warning about this on every invocation.

For LFM2.5 hybrid models the split path desynchronises subgraph output-node names from the parent program's buffers_to_mutate map (short-conv self.conv_state.copy_(...) decomposes to slice_copy + index_put, only one of which the partitioner records as the mutation source). The verifier then raises:

torch._export.verifier.SpecViolationError: Mutation node aten_index_put_default_N is neither a buffer nor a user input.

Reproduce

git clone https://github.com/pytorch/executorch && cd executorch
./install_executorch.sh
source .venv/bin/activate
pip install coremltools

cat > examples/models/lfm2/config/lfm2_coreml.yaml <<'EOF'
base:
  metadata: '{"get_bos_id": 1, "get_eos_ids":[7]}'
model:
  use_kv_cache: True
  enable_dynamic_shape: False
  dtype_override: fp32
backend:
  coreml:
    enabled: True
    ios: 18
    enable_state: True
    preserve_sdpa: True
    compute_units: cpu_and_ne
EOF

python -m extension.llm.export.export_llm \
  --config examples/models/lfm2/config/lfm2_coreml.yaml \
  +base.model_class=lfm2_5_1_2b \
  +base.params=examples/models/lfm2/config/lfm2_5_1_2b_config.json \
  +export.max_seq_length=2048 \
  +export.max_context_length=2048 \
  +export.output_name=lfm2_coreml.pte

Suggested fix

Add a CoreML helper analogous to _to_edge_and_lower_llama_xnnpack, or short-circuit _to_edge_and_lower_llama when coreml=True:

if coreml:
    coreml_partitioner = get_coreml_partitioner(
        coreml_ios, embedding_quantize, pt2e_quantize,
        coreml_quantize, coreml_compute_units,
    )
    builder = builder_exported.pt2e_quantize(quantizers).to_edge_transform_and_lower(
        [coreml_partitioner]
    )
    return builder.to_executorch(passes=additional_passes)

The same migration likely applies to MPS, QNN, and Vulkan branches; only CoreML has been exercised here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions