backends/coreml: LFM2.5 1.2B PTE compiles to mlmodelc but ANE init fails (ANECCompile FAILED)


### Problem

After applying the workaround in #19634, an LFM2.5 1.2B CoreML PTE loads via `executorch.runtime.Runtime.load_program(...)` and all metadata methods (`get_eos_ids`, `get_max_seq_len`, `use_kv_cache`, …) succeed. `prog.load_method("forward")` then fails:

```
[ETCoreMLModelManager.mm:495] Successfully got compiled model ...
[ETCoreMLModelAnalyzer.mm:68] [Core ML] Failed to create model profiler.
    Failed to build the model execution plan using a model architecture file '.../model.mil'
[coreml_backend_delegate.mm:324] CoreMLBackend: Failed to init the model.
[method.cpp:114] Init failed for backend CoreMLBackend: 0x23
E5RT encountered an STL exception. msg = MILCompilerForANE error:
    failed to compile ANE model using ANEF. Error=_ANECompiler : ANECCompile() FAILED.
```

MIL → `mlmodelc` compilation succeeds; the ANE-specific execution-plan build fails. Reproduces via `executorch.runtime` on macOS and on iPhone 17 Pro / iOS 26.4.2 (surfaces in `react-native-executorch` as `code: 35 "Failed to load LLM runner"`), so it's a CoreML/ANE-side issue rather than a runtime one. `compute_units: cpu_only` and `cpu_and_gpu` succeed, but XNNPACK already covers the CPU case at higher throughput — the value of CoreML for this model is the ANE.

Reproduces with **two different quantisation modes**, ruling out a quantiser-specific cause:

1. unquantised fp16 (no `quantization:` block)
2. weight-only 4-bit via the documented torchao `quantize_` path (`qmode: 4w`, see `docs/source/backends/coreml/coreml-quantization.md`)

So the failure is in lowering LFM2's short-conv `conv_state` mutation (`self.conv_state.copy_(new_state)` in `examples/models/lfm2/short_conv.py`, which decomposes to `slice_copy + index_put`) to an ANE-compatible MIL representation. The same model graph works on `cpu_and_gpu`.

### Reproduce

```bash
# Apply workaround from #19634 first.

cat > examples/models/lfm2/config/lfm2_coreml_4w.yaml <<'EOF'
base:
  metadata: '{"get_bos_id": 1, "get_eos_ids":[7]}'
model:
  use_kv_cache: True
  enable_dynamic_shape: False
  dtype_override: fp32
quantization:
  qmode: 4w
  group_size: 32
backend:
  coreml:
    enabled: True
    ios: 18
    enable_state: True
    preserve_sdpa: True
    compute_units: cpu_and_ne
EOF

python -m extension.llm.export.export_llm \
  --config examples/models/lfm2/config/lfm2_coreml_4w.yaml \
  +base.model_class=lfm2_5_1_2b \
  +base.params=examples/models/lfm2/config/lfm2_5_1_2b_config.json \
  +export.max_seq_length=2048 \
  +export.max_context_length=2048 \
  +export.output_name=lfm2_coreml_4w.pte

python -c "
from executorch.runtime import Runtime
prog = Runtime.get().load_program('lfm2_coreml_4w.pte')
prog.load_method('get_eos_ids')   # OK
prog.load_method('forward')       # fails with backend init 0x23 + ANECCompile FAILED
"
```

### Asks

1. Is the short-conv `.copy_(...)` mutation pattern expected to lower to ANE-compatible MIL? If not, what's the recommended rewrite of `examples/models/lfm2/short_conv.py` to produce an ANE-friendly graph?
2. Is there a documented way to identify, before compilation, which ops in a model will block ANE compilation?


cc @kimishpatel @YifanShenSZ @cymbalrush @metascroy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backends/coreml: LFM2.5 1.2B PTE compiles to mlmodelc but ANE init fails (ANECCompile FAILED) #19635

Problem

Reproduce

Asks

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

backends/coreml: LFM2.5 1.2B PTE compiles to mlmodelc but ANE init fails (ANECCompile FAILED) #19635

Description

Problem

Reproduce

Asks

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions