Arm backend: Add Qwen3-VL_2B_IT FP32 layer tests#19628
Conversation
Signed-off-by: Tom Allsop <tom.allsop@arm.com> Change-Id: I62d3848e0a6546e21d508b4ed565c2403b63f72d Co-authored-by: Baris Demir <baris.demir@arm.com>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19628
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Pull request overview
Adds per-layer FP32 export/lowering tests for the Qwen3-VL 2B Instruct model on the Arm TOSA-FP and VGF (no-quant) pipelines, including a checkpoint-shaped config helper and a directory of test modules covering both the vision and text sub-stacks.
Changes:
- New
qwen3_vl_test_config.pybuilds a Qwen3-VL 2B Instruct-likeQwen3VLConfig(text + vision) for tests. - New
test_qwen3_vl_layers.pydefines wrappernn.Modules for each vision/text sub-layer and parametrizes them throughTosaPipelineFPandVgfPipeline. MODELS.mdlists Qwen3-VL among supported models.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| backends/arm/test/models/Qwen3_VL/qwen3_vl_test_config.py | Factory for a Qwen3-VL 2B Instruct test config. |
| backends/arm/test/models/Qwen3_VL/test_qwen3_vl_layers.py | Layer-level wrappers + TOSA-FP and VGF (no-quant) test parametrizations. |
| backends/arm/MODELS.md | Adds Qwen3-VL to supported models list. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Tom Allsop <tom.allsop@arm.com> Change-Id: I397e651fa8ac7dd48cb8deb595bf52e306a3f469
zingo
left a comment
There was a problem hiding this comment.
OK to merge if/when CI is ok
I do not know but assume the MYPY errors need to be handled.
Yes I can fix the MyPy errors. Also adding the Qwen model seems to cause the runner to run out of memory. Either the model is too big or we have a memory leak, I'll take a look |
* Remove very large test that is running out of memory Signed-off-by: Tom Allsop <tom.allsop@arm.com Change-Id: I15d9802e30445230427441e7623747f85f1824a3
| def __init__(self, config, max_hw: int) -> None: | ||
| super().__init__() | ||
| head_dim = config.vision_config.hidden_size // config.vision_config.num_heads | ||
| self.rotary = Qwen3VLVisionRotaryEmbedding(head_dim // 2) | ||
| self.max_hw = max_hw | ||
|
|
||
| def forward(self, grid_thw: torch.Tensor) -> torch.Tensor: | ||
| rotary = self.rotary(self.max_hw) | ||
| return rotary | ||
|
|
||
| @classmethod | ||
| def prepare_model_and_inputs(cls): | ||
| config = _make_qwen3_vl_2b_instruct_layer_config() | ||
| grid_thw = _make_image_grid_thw(torch.device("cpu")) | ||
| max_hw = int(grid_thw[:, 1:].max().item()) | ||
| model = cls(config, max_hw).eval() |
Change-Id: I62d3848e0a6546e21d508b4ed565c2403b63f72d
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani