hotplug: validate CPU hotplug using runtime-discovered topology#463
Merged
Conversation
Add reusable CPU hotplug helpers to functestlib.sh for CPU online control handling, online mask checks, CPU state reads/writes, retry-based offline handling, dmesg evidence collection, topology logging, and best-effort cleanup of CPUs that were offlined during a test. The helpers are intentionally limited to reusable hotplug mechanics. They reuse existing functestlib.sh infrastructure such as check_dependencies(), check_kernel_config(), logging helpers, and the generic CPU helper APIs instead of duplicating dependency, kernel config, or local counting logic. This allows CPU hotplug and related CPU topology tests to scale across SoCs with different CPU counts, capacities, clusters, and topology layouts without hardcoding board-specific assumptions. Signed-off-by: Srikanth Muppandam <smuppand@qti.qualcomm.com>
Rework the hotplug test to dynamically discover online CPUs at runtime instead of hardcoding a fixed cpu0-cpu7 range. The test now: - discovers online CPUs from sysfs - logs CPU topology, capacity, cluster, and affinity details - uses check_dependencies() for required userspace tools - uses check_kernel_config() for CONFIG_HOTPLUG_CPU validation - validates only CPUs with writable hotplug control - checks CPU schedulability before and after hotplug - retries transient offline failures before declaring failure - treats persistent EBUSY as a CI failure for hotplug-controllable CPUs - verifies offline state through cpuX/online, online mask, and taskset - restores any offlined CPU through cleanup on failure or interruption This makes the test scalable across different Qualcomm SoCs and CPU cluster layouts while still catching real hotplug regressions such as persistent EBUSY, failed online recovery, stale online masks, or CPUs remaining schedulable while reported offline. Signed-off-by: Srikanth Muppandam <smuppand@qti.qualcomm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rework the CPU hotplug validation to scale across Qualcomm platforms with different CPU counts, capacities, and cluster layouts.
The previous test used a fixed CPU range and attempted to offline CPUs directly. This was not scalable across SoCs and could produce weak or misleading failures when CPU topology differed from the assumed layout.
This PR updates the test to discover CPU topology dynamically at runtime and validate only CPUs that are actually exposed as hotplug-controllable through sysfs.
Please refer this lava job https://lava.infra.foundries.io/scheduler/job/236862 for the results with these patches.