Skip to content

Check for null host before proceeding with VM volume operations in managed storage while restoring VM#12879

Open
sureshanaparti wants to merge 1 commit intoapache:4.22from
shapeblue:fix-npe-handle-managed-storage
Open

Check for null host before proceeding with VM volume operations in managed storage while restoring VM#12879
sureshanaparti wants to merge 1 commit intoapache:4.22from
shapeblue:fix-npe-handle-managed-storage

Conversation

@sureshanaparti
Copy link
Contributor

@sureshanaparti sureshanaparti commented Mar 24, 2026

Description

This PR checks for null host before proceeding with VM volume operations in managed storage while restoring VM.

During restore VM, when VM last host id returns null when the Host was deleted, the VM ends up with additional ROOT Volume in Allocated state and the later re-image operation will be failing with validation error:

InvalidParameterValueException ex = new InvalidParameterValueException("There are " + rootVols.size() + " root volumes for VM " + vm.getUuid());

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

@sureshanaparti
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link

codecov bot commented Mar 24, 2026

Codecov Report

❌ Patch coverage is 5.40541% with 35 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.61%. Comparing base (bce5594) to head (2d51a71).
⚠️ Report is 1 commits behind head on 4.22.

Files with missing lines Patch % Lines
.../src/main/java/com/cloud/vm/UserVmManagerImpl.java 5.40% 32 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.22   #12879      +/-   ##
============================================
- Coverage     17.61%   17.61%   -0.01%     
  Complexity    15665    15665              
============================================
  Files          5917     5917              
  Lines        531461   531464       +3     
  Branches      64977    64978       +1     
============================================
- Hits          93608    93607       -1     
- Misses       427295   427298       +3     
- Partials      10558    10559       +1     
Flag Coverage Δ
uitests 3.70% <ø> (ø)
unittests 18.68% <5.40%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a restore-VM failure mode in managed storage where vm.getLastHostId() points to a deleted host, causing host-dependent volume operations to NPE and leaving an extra ROOT volume behind.

Changes:

  • Add a null-host guard in managed-storage handling during VM restore to safely skip host-dependent operations when the host record no longer exists.
  • Minor formatting-only updates (whitespace/brace style) across UserVmManagerImpl.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1211 to 1212
} catch (Exception ex) {
throw new CloudRuntimeException("Router start failed due to" + ex);
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception message concatenation is missing a space ("due to" + ex) and the thrown CloudRuntimeException drops the original cause/stack trace. Consider including a separating space and passing the caught exception as the cause so callers/logs preserve details.

Copilot uses AI. Check for mistakes.
// default findById() won't search entries with removed field not null
Host host = _hostDao.findById(hostId);
if (host == null) {
logger.warn("Host {} not found", hostId);
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new log line only prints the hostId; when debugging restore/re-image issues it would be helpful to include the VM and root volume identifiers (e.g., vm UUID/id and root volume UUID/id) in the warning so operators can correlate the skip with the affected resources.

Suggested change
logger.warn("Host {} not found", hostId);
logger.warn("Host {} not found for vm id: {}, uuid: {}, root volume id: {}, uuid: {}", hostId, vm.getId(), vm.getUuid(), root.getId(), root.getUuid());

Copilot uses AI. Check for mistakes.
Comment on lines 9221 to +9227
if (hostId != null) {
VolumeInfo volumeInfo = volFactory.getVolume(root.getId());
// default findById() won't search entries with removed field not null
Host host = _hostDao.findById(hostId);
if (host == null) {
logger.warn("Host {} not found", hostId);
return;
}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are unit tests for restoreVirtualMachine() in server/src/test/java/com/cloud/vm/UserVmManagerImplTest.java, but none appear to cover the managed-storage restore path where vm.getLastHostId() is set and _hostDao.findById(hostId) returns null (deleted host). Adding a test for this scenario would help prevent regressions (e.g., ensure restore proceeds without leaving an extra ROOT volume in Allocated state).

Copilot uses AI. Check for mistakes.
@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17228

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants