Skip to content

HDDS-14888. Improve DiskCheckUtil.checkReadWrite to tolerate disk full#9972

Open
ChenSammi wants to merge 1 commit intoapache:masterfrom
ChenSammi:HDDS-14888
Open

HDDS-14888. Improve DiskCheckUtil.checkReadWrite to tolerate disk full#9972
ChenSammi wants to merge 1 commit intoapache:masterfrom
ChenSammi:HDDS-14888

Conversation

@ChenSammi
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

DiskCheckUtil.checkReadWrite does file write and read check to make sure disk is writable and readable.
If disk is full, it will catch the exception and return false.

The outer StorageVolume#check will decide whether mark this time volume check as fail or not, depending on whether volume has more than 100B *2 free space.
As the disk space will dynamically change, so there is change that checkReadWrite() is failed due to disk full, while when below logic is evaluated, it has enough available space again.
if (!diskChecksPassed && getCurrentUsage().getAvailable() < minimumDiskSpace) {

The goal of this target is to analysis the exception in checkReadWrite() directly, so that disk full case will not return false.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14888

How was this patch tested?

new unit test

The exception when test with local disk full volume is, FileSystemException for both FileChannel open and output stream write.


java.nio.file.FileSystemException: /Volumes/DiskFullTest/disk-check-ae1d2075-200a-409f-aeb4-655471324a4d: No space left on device
  at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
  at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
  at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
  at java.base/sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:182)
  at java.base/java.nio.channels.FileChannel.open(FileChannel.java:292)
  at java.base/java.nio.channels.FileChannel.open(FileChannel.java:345)
  at org.apache.ratis.util.FileUtils.lambda$newFileChannel$6(FileUtils.java:178)
  at org.apache.ratis.util.LogUtils.supplyAndLog(LogUtils.java:58)
  at org.apache.ratis.util.FileUtils.newFileChannel(FileUtils.java:177)
  at org.apache.ratis.util.FileUtils.newOutputStreamForceAtClose(FileUtils.java:165)
  at org.apache.ratis.util.FileUtils.newOutputStreamForceAtClose(FileUtils.java:169)

Copy link
Copy Markdown
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChenSammi Thanks for working over this, please check message with OS dependent case

" interrupted.");
}

// As WRITE keeps happening there is probability, disk has become full during above check.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below is just safe guard check, may be we can keep. We are depending on exception message which may differ in different OS.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can keep this with a revised comment, if Ozone runs on some minority platforms, which doesn't throw expected message.

@github-actions
Copy link
Copy Markdown

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

@github-actions github-actions Bot added the stale label Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants