Skip to content

code fixes and improvements#210

Merged
tis24dev merged 27 commits into
mainfrom
dev
May 12, 2026
Merged

code fixes and improvements#210
tis24dev merged 27 commits into
mainfrom
dev

Conversation

@tis24dev
Copy link
Copy Markdown
Owner

@tis24dev tis24dev commented May 11, 2026

  • Refactor directory recreation into modular components
  • Pin GitHub Actions and tidy Codacy docs
  • Refactor PBS mount guards and network UI flows
  • Fix nil candidate panic in raw artifact staging
  • Remove unused internal helpers
  • Fix ineffectual assignments
  • Fix Proxmox Backup zombie process filtering
  • fix: resolve golangci-lint staticcheck findings
  • Update go.mod
  • Fix restore, collector, and temp-file hardening issues
  • Use restore bundle planning helper
  • Skip staged apply after staging errors
  • Guard SAFE apply on completed export
  • Avoid deprecated tar TypeRegA
  • Clean up partial lock files & Make hardware probes best-effort
  • Propagate cancellation from best-effort probes
  • Defer pipe cleanup in encryption test
  • Update directory_recreation_pbs_config.go
  • Create race.yml
  • Update race.yml
  • Fix orchestrator race-test timeouts
  • Fix non-critical pvesh timeout handling and pin CI actions
  • fix: return worker context error in PBS datastore collector
  • fix: require Go 1.25.10 for stdlib security fixes

Summary by CodeRabbit

  • New Features

    • Added smart filesystem merge during restore with user confirmation.
    • Added TFA/WebAuthn category recommendations during restore.
    • Added network mount guards for PBS datastore protection.
    • Enhanced restore workflow with improved UI-driven orchestration.
  • Bug Fixes & Improvements

    • Improved resource cleanup and error propagation across file I/O operations.
    • Enhanced error handling for network configuration application and rollback.
    • Better validation and recovery logic for restore operations.
  • Infrastructure

    • Pinned GitHub Actions versions to specific commits for security.
    • Added Go race detector workflow.
    • Updated Go toolchain to 1.25.10.

Review Change Stack

tis24dev added 24 commits May 6, 2026 22:10
Split the monolithic directory recreation logic into focused, testable components. Parsing of PVE/PBS config files was moved to directory_recreation_config.go (load/parse helpers and normalization for datastore.cfg). PBS-specific preflight/initialization logic, subdir creation and lock handling were extracted into directory_recreation_pbs*.go files, and PVE-specific scaffolding moved to a dedicated pve file. Ownership and permission handling was isolated in directory_recreation_ownership.go, while path/device utilities and ZFS/root-fs safety checks were factored into directory_recreation_paths.go. The main RecreateStorageDirectories/RecreateDatastoreDirectories functions were simplified to use the new loaders/creators; tests updated accordingly.
Pin CI actions to specific commit SHAs for reproducible workflows and security: codecov action pinned to a v6 commit, anchore/sbom-action (syft) and goreleaser action pinned to specific commits. Also normalize list formatting/whitespace in Codacy MCP instructions for clarity; no functional changes to behavior.
Move and refactor PBS mount guard logic into a dedicated apply implementation, splitting responsibilities and adding helper functions for validation, directory setup and bind-readonly guard mounting. Simplify and harden proc path unescaping with small utility helpers. Extract and reorganize network apply UI flow into request/flow types and separate prompt/rollback files, update tests to use test tool helpers and a fake command runner, and add multiple restore/workflow UI stubs. Also add package comments and small housekeeping changes (embed.go, imports). This improves separation of concerns, testability and readability of mount guard and network apply codepaths.
Validate nil backup candidates before starting raw artifact staging trace logging.

This prevents copyRawArtifactsToWorkdirWithLogger from dereferencing a nil candidate before returning its intended error, and adds a regression test covering the nil candidate case.
Remove stale internal helper functions left behind by earlier refactors.

This cleans up unused environment detection, container detail formatting, symlink overlay, and restore abort helper code. Also removes the now-unused input import and verifies the cleanup with unused lint and the full Go test suite.
Remove ineffectual assignments reported by golangci-lint.

This simplifies config upgrade insertion logic, cleans up mount guard request normalization, and keeps executable permission checks while discarding unused refreshed file info. Verified with ineffassign lint and the full Go test suite.
Apply the Proxmox Backup zombie-process filter before suspicious process matching.

Previously the filter ran at the end of the process scan loop, so its continue had no practical effect. This moves the check ahead of suspicious signature detection and adds regression coverage for skipped zombie processes and warned non-zombie matches.
Clean up staticcheck findings across runtime and tests, including deprecated tar usage, error string formatting, fmt.Fprintf conversions, De Morgan simplifications, and embedded selector cleanup.

Strengthen fragile tests around nil contexts, NIC repair results, identity detection, and security checks. Make PVE cluster detection tests deterministic by avoiding dependence on the host pvecm command.
- Make nonessential system inventory probes best-effort
- Harden restore staging/export permissions and PBS datastore cfg temp handling
- Fix lock close propagation, help output routing, mount guard probe behavior, and chunk splitting
- Share close-error handling and clean up several small review findings
- Add coverage for tar TypeRegA restore metadata entries
Route the UI restore workflow through prepareBundleAndPlan so the helper is no longer dead code.

Preserve prepared bundle cleanup semantics when planning fails, and add a regression test for that failure path. Add the missing go.sum checksums for the tcell version required by go.mod.
Make staged extraction report whether staging completed successfully, not just whether it returned a fatal error.

Skip staged apply when extraction errors were downgraded to warnings, preserving restoreHadWarnings behavior and only recording stageLogPath after full staging success. Add regression coverage for incomplete staging.
Require a completed export extraction before running SAFE cluster apply.

Use exportLogPath as the success marker from extractSelectiveArchive, skip SAFE apply with a warning when export extraction did not complete, and add regression coverage for the partial-export path.
Replace deprecated tar.TypeRegA usage with explicit NUL typeflag constants.

Keep restore metadata parsing behavior for tar entries with typeflag 0, update the regression test to exercise that behavior directly, and remove the deprecated symbol from both test and production code.
Remove partially-created backup lock files when lock creation fails during WriteString or the final Close.

Log cleanup failures as warnings, preserve the original lock acquisition error, and add regression coverage for write and close failure paths.

Route lsmod and smartctl --scan through the best-effort probe path so failures no longer abort system collection.

Add debug-level handling for best-effort command failures and regression coverage to ensure lsmod and SMART scan errors continue without warnings or output files.
Make collectBestEffortProbe return cancellation and deadline errors instead of swallowing them as best-effort probe failures.

Propagate those errors from all probe call sites and preserve ordinary probe failures as debug-only skips. Extend command classification so canceled command executions are not downgraded to non-critical failures, and add tests for canceled/deadline probe paths.
Register pipe close defers immediately after each successful os.Pipe call in the non-interactive encryption recipient test.

Remove manual pipe closes from the second pipe failure path, preserving t.Fatalf behavior while avoiding duplicated cleanup handling.
Limit RunGoBackup integration tests to a minimal synthetic backup config so they do not collect and compress runner home directories under GitHub Actions.

Also make the TUI simulation harness create a fresh SimulationScreen per app instance, avoiding hangs when multi-step prompts reopen a TUI after the previous screen was finalized.
- Pin GitHub Actions workflow steps to immutable commit SHAs
- Treat pvesh-specific timeouts as non-fatal for non-critical collection commands
- Preserve parent context cancellation behavior for pvesh commands
- Use deterministic missing archive paths in restore workflow tests
- Check pipe close errors in encryption tests
- Return valid FileInfo from smartctl Stat test mock
- Add targeted regression coverage
Bump the module Go version and runtime minimum to 1.25.10 so builds use the patched standard library for GO-2026-4971 and GO-2026-4918.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

Warning

Rate limit exceeded

@tis24dev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 17 minutes and 5 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 382c288a-a155-4e4a-bacc-e14fccebf89e

📥 Commits

Reviewing files that changed from the base of the PR and between 5e2d95e and d4a97f7.

📒 Files selected for processing (5)
  • cmd/proxsave/main.go
  • cmd/proxsave/main_runtime.go
  • docs/INSTALL.md
  • docs/RESTORE_GUIDE.md
  • go.mod
📝 Walkthrough

Walkthrough

Pins CI actions and adds a race workflow. Introduces closeerr and propagates robust close/error handling. Major orchestrator refactor: new UI-driven restore planning/apply, network apply with rollback/diagnostics, PBS/PVE directory recreation, mount guards. Updates collectors, notifications, metrics, security, storage, and extensive tests. Toolchain bumped.

Changes

Restore and Apply Orchestration

Layer / File(s) Summary
UI-driven planning and run
internal/orchestrator/restore_workflow_ui_*.go, ..._run.go, ..._full.go
Adds prepare/plan flows, selective/full restore orchestration, confirmations, abort normalization.
Network apply and guards
internal/orchestrator/network_apply_workflow_ui_*.go, mount_guard*.go
Staged apply, preflight + rollback timer, diagnostics/health, PBS mount guards.
Directory recreation
internal/orchestrator/directory_recreation*.go
Parsers for PVE/PBS configs, safety checks, ownership/permissions, lock handling.

Infrastructure

Layer / File(s) Summary
Close-error utility
internal/closeerr/closeerr.go, aliases
Adds CloseIntoErr and package aliases; propagates across subsystems.
CI and toolchain
.github/workflows/*, go.mod
Pins actions by SHA, adds race workflow, bumps toolchain.

Subsystem updates

Layer / File(s) Summary
Backup/collectors
internal/backup/*
Best-effort probes, improved error propagation, tar/pipe close handling.
Notify/Metrics
internal/notify/*, internal/metrics/prometheus.go
Checked writes, response-body close checks, Fprintf builders.
Security/Storage
internal/security/*, internal/storage/*
Resource management fixes, logic tweaks.
TUI components
internal/tui/*
Form/List/Panel wrappers use promoted methods; simulation harness improvements.
cmd/runtime
cmd/proxsave/*
Safer defers and loop simplifications.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant RestoreWorkflow
  participant Filesystem
  participant NetApply
  participant PBSGuards

  User->>RestoreWorkflow: Select mode and categories
  RestoreWorkflow->>Filesystem: Extract normal/staged payloads
  RestoreWorkflow->>PBSGuards: Maybe apply datastore mount guards
  RestoreWorkflow->>NetApply: Preflight validate
  NetApply-->>RestoreWorkflow: Result (ok/fail)
  alt ok
    RestoreWorkflow->>NetApply: Apply with rollback timer
    User-->>RestoreWorkflow: Commit decision
    RestoreWorkflow->>NetApply: Disarm or allow rollback
  else fail
    RestoreWorkflow->>Filesystem: Auto-rollback staged network or prompt
  end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~150 minutes

Possibly related PRs

  • tis24dev/proxsave#189 — Earlier restore UI flow changes overlap with new workflow delegation and planning.
  • tis24dev/proxsave#198 — Previous edits to directory recreation intersect with new PBS/PVE recreation modules here.
  • tis24dev/proxsave#185 — Touches email relay code path also updated for response-body close handling.

Poem

I hop through logs and tidy ends,
Close the files, ensure they mend.
Restore’s a waltz with cautious tread,
Rollback ready, paths well-led.
Tags are pinned, the tests all race—
Ship it swift, with bunny grace. 🐇✨

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 11, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

PackageVersionScoreDetails
actions/actions/attest-build-provenance a2bbfa25375fe432b6a289bc6b6cd05ecd0c4c32 UnknownUnknown
actions/actions/checkout de0fac2e4500dabe0009e67214ff5f5447ce83dd 🟢 5.7
Details
CheckScoreReason
Maintained⚠️ 01 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Binary-Artifacts🟢 10no binaries found in the repo
Code-Review🟢 10all changesets reviewed
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies🟢 3dependency not pinned by hash detected -- score normalized to 3
Packaging⚠️ -1packaging workflow not detected
License🟢 10license file detected
Fuzzing⚠️ 0project is not fuzzed
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 9security policy file detected
Branch-Protection🟢 5branch protection is not maximal on development and all release branches
SAST🟢 8SAST tool detected but not run on all commits
actions/actions/setup-go 4a3601121dd01d1626a1e23e37211e3254c1c06c 🟢 5.6
Details
CheckScoreReason
Code-Review🟢 10all changesets reviewed
Maintained🟢 56 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 5
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Binary-Artifacts🟢 10no binaries found in the repo
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Packaging⚠️ -1packaging workflow not detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Fuzzing⚠️ 0project is not fuzzed
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 9security policy file detected
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST🟢 10SAST tool is run on all commits
actions/anchore/sbom-action/download-syft e22c389904149dbc22b58101806040fa8d37a610 🟢 6.9
Details
CheckScoreReason
Code-Review🟢 4Found 6/13 approved changesets -- score normalized to 4
Maintained🟢 1020 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 10
Binary-Artifacts🟢 9binaries present in source code
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Token-Permissions🟢 10GitHub workflow tokens follow principle of least privilege
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Fuzzing⚠️ 0project is not fuzzed
Pinned-Dependencies🟢 7dependency not pinned by hash detected -- score normalized to 7
License🟢 10license file detected
Security-Policy🟢 10security policy file detected
Signed-Releases⚠️ -1no releases found
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
Packaging🟢 10packaging workflow detected
Branch-Protection🟢 4branch protection is not maximal on development and all release branches
actions/goreleaser/goreleaser-action 1a80836c5c9d9e5755a25cb59ec6f45a3b5f41a8 🟢 4.9
Details
CheckScoreReason
Code-Review⚠️ 0Found 1/22 approved changesets -- score normalized to 0
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
Maintained🟢 1027 commit(s) and 3 issue activity found in the last 90 days -- score normalized to 10
Binary-Artifacts🟢 10no binaries found in the repo
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies🟢 9dependency not pinned by hash detected -- score normalized to 9
Fuzzing⚠️ 0project is not fuzzed
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: some github tokens can't read classic branch protection rules: https://github.com/ossf/scorecard-action/blob/main/docs/authentication/fine-grained-auth-token.md
Security-Policy⚠️ 0security policy file not detected
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
actions/actions/checkout de0fac2e4500dabe0009e67214ff5f5447ce83dd 🟢 5.7
Details
CheckScoreReason
Maintained⚠️ 01 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Binary-Artifacts🟢 10no binaries found in the repo
Code-Review🟢 10all changesets reviewed
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies🟢 3dependency not pinned by hash detected -- score normalized to 3
Packaging⚠️ -1packaging workflow not detected
License🟢 10license file detected
Fuzzing⚠️ 0project is not fuzzed
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 9security policy file detected
Branch-Protection🟢 5branch protection is not maximal on development and all release branches
SAST🟢 8SAST tool detected but not run on all commits

Scanned Files

  • .github/workflows/release.yml
  • .github/workflows/sync-dev.yml

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 11, 2026

Not up to standards ⛔

🔴 Issues 3 critical · 6 high · 4 medium

Alerts:
⚠ 13 issues (≤ 0 issues of at least minor severity)

Results:
13 new issues

Category Results
Security 3 critical
6 high
Complexity 4 medium

View in Codacy

🟢 Metrics 1026 complexity · 94 duplication

Metric Results
Complexity 1026
Duplication 94

View in Codacy

AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.

Run reviewer

TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@codacy-production codacy-production Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR includes a significant refactoring of the orchestrator and a version upgrade to Go 1.25. While it successfully addresses several critical bugs—including nil pointer dereferences and zombie process filtering—the analysis indicates that the code is currently not up to standards due to high-severity security and quality issues.

Key concerns that should be addressed before merging include insecure file permissions (0600/0700) for backup artifacts and storage directories, and a potential path traversal vulnerability in the PBS inspection logic. Furthermore, the modern Go 1.25 feature errors.Join should be utilized in error-handling helpers to prevent masking critical I/O errors during file closure.

About this PR

  • Across the codebase, particularly in 'internal/closeerr/closeerr.go' and 'internal/orchestrator/backup_safety.go', the pattern of handling errors during 'Close()' should be modernized to use 'errors.Join'. In the context of a backup tool, discarding errors from 'Close' when another error is present can mask critical failure signals such as 'Disk Full' or data-flushing issues.

Test suggestions

  • Zombie Proxmox Backup process filtering in security check
  • Nil candidate check in raw artifact staging
  • Skipping staged apply when extraction fails
  • Best-effort hardware probes handle command failures without warning
  • Propagate context cancellation during best-effort probes
  • Partial lock file cleanup after write/close failure
  • Restore metadata entry accepts NUL (TypeRegA) type flag

TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback


// createPVEStorageStructure creates the directory structure for a PVE storage
func createPVEStorageStructure(basePath, storageType string, logger *logging.Logger) error {
if err := os.MkdirAll(basePath, 0750); err != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 HIGH RISK

Use more restrictive permissions (0700) for storage directories to prevent unauthorized local access to Proxmox configurations.

if n > 0 {
if _, wErr := out.Write(buf[:n]); wErr != nil {
if out == nil {
out, err = os.OpenFile(chunkPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, defaultChunkFilePerm)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 HIGH RISK

Restrict chunk file permissions to 0600 to ensure backup data privacy by limiting access to the process owner.

}

func dirHasAnyEntry(path string) (hasEntry bool, err error) {
f, err := os.Open(path)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 HIGH RISK

Sanitize the directory path before opening it to prevent potential path traversal vulnerabilities using 'filepath.Clean(path)'.

}

func createPBSDatastoreLockFile(lockPath string) error {
file, err := os.OpenFile(lockPath, os.O_CREATE|os.O_EXCL|os.O_WRONLY, 0o640)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 HIGH RISK

Reduce lock file permissions to 0600 to follow the principle of least privilege. Additionally, consider implementing a mechanism to ensure the '.lock' file is cleaned up if the process crashes, and add the PID to the file to allow for stale lock detection, similar to existing checks in 'internal/checks/checks.go'.


// Export writes the given metrics snapshot to proxmox_backup.prom in textfileDir.
func (pe *PrometheusExporter) Export(m *BackupMetrics) error {
func (pe *PrometheusExporter) Export(m *BackupMetrics) (err error) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM RISK

The 'Export' function is highly complex (cyclomatic complexity 33) and repetitive. Refactor it to use a data-driven approach for writing metrics to improve maintainability.

Comment on lines +16 to +18
if closeErr := closer.Close(); closeErr != nil && !errors.Is(closeErr, os.ErrClosed) && *errp == nil {
*errp = fmt.Errorf("%s: %w", operation, closeErr)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM RISK

Suggestion: Since the project is using Go 1.25, use 'errors.Join' to accumulate the close error instead of discarding it when an earlier error is present. This ensures critical signals like flush failures are preserved.


// backupFile adds a single file to the tar archive
func backupFile(tw *tar.Writer, sourcePath, archivePath string, result *SafetyBackupResult, logger *logging.Logger) error {
func backupFile(tw *tar.Writer, sourcePath, archivePath string, result *SafetyBackupResult, logger *logging.Logger) (err error) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚪ LOW RISK

Suggestion: Consider using 'errors.Join' within the 'closeIntoErr' helper to ensure that if a file write fails AND the subsequent close fails, both errors are reported to the user.

return err
}
if !success {
w.logger.Warning("Skipping apply due to staged extraction errors")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚪ LOW RISK

Suggestion: Verify that this 'return nil' correctly implements the 'Skip staged apply after staging errors' requirement by ensuring no further apply logic runs for this candidate.

ctx = context.Background()
}
if cand == nil {
return stagedFiles{}, fmt.Errorf("candidate is nil")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚪ LOW RISK

Suggestion: This null check addresses the 'Fix nil candidate panic in raw artifact staging' requirement.

@tis24dev
Copy link
Copy Markdown
Owner Author

@coderabbitai help

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

CodeRabbit commands

These commands are invoked using PR/Issue comments.

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai rate limit to show your current review rate limit status.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai evaluate custom pre-merge check --instructions <custom-checks-instructions> --name <custom-checks-title> [--mode <error|warning>] to validate the custom pre-merge checks instructions. Defaults to error when --mode is omitted.
  • @coderabbitai ignore pre-merge checks to override pre-merge checks and get an approval on PR.
  • @coderabbitai run pre-merge checks to run pre-merge checks on the pull request.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve merge conflict to automatically resolve merge conflicts.
  • @coderabbitai autofix to automatically fix issues identified in unresolved review comments.
  • @coderabbitai resolve to resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai generate configuration to create a PR that adds the current resolved configuration as .coderabbit.yaml (or show it if already present).
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit configuration file (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, support, documentation and community

  • Visit our status page to check the current availability of CodeRabbit.
  • Create a ticket on our support page for assistance with any issues or questions.
  • Visit our documentation site for detailed information on how to use CodeRabbit.
  • Join our Discord community to connect with other users and get help from the community.
  • Follow us on X/Twitter for updates and announcements.

@tis24dev
Copy link
Copy Markdown
Owner Author

@coderabbitai summary

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

✅ Actions performed

Summary regeneration triggered.

tis24dev added 2 commits May 11, 2026 18:41
Fix the TUI simulation helper so each new SimulationScreen gets its own key injector instead of reusing a closure tied to the first screen.

Add a regression test covering the nested existing-path -> new-path prompt flow.

Verified with:
- GOTOOLCHAIN=go1.25.10+auto go test ./internal/orchestrator -run 'TestPrompt.*TUI|TestTUIWorkflowUIPromptDecryptSecret|TestPromptYesNoTUI|TestShowRestorePlanTUI|TestSelectCategoriesTUI' -count=1
- GOTOOLCHAIN=go1.25.10+auto go test ./internal/orchestrator -count=1
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
internal/notify/templates.go (1)

114-177: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Escape the interpolated strings in the HTML body.

These writes still inject raw values like Hostname, storage summaries, usage strings, and ScriptVersion straight into HTML, while later rows correctly use escapeHTML. Any &, <, or " in those fields will break the markup, and config-derived values become an HTML-injection surface.

💡 Suggested pattern
- fmt.Fprintf(&html, "            <p>%s - %s</p>\n", data.Hostname, data.BackupDate.Format("2006-01-02 15:04:05"))
+ fmt.Fprintf(&html, "            <p>%s - %s</p>\n", escapeHTML(data.Hostname), data.BackupDate.Format("2006-01-02 15:04:05"))

- fmt.Fprintf(&html, "                        <span class=\"emoji\">%s</span> %s backups\n", GetStorageEmoji(data.LocalStatus), data.LocalStatusSummary)
+ fmt.Fprintf(&html, "                        <span class=\"emoji\">%s</span> %s backups\n", GetStorageEmoji(data.LocalStatus), escapeHTML(data.LocalStatusSummary))

- fmt.Fprintf(&html, "            <p>Generated on %s by backup script v%s</p>\n", data.BackupDate.Format("2006-01-02 15:04:05"), data.ScriptVersion)
+ fmt.Fprintf(&html, "            <p>Generated on %s by backup script v%s</p>\n", data.BackupDate.Format("2006-01-02 15:04:05"), escapeHTML(data.ScriptVersion))

Apply the same treatment to the other raw string interpolations in this block.

Also applies to: 262-262

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/notify/templates.go` around lines 114 - 177, Several HTML writes in
the template (the fmt.Fprintf and html.WriteString calls that interpolate
data.Hostname, data.LocalStatusSummary, data.LocalUsed/Free/Percent,
data.SecondaryStatusSummary, data.SecondaryUsed/Free/Percent,
data.CloudStatusSummary, ScriptVersion, etc.) inject raw strings; update each
interpolated value to be passed through the existing escapeHTML helper before
insertion (e.g., replace direct uses of data.Hostname, data.LocalStatusSummary,
GetStorageEmoji(...) outputs and all data.* used in fmt.Fprintf/html.WriteString
here and the similar occurrence around the ScriptVersion usage) so every user-
or config-derived string is escaped prior to writing into the HTML buffer.
internal/tui/components/form_test.go (1)

74-79: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Inconsistent wrapper usage.

Line 74-75 correctly use form.GetFormItemCount(), but line 77 falls back to form.Form.GetFormItem(0) instead of using the wrapper method form.GetFormItem(0).

♻️ Proposed fix
 	if form.GetFormItemCount() != 1 {
 		t.Fatalf("form item count=%d; want 1", form.GetFormItemCount())
 	}
-	if got := form.Form.GetFormItem(0).(*tview.InputField).GetLabel(); got != "Password" {
+	if got := form.GetFormItem(0).(*tview.InputField).GetLabel(); got != "Password" {
 		t.Fatalf("label=%q; want %q", got, "Password")
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/tui/components/form_test.go` around lines 74 - 79, Test mixes direct
access to the embedded Form with the wrapper: replace the direct call
form.Form.GetFormItem(0) with the wrapper method form.GetFormItem(0) so the test
consistently uses form.GetFormItemCount() and form.GetFormItem(0), then assert
the returned item is a *tview.InputField and call GetLabel() on it to check it
equals "Password".
internal/backup/collector_pbs_datastore.go (1)

540-593: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Cancel sibling PXAR workers when the first datastore scan fails.

firstErr is captured here, but nothing tells the other goroutines to stop. A single failing datastore can now leave this step waiting for every other PXAR scan to finish, which is especially painful on large datastores. Reintroducing a child context and canceling it when firstErr is set preserves fast-fail behavior.

Suggested fix
 func (c *Collector) runPBSPXARStep(ctx context.Context, state *pbsPxarState, fn func(context.Context, pbsDatastore, *pbsPxarState) error) error {
 	if err := ctx.Err(); err != nil {
 		return err
 	}
 	if state == nil || len(state.eligible) == 0 {
 		return nil
 	}
+
+	stepCtx, cancel := context.WithCancel(ctx)
+	defer cancel()
 
 	dsWorkers := c.config.PxarDatastoreConcurrency
 	if dsWorkers <= 0 {
 		dsWorkers = 1
 	}
@@
-			if err := fn(ctx, ds, state); err != nil {
+			if err := fn(stepCtx, ds, state); err != nil {
 				if errors.Is(err, context.Canceled) {
 					return
 				}
 				errMu.Lock()
 				if firstErr == nil {
 					firstErr = err
+					cancel()
 				}
 				errMu.Unlock()
 			}
 		}()
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/backup/collector_pbs_datastore.go` around lines 540 - 593, Wrap the
worker goroutines in a child cancellable context created with
context.WithCancel(ctx) inside runPBSPXARStep, pass that child context into fn
(replace uses of ctx in the goroutine with childCtx), and call cancel() as soon
as you record firstErr so sibling PXAR workers stop; also listen on
childCtx.Done() in the semaphore-acquire select to abort acquisition when
cancellation happens. Ensure you still return the recorded firstErr after
wg.Wait and respect the original ctx.Err() if set.
internal/backup/archiver.go (1)

613-618: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve the compressor start error instead of returning a pipe error.

When cmd.Start() fails, closing pw causes the tar goroutine to report ErrClosedPipe, and these branches can return that secondary error instead of the real failed to start xz/zstd/... cause. Close the pipe with the start error, drain errChan, then return the start error.

Suggested fix pattern
 	if err := cmd.Start(); err != nil {
-		_ = pw.Close()
-		if startErr := <-errChan; startErr != nil {
-			return startErr
-		}
+		_ = pw.CloseWithError(err)
+		<-errChan
 		return fmt.Errorf("failed to start xz: %w", err)
 	}

Apply the same pattern to the zstd and generic pipeTarThroughCommand start-failure branches.

Also applies to: 686-691, 769-774

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/backup/archiver.go` around lines 613 - 618, When cmd.Start() fails
in the compressor start-failure branches (the xz branch around cmd.Start(), the
zstd branch, and the generic pipeTarThroughCommand start-failure code path),
preserve and return that start error instead of the subsequent ErrClosedPipe:
call pw.CloseWithError(startErr) (using the actual start error), then drain
errChan (receive the tar goroutine error but ignore a closed-pipe result), and
finally return the original startErr (not the pipe error); update all three
branches to follow this pattern so the real compressor start failure is
returned.
🧹 Nitpick comments (11)
.github/instructions/codacy.instructions.md (1)

27-28: ⚡ Quick win

Consider breaking line 28 into sub-bullets for readability.

Line 28 contains multiple URLs with inconsistent formatting styles (one with full description in parentheses, one with a template placeholder). Breaking this into sub-bullets would improve scannability:

♻️ Suggested formatting improvement
-- Try to reset the MCP on the extension
-- If the user is using VSCode, suggest them to review their Copilot > MCP settings in Github, under their organization or personal account. Refer them to Settings > Copilot > Enable MCP servers in Copilot. Suggested URL (https://github.com/settings/copilot/features) or https://github.com/organizations/{organization-name}/settings/copilot/features (This can only be done by their organization admins / owners)
+- Try to reset the MCP on the extension
+- If the user is using VSCode, suggest them to review their Copilot > MCP settings in Github:
+  - Navigate to Settings > Copilot > Enable MCP servers in Copilot
+  - For personal accounts: https://github.com/settings/copilot/features
+  - For organizations: https://github.com/organizations/{organization-name}/settings/copilot/features
+  - Note: Organization settings can only be modified by admins/owners
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/instructions/codacy.instructions.md around lines 27 - 28, The
paragraph starting with "Try to reset the MCP on the extension" bundles multiple
actions and URLs; split that long line into sub-bullets for clarity: one bullet
for resetting the MCP, one for VSCode-specific instructions referencing "Copilot
> MCP settings", and separate bullets for each URL
(https://github.com/settings/copilot/features and the template
https://github.com/organizations/{organization-name}/settings/copilot/features)
with consistent formatting and a short note that the org URL requires admin
access; keep the original phrasing ("Try to reset the MCP on the extension",
"Copilot > MCP settings") to make the change easy to locate.
internal/checks/checks_test.go (1)

1180-1208: 💤 Low value

Nice coverage for the close-failure → partial-lock-removal path.

Hijacking syncFile to close the underlying *os.File is a neat way to force the production f.Close() on line 393 of checks.go to return os.ErrClosed, deterministically exercising the new close-failure branch. Assertions on both the wrapped error text ("failed to close lock file") and the absence of the lock file pin down the contract well.

One small consideration: this test depends on syncFile being called before the production close. If the future implementation ever reorders or removes the explicit syncFile call, the close-failure path would silently stop being covered. Adding a brief inline comment to syncFile = func(...) { return f.Close() } explaining the intent (closing here causes the subsequent production Close() to fail) would help future maintainers. Optional.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/checks/checks_test.go` around lines 1180 - 1208,
TestCheckLockFile_CloseFailsRemovesPartialLock is relying on overriding the
package-level syncFile to force an early f.Close() so the subsequent production
Close() in checks.go fails; add a brief inline comment next to the assignment
syncFile = func(f *os.File) error { return f.Close() } that states "close here
causes the subsequent production Close() to return os.ErrClosed and exercise the
close-failure branch" so future maintainers understand the intent and the
dependency on invocation order.
internal/checks/checks.go (3)

441-464: 💤 Low value

Close error is now surfaced, but the EIO retry loop won't retry on Close failure.

Capturing closeErr into lastErr so it flows into the existing error-classification switch is the right call. Just flagging that the retry-on-EIO logic (lines 455–460) only applies when createTestFile itself returns the error — a Close-time EIO will not be retried because the break on line 449 exits the attempt loop unconditionally. For a permission probe this is almost certainly fine (Close errors here are rare and usually non-transient), so noting it for awareness rather than asking for a change.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/checks/checks.go` around lines 441 - 464, The loop captures closeErr
into lastErr but always breaks immediately after a Close failure, so Close-time
EIOs aren't retried; update the attempt loop around createTestFile/Close (the
code using createTestFile, closeErr, lastErr, syscall.EIO, retryDelay,
maxAttempts and c.logger.Warning) to treat a Close error the same as a
createTestFile EIO: after setting lastErr = closeErr, check if
errors.Is(closeErr, syscall.EIO) && attempt < maxAttempts, log a warning via
c.logger.Warning, sleep retryDelay and continue to retry; otherwise proceed to
break so non-transient close errors still stop retrying.

583-647: 💤 Low value

Error message normalization and cleanup hardening — LGTM.

A few nice properties of this block:

  • All wrapped errors now consistently use lowercased temp … prefixes and include the offending path, matching the casing change in backup_run_phases.go.
  • The defer func() { _ = osRemove(testFile) }() on line 629 guarantees the permission-test file is cleaned up regardless of whether the subsequent symlink test fails or succeeds — addresses the prior leak window.
  • Symlink-test cleanup (640–642) is now warn-only and ignores ENOENT, so a flaky removal won't mask an otherwise successful check.

One minor asymmetry: testFile is removed via defer while testSymlink is removed inline. Not a correctness issue (the symlink path only runs after the writability check passes), but moving symlink cleanup to a defer as well would make the cleanup pattern uniform. Optional.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/checks/checks.go` around lines 583 - 647, Move the symlink cleanup
to a deferred removal to match the permission-test pattern: after creating
testSymlink with osSymlink, add a defer that calls osRemove(testSymlink)
(ignoring errors similarly to the existing defer for testFile) so both testFile
and testSymlink are guaranteed to be cleaned up; keep the existing inline
warning-only removal logic if you want to log failures but ensure the defer runs
regardless of subsequent errors.

381-399: 💤 Low value

Robust partial-lock cleanup on write/close failure — looks good.

The flow correctly handles all three failure modes:

  • WriteString fails → close (warn on close error), remove partial lock, return wrapped write error.
  • syncFile fails → warn only (best-effort durability), continue.
  • Close fails → warn, remove partial lock, return wrapped close error.

One small redundancy worth considering: on the close-failure branch (lines 393–399) the close error is both Warning-logged and returned as result.Error. Since callers will log/handle the returned error, the inline Warning is mostly duplicate noise. Dropping it (or demoting to Debug) would keep the log cleaner without losing information — but this is purely cosmetic.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/checks/checks.go` around lines 381 - 399, The close-failure branch
currently both calls c.logger.Warning(...) and returns the wrapped error via
result.Error, creating duplicate logging; update the branch that handles the
f.Close() error (the block that calls f.Close(),
c.removePartialLockFile(lockPath), and sets result.Error/result.Message) to stop
emitting the Warning log — either remove the c.logger.Warning(...) call or
change it to c.logger.Debug(...) so only the returned result.Error is used for
higher-level logging; keep the call to c.removePartialLockFile(lockPath),
wrapping the close error into result.Error, and leave syncFile(...) and
WriteString handling unchanged.
.github/workflows/release.yml (1)

73-77: ⚡ Quick win

Pin an explicit GoReleaser version instead of version: latest.

Line 75 uses version: latest, which keeps the binary mutable and prevents reproducible releases. Use a semver range like version: "~> v2" (the recommended approach per the goreleaser-action documentation) to lock the GoReleaser version while allowing patch updates.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/release.yml around lines 73 - 77, The workflow pins the
goreleaser action with "version: latest" making releases non-reproducible;
update the action input in the release job (the goreleaser/goreleaser-action
usage block and its "version" key) to a semver range such as "~> v2" (or another
specific supported semver range per goreleaser-action docs) so the action binary
is fixed while still allowing patch updates.
internal/safeexec/safeexec.go (1)

21-163: 💤 Low value

Optional: Standardize factory formatting for consistency.

The factory functions mix single-line and multi-line formatting. While purely cosmetic, standardizing to one style would improve visual consistency.

For example, either:

"blkid": func(ctx context.Context, args ...string) *exec.Cmd { 
    return exec.CommandContext(ctx, "blkid", args...) 
},

Or:

"blkid": func(ctx context.Context, args ...string) *exec.Cmd { return exec.CommandContext(ctx, "blkid", args...) },

This is entirely optional and does not affect functionality.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/safeexec/safeexec.go` around lines 21 - 163, The map
allowedCommandFactories contains a mix of single-line and multi-line anonymous
factory functions (e.g., the "blkid" entry and many others); choose a consistent
formatting style (either single-line one-liner or multi-line with explicit
return) and apply it across all entries in allowedCommandFactories, updating
each anonymous func(ctx context.Context, args ...string) *exec.Cmd (for example
"blkid", "chattr", "mountpoint", etc.) so the entire map uses the same style for
visual consistency.
internal/identity/identity_test.go (1)

1140-1197: 🏗️ Heavy lift

These subtests stop asserting anything on Linux.

On the platform that uses the sysfs path, both tests now become log-only and can no longer catch regressions in bridge/wireless detection. Please keep a deterministic Linux assertion path here—e.g. by injecting the sysfs readers or by splitting the name-fallback logic into a pure helper and testing that directly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/identity/identity_test.go` around lines 1140 - 1197, The tests
currently bail out on linux because sysfs-based detection in isBridgeInterface
and isWirelessInterface makes the results non-deterministic; refactor by
extracting the name-only logic into pure helpers (e.g., isBridgeInterfaceByName
and isWirelessInterfaceByName) or add an injectable sysfs reader used by
isBridgeInterface/isWirelessInterface, then update the tests to call these pure
name-based helpers (or pass a mocked sysfs reader) so assertions run
deterministically on Linux; locate and change the logic inside isBridgeInterface
and isWirelessInterface to delegate to the new helpers or injectable reader and
update TestIsBridgeInterfaceByName/TestIsWirelessInterfaceByName to assert
against the pure helpers instead of skipping on linux.
internal/orchestrator/restore_workflow_ui_fstab.go (1)

19-20: 💤 Low value

Document the timeout rationale.

The 90-second timeout for fstab merge confirmation is hardcoded. Consider documenting why this specific duration was chosen (e.g., sufficient for operator review of mount points and remapping).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/restore_workflow_ui_fstab.go` around lines 19 - 20, The
constant fstabMergeTimeout (currently set to 90 * time.Second) lacks rationale;
update the code by adding a brief comment/docstring above fstabMergeTimeout that
explains why 90s was chosen (e.g., allows operator to review mount points and
remapping in the UI, expected human confirmation time, and safety margin) and
note whether this value is adjustable (e.g., via config/flag) or why it should
remain fixed; reference the symbol fstabMergeTimeout and the surrounding restore
workflow UI merge logic so reviewers can see the explanation next to the
constant.
internal/orchestrator/network_apply_workflow_ui_prompt.go (1)

151-165: 💤 Low value

Consider extracting the prompt timeout to a named constant.

The 90*time.Second confirmation timeout at line 162 is a magic number that sits next to defaultNetworkRollbackTimeout in the same message. A named constant (e.g., defaultNetworkApplyConfirmTimeout) would make it easier to keep these two timeouts coherent and to tune the prompt wait independently of the rollback window.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/network_apply_workflow_ui_prompt.go` around lines 151 -
165, Extract the magic timeout 90*time.Second used in confirmApplyNow when
calling f.ui.ConfirmAction into a named package-level constant (e.g.,
defaultNetworkApplyConfirmTimeout) and replace the literal with that constant;
update the declaration near defaultNetworkRollbackTimeout so both timeouts are
easy to see and tune together, and keep the change localized to the
confirmApplyNow function and the new constant to avoid altering ConfirmAction's
signature.
internal/orchestrator/network_apply_workflow_ui_rollback.go (1)

505-509: 💤 Low value

Consider renaming the error method to avoid shadowing the built-in type.

error is the name of Go's built-in interface, and a method named error on this flow type is also used just once (line 275). Either inlining f.logger.Error(...) at the single call site or renaming to errorf/logError would avoid the visual collision and keep symmetry with debug/info/warning clearer. Note: this is only a readability nit since methods can share names with built-in types without issue.

♻️ Proposed rename
-func (f *networkRollbackUIApplyFlow) error(format string, args ...interface{}) {
+func (f *networkRollbackUIApplyFlow) errorf(format string, args ...interface{}) {
 	if f.logger != nil {
 		f.logger.Error(format, args...)
 	}
 }

And update the call site at line 275 accordingly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/network_apply_workflow_ui_rollback.go` around lines 505
- 509, The method named error on the networkRollbackUIApplyFlow conflicts
visually with Go's built-in error type; rename the method (e.g., to errorf or
logError) and update its single call site (the usage of f.error) to the new
name, or alternatively inline the f.logger.Error(...) at that call site; ensure
the renamed method still checks f.logger != nil and forwards to
f.logger.Error(format, args...).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/logging/logger.go`:
- Around line 75-79: The close path currently returns on Close error before
clearing the logger state, leaving l.logFile as a stale handle; change the logic
in the close method that calls l.logFile.Close() so you call Close, capture its
error, then set l.logFile = nil (and any related state) before returning the
error (e.g., call l.logFile.Close(), err := ..., then l.logFile = nil, then if
err != nil return fmt.Errorf("failed to close existing log file: %w", err)).
Ensure you reference and clear l.logFile to avoid silent drop of future file
writes.

In `@internal/notify/email.go`:
- Around line 1154-1156: Headers are written verbatim and allow CR/LF injection
via toHeader and e.config.From; before writing in the email builder (the block
that calls fmt.Fprintf(&email, "To: %s\n", toHeader), fmt.Fprintf(&email, "From:
%s\n", e.config.From)), validate and sanitize those values by rejecting or
removing any '\r' or '\n' (or using strict parsing like net/mail.Address
parsing) and normalize/escape or return an error for invalid input; implement a
small helper (e.g., sanitizeHeaderValue or parseAndFormatAddress) and use it for
toHeader and e.config.From so no raw CR/LF can be emitted into the MIME header.

In `@internal/orchestrator/backup_safety.go`:
- Around line 447-457: When a write (io.Copy from tarReader into outFile) or the
subsequent outFile.Close() fails in the restore loop, remove the partially
written target file before continuing; update the failure branches inside the
block that calls io.Copy(...) and the block that calls outFile.Close() to call
os.Remove(target) (handling/remove and logging any Remove error) after
attempting to close the file and before continuing, referencing the existing
variables outFile, target, tarReader and logger.Warning for locating and logging
the removal step.

In `@internal/orchestrator/directory_recreation_config.go`:
- Around line 155-163: In parseConfigPath, stop using strings.Fields which
splits paths with spaces; instead detect the "path " prefix, slice off that
prefix from the original line and return strings.TrimSpace of the remainder so
values like "path /mnt/my store" return the full "/mnt/my store"; update
parseConfigPath to use that slicing and trimming logic while preserving the same
boolean success return.

In `@internal/orchestrator/directory_recreation_ownership.go`:
- Around line 20-27: The current flow returns early when
lookupBackupOwnership(path, logger) yields found==false, skipping
ensureDatastoreDirectoryMode; change it so only a real error from lookup causes
an immediate return (keep "if err != nil { return err }"), but if found==false
do not return—skip calling chownDatastorePath and proceed to call
ensureDatastoreDirectoryMode(path, logger). If found==true, call
chownDatastorePath(uid,gid,logger) and handle its error as before, then always
call ensureDatastoreDirectoryMode and return its result; reference
lookupBackupOwnership, chownDatastorePath, and ensureDatastoreDirectoryMode to
locate and adjust the logic.

In `@internal/orchestrator/directory_recreation_paths.go`:
- Around line 150-154: The current isCommonZFSMountPath function is too
permissive because strings.Contains("backup") and "datastore" match unrelated
paths; update isCommonZFSMountPath to only treat strong, mount-like path
prefixes as ZFS (e.g., keep strings.HasPrefix(pathLower, "/mnt/") and replace
the broad Contains checks with prefix/segment-aware checks such as
strings.HasPrefix(pathLower, "/backup/") or strings.HasPrefix(pathLower,
"/datastore/") or checks that look for "/backup/" and "/datastore/" with path
separators), or ideally consult mount information if available; make the change
in isCommonZFSMountPath so only true mount-style paths are classified as ZFS.

In `@internal/orchestrator/directory_recreation_pbs_config.go`:
- Around line 77-102: The function writePBSDatastoreCfgAtomically currently
writes and renames a temp file without syncing; modify it to call tmpFile.Sync()
(handle and return any error) after writing and before closing/renaming to
ensure data is flushed to stable storage, and after a successful
os.Rename(tmpPath, path) call also fsync the containing directory (open
filepath.Dir(path) and call Sync on that dir file) to ensure the directory entry
is persisted; keep the existing defer cleanup behavior and propagate any
sync/fsync errors back to the caller.

In `@internal/orchestrator/directory_recreation_pbs.go`:
- Around line 190-200: The ensurePBSDatastoreSubdirs function currently swallows
os.MkdirAll failures and only logs a warning, allowing initializePBSDatastore to
treat a failed subdir creation as success; change ensurePBSDatastoreSubdirs to
propagate the first os.MkdirAll error instead of continuing (e.g., change
signature to return (bool, []string, error) or otherwise return an error), stop
the loop on that error, and update the caller initializePBSDatastore to handle
and return that error so datastore initialization fails when creating
.chunks/.index (refer to ensurePBSDatastoreSubdirs, pbsDatastoreSubdirs and
initializePBSDatastore).

In `@internal/orchestrator/directory_recreation_pve.go`:
- Around line 30-31: The call to createStorageSubdirs currently logs failures
but always returns nil; change the code to surface errors instead: modify
createStorageSubdirs to return an error (or propagate its error) and in
directory_recreation_pve.go replace the fire-and-forget call with error handling
that returns the error upstream (e.g., if err := createStorageSubdirs(basePath,
subdirs, logger); err != nil { logger.Error(...); return err }). Ensure all
callers of createStorageSubdirs (including the block around the current lines
34–40) are updated to check and return the error so partial directory creation
doesn't go unnoticed.

In `@internal/orchestrator/mount_guard.go`:
- Around line 32-40: guardMountPoint currently probes mount state twice
(ensureGuardTargetUnmounted then isAlreadyMounted) which can race; change it to
perform a single probe and act on that result. Modify ensureGuardTargetUnmounted
(or add a new helper like probeMountState) to return both an error and a boolean
"isMounted" (or return a typed status) so guardMountPoint can call
normalizeGuardMountRequest, call the single probe
(ensureGuardTargetUnmounted/probeMountState) once, and then decide: if probe
returns an error, wrap and return it; if probe reports already mounted, return
nil; otherwise proceed to bind-mount. Update references to
ensureGuardTargetUnmounted and isAlreadyMounted accordingly in guardMountPoint
(and the similar block at lines 64-78) so no second probe is done.

In `@internal/orchestrator/pbs_notifications_api_apply.go`:
- Around line 81-104: The warning/error logging calls in
buildPBSNotificationDesiredState (switch default and other branches) and in
buildPBSNotificationEndpoint are dereferencing logger without a nil-check;
update both functions to guard logger usage by either early-setting a no-op
logger when logger == nil or wrapping each logger.Warning / logger.Error call in
an if logger != nil condition so that missing logger won't panic when
encountering unknown sections or malformed endpoints; reference the functions
buildPBSNotificationDesiredState and buildPBSNotificationEndpoint and ensure all
places that call logger.* within these functions (including the default case and
any endpoint-validation branches) are guarded.

In `@internal/orchestrator/restore_workflow_ui_backups_services.go`:
- Around line 131-147: prepareRestoreServices currently accumulates cleanup
funcs from preparePVEClusterRestore and preparePBSServices but if a later
prepare call returns an error (including ErrRestoreAborted) it returns early
without running the already-registered cleanups, leaving services stopped; fix
by ensuring any previously appended cleanup functions are executed before
returning an error from prepareRestoreServices (e.g., run cleanups in reverse
order immediately when a subsequent prepare returns non-nil error), referencing
prepareRestoreServices, preparePVEClusterRestore, preparePBSServices and
ErrRestoreAborted to locate the relevant logic.

In `@internal/orchestrator/restore_workflow_ui_run.go`:
- Around line 55-63: After calling prepareBundleAndPlan() in
restoreUIWorkflowRun.run(), register the defer to call w.prepared.Cleanup()
immediately after w.prepared may be set (i.e., right after the
prepareBundleAndPlan() call) so that cleanup runs even if prepareBundleAndPlan()
later returns an error; ensure you still check err and return it, but move the
defer placement to occur before the err check and guard it with if w.prepared !=
nil to avoid nil dereference.

In `@internal/orchestrator/restore_workflow_ui_tfa.go`:
- Around line 25-27: The prompt for TFA compatibility should not be skipped when
logger is nil; ensure the decision to ask the user depends only on addNow (and
addNames) not on logger. Modify the code around addNow/logger: move the user
prompt (the logic that mutates selected) out of any logger != nil guard so that
the prompt runs whenever addNow is false, and restrict logger usage to only
emitting the Warning about re-enrollment (use logger.Warning(...,
strings.Join(addNames, ", ")) only if logger != nil). Keep references to addNow,
addNames, logger and selected so you update the correct block in
restore_workflow_ui_tfa.go.

In `@internal/orchestrator/restore_workflow_ui.go`:
- Around line 65-77: Update normalizeRestoreWorkflowUIError to detect EOF using
errors.Is(err, io.EOF) instead of the direct equality check; specifically change
the condition that currently does "err == io.EOF" so wrapped EOFs (e.g.,
fmt.Errorf(...%w...)) trigger the logger.Warning call and return
ErrRestoreAborted. Keep the existing logger.Warning message and the subsequent
call to restoreWorkflowInputAborted(ctx, err) unchanged so behavior and other
sentinel handling remain consistent.

In `@internal/storage/filesystem.go`:
- Around line 205-209: The defer that removes the temporary test file should be
registered before the early return so it always runs; move the defer func() { _
= os.Remove(testFile) }() to immediately after the test file is created (or at
least before the f.Close() error check) so that regardless of f.Close() failing
the cleanup will execute, and preserve the existing d.logger.Debug call and
return false behavior when f.Close() returns an error.

---

Outside diff comments:
In `@internal/backup/archiver.go`:
- Around line 613-618: When cmd.Start() fails in the compressor start-failure
branches (the xz branch around cmd.Start(), the zstd branch, and the generic
pipeTarThroughCommand start-failure code path), preserve and return that start
error instead of the subsequent ErrClosedPipe: call pw.CloseWithError(startErr)
(using the actual start error), then drain errChan (receive the tar goroutine
error but ignore a closed-pipe result), and finally return the original startErr
(not the pipe error); update all three branches to follow this pattern so the
real compressor start failure is returned.

In `@internal/backup/collector_pbs_datastore.go`:
- Around line 540-593: Wrap the worker goroutines in a child cancellable context
created with context.WithCancel(ctx) inside runPBSPXARStep, pass that child
context into fn (replace uses of ctx in the goroutine with childCtx), and call
cancel() as soon as you record firstErr so sibling PXAR workers stop; also
listen on childCtx.Done() in the semaphore-acquire select to abort acquisition
when cancellation happens. Ensure you still return the recorded firstErr after
wg.Wait and respect the original ctx.Err() if set.

In `@internal/notify/templates.go`:
- Around line 114-177: Several HTML writes in the template (the fmt.Fprintf and
html.WriteString calls that interpolate data.Hostname, data.LocalStatusSummary,
data.LocalUsed/Free/Percent, data.SecondaryStatusSummary,
data.SecondaryUsed/Free/Percent, data.CloudStatusSummary, ScriptVersion, etc.)
inject raw strings; update each interpolated value to be passed through the
existing escapeHTML helper before insertion (e.g., replace direct uses of
data.Hostname, data.LocalStatusSummary, GetStorageEmoji(...) outputs and all
data.* used in fmt.Fprintf/html.WriteString here and the similar occurrence
around the ScriptVersion usage) so every user- or config-derived string is
escaped prior to writing into the HTML buffer.

In `@internal/tui/components/form_test.go`:
- Around line 74-79: Test mixes direct access to the embedded Form with the
wrapper: replace the direct call form.Form.GetFormItem(0) with the wrapper
method form.GetFormItem(0) so the test consistently uses form.GetFormItemCount()
and form.GetFormItem(0), then assert the returned item is a *tview.InputField
and call GetLabel() on it to check it equals "Password".

---

Nitpick comments:
In @.github/instructions/codacy.instructions.md:
- Around line 27-28: The paragraph starting with "Try to reset the MCP on the
extension" bundles multiple actions and URLs; split that long line into
sub-bullets for clarity: one bullet for resetting the MCP, one for
VSCode-specific instructions referencing "Copilot > MCP settings", and separate
bullets for each URL (https://github.com/settings/copilot/features and the
template
https://github.com/organizations/{organization-name}/settings/copilot/features)
with consistent formatting and a short note that the org URL requires admin
access; keep the original phrasing ("Try to reset the MCP on the extension",
"Copilot > MCP settings") to make the change easy to locate.

In @.github/workflows/release.yml:
- Around line 73-77: The workflow pins the goreleaser action with "version:
latest" making releases non-reproducible; update the action input in the release
job (the goreleaser/goreleaser-action usage block and its "version" key) to a
semver range such as "~> v2" (or another specific supported semver range per
goreleaser-action docs) so the action binary is fixed while still allowing patch
updates.

In `@internal/checks/checks_test.go`:
- Around line 1180-1208: TestCheckLockFile_CloseFailsRemovesPartialLock is
relying on overriding the package-level syncFile to force an early f.Close() so
the subsequent production Close() in checks.go fails; add a brief inline comment
next to the assignment syncFile = func(f *os.File) error { return f.Close() }
that states "close here causes the subsequent production Close() to return
os.ErrClosed and exercise the close-failure branch" so future maintainers
understand the intent and the dependency on invocation order.

In `@internal/checks/checks.go`:
- Around line 441-464: The loop captures closeErr into lastErr but always breaks
immediately after a Close failure, so Close-time EIOs aren't retried; update the
attempt loop around createTestFile/Close (the code using createTestFile,
closeErr, lastErr, syscall.EIO, retryDelay, maxAttempts and c.logger.Warning) to
treat a Close error the same as a createTestFile EIO: after setting lastErr =
closeErr, check if errors.Is(closeErr, syscall.EIO) && attempt < maxAttempts,
log a warning via c.logger.Warning, sleep retryDelay and continue to retry;
otherwise proceed to break so non-transient close errors still stop retrying.
- Around line 583-647: Move the symlink cleanup to a deferred removal to match
the permission-test pattern: after creating testSymlink with osSymlink, add a
defer that calls osRemove(testSymlink) (ignoring errors similarly to the
existing defer for testFile) so both testFile and testSymlink are guaranteed to
be cleaned up; keep the existing inline warning-only removal logic if you want
to log failures but ensure the defer runs regardless of subsequent errors.
- Around line 381-399: The close-failure branch currently both calls
c.logger.Warning(...) and returns the wrapped error via result.Error, creating
duplicate logging; update the branch that handles the f.Close() error (the block
that calls f.Close(), c.removePartialLockFile(lockPath), and sets
result.Error/result.Message) to stop emitting the Warning log — either remove
the c.logger.Warning(...) call or change it to c.logger.Debug(...) so only the
returned result.Error is used for higher-level logging; keep the call to
c.removePartialLockFile(lockPath), wrapping the close error into result.Error,
and leave syncFile(...) and WriteString handling unchanged.

In `@internal/identity/identity_test.go`:
- Around line 1140-1197: The tests currently bail out on linux because
sysfs-based detection in isBridgeInterface and isWirelessInterface makes the
results non-deterministic; refactor by extracting the name-only logic into pure
helpers (e.g., isBridgeInterfaceByName and isWirelessInterfaceByName) or add an
injectable sysfs reader used by isBridgeInterface/isWirelessInterface, then
update the tests to call these pure name-based helpers (or pass a mocked sysfs
reader) so assertions run deterministically on Linux; locate and change the
logic inside isBridgeInterface and isWirelessInterface to delegate to the new
helpers or injectable reader and update
TestIsBridgeInterfaceByName/TestIsWirelessInterfaceByName to assert against the
pure helpers instead of skipping on linux.

In `@internal/orchestrator/network_apply_workflow_ui_prompt.go`:
- Around line 151-165: Extract the magic timeout 90*time.Second used in
confirmApplyNow when calling f.ui.ConfirmAction into a named package-level
constant (e.g., defaultNetworkApplyConfirmTimeout) and replace the literal with
that constant; update the declaration near defaultNetworkRollbackTimeout so both
timeouts are easy to see and tune together, and keep the change localized to the
confirmApplyNow function and the new constant to avoid altering ConfirmAction's
signature.

In `@internal/orchestrator/network_apply_workflow_ui_rollback.go`:
- Around line 505-509: The method named error on the networkRollbackUIApplyFlow
conflicts visually with Go's built-in error type; rename the method (e.g., to
errorf or logError) and update its single call site (the usage of f.error) to
the new name, or alternatively inline the f.logger.Error(...) at that call site;
ensure the renamed method still checks f.logger != nil and forwards to
f.logger.Error(format, args...).

In `@internal/orchestrator/restore_workflow_ui_fstab.go`:
- Around line 19-20: The constant fstabMergeTimeout (currently set to 90 *
time.Second) lacks rationale; update the code by adding a brief
comment/docstring above fstabMergeTimeout that explains why 90s was chosen
(e.g., allows operator to review mount points and remapping in the UI, expected
human confirmation time, and safety margin) and note whether this value is
adjustable (e.g., via config/flag) or why it should remain fixed; reference the
symbol fstabMergeTimeout and the surrounding restore workflow UI merge logic so
reviewers can see the explanation next to the constant.

In `@internal/safeexec/safeexec.go`:
- Around line 21-163: The map allowedCommandFactories contains a mix of
single-line and multi-line anonymous factory functions (e.g., the "blkid" entry
and many others); choose a consistent formatting style (either single-line
one-liner or multi-line with explicit return) and apply it across all entries in
allowedCommandFactories, updating each anonymous func(ctx context.Context, args
...string) *exec.Cmd (for example "blkid", "chattr", "mountpoint", etc.) so the
entire map uses the same style for visual consistency.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 38c31e85-e07c-4500-869c-80c1c8bdac48

📥 Commits

Reviewing files that changed from the base of the PR and between c7f28e4 and 5e2d95e.

📒 Files selected for processing (150)
  • .github/instructions/codacy.instructions.md
  • .github/workflows/autotag.yml
  • .github/workflows/codecov.yml
  • .github/workflows/dependency-review.yml
  • .github/workflows/race.yml
  • .github/workflows/release.yml
  • .github/workflows/security-ultimate.yml
  • .github/workflows/sync-dev.yml
  • cmd/proxsave/main_defers.go
  • cmd/proxsave/runtime_helpers.go
  • cmd/proxsave/upgrade.go
  • embed.go
  • go.mod
  • internal/backup/archiver.go
  • internal/backup/archiver_test.go
  • internal/backup/archiver_verification_test.go
  • internal/backup/checksum.go
  • internal/backup/collector.go
  • internal/backup/collector_bricks_pve.go
  • internal/backup/collector_pbs.go
  • internal/backup/collector_pbs_datastore.go
  • internal/backup/collector_pve.go
  • internal/backup/collector_pve_additional_test.go
  • internal/backup/collector_pve_util_test.go
  • internal/backup/collector_system.go
  • internal/backup/collector_system_test.go
  • internal/backup/collector_test.go
  • internal/backup/optimizations.go
  • internal/backup/optimizations_bench_test.go
  • internal/checks/checks.go
  • internal/checks/checks_test.go
  • internal/cli/args.go
  • internal/closeerr/closeerr.go
  • internal/config/config.go
  • internal/config/upgrade.go
  • internal/environment/detect.go
  • internal/environment/unprivileged.go
  • internal/identity/identity.go
  • internal/identity/identity_test.go
  • internal/input/input_test.go
  • internal/logging/logger.go
  • internal/logging/session_test.go
  • internal/metrics/prometheus.go
  • internal/notify/email.go
  • internal/notify/email_delivery_methods_test.go
  • internal/notify/email_relay.go
  • internal/notify/email_relay_test.go
  • internal/notify/gotify.go
  • internal/notify/gotify_test.go
  • internal/notify/telegram.go
  • internal/notify/telegram_registration.go
  • internal/notify/telegram_registration_test.go
  • internal/notify/templates.go
  • internal/notify/webhook.go
  • internal/notify/webhook_test.go
  • internal/orchestrator/additional_helpers_test.go
  • internal/orchestrator/backup_run_phases.go
  • internal/orchestrator/backup_safety.go
  • internal/orchestrator/backup_safety_glob_test.go
  • internal/orchestrator/backup_safety_test.go
  • internal/orchestrator/backup_sources.go
  • internal/orchestrator/bundle_test.go
  • internal/orchestrator/close_error.go
  • internal/orchestrator/decompress_reader_test.go
  • internal/orchestrator/decrypt.go
  • internal/orchestrator/decrypt_test.go
  • internal/orchestrator/decrypt_tui_e2e_helpers_test.go
  • internal/orchestrator/deps.go
  • internal/orchestrator/deps_additional_test.go
  • internal/orchestrator/directory_recreation.go
  • internal/orchestrator/directory_recreation_config.go
  • internal/orchestrator/directory_recreation_ownership.go
  • internal/orchestrator/directory_recreation_paths.go
  • internal/orchestrator/directory_recreation_pbs.go
  • internal/orchestrator/directory_recreation_pbs_config.go
  • internal/orchestrator/directory_recreation_pbs_inspect.go
  • internal/orchestrator/directory_recreation_pbs_lock.go
  • internal/orchestrator/directory_recreation_pve.go
  • internal/orchestrator/directory_recreation_test.go
  • internal/orchestrator/encryption.go
  • internal/orchestrator/encryption_exported_test.go
  • internal/orchestrator/fs_atomic.go
  • internal/orchestrator/log_parser.go
  • internal/orchestrator/mount_guard.go
  • internal/orchestrator/mount_guard_apply.go
  • internal/orchestrator/mount_guard_more_test.go
  • internal/orchestrator/network_apply_countdown_test.go
  • internal/orchestrator/network_apply_preflight_rollback_test.go
  • internal/orchestrator/network_apply_workflow_ui.go
  • internal/orchestrator/network_apply_workflow_ui_prompt.go
  • internal/orchestrator/network_apply_workflow_ui_rollback.go
  • internal/orchestrator/network_diagnostics.go
  • internal/orchestrator/network_health.go
  • internal/orchestrator/network_plan.go
  • internal/orchestrator/network_staged_apply.go
  • internal/orchestrator/nic_mapping.go
  • internal/orchestrator/nic_mapping_additional_test.go
  • internal/orchestrator/orchestrator.go
  • internal/orchestrator/orchestrator_test.go
  • internal/orchestrator/pbs_notifications_api_apply.go
  • internal/orchestrator/pbs_staged_apply.go
  • internal/orchestrator/prompts_cli_test.go
  • internal/orchestrator/resolv_conf_repair.go
  • internal/orchestrator/restore_archive.go
  • internal/orchestrator/restore_archive_entries.go
  • internal/orchestrator/restore_archive_extract.go
  • internal/orchestrator/restore_coverage_extra_test.go
  • internal/orchestrator/restore_decision.go
  • internal/orchestrator/restore_decision_test.go
  • internal/orchestrator/restore_decompression.go
  • internal/orchestrator/restore_errors_test.go
  • internal/orchestrator/restore_firewall_additional_test.go
  • internal/orchestrator/restore_test.go
  • internal/orchestrator/restore_tui.go
  • internal/orchestrator/restore_tui_simulation_test.go
  • internal/orchestrator/restore_workflow_decision_test.go
  • internal/orchestrator/restore_workflow_test.go
  • internal/orchestrator/restore_workflow_ui.go
  • internal/orchestrator/restore_workflow_ui_apply.go
  • internal/orchestrator/restore_workflow_ui_backups_services.go
  • internal/orchestrator/restore_workflow_ui_cluster_apply.go
  • internal/orchestrator/restore_workflow_ui_extract.go
  • internal/orchestrator/restore_workflow_ui_extract_test.go
  • internal/orchestrator/restore_workflow_ui_fstab.go
  • internal/orchestrator/restore_workflow_ui_full.go
  • internal/orchestrator/restore_workflow_ui_plan.go
  • internal/orchestrator/restore_workflow_ui_run.go
  • internal/orchestrator/restore_workflow_ui_tfa.go
  • internal/orchestrator/selective_additional_test.go
  • internal/orchestrator/temp_registry.go
  • internal/orchestrator/tui_simulation_test.go
  • internal/orchestrator/workflow_ui_tui_decrypt.go
  • internal/orchestrator/workflow_ui_tui_restore.go
  • internal/orchestrator/workflow_ui_tui_shared.go
  • internal/pbs/namespaces_test.go
  • internal/safeexec/safeexec.go
  • internal/security/security.go
  • internal/security/security_test.go
  • internal/storage/cloud.go
  • internal/storage/filesystem.go
  • internal/storage/local.go
  • internal/storage/secondary.go
  • internal/tui/abort_context_test.go
  • internal/tui/components/form.go
  • internal/tui/components/form_test.go
  • internal/tui/components/list_form_item.go
  • internal/tui/components/panel.go
  • internal/tui/wizard/age.go
  • internal/tui/wizard/install.go
  • internal/tui/wizard/post_install_audit_tui.go
💤 Files with no reviewable changes (2)
  • internal/environment/unprivileged.go
  • internal/orchestrator/network_staged_apply.go

Comment on lines +75 to 79
if err := l.logFile.Close(); err != nil {
return fmt.Errorf("failed to close existing log file: %w", err)
}
l.logFile = nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clear l.logFile before returning on close failure.

At Line 75, an error return happens before resetting l.logFile. That can leave a stale handle in logger state, and later writes (which ignore write errors) can silently drop file logs.

💡 Proposed fix
 	// If a log file is already open, close it first.
 	if l.logFile != nil {
-		if err := l.logFile.Close(); err != nil {
+		prev := l.logFile
+		l.logFile = nil
+		if err := prev.Close(); err != nil {
 			return fmt.Errorf("failed to close existing log file: %w", err)
 		}
-		l.logFile = nil
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/logging/logger.go` around lines 75 - 79, The close path currently
returns on Close error before clearing the logger state, leaving l.logFile as a
stale handle; change the logic in the close method that calls l.logFile.Close()
so you call Close, capture its error, then set l.logFile = nil (and any related
state) before returning the error (e.g., call l.logFile.Close(), err := ...,
then l.logFile = nil, then if err != nil return fmt.Errorf("failed to close
existing log file: %w", err)). Ensure you reference and clear l.logFile to avoid
silent drop of future file writes.

Comment thread internal/notify/email.go
Comment on lines +1154 to +1156
fmt.Fprintf(&email, "To: %s\n", toHeader)
fmt.Fprintf(&email, "From: %s\n", e.config.From)
fmt.Fprintf(&email, "Subject: =?UTF-8?B?%s?=\n", encodedSubject)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject CR/LF before writing mail headers.

To and From are emitted verbatim into the MIME header block. A configured or auto-detected value containing \r/\n can inject extra headers or produce a malformed message. Please sanitize or strictly parse these fields before writing them.

💡 Suggested patch
 	var email strings.Builder
-	toHeader = strings.TrimSpace(recipient)
+	sanitizeHeader := func(v string) string {
+		return strings.NewReplacer("\r", "", "\n", "").Replace(strings.TrimSpace(v))
+	}
+	toHeader = sanitizeHeader(recipient)
 	if toHeader == "" {
 		toHeader = "root"
 	}
-	fmt.Fprintf(&email, "To: %s\n", toHeader)
-	fmt.Fprintf(&email, "From: %s\n", e.config.From)
+	fromHeader := sanitizeHeader(e.config.From)
+	fmt.Fprintf(&email, "To: %s\n", toHeader)
+	fmt.Fprintf(&email, "From: %s\n", fromHeader)
 	fmt.Fprintf(&email, "Subject: =?UTF-8?B?%s?=\n", encodedSubject)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/notify/email.go` around lines 1154 - 1156, Headers are written
verbatim and allow CR/LF injection via toHeader and e.config.From; before
writing in the email builder (the block that calls fmt.Fprintf(&email, "To:
%s\n", toHeader), fmt.Fprintf(&email, "From: %s\n", e.config.From)), validate
and sanitize those values by rejecting or removing any '\r' or '\n' (or using
strict parsing like net/mail.Address parsing) and normalize/escape or return an
error for invalid input; implement a small helper (e.g., sanitizeHeaderValue or
parseAndFormatAddress) and use it for toHeader and e.config.From so no raw CR/LF
can be emitted into the MIME header.

Comment on lines 447 to +457
if _, err := io.Copy(outFile, tarReader); err != nil {
outFile.Close()
if closeErr := outFile.Close(); closeErr != nil {
logger.Warning("Cannot close partially restored file %s: %v", target, closeErr)
}
logger.Warning("Cannot write file %s: %v", target, err)
continue
}
outFile.Close()
if err := outFile.Close(); err != nil {
logger.Warning("Cannot close restored file %s: %v", target, err)
continue
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove the target when a restore write or close fails.

This path logs and continues, but it leaves the partially restored file in place. For rollback code that's dangerous: a truncated config is usually worse than no restore at all. Delete target on both the io.Copy and Close failure branches.

Suggested fix
 		if _, err := io.Copy(outFile, tarReader); err != nil {
 			if closeErr := outFile.Close(); closeErr != nil {
 				logger.Warning("Cannot close partially restored file %s: %v", target, closeErr)
 			}
+			if removeErr := safetyFS.Remove(target); removeErr != nil && !os.IsNotExist(removeErr) {
+				logger.Warning("Cannot remove partially restored file %s: %v", target, removeErr)
+			}
 			logger.Warning("Cannot write file %s: %v", target, err)
 			continue
 		}
 		if err := outFile.Close(); err != nil {
+			if removeErr := safetyFS.Remove(target); removeErr != nil && !os.IsNotExist(removeErr) {
+				logger.Warning("Cannot remove incompletely restored file %s: %v", target, removeErr)
+			}
 			logger.Warning("Cannot close restored file %s: %v", target, err)
 			continue
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if _, err := io.Copy(outFile, tarReader); err != nil {
outFile.Close()
if closeErr := outFile.Close(); closeErr != nil {
logger.Warning("Cannot close partially restored file %s: %v", target, closeErr)
}
logger.Warning("Cannot write file %s: %v", target, err)
continue
}
outFile.Close()
if err := outFile.Close(); err != nil {
logger.Warning("Cannot close restored file %s: %v", target, err)
continue
}
if _, err := io.Copy(outFile, tarReader); err != nil {
if closeErr := outFile.Close(); closeErr != nil {
logger.Warning("Cannot close partially restored file %s: %v", target, closeErr)
}
if removeErr := safetyFS.Remove(target); removeErr != nil && !os.IsNotExist(removeErr) {
logger.Warning("Cannot remove partially restored file %s: %v", target, removeErr)
}
logger.Warning("Cannot write file %s: %v", target, err)
continue
}
if err := outFile.Close(); err != nil {
if removeErr := safetyFS.Remove(target); removeErr != nil && !os.IsNotExist(removeErr) {
logger.Warning("Cannot remove incompletely restored file %s: %v", target, removeErr)
}
logger.Warning("Cannot close restored file %s: %v", target, err)
continue
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/backup_safety.go` around lines 447 - 457, When a write
(io.Copy from tarReader into outFile) or the subsequent outFile.Close() fails in
the restore loop, remove the partially written target file before continuing;
update the failure branches inside the block that calls io.Copy(...) and the
block that calls outFile.Close() to call os.Remove(target) (handling/remove and
logging any Remove error) after attempting to close the file and before
continuing, referencing the existing variables outFile, target, tarReader and
logger.Warning for locating and logging the removal step.

Comment on lines +155 to +163
func parseConfigPath(line string) (string, bool) {
if !strings.HasPrefix(line, "path ") {
return "", false
}
parts := strings.Fields(line)
if len(parts) < 2 {
return "", false
}
return parts[1], true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Parse the full path value instead of the second token.

strings.Fields truncates path /mnt/my store to /mnt/my, so directory recreation will target the wrong location for paths containing spaces. Slice off the "path " prefix and trim the remainder instead.

Suggested fix
 func parseConfigPath(line string) (string, bool) {
 	if !strings.HasPrefix(line, "path ") {
 		return "", false
 	}
-	parts := strings.Fields(line)
-	if len(parts) < 2 {
+	value := strings.TrimSpace(strings.TrimPrefix(line, "path "))
+	if value == "" {
 		return "", false
 	}
-	return parts[1], true
+	return value, true
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func parseConfigPath(line string) (string, bool) {
if !strings.HasPrefix(line, "path ") {
return "", false
}
parts := strings.Fields(line)
if len(parts) < 2 {
return "", false
}
return parts[1], true
func parseConfigPath(line string) (string, bool) {
if !strings.HasPrefix(line, "path ") {
return "", false
}
value := strings.TrimSpace(strings.TrimPrefix(line, "path "))
if value == "" {
return "", false
}
return value, true
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/directory_recreation_config.go` around lines 155 - 163,
In parseConfigPath, stop using strings.Fields which splits paths with spaces;
instead detect the "path " prefix, slice off that prefix from the original line
and return strings.TrimSpace of the remainder so values like "path /mnt/my
store" return the full "/mnt/my store"; update parseConfigPath to use that
slicing and trimming logic while preserving the same boolean success return.

Comment on lines +20 to +27
uid, gid, found, err := lookupBackupOwnership(path, logger)
if err != nil || !found {
return err
}
if err := chownDatastorePath(path, uid, gid, logger); err != nil {
return err
}
return ensureDatastoreDirectoryMode(path, logger)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t skip the mode fix when backup lookup is unavailable.

If user.Lookup("backup") returns found == false, this returns early and never calls ensureDatastoreDirectoryMode. That leaves recreated datastore directories with potentially wrong permissions even though the chmod step is still safe and independent of chown.

Suggested fix
 	uid, gid, found, err := lookupBackupOwnership(path, logger)
-	if err != nil || !found {
+	if err != nil {
 		return err
 	}
+	if !found {
+		return ensureDatastoreDirectoryMode(path, logger)
+	}
 	if err := chownDatastorePath(path, uid, gid, logger); err != nil {
 		return err
 	}
 	return ensureDatastoreDirectoryMode(path, logger)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
uid, gid, found, err := lookupBackupOwnership(path, logger)
if err != nil || !found {
return err
}
if err := chownDatastorePath(path, uid, gid, logger); err != nil {
return err
}
return ensureDatastoreDirectoryMode(path, logger)
uid, gid, found, err := lookupBackupOwnership(path, logger)
if err != nil {
return err
}
if !found {
return ensureDatastoreDirectoryMode(path, logger)
}
if err := chownDatastorePath(path, uid, gid, logger); err != nil {
return err
}
return ensureDatastoreDirectoryMode(path, logger)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/directory_recreation_ownership.go` around lines 20 -
27, The current flow returns early when lookupBackupOwnership(path, logger)
yields found==false, skipping ensureDatastoreDirectoryMode; change it so only a
real error from lookup causes an immediate return (keep "if err != nil { return
err }"), but if found==false do not return—skip calling chownDatastorePath and
proceed to call ensureDatastoreDirectoryMode(path, logger). If found==true, call
chownDatastorePath(uid,gid,logger) and handle its error as before, then always
call ensureDatastoreDirectoryMode and return its result; reference
lookupBackupOwnership, chownDatastorePath, and ensureDatastoreDirectoryMode to
locate and adjust the logic.

Comment on lines +131 to +147
func (w *restoreUIWorkflowRun) prepareRestoreServices() (func(), error) {
var cleanups []func()
if cleanup, err := w.preparePVEClusterRestore(); err != nil {
return nil, err
} else if cleanup != nil {
cleanups = append(cleanups, cleanup)
}
if cleanup, err := w.preparePBSServices(); err != nil {
return nil, err
} else if cleanup != nil {
cleanups = append(cleanups, cleanup)
}
return func() {
for i := len(cleanups) - 1; i >= 0; i-- {
cleanups[i]()
}
}, nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Run already-registered cleanups when a later prepare step fails.

If preparePVEClusterRestore() succeeds and preparePBSServices() then errors or returns ErrRestoreAborted, this function drops the accumulated cleanup and returns nil, leaving the earlier service stop in effect. That can strand the host in a degraded state before restore even starts.

Suggested fix
 func (w *restoreUIWorkflowRun) prepareRestoreServices() (func(), error) {
 	var cleanups []func()
+	runCleanups := func() {
+		for i := len(cleanups) - 1; i >= 0; i-- {
+			cleanups[i]()
+		}
+	}
 	if cleanup, err := w.preparePVEClusterRestore(); err != nil {
 		return nil, err
 	} else if cleanup != nil {
 		cleanups = append(cleanups, cleanup)
 	}
 	if cleanup, err := w.preparePBSServices(); err != nil {
+		runCleanups()
 		return nil, err
 	} else if cleanup != nil {
 		cleanups = append(cleanups, cleanup)
 	}
-	return func() {
-		for i := len(cleanups) - 1; i >= 0; i-- {
-			cleanups[i]()
-		}
-	}, nil
+	return runCleanups, nil
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (w *restoreUIWorkflowRun) prepareRestoreServices() (func(), error) {
var cleanups []func()
if cleanup, err := w.preparePVEClusterRestore(); err != nil {
return nil, err
} else if cleanup != nil {
cleanups = append(cleanups, cleanup)
}
if cleanup, err := w.preparePBSServices(); err != nil {
return nil, err
} else if cleanup != nil {
cleanups = append(cleanups, cleanup)
}
return func() {
for i := len(cleanups) - 1; i >= 0; i-- {
cleanups[i]()
}
}, nil
func (w *restoreUIWorkflowRun) prepareRestoreServices() (func(), error) {
var cleanups []func()
runCleanups := func() {
for i := len(cleanups) - 1; i >= 0; i-- {
cleanups[i]()
}
}
if cleanup, err := w.preparePVEClusterRestore(); err != nil {
return nil, err
} else if cleanup != nil {
cleanups = append(cleanups, cleanup)
}
if cleanup, err := w.preparePBSServices(); err != nil {
runCleanups()
return nil, err
} else if cleanup != nil {
cleanups = append(cleanups, cleanup)
}
return runCleanups, nil
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/restore_workflow_ui_backups_services.go` around lines
131 - 147, prepareRestoreServices currently accumulates cleanup funcs from
preparePVEClusterRestore and preparePBSServices but if a later prepare call
returns an error (including ErrRestoreAborted) it returns early without running
the already-registered cleanups, leaving services stopped; fix by ensuring any
previously appended cleanup functions are executed before returning an error
from prepareRestoreServices (e.g., run cleanups in reverse order immediately
when a subsequent prepare returns non-nil error), referencing
prepareRestoreServices, preparePVEClusterRestore, preparePBSServices and
ErrRestoreAborted to locate the relevant logic.

Comment on lines +55 to +63
func (w *restoreUIWorkflowRun) run() error {
fallbackToFullRestore, err := w.prepareBundleAndPlan()
if err != nil {
return err
}
if w.prepared != nil {
defer w.prepared.Cleanup()
}
if fallbackToFullRestore {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Defer prepared-bundle cleanup before the error return.

If prepareBundleAndPlan() allocates w.prepared and then fails, the early return skips Cleanup() and leaks the prepared bundle/tempdir. Register the defer immediately after the call, before checking err.

Suggested fix
 func (w *restoreUIWorkflowRun) run() error {
 	fallbackToFullRestore, err := w.prepareBundleAndPlan()
+	if w.prepared != nil {
+		defer w.prepared.Cleanup()
+	}
 	if err != nil {
 		return err
 	}
-	if w.prepared != nil {
-		defer w.prepared.Cleanup()
-	}
 	if fallbackToFullRestore {
 		return runFullRestoreWithUI(w.ctx, w.ui, w.candidate, w.prepared, w.destRoot, w.logger, w.cfg.DryRun)
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (w *restoreUIWorkflowRun) run() error {
fallbackToFullRestore, err := w.prepareBundleAndPlan()
if err != nil {
return err
}
if w.prepared != nil {
defer w.prepared.Cleanup()
}
if fallbackToFullRestore {
func (w *restoreUIWorkflowRun) run() error {
fallbackToFullRestore, err := w.prepareBundleAndPlan()
if w.prepared != nil {
defer w.prepared.Cleanup()
}
if err != nil {
return err
}
if fallbackToFullRestore {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/restore_workflow_ui_run.go` around lines 55 - 63, After
calling prepareBundleAndPlan() in restoreUIWorkflowRun.run(), register the defer
to call w.prepared.Cleanup() immediately after w.prepared may be set (i.e.,
right after the prepareBundleAndPlan() call) so that cleanup runs even if
prepareBundleAndPlan() later returns an error; ensure you still check err and
return it, but move the defer placement to occur before the err check and guard
it with if w.prepared != nil to avoid nil dereference.

Comment on lines +25 to +27
if !addNow {
logger.Warning("Access control selected without %s; WebAuthn users may require re-enrollment if the UI origin changes", strings.Join(addNames, ", "))
return selected, nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t gate the TFA compatibility prompt on logger.

logger is only needed for the decline warning, but Line 34 makes it a hard precondition for prompting. If a caller passes nil, this silently skips the new safeguard and returns the original category set unchanged.

Suggested fix
 func maybeAddRecommendedCategoriesForTFA(ctx context.Context, ui RestoreWorkflowUI, logger *logging.Logger, selected []Category, available []Category) ([]Category, error) {
 	if !shouldPromptForTFARecommendations(ui, logger, selected) {
 		return selected, nil
 	}
@@
 	if !addNow {
-		logger.Warning("Access control selected without %s; WebAuthn users may require re-enrollment if the UI origin changes", strings.Join(addNames, ", "))
+		if logger != nil {
+			logger.Warning("Access control selected without %s; WebAuthn users may require re-enrollment if the UI origin changes", strings.Join(addNames, ", "))
+		}
 		return selected, nil
 	}
 	return dedupeCategoriesByID(append(selected, addCategories...)), nil
 }
 
 func shouldPromptForTFARecommendations(ui RestoreWorkflowUI, logger *logging.Logger, selected []Category) bool {
-	return ui != nil &&
-		logger != nil &&
-		(hasCategoryID(selected, "pve_access_control") || hasCategoryID(selected, "pbs_access_control"))
+	return ui != nil &&
+		(hasCategoryID(selected, "pve_access_control") || hasCategoryID(selected, "pbs_access_control"))
 }

Also applies to: 32-35

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/restore_workflow_ui_tfa.go` around lines 25 - 27, The
prompt for TFA compatibility should not be skipped when logger is nil; ensure
the decision to ask the user depends only on addNow (and addNames) not on
logger. Modify the code around addNow/logger: move the user prompt (the logic
that mutates selected) out of any logger != nil guard so that the prompt runs
whenever addNow is false, and restrict logger usage to only emitting the Warning
about re-enrollment (use logger.Warning(..., strings.Join(addNames, ", ")) only
if logger != nil). Keep references to addNow, addNames, logger and selected so
you update the correct block in restore_workflow_ui_tfa.go.

Comment on lines +65 to 77
func normalizeRestoreWorkflowUIError(ctx context.Context, logger *logging.Logger, err error) error {
if err == nil {
return nil
}
logging.DebugStep(logger, "safe cluster apply (ui)", "pvesh=%s", pveshPath)

currentNode, _ := os.Hostname()
currentNode = shortHost(currentNode)
if strings.TrimSpace(currentNode) == "" {
currentNode = "localhost"
}
logging.DebugStep(logger, "safe cluster apply (ui)", "current_node=%s", currentNode)

logger.Info("")
logger.Info("SAFE cluster restore: applying configs via pvesh (node=%s)", currentNode)

// Datacenter-wide objects (SAFE apply):
// - resource mappings (used by VM configs via mapping=<id>)
// - resource pools (definitions + membership)
if mapErr := maybeApplyPVEClusterResourceMappingsWithUI(ctx, ui, logger, exportRoot); mapErr != nil {
logger.Warning("SAFE apply: resource mappings: %v", mapErr)
}

pools, poolsErr := readPVEPoolsFromExportUserCfg(exportRoot)
if poolsErr != nil {
logger.Warning("SAFE apply: failed to parse pools from export: %v", poolsErr)
pools = nil
}
applyPools := false
allowPoolMove := false
if len(pools) > 0 {
poolNames := summarizePoolIDs(pools, 10)
message := fmt.Sprintf("Found %d pool(s) in exported user.cfg.\n\nPools: %s\n\nApply pool definitions now? (Membership will be applied later in this SAFE apply flow.)", len(pools), poolNames)
ok, promptErr := ui.ConfirmAction(ctx, "Apply PVE resource pools (merge)", message, "Apply now", "Skip apply", 0, false)
if promptErr != nil {
return promptErr
}
applyPools = ok
logging.DebugStep(logger, "safe cluster apply (ui)", "User choice: apply_pools=%v (pools=%d)", applyPools, len(pools))
if applyPools {
if anyPoolHasVMs(pools) {
moveMsg := "Allow moving guests from other pools to match the backup? This may change the current pool assignment of existing VMs/CTs."
move, moveErr := ui.ConfirmAction(ctx, "Pools: allow move (VM/CT)", moveMsg, "Allow move", "Don't move", 0, false)
if moveErr != nil {
return moveErr
}
allowPoolMove = move
}

applied, failed, applyErr := applyPVEPoolsDefinitions(ctx, logger, pools)
if applyErr != nil {
logger.Warning("Pools apply (definitions) encountered errors: %v", applyErr)
}
logger.Info("Pools apply (definitions) completed: ok=%d failed=%d", applied, failed)
}
}

sourceNode := currentNode
logging.DebugStep(logger, "safe cluster apply (ui)", "List exported node directories under %s", filepath.Join(exportRoot, "etc/pve/nodes"))
exportNodes, nodesErr := listExportNodeDirs(exportRoot)
if nodesErr != nil {
logger.Warning("Failed to inspect exported node directories: %v", nodesErr)
} else if len(exportNodes) > 0 {
logging.DebugStep(logger, "safe cluster apply (ui)", "export_nodes=%s", strings.Join(exportNodes, ","))
} else {
logging.DebugStep(logger, "safe cluster apply (ui)", "No exported node directories found")
}

if len(exportNodes) > 0 && !stringSliceContains(exportNodes, sourceNode) {
logging.DebugStep(logger, "safe cluster apply (ui)", "Node mismatch: current_node=%s export_nodes=%s", currentNode, strings.Join(exportNodes, ","))
logger.Warning("SAFE cluster restore: VM/CT configs not found for current node %s in export; available nodes: %s", currentNode, strings.Join(exportNodes, ", "))
if len(exportNodes) == 1 {
sourceNode = exportNodes[0]
logging.DebugStep(logger, "safe cluster apply (ui)", "Auto-select source node: %s", sourceNode)
logger.Info("SAFE cluster restore: using exported node %s as VM/CT source, applying to current node %s", sourceNode, currentNode)
} else {
for _, node := range exportNodes {
qemuCount, lxcCount := countVMConfigsForNode(exportRoot, node)
logging.DebugStep(logger, "safe cluster apply (ui)", "Export node candidate: %s (qemu=%d, lxc=%d)", node, qemuCount, lxcCount)
}
selected, selErr := ui.SelectExportNode(ctx, exportRoot, currentNode, exportNodes)
if selErr != nil {
return selErr
}
if strings.TrimSpace(selected) == "" {
logging.DebugStep(logger, "safe cluster apply (ui)", "User selected: skip VM/CT apply (no source node)")
logger.Info("Skipping VM/CT apply (no source node selected)")
sourceNode = ""
} else {
sourceNode = selected
logging.DebugStep(logger, "safe cluster apply (ui)", "User selected source node: %s", sourceNode)
logger.Info("SAFE cluster restore: selected exported node %s as VM/CT source, applying to current node %s", sourceNode, currentNode)
}
}
}
logging.DebugStep(logger, "safe cluster apply (ui)", "Selected VM/CT source node: %q (current_node=%q)", sourceNode, currentNode)

var vmEntries []vmEntry
if strings.TrimSpace(sourceNode) != "" {
logging.DebugStep(logger, "safe cluster apply (ui)", "Scan VM/CT configs in export (source_node=%s)", sourceNode)
vmEntries, err = scanVMConfigs(exportRoot, sourceNode)
if err != nil {
logger.Warning("Failed to scan VM configs: %v", err)
vmEntries = nil
} else {
logging.DebugStep(logger, "safe cluster apply (ui)", "VM/CT configs found=%d (source_node=%s)", len(vmEntries), sourceNode)
}
}

if len(vmEntries) > 0 {
applyVMs, promptErr := ui.ConfirmApplyVMConfigs(ctx, sourceNode, currentNode, len(vmEntries))
if promptErr != nil {
return promptErr
}
logging.DebugStep(logger, "safe cluster apply (ui)", "User choice: apply_vms=%v (entries=%d)", applyVMs, len(vmEntries))
if applyVMs {
applied, failed := applyVMConfigs(ctx, vmEntries, logger)
logger.Info("VM/CT apply completed: ok=%d failed=%d", applied, failed)
} else {
logger.Info("Skipping VM/CT apply")
}
} else {
if strings.TrimSpace(sourceNode) == "" {
logger.Info("No VM/CT configs applied (no source node selected)")
} else {
logger.Info("No VM/CT configs found for node %s in export", sourceNode)
}
}

skipStorageDatacenter := plan != nil && plan.HasCategoryID("storage_pve")
if skipStorageDatacenter {
logging.DebugStep(logger, "safe cluster apply (ui)", "Skip storage/datacenter apply: handled by storage_pve staged restore")
logger.Info("Skipping storage/datacenter apply (handled by storage_pve staged restore)")
} else {
storageCfg := filepath.Join(exportRoot, "etc/pve/storage.cfg")
logging.DebugStep(logger, "safe cluster apply (ui)", "Check export: storage.cfg (%s)", storageCfg)
storageInfo, storageErr := restoreFS.Stat(storageCfg)
if storageErr == nil && !storageInfo.IsDir() {
logging.DebugStep(logger, "safe cluster apply (ui)", "storage.cfg found (size=%d)", storageInfo.Size())
applyStorage, promptErr := ui.ConfirmApplyStorageCfg(ctx, storageCfg)
if promptErr != nil {
return promptErr
}
logging.DebugStep(logger, "safe cluster apply (ui)", "User choice: apply_storage=%v", applyStorage)
if applyStorage {
applied, failed, err := applyStorageCfg(ctx, storageCfg, logger)
logging.DebugStep(logger, "safe cluster apply (ui)", "Storage apply result: ok=%d failed=%d err=%v", applied, failed, err)
if err != nil {
logger.Warning("Storage apply encountered errors: %v", err)
}
logger.Info("Storage apply completed: ok=%d failed=%d", applied, failed)
} else {
logger.Info("Skipping storage.cfg apply")
}
} else {
logging.DebugStep(logger, "safe cluster apply (ui)", "storage.cfg not found (err=%v)", storageErr)
logger.Info("No storage.cfg found in export")
}

dcCfg := filepath.Join(exportRoot, "etc/pve/datacenter.cfg")
logging.DebugStep(logger, "safe cluster apply (ui)", "Check export: datacenter.cfg (%s)", dcCfg)
dcInfo, dcErr := restoreFS.Stat(dcCfg)
if dcErr == nil && !dcInfo.IsDir() {
logging.DebugStep(logger, "safe cluster apply (ui)", "datacenter.cfg found (size=%d)", dcInfo.Size())
applyDC, promptErr := ui.ConfirmApplyDatacenterCfg(ctx, dcCfg)
if promptErr != nil {
return promptErr
}
logging.DebugStep(logger, "safe cluster apply (ui)", "User choice: apply_datacenter=%v", applyDC)
if applyDC {
logging.DebugStep(logger, "safe cluster apply (ui)", "Apply datacenter.cfg via pvesh")
if err := runPvesh(ctx, logger, []string{"set", "/cluster/config", "-conf", dcCfg}); err != nil {
logger.Warning("Failed to apply datacenter.cfg: %v", err)
} else {
logger.Info("datacenter.cfg applied successfully")
}
} else {
logger.Info("Skipping datacenter.cfg apply")
}
} else {
logging.DebugStep(logger, "safe cluster apply (ui)", "datacenter.cfg not found (err=%v)", dcErr)
logger.Info("No datacenter.cfg found in export")
}
if err == io.EOF {
logger.Warning("Restore input closed unexpectedly (EOF). This usually means the interactive UI lost access to stdin/TTY (e.g., SSH disconnect or non-interactive execution). Re-run with --restore --cli from an interactive shell.")
return ErrRestoreAborted
}

// Apply pool membership after VM configs and storage/datacenter apply.
if applyPools && len(pools) > 0 {
applied, failed, applyErr := applyPVEPoolsMembership(ctx, logger, pools, allowPoolMove)
if applyErr != nil {
logger.Warning("Pools apply (membership) encountered errors: %v", applyErr)
}
logger.Info("Pools apply (membership) completed: ok=%d failed=%d", applied, failed)
if restoreWorkflowInputAborted(ctx, err) {
return ErrRestoreAborted
}

return nil
return err
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use errors.Is(err, io.EOF) to handle wrapped EOFs.

err == io.EOF only matches a bare sentinel. Any caller in the workflow chain that wraps EOF (e.g. fmt.Errorf("read prompt: %w", io.EOF)) will bypass both the operator-facing warning and the ErrRestoreAborted normalization, leaving the original wrapped error to surface to the caller. The rest of this helper already uses errors.Is for its sentinels, so this is also a consistency win.

🛡️ Proposed fix
-	if err == io.EOF {
+	if errors.Is(err, io.EOF) {
 		logger.Warning("Restore input closed unexpectedly (EOF). This usually means the interactive UI lost access to stdin/TTY (e.g., SSH disconnect or non-interactive execution). Re-run with --restore --cli from an interactive shell.")
 		return ErrRestoreAborted
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func normalizeRestoreWorkflowUIError(ctx context.Context, logger *logging.Logger, err error) error {
if err == nil {
return nil
}
logging.DebugStep(logger, "safe cluster apply (ui)", "pvesh=%s", pveshPath)
currentNode, _ := os.Hostname()
currentNode = shortHost(currentNode)
if strings.TrimSpace(currentNode) == "" {
currentNode = "localhost"
}
logging.DebugStep(logger, "safe cluster apply (ui)", "current_node=%s", currentNode)
logger.Info("")
logger.Info("SAFE cluster restore: applying configs via pvesh (node=%s)", currentNode)
// Datacenter-wide objects (SAFE apply):
// - resource mappings (used by VM configs via mapping=<id>)
// - resource pools (definitions + membership)
if mapErr := maybeApplyPVEClusterResourceMappingsWithUI(ctx, ui, logger, exportRoot); mapErr != nil {
logger.Warning("SAFE apply: resource mappings: %v", mapErr)
}
pools, poolsErr := readPVEPoolsFromExportUserCfg(exportRoot)
if poolsErr != nil {
logger.Warning("SAFE apply: failed to parse pools from export: %v", poolsErr)
pools = nil
}
applyPools := false
allowPoolMove := false
if len(pools) > 0 {
poolNames := summarizePoolIDs(pools, 10)
message := fmt.Sprintf("Found %d pool(s) in exported user.cfg.\n\nPools: %s\n\nApply pool definitions now? (Membership will be applied later in this SAFE apply flow.)", len(pools), poolNames)
ok, promptErr := ui.ConfirmAction(ctx, "Apply PVE resource pools (merge)", message, "Apply now", "Skip apply", 0, false)
if promptErr != nil {
return promptErr
}
applyPools = ok
logging.DebugStep(logger, "safe cluster apply (ui)", "User choice: apply_pools=%v (pools=%d)", applyPools, len(pools))
if applyPools {
if anyPoolHasVMs(pools) {
moveMsg := "Allow moving guests from other pools to match the backup? This may change the current pool assignment of existing VMs/CTs."
move, moveErr := ui.ConfirmAction(ctx, "Pools: allow move (VM/CT)", moveMsg, "Allow move", "Don't move", 0, false)
if moveErr != nil {
return moveErr
}
allowPoolMove = move
}
applied, failed, applyErr := applyPVEPoolsDefinitions(ctx, logger, pools)
if applyErr != nil {
logger.Warning("Pools apply (definitions) encountered errors: %v", applyErr)
}
logger.Info("Pools apply (definitions) completed: ok=%d failed=%d", applied, failed)
}
}
sourceNode := currentNode
logging.DebugStep(logger, "safe cluster apply (ui)", "List exported node directories under %s", filepath.Join(exportRoot, "etc/pve/nodes"))
exportNodes, nodesErr := listExportNodeDirs(exportRoot)
if nodesErr != nil {
logger.Warning("Failed to inspect exported node directories: %v", nodesErr)
} else if len(exportNodes) > 0 {
logging.DebugStep(logger, "safe cluster apply (ui)", "export_nodes=%s", strings.Join(exportNodes, ","))
} else {
logging.DebugStep(logger, "safe cluster apply (ui)", "No exported node directories found")
}
if len(exportNodes) > 0 && !stringSliceContains(exportNodes, sourceNode) {
logging.DebugStep(logger, "safe cluster apply (ui)", "Node mismatch: current_node=%s export_nodes=%s", currentNode, strings.Join(exportNodes, ","))
logger.Warning("SAFE cluster restore: VM/CT configs not found for current node %s in export; available nodes: %s", currentNode, strings.Join(exportNodes, ", "))
if len(exportNodes) == 1 {
sourceNode = exportNodes[0]
logging.DebugStep(logger, "safe cluster apply (ui)", "Auto-select source node: %s", sourceNode)
logger.Info("SAFE cluster restore: using exported node %s as VM/CT source, applying to current node %s", sourceNode, currentNode)
} else {
for _, node := range exportNodes {
qemuCount, lxcCount := countVMConfigsForNode(exportRoot, node)
logging.DebugStep(logger, "safe cluster apply (ui)", "Export node candidate: %s (qemu=%d, lxc=%d)", node, qemuCount, lxcCount)
}
selected, selErr := ui.SelectExportNode(ctx, exportRoot, currentNode, exportNodes)
if selErr != nil {
return selErr
}
if strings.TrimSpace(selected) == "" {
logging.DebugStep(logger, "safe cluster apply (ui)", "User selected: skip VM/CT apply (no source node)")
logger.Info("Skipping VM/CT apply (no source node selected)")
sourceNode = ""
} else {
sourceNode = selected
logging.DebugStep(logger, "safe cluster apply (ui)", "User selected source node: %s", sourceNode)
logger.Info("SAFE cluster restore: selected exported node %s as VM/CT source, applying to current node %s", sourceNode, currentNode)
}
}
}
logging.DebugStep(logger, "safe cluster apply (ui)", "Selected VM/CT source node: %q (current_node=%q)", sourceNode, currentNode)
var vmEntries []vmEntry
if strings.TrimSpace(sourceNode) != "" {
logging.DebugStep(logger, "safe cluster apply (ui)", "Scan VM/CT configs in export (source_node=%s)", sourceNode)
vmEntries, err = scanVMConfigs(exportRoot, sourceNode)
if err != nil {
logger.Warning("Failed to scan VM configs: %v", err)
vmEntries = nil
} else {
logging.DebugStep(logger, "safe cluster apply (ui)", "VM/CT configs found=%d (source_node=%s)", len(vmEntries), sourceNode)
}
}
if len(vmEntries) > 0 {
applyVMs, promptErr := ui.ConfirmApplyVMConfigs(ctx, sourceNode, currentNode, len(vmEntries))
if promptErr != nil {
return promptErr
}
logging.DebugStep(logger, "safe cluster apply (ui)", "User choice: apply_vms=%v (entries=%d)", applyVMs, len(vmEntries))
if applyVMs {
applied, failed := applyVMConfigs(ctx, vmEntries, logger)
logger.Info("VM/CT apply completed: ok=%d failed=%d", applied, failed)
} else {
logger.Info("Skipping VM/CT apply")
}
} else {
if strings.TrimSpace(sourceNode) == "" {
logger.Info("No VM/CT configs applied (no source node selected)")
} else {
logger.Info("No VM/CT configs found for node %s in export", sourceNode)
}
}
skipStorageDatacenter := plan != nil && plan.HasCategoryID("storage_pve")
if skipStorageDatacenter {
logging.DebugStep(logger, "safe cluster apply (ui)", "Skip storage/datacenter apply: handled by storage_pve staged restore")
logger.Info("Skipping storage/datacenter apply (handled by storage_pve staged restore)")
} else {
storageCfg := filepath.Join(exportRoot, "etc/pve/storage.cfg")
logging.DebugStep(logger, "safe cluster apply (ui)", "Check export: storage.cfg (%s)", storageCfg)
storageInfo, storageErr := restoreFS.Stat(storageCfg)
if storageErr == nil && !storageInfo.IsDir() {
logging.DebugStep(logger, "safe cluster apply (ui)", "storage.cfg found (size=%d)", storageInfo.Size())
applyStorage, promptErr := ui.ConfirmApplyStorageCfg(ctx, storageCfg)
if promptErr != nil {
return promptErr
}
logging.DebugStep(logger, "safe cluster apply (ui)", "User choice: apply_storage=%v", applyStorage)
if applyStorage {
applied, failed, err := applyStorageCfg(ctx, storageCfg, logger)
logging.DebugStep(logger, "safe cluster apply (ui)", "Storage apply result: ok=%d failed=%d err=%v", applied, failed, err)
if err != nil {
logger.Warning("Storage apply encountered errors: %v", err)
}
logger.Info("Storage apply completed: ok=%d failed=%d", applied, failed)
} else {
logger.Info("Skipping storage.cfg apply")
}
} else {
logging.DebugStep(logger, "safe cluster apply (ui)", "storage.cfg not found (err=%v)", storageErr)
logger.Info("No storage.cfg found in export")
}
dcCfg := filepath.Join(exportRoot, "etc/pve/datacenter.cfg")
logging.DebugStep(logger, "safe cluster apply (ui)", "Check export: datacenter.cfg (%s)", dcCfg)
dcInfo, dcErr := restoreFS.Stat(dcCfg)
if dcErr == nil && !dcInfo.IsDir() {
logging.DebugStep(logger, "safe cluster apply (ui)", "datacenter.cfg found (size=%d)", dcInfo.Size())
applyDC, promptErr := ui.ConfirmApplyDatacenterCfg(ctx, dcCfg)
if promptErr != nil {
return promptErr
}
logging.DebugStep(logger, "safe cluster apply (ui)", "User choice: apply_datacenter=%v", applyDC)
if applyDC {
logging.DebugStep(logger, "safe cluster apply (ui)", "Apply datacenter.cfg via pvesh")
if err := runPvesh(ctx, logger, []string{"set", "/cluster/config", "-conf", dcCfg}); err != nil {
logger.Warning("Failed to apply datacenter.cfg: %v", err)
} else {
logger.Info("datacenter.cfg applied successfully")
}
} else {
logger.Info("Skipping datacenter.cfg apply")
}
} else {
logging.DebugStep(logger, "safe cluster apply (ui)", "datacenter.cfg not found (err=%v)", dcErr)
logger.Info("No datacenter.cfg found in export")
}
if err == io.EOF {
logger.Warning("Restore input closed unexpectedly (EOF). This usually means the interactive UI lost access to stdin/TTY (e.g., SSH disconnect or non-interactive execution). Re-run with --restore --cli from an interactive shell.")
return ErrRestoreAborted
}
// Apply pool membership after VM configs and storage/datacenter apply.
if applyPools && len(pools) > 0 {
applied, failed, applyErr := applyPVEPoolsMembership(ctx, logger, pools, allowPoolMove)
if applyErr != nil {
logger.Warning("Pools apply (membership) encountered errors: %v", applyErr)
}
logger.Info("Pools apply (membership) completed: ok=%d failed=%d", applied, failed)
if restoreWorkflowInputAborted(ctx, err) {
return ErrRestoreAborted
}
return nil
return err
}
func normalizeRestoreWorkflowUIError(ctx context.Context, logger *logging.Logger, err error) error {
if err == nil {
return nil
}
if errors.Is(err, io.EOF) {
logger.Warning("Restore input closed unexpectedly (EOF). This usually means the interactive UI lost access to stdin/TTY (e.g., SSH disconnect or non-interactive execution). Re-run with --restore --cli from an interactive shell.")
return ErrRestoreAborted
}
if restoreWorkflowInputAborted(ctx, err) {
return ErrRestoreAborted
}
return err
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/restore_workflow_ui.go` around lines 65 - 77, Update
normalizeRestoreWorkflowUIError to detect EOF using errors.Is(err, io.EOF)
instead of the direct equality check; specifically change the condition that
currently does "err == io.EOF" so wrapped EOFs (e.g., fmt.Errorf(...%w...))
trigger the logger.Warning call and return ErrRestoreAborted. Keep the existing
logger.Warning message and the subsequent call to
restoreWorkflowInputAborted(ctx, err) unchanged so behavior and other sentinel
handling remain consistent.

Comment on lines +205 to +209
if err := f.Close(); err != nil {
d.logger.Debug("Cannot close test file for ownership check: %v", err)
return false
}
defer func() { _ = os.Remove(testFile) }()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: Deferred cleanup won't execute when close fails.

The defer func() { _ = os.Remove(testFile) }() on line 209 is placed after the early return on line 207, so it will never be registered (and thus never execute) if f.Close() fails. This leaves the test file un-cleaned in that error path.

Move the defer before the close check:

 	f, err := os.Create(testFile)
 	if err != nil {
 		d.logger.Debug("Cannot create test file for ownership check: %v", err)
 		return false
 	}
+	defer func() { _ = os.Remove(testFile) }()
 	if err := f.Close(); err != nil {
 		d.logger.Debug("Cannot close test file for ownership check: %v", err)
 		return false
 	}
-	defer func() { _ = os.Remove(testFile) }()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/storage/filesystem.go` around lines 205 - 209, The defer that
removes the temporary test file should be registered before the early return so
it always runs; move the defer func() { _ = os.Remove(testFile) }() to
immediately after the test file is created (or at least before the f.Close()
error check) so that regardless of f.Close() failing the cleanup will execute,
and preserve the existing d.logger.Debug call and return false behavior when
f.Close() returns an error.

@tis24dev tis24dev merged commit b7d9b0a into main May 12, 2026
11 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant