Skip to content

feat: S3 object-storage command suite + serverless (container/batchjob, pre-release)#45

Merged
hi-lei merged 26 commits into
mainfrom
feat/add-serverless-command
Jun 1, 2026
Merged

feat: S3 object-storage command suite + serverless (container/batchjob, pre-release)#45
hi-lei merged 26 commits into
mainfrom
feat/add-serverless-command

Conversation

@hi-lei
Copy link
Copy Markdown
Collaborator

@hi-lei hi-lei commented Jun 1, 2026

Summary

Adds the S3-compatible object-storage command suite (verda s3 …, GA) and the
serverless commands (container/batchjob, gated pre-release behind a flag),
plus supporting deps and VM cleanup.

S3 / object storage (GA)

Full AWS-CLI-style surface over Verda's S3 endpoint, using a separate credential
set (keys prefixed verda_s3_):

  • Commands: configure, show, ls, cp, mv, rm, sync, mb, rb, presign, plus ls-uploads/abort-uploads for resume management.
  • Resumable large transfers: single-file cp uploads and downloads are multipart, parallel, and resumable from a local checkpoint (re-run the same command). RGW-safe checksums; same-host advisory lock prevents concurrent transfers of the same object. Live progress bar + transfer rate.
  • Interactive TUI (dual-mode): omit the target on a terminal to drive a TUI; flags/--agent/-o json stay fully non-interactive.
    • ls → folder browser (drill in, per-object actions, multi-download)
    • cp → upload wizard (source → bucket → folder → confirm)
    • mb → name prompt · rb → bucket picker + confirm
    • rm → folder browser with multi-select delete
    • mv → stepped S3→S3 move/rename wizard (Step N of M, Esc=back, Ctrl+C=exit)
  • Browser downloads default to the OS Downloads folder and never overwrite an unrelated local file, while still resuming an interrupted download of the same object.

Serverless (pre-release, gated)

container and batchjob commands with shared wizard steps, registered behind a
pre-release flag so they don't surface in the GA build.

Also

  • Deps: bump verdagostack to v1.4.1, drop the local replace.
  • VM/volume: remove the deprecated HDD storage option.
  • Register s3 unconditionally (GA); gate serverless.

Interactive UX contract

Standardized across all interactive commands: mandatory hint bar
(↑/↓ navigate · type to filter · enter select · esc back · ctrl+c exit),
Esc = soft back, Ctrl+C = hard exit, never a confirmation dialog on either.
Codified in .ai/skills/new-command.md.

Testing

make build, make lint (0 issues), and make test (all packages) pass.
New unit tests cover the resumable upload/download state machine (incl.
part-size-change restart and ETag-change restart), the same-host lock, the
rm folder-browser delete, the mv wizard + index navigation, local-overwrite
resolution, and the object/bucket pickers. Two rounds of adversarial review
were run and findings fixed (no data-loss or security issues outstanding).

Notes

  • S3 uses separate credentials (verda s3 configure); independent of auth login.
  • Destructive rm/rb require --yes in --agent mode; cp/mv/sync follow AWS convention (no prompt).

🤖 Generated with Claude Code

hi-lei and others added 26 commits April 24, 2026 21:34
`verda serverless container` and `verda serverless batchjob` manage the
two serverless deployment shapes on Verda Cloud: always-on HTTPS endpoints
and one-shot batch jobs. The web UI's "Deployment type" radio maps to the
subcommand choice; each subcommand calls its own SDK service
(/container-deployments vs /job-deployments).

- container: create (22-step wizard + flags), list/describe/delete,
  pause/resume/restart/purge-queue
- batchjob: create (flags only; wizard is a follow-up), list/describe/delete,
  pause/resume/purge-queue. Cannot use spot; deadline is required.
- Validation client-side: reject :latest tags, RFC-1123 deployment names,
  env-var name pattern, absolute mount paths
- Scaling preset maps Instant=1 / Balanced=3 / Cost saver=6 / Custom into
  ScalingTriggers.QueueLoad.Threshold
- Tests cover name validation, :latest rejection, env/mount parsing, full
  preset mapping, validation errors, request-payload shape for both
  subcommands
- CLAUDE.md and README.md per the per-command docs convention
- Design: docs/plans/2026-04-24-serverless-container-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deployment type (Continuous vs Job) is a subcommand-level decision in
the CLI, so the same step factories drive both create wizards:

- wizard_shared.go — 10 shared step builders taking `*T` pointers:
  stepName, stepImage, stepCompute, stepComputeSize, stepRegistryCreds,
  stepPort, stepEnvVars, stepMaxReplicas, stepRequestTTL,
  stepSecretMounts. Plus durationStep + int validators.
- wizard.go — container-only steps: compute-type (spot), healthcheck
  on/off/port/path, min-replicas, concurrency, queue-preset +
  queue-load custom, CPU/GPU util triggers, scale-up/down delays.
  Drops ~350 lines vs the old file since nothing is duplicated.
- wizard_batchjob.go — batchjob flow (11 steps: 10 shared + deadline).
  Reuses the full shared factory surface; the only batchjob-specific
  step is stepBatchjobDeadline (required, > 0).
- batchjob_create.go — launches runBatchjobWizard when any of
  --name/--image/--compute/--deadline is missing interactively.
  Prints renderBatchjobSummary + Confirm before the API call, mirroring
  the container flow.
- CLAUDE.md + README.md updated to reflect the split.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Promote container and batchjob to top-level commands (drop `verda serverless` parent)
- Add `verda` banner + logo on no-args / help path
- Read version-update hint from on-disk cache instead of fetching GitHub at hot-path
- HTTP debug transport: `--debug` now logs request/response (with auth/JSON-secret redaction) for all commands, including failed calls
- `container list` and `vm list`: keep details visible after selection with explicit "Back to list / Exit" gate
- `container list`: parallel per-deployment status fetch (5 concurrent) cached for 30s, `--status` substring filter, interactive selector with type-to-filter and describe drill-down loop
- Serverless wizard refactor: shared step factories used by both container and batchjob create flows; new wire-format tests guard the create-request JSON shape
- pre-commit: replace dnephin go-unit-tests (hardcoded -timeout=30s) with local hook running `make test`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.4.1 ships the live-list panic guard + stale wizard hint-bar refresh, so the temporary ../verdagostack replace is no longer needed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- factory: propagate debug body-read errors; fix redaction regex for escaped quotes; standard early-return around retry middleware
- cmdutil: hoist shared PromptBackOrExit (used by vm/list + serverless container_list)
- serverless: bound best-effort describe status RPC with a sub-timeout under the spinner; cancel stale live-list status goroutines; collapse duplicate isPromptCancel; clarify env-var help

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
HDD is deprecated and no longer provisionable. Drop it from the vm and volume create wizards (default NVMe); --type/--storage-type still pass through. Display of existing HDD volumes unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- s3 always registered now (object storage shipped to prod); VERDA_S3_ENABLED gate removed
- container/batchjob gated behind VERDA_SERVERLESS_ENABLED + Hidden until GA (serverlessEnabled in cmd.go; Hidden on parents; gate_test asserts it)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- resumable multipart upload with local checkpoints (~/.verda/s3-uploads), --part-size/--concurrency/--no-resume, ls-uploads/abort-uploads cleanup
- RGW checksum fix: disable aws-sdk default integrity trailers on client + manager + custom uploader (else 400 XAmzContentSHA256Mismatch)
- interactive bucket picker (selectBucket) wired into rb/ls-uploads/abort-uploads via dual-mode resolveBucketArg (omit target on a TTY -> picker)
- honor 'verda auth use' active profile in s3 (options.ActiveProfile); previously s3 ignored it and used [default]
- complete s3 GA (unhide); expanded test coverage (show, recursive cp/mv, interactive rm)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`verda s3 ls` with no arg on a terminal opens a navigable explorer: buckets -> folders -> objects (one delimiter level at a time), Esc=up / Ctrl+C=exit, per-object Download/Info/Delete, plus a 'Download files here…' multi-select. Explicit args, --agent, pipes and -o json keep the static, scriptable listing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- upload wizard: `verda s3 cp` (no/partial args on a TTY) guides source -> bucket (+create) -> folder (+new) -> confirm; auto --recursive for directories; Esc steps back, Ctrl+C exits; confirm defaults to Yes
- ls-uploads: pick an in-progress upload to resume — uses a matching local checkpoint or prompts for the file, infers the original part size, and adopts the existing UploadId (no new upload/orphan)
- same-host flock guard (~/.verda/s3-uploads/<id>.lock) so two cp/resume of the same object can't race the checkpoint
- part-level progress bar for large/resumable uploads
- bulk transfers run on cmd.Context(), not the 30s --timeout, so large up/downloads no longer fail with 'context deadline exceeded'

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Append '@ X/s' to the upload result line, computed over the bytes moved this run (a resume reports its true session throughput, not an inflated figure). A live in-bar rate isn't possible yet — tui.ProgressHandle exposes only SetPercent/Stop, no label update.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the tui progress bar (which can only show a percentage) with a self-rendered single line on an interactive terminal: 'Uploading <name>  ████░░░ 62%  9.1 MB/s', overwritten in place via \r. The rate is measured over bytes moved this session, so a resume shows its true throughput. Gated on table output + a stderr TTY (cmdutil.IsStderrTerminal); pipes/--agent/-o json print nothing live.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- downloads now show a live bar + percent + transfer rate (same renderer as uploads); total comes from HeadObject. Downloads were already 5-way concurrent via the SDK transfer manager.
- generalized the upload progress into a byte-based transferProgress shared by both directions; a concurrency-safe countingWriterAt feeds download bytes from the manager's parallel ranged GETs
- HeadObject is issued only when actually rendering, so --agent/pipes/sync pay no extra request; sync suppresses per-file progress (quietProgress)
- download result line now shows '@ X/s' too

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- single-object downloads use a custom concurrent downloader: N-way ranged GETs writing each chunk at its offset (replaces the transfer manager for this path; same 5-way default)
- resumable via a local checkpoint (~/.verda/s3-downloads) + a <dest>.part file. Re-running the same `verda s3 cp s3://… ./dst` resumes, fetching only the missing chunks
- If-Match (ETag) guards against the object changing mid-download; an ETag/size mismatch restarts cleanly
- live progress bar + transfer rate; same-host lock (acquireTransferLock, renamed from acquireUploadLock) blocks concurrent downloads of the same object
- recursive / sync / mv downloads still use the transfer manager (per-file resume is a follow-up)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…help + README

- the ls browser's per-object Download and multi-select 'Download files here…' now go through the resumable downloader (shared downloadToLocal helper), so re-selecting Download on an interrupted object resumes from its .part — giving downloads a discoverable interactive resume entry point (cp s3://… ./ is the param-way equivalent)
- cp --help and the s3 README document resumable transfers, --no-resume/--concurrency/--part-size (which apply to uploads AND single-object downloads), and the two download entry points (cp + the browser)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Download result lines now print the absolute destination path (cp and the ls browser, single + multi) so it's clear where the file landed: '✓ downloaded s3://… -> /abs/path (size) @ rate'.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ads)

runLs wrapped the interactive browser in a 30s context, so a large browser download died with 'context deadline exceeded' (and long navigation could too). The browser now runs on cmd.Context() like the other bulk-transfer paths; only the static, scriptable listings keep the per-request --timeout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- downloads now print 'Resuming download of X (A of B, P% already on disk)' when continuing from a checkpoint, mirroring the upload 'Resuming upload' line (was silent before). Driven by a new OnResume callback on the resumable downloader.
- expanded 'verda s3 cp --help' examples: upload-into-folder, download-into-directory, recursive prefix download, exclude filter, content-type override (alongside the existing upload/download/resume/tuning/copy/dryrun examples)
- test: OnResume fires with the already-on-disk bytes on resume, and does NOT fire on a fresh download

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…w fixes

Extend the dual-mode interactive pattern (omit the target on a TTY) to the
remaining S3 subcommands, mirroring ls/cp:

- mb: prompt for the new bucket name
- rb: bucket picker + destructive confirm (extracted to confirmRbDeletion)
- rm: folder browser with multi-select delete (rm_browse.go), reusing the
  shared confirm + batch-delete path; deletes only the ticked keys
- mv: stepped S3->S3 move/rename wizard (move_wizard.go). Every prompt is an
  index-navigable step, so Esc steps back exactly one prompt and Ctrl+C exits,
  with a "Step N of M" header and a one-time intro banner

Browser downloads now default to the OS Downloads folder and never overwrite an
unrelated local file (resolveLocalDest), while still resuming an interrupted
download of the same object; each finishes on a Back/Exit summary.

Adversarial-review fixes:
- upload resume restarts (not corrupts) when --part-size changes on an
  unchanged file; download .part path keyed off the absolute destination
- gate every interactive entry on interactiveTTY so `-o json` on a TTY never
  launches a TUI; rm refuses --recursive/--dryrun/--yes/--include/--exclude
  without an explicit URI instead of silently dropping them
- emptyBucket fails fast on per-object DeleteObjects errors (accurate count);
  BucketNotEmpty maps to a friendly error
- wizard-facing pickers return the raw prompter error so Esc=back works;
  stale download checkpoints + lock files are GC'd from the download path

make build / make lint (0 issues) / make test all pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- s3/README.md: rewrite the Interactive vs Non-Interactive section with the
  per-command trigger table, the ~/Downloads no-overwrite policy, resumable
  transfer notes, and the part-size-change-restarts behavior
- internal/skills (verda-cloud, verda-reference): add Object Storage / S3
  coverage for agents — intent mapping, per-command reference with verified
  flags and JSON field names, separate-credentials model, and the --yes /
  --dryrun rules for destructive ops
- .ai/skills/new-command.md: codify the interactive-command contract — the
  mandatory hint bar, Esc=back / Ctrl+C=exit, the implicit-TTY trigger, the
  stepped-wizard pattern, and the picker-swallows-Esc pitfall

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
verdagostack v1.4.1 raises the module's go directive to 1.25.10 (propagated
into verda-cli's go.mod), but the workflows pinned setup-go to 1.25.9, so CI
failed with "go.mod requires go >= 1.25.10 (GOTOOLCHAIN=local)". Switch every
setup-go step to go-version-file: go.mod so the CI toolchain always matches the
module directive and won't drift on future bumps.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
osv-scanner flagged GO-2026-5024 in the transitive golang.org/x/sys@v0.43.0.
Bump to v0.45.0 (past the fix). govulncheck confirms the advisory is cleared.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The security workflow runs gosec with --no-config, dropping the repo's
.golangci.yaml exclusions (which normally skip gosec on _test.go). That exposed
five int32/file-read conversions with no suppress directive — numParts (capped
at maxParts) and four test-only loop/index conversions. Annotate with
//nolint:gosec + reason; honored by both the strict scan and make lint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
golangci-lint v2.5.0 (CI's pinned version) flags the conditional-append loops in
browseDownloadMulti and rmBrowseDeleteMulti under prealloc; newer local versions
don't. Preallocate objs/labels/keys with len(payload.Objects) capacity so both
versions are clean. Verified with golangci-lint v2.5.0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hi-lei hi-lei merged commit 86f389a into main Jun 1, 2026
12 checks passed
@hi-lei hi-lei deleted the feat/add-serverless-command branch June 1, 2026 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant