Skip to content

blob/sftpblob: add sftpblob driver for SFTP-based blob storage support#3682

Closed
TuSKan wants to merge 2 commits intogoogle:masterfrom
TuSKan:sftpblob
Closed

blob/sftpblob: add sftpblob driver for SFTP-based blob storage support#3682
TuSKan wants to merge 2 commits intogoogle:masterfrom
TuSKan:sftpblob

Conversation

@TuSKan
Copy link
Copy Markdown

@TuSKan TuSKan commented Mar 31, 2026

What does this PR do?

This PR introduces the sftpblob driver, allowing gocloud.dev/blob to interface with SFTP servers.

SFTP remains a critical protocol for legacy enterprise integrations and B2B data pipelines. This driver bridges the gap, allowing users to interact with SFTP endpoints using the standard Go CDK blob interface.

Implementation Details

  • Underlying Libraries: Utilizes github.com/pkg/sftp for the protocol implementation and golang.org/x/crypto/ssh for connection management.

  • Authentication: Supports password, private key (PEM), and SSH_AUTH_SOCK agent authentication via URL parameters.

  • Atomic Writes: To prevent partial uploads from corrupting data, writes are streamed to a temporary file (.tmp). The driver tracks Write state, and only executes a PosixRename to the final path upon a successful Close(). Failed writes automatically clean up the temporary file.

  • Context Management: Decouples the setup context used in OpenBucketURL from the ongoing connection lifecycle to prevent background goroutine panics on early context cancellation.

Known Limitations & Trade-offs

SFTP is not an object store, so mapping it to the blob interface requires a few compromises, which are documented in the package docstring:

  1. Pagination (ListPaged): SFTP lacks native server-side pagination or cursors. ListPaged is implemented via physical directory walks. Users should be warned that this is O(N) in network calls and not recommended for directories with massive file counts.

  2. Server-Side Copying: Native server-side copying depends on the copy-data SFTP extension. If the target server (like older OpenSSH versions) does not support this, Copy falls back to io.Copy, pulling the data to the client and pushing it back up.

  3. Metadata: Extended attributes (CacheControl, ContentType, etc.) are stored in an adjacent .attrs sidecar file via JSON. This can be disabled via ?metadata=skip in the connection URL.

Testing Strategy

  • Passes the drivertest conformance suite.

Checklist

  • Added URL opener examples.

  • Ran go fmt and go vet.

  • Verified drivertest.RunConformanceTests passes.

@google-cla
Copy link
Copy Markdown

google-cla bot commented Mar 31, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@vangent
Copy link
Copy Markdown
Contributor

vangent commented Apr 5, 2026

See comment on the Issue.

@vangent vangent closed this Apr 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants