Skip to content

SOLR-15701: Complete configsets api#4264

Open
epugh wants to merge 4 commits intoapache:mainfrom
epugh:complete_configsets_api
Open

SOLR-15701: Complete configsets api#4264
epugh wants to merge 4 commits intoapache:mainfrom
epugh:complete_configsets_api

Conversation

@epugh
Copy link
Copy Markdown
Contributor

@epugh epugh commented Apr 4, 2026

https://issues.apache.org/jira/browse/SOLR-15701

Description

Add Download and GetFile to ConfigSets API and SolrJ.

Solution

This was extracted with copilots help from the SchemaDesigner code base. We are going to merge this as part of ConfigSets API. In a future PR, once this is done, we'll finish the currently started SchemaDesigner work to use these APIs.

Tests

Added some tests.

@epugh epugh requested a review from gerlowskija April 4, 2026 00:04
@github-actions github-actions bot added documentation Improvements or additions to documentation tests cat:api labels Apr 4, 2026
throw new SolrException(
SolrException.ErrorCode.NOT_FOUND, "ConfigSet '" + configSetName + "' not found");
}
byte[] data = downloadFileFromConfig(configSetName, filePath);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the data be immutable ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand a bit on what you mean?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable "data" holds the file contents hence it must be immutable. You can make the variable immutable by using "final" keyword.

Copy link
Copy Markdown
Contributor

@VishnuPriyaChandraSekar VishnuPriyaChandraSekar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just started reviewing the Solr code base.
I just left a minor comment other than that, PR seems to be fine.

Copy link
Copy Markdown
Contributor

@gerlowskija gerlowskija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good overall. I left some inline comments, mostly small things.

I do have one larger question here though: I gather much of this functionality already existed under the aegis of the SchemaDesigner, but I don't see any deletions or modifications to those files in this PR. Shouldn't this PR be deleting the "download-configset" and "get-configset-file" APIs that exist over in schema-designer land?

/**
* V2 API definition for downloading an existing configset as a ZIP archive.
*
* <p>Equivalent to GET /api/configsets/{configSetName}/download
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-0] Not sure what you're trying to say here, but this could use some word-smithing. There's no v1 counterpart to this API which is where I typically see the "equivalent to" language.

Alternately If you're trying to highlight what the API actually looks like, then maybe "Available at" works better? (Or omit altogether since the @Path annotation is a just a line or two below)

@Path("/configsets/{configSetName}")
interface Download {
@GET
@Path("/download")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-1] /download isn't very REST-ful. The fact that we're fetching/downloading the resource is already kindof implied by the HTTP 'GET' verb/method.

Could we drop it and make this "just" GET /api/configsets/{configsetName}?

@Produces("application/zip")
Response downloadConfigSet(
@PathParam("configSetName") String configSetName,
@QueryParam("displayName") String displayName)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] From looking ahead at code in this PR, it looks like 'displayName' is used primarily (solely?) to inform the "attachment name" part of a "Content-Disposition" response header.

Do I have that right? Or am I missing another usage?

summary = "Get the contents of a file in a configset.",
tags = {"configsets"})
ConfigSetFileContentsResponse getConfigSetFile(
@PathParam("configSetName") String configSetName, @QueryParam("path") String filePath)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-0] IMO "filePath" should be a path parameter rather than a query-parameter. That would allow this API to mirror the "update-configset-file" API really well, and also bring it into line with what we've done for other similar APIs in filestore and elsewhere.

To be explicit, I'm suggesting this API be: GET /api/configsets/<configSetName>/files/some/file/path.txt

@Operation(
summary = "Get the contents of a file in a configset.",
tags = {"configsets"})
ConfigSetFileContentsResponse getConfigSetFile(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] Can you add a bit of context around the decision to return a structured POJO here vs. "just" the verbatim file bytes approach we take in filestore, replication handler, ZooKeeperReadAPI, etc.

The structured POJO route seems to conflict a bit with the @Produces(TEXT_PLAIN) annotation above, unless I'm missing something?

How does this behave if a user puts a small binary file in their configset?

final String fileName = safeName + "_configset.zip";
return Response.ok((StreamingOutput) outputStream -> outputStream.write(zipBytes))
.type("application/zip")
.header("Content-Disposition", "attachment; filename=\"" + fileName + "\"")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] "Content-Disposition" is a browser-based header AFAICT - it's primarily a hint for browsers as to whether something's an attachment or not. Is it worth having here? Why would we include such a header here, but not on (e.g.) the filestore API or in ReplicationHandler, or somewhere similar that's downloading file content?

import org.junit.Test;

/** Unit tests for {@link UploadConfigSet}. */
public class UploadConfigSetAPITest extends SolrTestCase {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] Can you share a bit of context around this test file please? I'm not going to complain about new tests, but I'm surprised to find that a third of this PR is tests for an API that wasn't touched by the PR 😛

[[configsets-download]]
== Download a Configset

The `download` command downloads an entire configset as a zipped file.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0] The "command" language here is unfortunate. It fits the v1 APIs and their action=LIST|UPLOAD|CREATE|etc syntax well, but doesn't make sense for the v2 API IMO.

Maybe something like the following would be a bit clearer:

The v2 API allows configsets to be downloaded as a single zipped file. This is useful for backing up configsets, sharing ...
The download API takes the following parameters:

[[configsets-get-file]]
== Get a Single File from a Configset

This command retrieves the contents of a single file from an existing configset.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, re: replacing "command" with "API" or less v1 specific language

+
The path to the file within the configset (e.g., `solrconfig.xml` or `lang/stopwords_en.txt`).

The response will be a JSON object containing:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-0] Why "JSON" here? v2 APIs in general support javabin, xml, etc. in addition to JSON. Is this API specifically JSON-only in some way? If not, then we might not want to be overly specific here. Maybe drop the word "JSON" and leave the rest as-is?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cat:api documentation Improvements or additions to documentation tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants