feat: provision per-host Postgres user for RAG service instances#299
feat: provision per-host Postgres user for RAG service instances#299tsivaprasad wants to merge 6 commits intomainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds RAG service provisioning and tests; refactors service-instance resource generation to dispatch by service type and centralizes resource building; omits monitor dependency for RAG in workflows; strengthens Postgres role-existence checks and adds a reusable query; updates Makefile tag resolution. Changes
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
57345d8 to
4392b87
Compare
|
@coderabbitai review |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
9fcf32a to
bdc9b3d
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
server/internal/orchestrator/swarm/orchestrator.go (2)
433-438: Remove unreachable service-type check in MCP generator.
generateMCPInstanceResourcesis only called from Line 406 (case "mcp"), so this branch is dead and adds noise.Suggested cleanup
- // Only MCP is fully implemented in the orchestrator for now. - // PostgREST provisioning (container spec, config delivery, service user) is - // implemented in follow-up tickets. - if spec.ServiceSpec.ServiceType != "mcp" { - return nil, fmt.Errorf("service type %q is not yet supported for provisioning", spec.ServiceSpec.ServiceType) - } - // Parse the MCP service config from the untyped config map mcpConfig, errs := database.ParseMCPServiceConfig(spec.ServiceSpec.Config, false)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/orchestrator/swarm/orchestrator.go` around lines 433 - 438, The runtime check inside generateMCPInstanceResources that returns an error when spec.ServiceSpec.ServiceType != "mcp" is unreachable because generateMCPInstanceResources is only invoked from the "case \"mcp\"" branch; remove that dead branch to clean up noise. Edit the function generateMCPInstanceResources to delete the if block referencing spec.ServiceSpec.ServiceType (and the fmt.Errorf return), ensuring no other logic depends on that check and run tests/compile to confirm no references remain.
593-612: UsebuildServiceInstanceResourcesin the RAG path too.Line 593-Line 612 duplicates the conversion logic now centralized in
buildServiceInstanceResources, increasing drift risk.Suggested simplification
- data := make([]*resource.ResourceData, len(orchestratorResources)) - for i, res := range orchestratorResources { - d, err := resource.ToResourceData(res) - if err != nil { - return nil, fmt.Errorf("failed to convert resource to resource data: %w", err) - } - data[i] = d - } - - return &database.ServiceInstanceResources{ - ServiceInstance: &database.ServiceInstance{ - ServiceInstanceID: spec.ServiceInstanceID, - ServiceID: spec.ServiceSpec.ServiceID, - DatabaseID: spec.DatabaseID, - HostID: spec.HostID, - State: database.ServiceInstanceStateCreating, - }, - Resources: data, - }, nil + return o.buildServiceInstanceResources(spec, orchestratorResources)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/orchestrator/swarm/orchestrator.go` around lines 593 - 612, Replace the duplicated conversion loop that turns orchestratorResources into []*resource.ResourceData (the block using resource.ToResourceData and assembling database.ServiceInstanceResources with spec.ServiceInstanceID, ServiceSpec.ServiceID, DatabaseID, HostID and state) by calling the centralized helper buildServiceInstanceResources(spec, orchestratorResources), returning its result and propagating any error; remove the manual loop and construction and ensure error handling mirrors buildServiceInstanceResources' signature.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/internal/orchestrator/swarm/orchestrator.go`:
- Around line 580-591: Don’t rely on the array position — iterate over
spec.DatabaseNodes and create the ServiceUserRole for every node except the
canonical one by explicitly comparing identifiers (e.g., skip when nodeInst.ROID
matches canonicalROID or when nodeInst.NodeName matches the canonical node name)
instead of using spec.DatabaseNodes[1:], then append roles with ServiceID from
spec.ServiceSpec.ServiceID, DatabaseID/DatabaseName from spec, NodeName from
nodeInst, Mode ServiceUserRoleRO and CredentialSource set to &canonicalROID.
- Around line 565-566: The role identity is being keyed by
spec.ServiceSpec.ServiceID (creating canonicalROID via
ServiceUserRoleIdentifier) which makes it shared across all instances; change
the keying to use the service-instance identifier (e.g., spec.ServiceInstanceID
or the appropriate ServiceInstance ID field) when constructing the role
identifier and when creating/looking up the role resource (replace
ServiceUserRoleIdentifier(spec.ServiceSpec.ServiceID, ...) with
ServiceUserRoleIdentifier(spec.ServiceInstanceID, ...) or equivalent). Also, if
RAGServiceUserRole is available in this PR, instantiate/lookup that resource
type instead of ServiceUserRole so the role is scoped per-instance; update the
other occurrences referenced (the blocks around canonicalROID and the later
calls at the 569–574 and 583–589 sites) to use the instance-scoped identifier
consistently.
---
Nitpick comments:
In `@server/internal/orchestrator/swarm/orchestrator.go`:
- Around line 433-438: The runtime check inside generateMCPInstanceResources
that returns an error when spec.ServiceSpec.ServiceType != "mcp" is unreachable
because generateMCPInstanceResources is only invoked from the "case \"mcp\""
branch; remove that dead branch to clean up noise. Edit the function
generateMCPInstanceResources to delete the if block referencing
spec.ServiceSpec.ServiceType (and the fmt.Errorf return), ensuring no other
logic depends on that check and run tests/compile to confirm no references
remain.
- Around line 593-612: Replace the duplicated conversion loop that turns
orchestratorResources into []*resource.ResourceData (the block using
resource.ToResourceData and assembling database.ServiceInstanceResources with
spec.ServiceInstanceID, ServiceSpec.ServiceID, DatabaseID, HostID and state) by
calling the centralized helper buildServiceInstanceResources(spec,
orchestratorResources), returning its result and propagating any error; remove
the manual loop and construction and ensure error handling mirrors
buildServiceInstanceResources' signature.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 87ba96a6-319b-4081-aa9c-8afd6b95d1ec
📒 Files selected for processing (1)
server/internal/orchestrator/swarm/orchestrator.go
| canonicalROID := ServiceUserRoleIdentifier(spec.ServiceSpec.ServiceID, ServiceUserRoleRO) | ||
|
|
There was a problem hiding this comment.
RAG role identity is keyed by service ID instead of service-instance ID.
At Line 565 and Line 569/Line 583, using spec.ServiceSpec.ServiceID makes role identity shared across all instances of the same service. That conflicts with per-host/per-instance role provisioning and can cause credential-source collisions between instances on different hosts.
Suggested direction
- canonicalROID := ServiceUserRoleIdentifier(spec.ServiceSpec.ServiceID, ServiceUserRoleRO)
+ canonicalROID := ServiceUserRoleIdentifier(spec.ServiceInstanceID, ServiceUserRoleRO)
...
- ServiceID: spec.ServiceSpec.ServiceID,
+ ServiceID: spec.ServiceInstanceID,
...
- ServiceID: spec.ServiceSpec.ServiceID,
+ ServiceID: spec.ServiceInstanceID,If RAGServiceUserRole exists in this PR, prefer instantiating that resource type here instead of ServiceUserRole.
Also applies to: 569-574, 583-589
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@server/internal/orchestrator/swarm/orchestrator.go` around lines 565 - 566,
The role identity is being keyed by spec.ServiceSpec.ServiceID (creating
canonicalROID via ServiceUserRoleIdentifier) which makes it shared across all
instances; change the keying to use the service-instance identifier (e.g.,
spec.ServiceInstanceID or the appropriate ServiceInstance ID field) when
constructing the role identifier and when creating/looking up the role resource
(replace ServiceUserRoleIdentifier(spec.ServiceSpec.ServiceID, ...) with
ServiceUserRoleIdentifier(spec.ServiceInstanceID, ...) or equivalent). Also, if
RAGServiceUserRole is available in this PR, instantiate/lookup that resource
type instead of ServiceUserRole so the role is scoped per-instance; update the
other occurrences referenced (the blocks around canonicalROID and the later
calls at the 569–574 and 583–589 sites) to use the instance-scoped identifier
consistently.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
server/internal/orchestrator/swarm/orchestrator.go (1)
565-574:⚠️ Potential issue | 🟠 MajorRAG role identity is keyed by service ID instead of service-instance ID.
The PR objective states that
RAGServiceUserRoleshould be "keyed by serviceInstanceID" for per-host scoping, butServiceUserRoleIdentifierandServiceUserRoleare constructed usingspec.ServiceSpec.ServiceID. This creates a shared role identity across all instances of the same service rather than per-instance isolation.Suggested fix
- canonicalROID := ServiceUserRoleIdentifier(spec.ServiceSpec.ServiceID, ServiceUserRoleRO) + canonicalROID := ServiceUserRoleIdentifier(spec.ServiceInstanceID, ServiceUserRoleRO) // Canonical read-only role — runs on the node co-located with this instance. canonicalRO := &ServiceUserRole{ - ServiceID: spec.ServiceSpec.ServiceID, + ServiceID: spec.ServiceInstanceID, DatabaseID: spec.DatabaseID, DatabaseName: spec.DatabaseName, NodeName: spec.NodeName, Mode: ServiceUserRoleRO, }Also update lines 585 accordingly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/orchestrator/swarm/orchestrator.go` around lines 565 - 574, The role identity is being keyed by spec.ServiceSpec.ServiceID causing shared RAG roles across service instances; update the call to ServiceUserRoleIdentifier and the ServiceUserRole fields to use the per-instance identifier (spec.ServiceSpec.ServiceInstanceID) instead of ServiceID so the canonicalROID and canonicalRO are scoped to the service instance; also make the same replacement in the subsequent related construction referenced later (the other canonical role creation around the same block) so all lookups/creations consistently use ServiceInstanceID.
🧹 Nitpick comments (1)
server/internal/orchestrator/swarm/orchestrator.go (1)
594-612: Consolidate resource conversion by callingbuildServiceInstanceResources.Lines 594-612 duplicate the conversion logic that
buildServiceInstanceResourcesalready provides. This function was introduced specifically to share this code path.♻️ Proposed fix
} - data := make([]*resource.ResourceData, len(orchestratorResources)) - for i, res := range orchestratorResources { - d, err := resource.ToResourceData(res) - if err != nil { - return nil, fmt.Errorf("failed to convert resource to resource data: %w", err) - } - data[i] = d - } - - return &database.ServiceInstanceResources{ - ServiceInstance: &database.ServiceInstance{ - ServiceInstanceID: spec.ServiceInstanceID, - ServiceID: spec.ServiceSpec.ServiceID, - DatabaseID: spec.DatabaseID, - HostID: spec.HostID, - State: database.ServiceInstanceStateCreating, - }, - Resources: data, - }, nil + return o.buildServiceInstanceResources(spec, orchestratorResources) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/orchestrator/swarm/orchestrator.go` around lines 594 - 612, The code duplicates resource conversion logic already implemented in buildServiceInstanceResources; remove the manual loop that calls resource.ToResourceData over orchestratorResources and instead call buildServiceInstanceResources(spec, orchestratorResources), propagate its returned (*database.ServiceInstanceResources, error), and handle the error as before (returning fmt.Errorf or the error directly). Ensure you reference orchestratorResources and spec when calling buildServiceInstanceResources and preserve the same error handling semantics as the surrounding function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@common.mk`:
- Line 18: The CONTROL_PLANE_VERSION assignment runs git describe with --match
'$(CHANGIE_LATEST)*' which becomes --match '*' when CHANGIE_LATEST is empty;
change the logic so you first test whether CHANGIE_LATEST is non-empty and only
then run the git fetch && git describe command (leaving CONTROL_PLANE_VERSION
empty when CHANGIE_LATEST is empty). Concretely, wrap the existing shell
invocation in a conditional that checks $(CHANGIE_LATEST) (or branch between a
no-op/empty echo and the git describe call) so --match is never called with '*'
while keeping the variable assignment behavior around CONTROL_PLANE_VERSION
intact.
---
Duplicate comments:
In `@server/internal/orchestrator/swarm/orchestrator.go`:
- Around line 565-574: The role identity is being keyed by
spec.ServiceSpec.ServiceID causing shared RAG roles across service instances;
update the call to ServiceUserRoleIdentifier and the ServiceUserRole fields to
use the per-instance identifier (spec.ServiceSpec.ServiceInstanceID) instead of
ServiceID so the canonicalROID and canonicalRO are scoped to the service
instance; also make the same replacement in the subsequent related construction
referenced later (the other canonical role creation around the same block) so
all lookups/creations consistently use ServiceInstanceID.
---
Nitpick comments:
In `@server/internal/orchestrator/swarm/orchestrator.go`:
- Around line 594-612: The code duplicates resource conversion logic already
implemented in buildServiceInstanceResources; remove the manual loop that calls
resource.ToResourceData over orchestratorResources and instead call
buildServiceInstanceResources(spec, orchestratorResources), propagate its
returned (*database.ServiceInstanceResources, error), and handle the error as
before (returning fmt.Errorf or the error directly). Ensure you reference
orchestratorResources and spec when calling buildServiceInstanceResources and
preserve the same error handling semantics as the surrounding function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 439a9683-c4ea-4111-ad7e-fafa43ae3624
📒 Files selected for processing (3)
common.mkserver/internal/orchestrator/swarm/orchestrator.goserver/internal/orchestrator/swarm/rag_service_user_role_test.go
✅ Files skipped from review due to trivial changes (1)
- server/internal/orchestrator/swarm/rag_service_user_role_test.go
Summary
This PR introduces RAGServiceUserRole, a new resource that creates a dedicated PostgreSQL database user for each RAG service instance running on its co-located host.
Changes
rag_service_user_role.go— New resource keyed byserviceInstanceID(not serviceID). HandlesCreate/Delete/Refreshlifecycle. Refresh queriespg_catalog.pg_rolesto verify role existence.connectToColocatedPrimaryfilters instances by HostID before Patroni primary lookup, ensuring the role lands on the correct node.orchestrator.go — Refactored
GenerateServiceInstanceResourcesto dispatch by ServiceType. MCP logic extracted intogenerateMCPInstanceResources. NewgenerateRAGInstanceResourcesand sharedbuildServiceInstanceResourceshelper added.resources.go — Registers
ResourceTypeRAGServiceUserRole.plan_update.go — Skips
ServiceInstanceMonitorResourcefor RAG instances since no Docker container exists yet (swarm.service_instance dependency would be unsatisfied).Testing
Manual Verification:
Created Cluster
Created a database using the following command:
restish control-plane-local-1 create-database < ../demo/488/rag_create_db.json
rag_create_db.json
The database created successfully
Connect to db and confirm that rg service user created
Checklist
Notes for Reviewers
PLAT-489