Skip to content

feat: add sentinel prom metrics#5283

Open
Flo4604 wants to merge 5 commits intomainfrom
03-11-feat_cache_parsed_policies_extract_timer_pkg
Open

feat: add sentinel prom metrics#5283
Flo4604 wants to merge 5 commits intomainfrom
03-11-feat_cache_parsed_policies_extract_timer_pkg

Conversation

@Flo4604
Copy link
Member

@Flo4604 Flo4604 commented Mar 11, 2026

What does this PR do?

Adds Prometheus metrics instrumentation to the Sentinel service to improve observability and monitoring capabilities.

New metrics added:

  • sentinel_engine_evaluations_total - Counts policy evaluations by type (keyauth) and result (success/denied/error/skipped)
  • sentinel_proxy_errors_total - Tracks proxy errors categorized by type (timeout, connection refused, DNS failure, etc.)
  • sentinel_upstream_response_total - Counts upstream responses by HTTP status class (2xx, 3xx, 4xx, 5xx)
  • sentinel_upstream_duration_seconds - Measures backend response latency excluding Sentinel overhead
  • sentinel_routing_instance_selection_total - Tracks instance selection outcomes
  • sentinel_routing_duration_seconds - Measures duration of routing operations

Key changes:

  • Instrumented the policy evaluation engine to track keyauth policy results with error classification
  • Added proxy error categorization for timeouts, connection failures, and DNS issues
  • Implemented upstream response tracking with latency measurements
  • Enhanced router service with policy caching and comprehensive routing metrics
  • Added timing instrumentation for deployment lookups and instance selection

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Chore (refactoring code, technical debt, workflow improvements)
  • Enhancement (small improvements)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How should this be tested?

  • Deploy Sentinel service and verify /metrics endpoint exposes new metrics
  • Send requests through Sentinel proxy and confirm metrics are incremented correctly
  • Test various error scenarios (timeouts, connection failures) to verify error classification
  • Check that policy evaluation metrics track success/denial/error states properly
  • Confirm upstream latency measurements exclude Sentinel processing overhead
  • Verify routing metrics track deployment lookups and instance selection outcomes

Checklist

Required

  • Filled out the "How to test" section in this PR
  • Read Contributing Guide
  • Self-reviewed my own code
  • Commented on my code in hard-to-understand areas
  • Ran pnpm build
  • Ran pnpm fmt
  • Ran make fmt on /go directory
  • Checked for warnings, there are none
  • Removed all console.logs
  • Merged the latest changes from main onto my branch with git pull origin main
  • My changes don't cause any responsiveness issues

Appreciated

  • If a UI change was made: Added a screen recording or screenshots to this PR
  • Updated the Unkey Docs if changes were necessary

@vercel
Copy link

vercel bot commented Mar 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
dashboard Ready Ready Preview, Comment Mar 24, 2026 1:23pm

Request Review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

This pull request adds comprehensive Prometheus metrics instrumentation across the Sentinel service, including engine policy evaluation tracking, middleware error categorization, proxy upstream response metrics, and router service instance selection tracking. Additionally, it introduces a new GetPolicies method in the router service with caching to optimize policy retrieval, and refactors observability metrics to use consistent namespacing while removing the environment_id label.

Changes

Cohort / File(s) Summary
Build Configuration
svc/sentinel/engine/BUILD.bazel, svc/sentinel/middleware/BUILD.bazel, svc/sentinel/routes/proxy/BUILD.bazel, svc/sentinel/services/router/BUILD.bazel
Added metrics.go source file and Prometheus client library dependencies (prometheus, promauto) to enable metrics instrumentation across services.
Engine Metrics
svc/sentinel/engine/engine.go, svc/sentinel/engine/metrics.go
Instrumented keyauth policy evaluation with counters for skipped/success/error outcomes and histogram for evaluation duration. Added classifyKeyauthError() to map Keyauth-specific Sentinel auth URNs to metric result labels.
Middleware Error Tracking
svc/sentinel/middleware/error_handling.go, svc/sentinel/middleware/metrics.go
Introduced proxy error counter with categorizeProxyErrorTypeForMetrics() to classify errors (timeout, connection refused/reset, DNS failure, client canceled, other) for metrics collection on each error occurrence.
Observability Refactoring
svc/sentinel/middleware/observability.go
Updated metric namespace/subsystem to unkey/sentinel format and removed environment_id label from request counters, duration histograms, and active request gauges; retained status_code, error_type, and region labels.
Proxy Upstream Metrics
svc/sentinel/routes/proxy/handler.go, svc/sentinel/routes/proxy/metrics.go
Added upstream response and duration metrics collection in ModifyResponse hook using status class classification (2xx/3xx/4xx/5xx). Changed policy lookup to use RouterService.GetPolicies() instead of direct parsing.
Router Service Enhancement
svc/sentinel/services/router/interface.go, svc/sentinel/services/router/service.go, svc/sentinel/services/router/metrics.go
Extended router.Service with new GetPolicies() method that caches parsed policies via SWR pattern with deployment-derived TTL. Instrumented instance selection and routing operations with counters (success, no_instances, deployment_not_found, error) and histograms (get_deployment, select_instance timing). Added policy cache prewarming during service initialization.

Sequence Diagram(s)

sequenceDiagram
    actor ProxyHandler as Proxy Handler
    participant RouterService as Router Service
    participant PolicyCache as Policy Cache
    participant Engine as Engine
    participant DB as Database
    
    ProxyHandler->>RouterService: GetPolicies(ctx, deployment)
    
    RouterService->>PolicyCache: SWR lookup (deploymentID)
    alt Cache Hit
        PolicyCache-->>RouterService: []*sentinelv1.Policy
    else Cache Miss
        RouterService->>DB: Get deployment details
        DB-->>RouterService: deployment config
        
        RouterService->>Engine: ParseMiddleware(SentinelConfig)
        Engine-->>RouterService: []*sentinelv1.Policy (or error)
        
        RouterService->>PolicyCache: Store policies with TTL
        PolicyCache-->>RouterService: cached
    end
    
    RouterService-->>ProxyHandler: []*sentinelv1.Policy
    ProxyHandler->>ProxyHandler: Evaluate policies using cached result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding Prometheus metrics instrumentation to the Sentinel service. It is concise and clearly indicates the primary objective of the PR.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering what the PR does, the metrics added, key changes, type of change, testing approach, and checklist items. All required template sections are present and adequately filled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 03-11-feat_cache_parsed_policies_extract_timer_pkg

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

Command failed


Comment @coderabbitai help to get the list of available commands and usage tips.

@Flo4604 Flo4604 force-pushed the frontline-prom-metrics branch from 61a8527 to e2de48e Compare March 12, 2026 09:53
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from 9b71f71 to 3a8cd43 Compare March 12, 2026 09:53
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from 3a8cd43 to e299fdf Compare March 12, 2026 17:14
@Flo4604 Flo4604 force-pushed the frontline-prom-metrics branch from e2de48e to 520e4c6 Compare March 12, 2026 17:14
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from e299fdf to c336f6d Compare March 13, 2026 14:34
@Flo4604 Flo4604 force-pushed the frontline-prom-metrics branch from 6ddfc5a to 71aaa96 Compare March 16, 2026 11:31
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from c336f6d to 0ba3c6e Compare March 16, 2026 11:31
@Flo4604 Flo4604 force-pushed the frontline-prom-metrics branch from 71aaa96 to d93b474 Compare March 16, 2026 11:52
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from 0ba3c6e to 7bcae14 Compare March 16, 2026 11:52
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from 7bcae14 to e4cf96a Compare March 18, 2026 14:21
@Flo4604 Flo4604 force-pushed the frontline-prom-metrics branch from d93b474 to da2d1e1 Compare March 18, 2026 14:21
@Flo4604 Flo4604 force-pushed the frontline-prom-metrics branch from da2d1e1 to 96a0215 Compare March 18, 2026 16:00
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from e4cf96a to 86ca217 Compare March 18, 2026 16:00
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from 86ca217 to f57a27c Compare March 19, 2026 11:14
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from 59db08a to bda6c5e Compare March 23, 2026 13:41
@Flo4604 Flo4604 force-pushed the frontline-prom-metrics branch from 979c6ae to f478540 Compare March 23, 2026 13:44
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from bda6c5e to 82121dd Compare March 23, 2026 13:44
@Flo4604 Flo4604 changed the title feat: add sentinel prometheus metrics, build info gauge, cache parsed policies, and extract shared timer feat: add sentinel prom metrics Mar 23, 2026
@Flo4604 Flo4604 requested a review from chronark March 23, 2026 16:46
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from 82121dd to 9675019 Compare March 24, 2026 09:38
@Flo4604 Flo4604 force-pushed the frontline-prom-metrics branch from f478540 to 1226d14 Compare March 24, 2026 09:38
@Flo4604 Flo4604 force-pushed the frontline-prom-metrics branch from 1226d14 to a759368 Compare March 24, 2026 12:08
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from 9675019 to 79fdfc8 Compare March 24, 2026 12:08
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from 79fdfc8 to 14563fb Compare March 24, 2026 12:22
@Flo4604 Flo4604 force-pushed the frontline-prom-metrics branch from a759368 to ca3eab6 Compare March 24, 2026 12:22
Base automatically changed from frontline-prom-metrics to main March 24, 2026 12:31
Flo4604 added 5 commits March 24, 2026 14:20
…ract shared timer

- Add sentinel routing metrics: instance selection outcomes + duration histograms
- Add sentinel engine metrics: policy evaluation counts by type and result
- Add sentinel proxy error + upstream response metrics
- Cache parsed sentinel policies in router service to avoid proto unmarshalling on every request
- Extract shared timer package (pkg/prometheus/timer) to avoid copy-pasting duration helpers
@Flo4604 Flo4604 force-pushed the 03-11-feat_cache_parsed_policies_extract_timer_pkg branch from 14563fb to 108b602 Compare March 24, 2026 13:21
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
svc/sentinel/routes/proxy/handler.go (1)

157-168: ⚠️ Potential issue | 🟡 Minor

Emit upstream metrics even when tracking missing.

Handle already tolerates missing tracking context. Keeping upstreamResponseTotal and duration under if tracking != nil means those requests vanish from /metrics. Move the counter outside this block; if duration should survive too, keep a local upstream start timestamp.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@svc/sentinel/routes/proxy/handler.go` around lines 157 - 168, The metrics
emission is currently inside the tracking nil check so requests without tracking
are omitted from metrics; move the upstream metric logic out of the "if tracking
!= nil" block in the ModifyResponse function so
upstreamResponseTotal.WithLabelValues(upstreamStatusClass(resp.StatusCode)).Inc()
always runs, and for upstreamDuration use a local start timestamp (e.g. capture
a start := tracking.InstanceStart if tracking != nil else timeZero or time.Now()
at request start) so you can safely compute and call
upstreamDuration.WithLabelValues(...).Observe(...) without dereferencing
tracking; update references to
tracking.InstanceEnd/ResponseStatus/ResponseHeaders to remain guarded by the
existing tracking != nil check.
🧹 Nitpick comments (1)
svc/sentinel/middleware/observability.go (1)

21-49: Metric schema break; migrate queries same deploy.

These renames plus dropping environment_id will break existing PromQL/rules on these series. Roll dashboard/alert/recording-rule updates with the code, or keep a short compat window.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@svc/sentinel/middleware/observability.go` around lines 21 - 49, The metric
label changes in sentinelRequestsTotal, sentinelRequestDuration and
sentinelActiveRequests drop/rename the existing environment_id label which will
break PromQL, dashboards, alerts and recording rules; either restore the
previous label set (re-add "environment_id" to the label lists for
sentinelRequestsTotal, sentinelRequestDuration and sentinelActiveRequests) or
emit compatibility metrics alongside the new schema (duplicate the metrics with
the old label set) and coordinate deploying dashboard/alert/recording-rule
updates in the same release; update the metric definitions in observability.go
(symbols: sentinelRequestsTotal, sentinelRequestDuration,
sentinelActiveRequests) accordingly so queries continue to work during rollout.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@svc/sentinel/middleware/metrics.go`:
- Around line 28-55: categorizeProxyErrorTypeForMetrics currently maps
EHOSTUNREACH to "other", losing the ServiceUnavailable signal; update it to
detect host-unreachable errors (check syscall.EHOSTUNREACH and inspect
net.OpError.Err for EHOSTUNREACH) and return the same metric label used by the
higher-level classifier (e.g., "service_unavailable" or the label used by
categorizeProxyError) so both metrics and response classification stay
consistent, and consider refactoring by extracting a shared typed classifier
used by both categorizeProxyError and categorizeProxyErrorTypeForMetrics to
avoid duplicated logic.

In `@svc/sentinel/routes/proxy/metrics.go`:
- Around line 31-42: The upstreamStatusClass function currently treats any code
<300 that isn't 3xx/4xx/5xx as "2xx", which mislabels 1xx and invalid codes;
update upstreamStatusClass to explicitly handle 100-199 as "1xx", 200-299 as
"2xx" and return a clear fallback (e.g., "unknown" or "invalid") for codes <=0
or outside 1xx-5xx, and adjust the switch/conditions accordingly in the
upstreamStatusClass function so callers receive correct status buckets.

In `@svc/sentinel/services/router/service.go`:
- Around line 239-247: The code incorrectly uses sentinelInstanceSelectionTotal
for deployment lookup failures; create a separate metric (e.g.,
sentinelDeploymentLookupTotal) or add a stage label to
sentinelInstanceSelectionTotal and use that for lookup outcomes, then replace
the increments around GetDeployment failure branches (the blocks that call
sentinelInstanceSelectionTotal.WithLabelValues("error") and
WithLabelValues("deployment_not_found") near the cache.Null / db.IsNotFound
checks and the similar occurrence at the other location) so lookup failures
increment the new lookup metric (or use the "lookup" stage label) while
preserving sentinelInstanceSelectionTotal for true instance-selection outcomes
only.

---

Outside diff comments:
In `@svc/sentinel/routes/proxy/handler.go`:
- Around line 157-168: The metrics emission is currently inside the tracking nil
check so requests without tracking are omitted from metrics; move the upstream
metric logic out of the "if tracking != nil" block in the ModifyResponse
function so
upstreamResponseTotal.WithLabelValues(upstreamStatusClass(resp.StatusCode)).Inc()
always runs, and for upstreamDuration use a local start timestamp (e.g. capture
a start := tracking.InstanceStart if tracking != nil else timeZero or time.Now()
at request start) so you can safely compute and call
upstreamDuration.WithLabelValues(...).Observe(...) without dereferencing
tracking; update references to
tracking.InstanceEnd/ResponseStatus/ResponseHeaders to remain guarded by the
existing tracking != nil check.

---

Nitpick comments:
In `@svc/sentinel/middleware/observability.go`:
- Around line 21-49: The metric label changes in sentinelRequestsTotal,
sentinelRequestDuration and sentinelActiveRequests drop/rename the existing
environment_id label which will break PromQL, dashboards, alerts and recording
rules; either restore the previous label set (re-add "environment_id" to the
label lists for sentinelRequestsTotal, sentinelRequestDuration and
sentinelActiveRequests) or emit compatibility metrics alongside the new schema
(duplicate the metrics with the old label set) and coordinate deploying
dashboard/alert/recording-rule updates in the same release; update the metric
definitions in observability.go (symbols: sentinelRequestsTotal,
sentinelRequestDuration, sentinelActiveRequests) accordingly so queries continue
to work during rollout.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6197a52e-8b06-4759-a28b-f23265f5aa48

📥 Commits

Reviewing files that changed from the base of the PR and between 06f906d and 108b602.

📒 Files selected for processing (14)
  • svc/sentinel/engine/BUILD.bazel
  • svc/sentinel/engine/engine.go
  • svc/sentinel/engine/metrics.go
  • svc/sentinel/middleware/BUILD.bazel
  • svc/sentinel/middleware/error_handling.go
  • svc/sentinel/middleware/metrics.go
  • svc/sentinel/middleware/observability.go
  • svc/sentinel/routes/proxy/BUILD.bazel
  • svc/sentinel/routes/proxy/handler.go
  • svc/sentinel/routes/proxy/metrics.go
  • svc/sentinel/services/router/BUILD.bazel
  • svc/sentinel/services/router/interface.go
  • svc/sentinel/services/router/metrics.go
  • svc/sentinel/services/router/service.go

Comment on lines +28 to +55
func categorizeProxyErrorTypeForMetrics(err error) string {
if errors.Is(err, context.Canceled) {
return "client_canceled"
}
if errors.Is(err, context.DeadlineExceeded) || os.IsTimeout(err) {
return "timeout"
}

var netErr *net.OpError
if errors.As(err, &netErr) {
if netErr.Timeout() {
return "timeout"
}
if errors.Is(netErr.Err, syscall.ECONNREFUSED) {
return "conn_refused"
}
if errors.Is(netErr.Err, syscall.ECONNRESET) {
return "conn_reset"
}
}

var dnsErr *net.DNSError
if errors.As(err, &dnsErr) {
return "dns_failure"
}

return "other"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

EHOSTUNREACH falls into other.

categorizeProxyError treats host-unreachable as ServiceUnavailable, but this helper has no matching branch, so the new metric loses that signal. Better: share one typed classifier for URN/message + metric label.

As per coding guidelines, "Make illegal states unrepresentable by modeling domain with ADTs/discriminated unions and parsing inputs at boundaries into typed structures".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@svc/sentinel/middleware/metrics.go` around lines 28 - 55,
categorizeProxyErrorTypeForMetrics currently maps EHOSTUNREACH to "other",
losing the ServiceUnavailable signal; update it to detect host-unreachable
errors (check syscall.EHOSTUNREACH and inspect net.OpError.Err for EHOSTUNREACH)
and return the same metric label used by the higher-level classifier (e.g.,
"service_unavailable" or the label used by categorizeProxyError) so both metrics
and response classification stay consistent, and consider refactoring by
extracting a shared typed classifier used by both categorizeProxyError and
categorizeProxyErrorTypeForMetrics to avoid duplicated logic.

Comment on lines +31 to +42
func upstreamStatusClass(code int) string {
switch {
case code >= 500:
return "5xx"
case code >= 400:
return "4xx"
case code >= 300:
return "3xx"
default:
return "2xx"
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

upstreamStatusClass returns "2xx" for 1xx and invalid status codes.

Codes below 200 (including 1xx informational, 0, or negative values) fall through to "2xx". Consider explicit handling:

Proposed fix
 func upstreamStatusClass(code int) string {
 	switch {
 	case code >= 500:
 		return "5xx"
 	case code >= 400:
 		return "4xx"
 	case code >= 300:
 		return "3xx"
-	default:
+	case code >= 200:
 		return "2xx"
+	default:
+		return "other"
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@svc/sentinel/routes/proxy/metrics.go` around lines 31 - 42, The
upstreamStatusClass function currently treats any code <300 that isn't
3xx/4xx/5xx as "2xx", which mislabels 1xx and invalid codes; update
upstreamStatusClass to explicitly handle 100-199 as "1xx", 200-299 as "2xx" and
return a clear fallback (e.g., "unknown" or "invalid") for codes <=0 or outside
1xx-5xx, and adjust the switch/conditions accordingly in the upstreamStatusClass
function so callers receive correct status buckets.

Comment on lines +239 to +247
sentinelInstanceSelectionTotal.WithLabelValues("error").Inc()
return db.Deployment{}, fault.Wrap(err,
fault.Code(codes.Sentinel.Internal.InternalServerError.URN()),
fault.Internal("failed to get deployment"),
)
}

if hit == cache.Null || db.IsNotFound(err) {
sentinelInstanceSelectionTotal.WithLabelValues("deployment_not_found").Inc()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

deployment_not_found isn't an instance-selection outcome.

GetDeployment now increments sentinelInstanceSelectionTotal for lookup failures. That muddies the metric: dashboards can't tell whether routing failed before selection or during selection. Use a separate lookup counter, or add a stage label.

Also applies to: 262-262

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@svc/sentinel/services/router/service.go` around lines 239 - 247, The code
incorrectly uses sentinelInstanceSelectionTotal for deployment lookup failures;
create a separate metric (e.g., sentinelDeploymentLookupTotal) or add a stage
label to sentinelInstanceSelectionTotal and use that for lookup outcomes, then
replace the increments around GetDeployment failure branches (the blocks that
call sentinelInstanceSelectionTotal.WithLabelValues("error") and
WithLabelValues("deployment_not_found") near the cache.Null / db.IsNotFound
checks and the similar occurrence at the other location) so lookup failures
increment the new lookup metric (or use the "lookup" stage label) while
preserving sentinelInstanceSelectionTotal for true instance-selection outcomes
only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants