Skip to content

feat: emit raw block numbers for lag tracking across all chains#59

Merged
smypmsa merged 13 commits intomasterfrom
feat/block-lag-metric
Apr 16, 2026
Merged

feat: emit raw block numbers for lag tracking across all chains#59
smypmsa merged 13 commits intomasterfrom
feat/block-lag-metric

Conversation

@smypmsa
Copy link
Copy Markdown
Member

@smypmsa smypmsa commented Apr 12, 2026

Why

Grafana needs raw block numbers per provider to compute block lag (how far each provider is behind the tip).

What

Piggyback on existing latency calls to capture the block number and emit it as a new metric_type=block_number on the existing response_latency_seconds measurement.

Chain Method reused Field captured
EVM (Ethereum, Base, BSC, Arbitrum, Monad, Hyperliquid) eth_blockNumber hex result
Solana getLatestBlockhash result.context.slot
TON getMasterchainInfo (new call) result.last.seqno

Zero extra RPC calls except TON's getMasterchainInfo — which is also a useful new TON latency metric.

Why raw, not pre-computed lag

Grafana computes max(block_number) - provider_value at query time. Keeps the tip formula (max / median-of-top-N / etc.) configurable without a redeploy.

Output

dev_response_latency_seconds,blockchain=ethereum,provider=alchemy,api_method=eth_blockNumber,metric_type=block_number value=21800100
dev_response_latency_seconds,blockchain=ton,provider=chainstack,api_method=getMasterchainInfo,metric_type=block_number value=48200000

Also bundled

Unrelated repo-hygiene changes piggybacked on the feature commits:

  • uv migration — root pyproject.toml gets a [project] block with deps, requirements.txt deleted, dashboards/ split into its own uv subproject (dashboards/pyproject.toml).
  • Docs/ignoresreadme.mdREADME.md, CLAUDE.md commands switched to uvx / uv run, .vercelignore casing + new entries.

Summary by CodeRabbit

  • New Features

    • Expanded metrics: new latency metric for TON plus expanded block-number capture for EVM chains to surface block-lag metrics.
  • Bug Fixes / Reliability

    • More robust JSON-RPC/HTTP handling and clearer failure/reset behavior to improve metric accuracy and stability.
  • Documentation

    • Widespread docstring additions and clarifications across the codebase for easier maintenance and observability.
  • Chores

    • Updated aiohttp pin and lint config tweaks.

smypmsa and others added 2 commits April 12, 2026 23:31
Adds block lag tracking — how many blocks/slots each provider is behind
the consensus tip (max block seen across all providers in the same invocation).
No new RPC calls; reuses existing eth_blockNumber, getLatestBlockhash, and
adds getMasterchainInfo for TON.

Architecture:
- BaseMetric: adds _captured_block_number Optional[int] field
- HttpCallLatencyMetricBase: adds _on_json_response() hook called after each
  successful JSON-RPC response; subclasses override to extract block numbers
- EVMBlockNumberLatencyMetric (new, common/metric_types.py): shared eth_blockNumber
  metric for all EVM chains — captures hex block number; all 6 chain-specific
  HTTPBlockNumberLatencyMetric classes now inherit from this
- HTTPGetRecentBlockhashLatencyMetric (Solana): captures context.slot from
  getLatestBlockhash response (no extra call)
- HTTPGetMasterchainInfoLatencyMetric (TON, new): calls getMasterchainInfo,
  captures last.seqno; also a useful new latency metric for TON
- MetricsHandler._compute_block_lag(): post-gather step that groups instances
  by blockchain, computes tip = max block number, and stores
  (tip - provider_block) as metric_type="block_lag" on each instance

Output: Influx tag metric_type=block_lag on the existing
response_latency_seconds measurement for api_method=eth_blockNumber
(EVM/HL), getLatestBlockhash (Solana), getMasterchainInfo (TON).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds block lag tracking — how many blocks/slots each provider is behind
the consensus tip (max block seen across all providers in the same invocation).
No new RPC calls; reuses existing eth_blockNumber, getLatestBlockhash, and
adds getMasterchainInfo for TON.

Architecture:
- BaseMetric: adds _captured_block_number Optional[int] field
- HttpCallLatencyMetricBase: adds _on_json_response() hook called after each
  successful JSON-RPC response; subclasses override to extract block numbers
- EVMBlockNumberLatencyMetric (new, common/metric_types.py): shared eth_blockNumber
  metric for all EVM chains — captures hex block number; all 6 chain-specific
  HTTPBlockNumberLatencyMetric classes now inherit from this
- HTTPGetRecentBlockhashLatencyMetric (Solana): captures context.slot from
  getLatestBlockhash response (no extra call)
- HTTPGetMasterchainInfoLatencyMetric (TON, new): calls getMasterchainInfo,
  captures last.seqno; also a useful new latency metric for TON
- MetricsHandler._compute_block_lag(): post-gather step that groups instances
  by blockchain, computes tip = max block number, and stores
  (tip - provider_block) as metric_type="block_lag" on each instance

Output: Influx tag metric_type=block_lag on the existing
response_latency_seconds measurement for api_method=eth_blockNumber
(EVM/HL), getLatestBlockhash (Solana), getMasterchainInfo (TON).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 12, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Note

.coderabbit.yaml has unrecognized properties

CodeRabbit is using all valid settings from your configuration. Unrecognized properties (listed below) have been ignored and may indicate typos or deprecated fields that can be removed.

⚠️ Parsing warnings (1)
Validation error: Unrecognized key(s) in object: 'instructions', 'path_instructions'
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
📝 Walkthrough

Walkthrough

Capture and emit block numbers/slots from JSON-RPC responses: introduced an overridable _on_json_response hook and centralized response processing, added EVM shared block-number metric class, created TON masterchain metric, and added logic to emit captured block numbers during metrics collection.

Changes

Cohort / File(s) Summary
Core metric infra
common/base_metric.py, common/metric_types.py, common/metrics_handler.py
Centralize JSON-RPC response processing with HttpCallLatencyMetricBase._process_response and _on_json_response hook; tighten typing; accept list params; add EVMBlockNumberLatencyMetric; add _captured_block_number lifecycle and MetricsHandler._emit_block_numbers() to include captured block numbers in emitted metrics.
EVM chain metrics
metrics/base.py, metrics/arbitrum.py, metrics/bnbsc.py, metrics/ethereum.py, metrics/hyperliquid.py, metrics/monad.py
Replace per-chain eth_blockNumber duplicates by inheriting EVMBlockNumberLatencyMetric; minor docstring and type-annotation tightening (notably Ethereum WS class signatures/timeout handling).
Non-EVM chain metrics
metrics/solana.py, metrics/ton.py, metrics/solana_landing_rate.py
Add _on_json_response to capture Solana slot and TON masterchain seqno; introduce HTTPGetMasterchainInfoLatencyMetric for TON; tighten Solana landing metric kwargs typing and add docstrings.
API handler docstrings
api/read/ton.py, api/read/arbitrum.py, api/read/base.py, api/read/bnbsc.py, api/read/ethereum.py, api/read/hyperliquid.py, api/read/monad.py, api/read/solana.py, api/write/solana.py, api/support/update_state.py
Added or updated module/class/docstrings for multiple Vercel handler modules; no functional changes.
Dashboards & tests
dashboards/grafana_sync.py, dashboards/tests/test_grafana_sync.py
Docstring additions and formatting changes in Grafana sync CLI and tests; reformatting only, behavior unchanged.
Common & config tweaks
common/__init__.py, common/factory.py, common/metric_config.py, common/state/..., config/defaults.py, pyproject.toml
Mostly docstrings, a ClassVar annotation added to factory/defaults, small typing tweaks, and Ruff config updates.
Requirements & misc
requirements.txt, .gitignore
Bump aiohttp 3.13.3 → 3.13.4; add uv.lock to .gitignore.

Sequence Diagram

sequenceDiagram
    participant Collector as MetricsCollector
    participant Metric as HttpCallLatencyMetricBase
    participant Handler as MetricsHandler
    participant Push as GrafanaPusher

    Collector->>Metric: perform HTTP JSON-RPC call
    Metric->>Metric: _process_response(json_response)
    alt json contains block/slot
        Metric->>Metric: _on_json_response(json_response)
        Metric->>Metric: set _captured_block_number
    end
    Collector->>Handler: return metric values
    Handler->>Handler: asyncio.gather(...) completes
    Handler->>Handler: _emit_block_numbers()
    Handler->>Push: update_metric_value(..., "block_number")
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through JSON, parsed seqno and slot,

I tucked them in metrics, a tidy small plot.
One shared base now, no duplicate chase,
Block numbers travel to their rightful place.
🥕 Hooray for observability hops!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title directly and clearly summarizes the main objective: emitting raw block numbers across all chains for lag tracking, which is the core feature of this changeset.
Docstring Coverage ✅ Passed Docstring coverage is 80.46% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/block-lag-metric

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@common/metric_types.py`:
- Around line 306-327: The EVMBlockNumberLatencyMetric currently introduces an
inheritance layer to parse a single response field; instead, add a composable
response parser callback to HttpCallLatencyMetricBase (e.g., an optional
constructor arg or protected attribute like response_parser) that can be invoked
from HttpCallLatencyMetricBase._on_json_response, move the block-number parsing
logic (the int(hex,16) and ValueError guard that sets _captured_block_number)
into that parser, and change EVMBlockNumberLatencyMetric to register/assign that
parser while keeping its method() and get_params_from_state() behavior; ensure
you reference and preserve the existing symbols _on_json_response,
_captured_block_number, method, and get_params_from_state so per-chain
HTTPBlockNumberLatencyMetric classes no longer require extra inheritance.
- Around line 267-268: Wrap the call to self._on_json_response(json_response) in
a try/except so any exception from an override does not propagate and mark the
optional block capture as failed by setting self._captured_block_number = None;
specifically, inside fetch_data (or the method that currently calls
_on_json_response) catch Exception around the _on_json_response(json_response)
invocation, set self._captured_block_number = None in the except block, and
optionally log the error, but do not re-raise so the latency/sample path
continues.
- Around line 317-319: The override get_params_from_state currently returns
list[Any] which conflicts with the base get_params_from_state (and the instance
variable self.method_params typed as dict[str, Any]); change the type contract
to allow both JSON-RPC param shapes by updating get_params_from_state return
annotations (in base and overrides) to dict[str, Any] | list[Any] (or
Mapping[str, Any] | Sequence[Any]) and update self.method_params' type from
dict[str, Any] to the same union so mypy no longer flags an LSP-violating
override.

In `@common/metrics_handler.py`:
- Around line 48-67: In _compute_block_lag(), skip instances that indicate a
failed response when grouping and computing the tip: only include instances with
a non-null _captured_block_number AND a successful response status (e.g., check
the instance's response/status flag such as instance.response_status or
instance.status/is_successful()) when building groups and when creating
block_numbers; then compute tip = max(block_numbers) and call
instance.update_metric_value(lag, "block_lag") only for those same successful
instances. Ensure both the grouping loop and the inner lag loop apply the status
gate so failed instances never contribute to or receive block_lag.

In `@common/state/blockchain_fetcher.py`:
- Around line 127-130: The code in _fetch_evm_data constructs tx_hash by
indexing transactions[0] before checking if transactions is non-empty, causing
IndexError for empty blocks; change the logic to first check truthiness/length
of transactions (the transactions variable) and only then access transactions[0]
or call .get on it—e.g., evaluate if transactions (or len(transactions) > 0)
before attempting transactions[0] so empty lists fall through correctly to
BlockchainData.empty() without raising an exception.

In `@dashboards/tests/test_grafana_sync.py`:
- Around line 83-89: The test uses nested context managers which triggers Ruff
SIM117; combine them into a single with statement by merging the patch.object
mock and pytest.raises(SystemExit) into one line so that the call to
m.resolve_folder_uid(m.load_config()) runs inside both contexts; update the
block around the api_get patch (patch.object on m.api_get) and the pytest.raises
usage to a single with to satisfy the linter.

In `@metrics/solana.py`:
- Around line 40-47: Add a Google-style docstring to the _on_json_response
method describing its purpose (extracting the slot from a JSON RPC response and
storing it to self._captured_block_number), document the parameter json_response
with its type dict[str, Any], and state that it returns None; place the
docstring immediately under the def _on_json_response(...) signature and follow
repo PEP 8 / Google-style formatting expected for functions.

In `@metrics/ton.py`:
- Around line 15-22: Add a Google-style docstring to the _on_json_response
method describing its purpose (extracting the block seqno from a JSON-RPC
response and storing it to self._captured_block_number), documenting the
parameter json_response: dict[str, Any] and the return type None, and noting the
behavior when expected keys are missing; place the docstring immediately below
the def _on_json_response(...) line and ensure it follows PEP 8 formatting for
docstrings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4c74c9bc-2468-432b-81b0-fa3fa5256af9

📥 Commits

Reviewing files that changed from the base of the PR and between e3e0a5c and b9cfd78.

📒 Files selected for processing (16)
  • api/read/ton.py
  • api/support/update_state.py
  • common/base_metric.py
  • common/metric_types.py
  • common/metrics_handler.py
  • common/state/blockchain_fetcher.py
  • dashboards/grafana_sync.py
  • dashboards/tests/test_grafana_sync.py
  • metrics/arbitrum.py
  • metrics/base.py
  • metrics/bnbsc.py
  • metrics/ethereum.py
  • metrics/hyperliquid.py
  • metrics/monad.py
  • metrics/solana.py
  • metrics/ton.py

Comment thread common/metric_types.py Outdated
Comment thread common/metric_types.py Outdated
Comment on lines +306 to +327
class EVMBlockNumberLatencyMetric(HttpCallLatencyMetricBase):
"""Shared eth_blockNumber metric used by all EVM chains.

Captures the returned block number for cross-provider lag computation.
"""

@property
def method(self) -> str:
return "eth_blockNumber"

@staticmethod
def get_params_from_state(state_data: dict[str, Any]) -> list[Any]:
"""Return empty params list for eth_blockNumber."""
return []

def _on_json_response(self, json_response: dict[str, Any]) -> None:
result = json_response.get("result")
if isinstance(result, str):
try:
self._captured_block_number = int(result, 16)
except ValueError:
pass
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Prefer composition for the shared block-capture behavior.

EVMBlockNumberLatencyMetric pushes every chain-specific HTTPBlockNumberLatencyMetric through an extra inheritance layer just to parse one response field. A small parser/callback composed into HttpCallLatencyMetricBase would keep the per-chain metric classes flat and avoid growing the type hierarchy for shared capture logic.

As per coding guidelines, "Use composition over inheritance for metric classes".

🧰 Tools
🪛 Ruff (0.15.9)

[warning] 317-317: Unused static method argument: state_data

(ARG004)


[warning] 324-327: Use contextlib.suppress(ValueError) instead of try-except-pass

Replace try-except-pass with with contextlib.suppress(ValueError): ...

(SIM105)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@common/metric_types.py` around lines 306 - 327, The
EVMBlockNumberLatencyMetric currently introduces an inheritance layer to parse a
single response field; instead, add a composable response parser callback to
HttpCallLatencyMetricBase (e.g., an optional constructor arg or protected
attribute like response_parser) that can be invoked from
HttpCallLatencyMetricBase._on_json_response, move the block-number parsing logic
(the int(hex,16) and ValueError guard that sets _captured_block_number) into
that parser, and change EVMBlockNumberLatencyMetric to register/assign that
parser while keeping its method() and get_params_from_state() behavior; ensure
you reference and preserve the existing symbols _on_json_response,
_captured_block_number, method, and get_params_from_state so per-chain
HTTPBlockNumberLatencyMetric classes no longer require extra inheritance.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentionally kept as inheritance: the hook pattern means each chain class only needs a one-liner subclass, while composition would require threading a parser callback through every constructor call across all chain files.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smypmsa, understood — the hook/template-method pattern does keep each chain's HTTPBlockNumberLatencyMetric to a single-line subclass body, which is a reasonable trade-off vs. wiring a parser callback through every constructor. I'll record this rationale.

Two minor Ruff issues from the original diff are still worth checking:

  • ARG004: state_data is unused in EVMBlockNumberLatencyMetric.get_params_from_state — you can suppress it with _state_data or a # noqa: ARG004 comment.
  • SIM105: Already addressed in the final file (uses contextlib.suppress(ValueError)). ✅

(˵ •̀ ᴗ - ˵) ✧


✏️ Learnings added
Learnt from: smypmsa
Repo: chainstacklabs/compare-dashboard-functions PR: 59
File: common/metric_types.py:0-0
Timestamp: 2026-04-13T14:43:40.770Z
Learning: In chainstacklabs/compare-dashboard-functions, EVMBlockNumberLatencyMetric intentionally uses inheritance (template method / hook pattern) over composition. Each per-chain HTTPBlockNumberLatencyMetric is a one-liner subclass; threading a parser callback through every constructor across all chain files was considered worse. The CLAUDE.md "composition over inheritance" guideline does not apply here.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: CR
Repo: chainstacklabs/compare-dashboard-functions PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:55:12.205Z
Learning: Applies to metrics/*.py : Use composition over inheritance for metric classes

Learnt from: CR
Repo: chainstacklabs/compare-dashboard-functions PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:55:12.205Z
Learning: Applies to metrics/*.py : New chain metrics must subclass BaseMetric and implement collect_metric() and process_data() methods in metrics/<chain>.py

Comment thread common/metric_types.py
Comment thread common/metrics_handler.py Outdated
Comment thread common/state/blockchain_fetcher.py Outdated
Comment thread dashboards/tests/test_grafana_sync.py Outdated
Comment thread metrics/solana.py
Comment thread metrics/ton.py
dependabot bot and others added 2 commits April 13, 2026 08:39
---
updated-dependencies:
- dependency-name: aiohttp
  dependency-version: 3.13.4
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Shift lag computation from server to Grafana query time.

Before: _compute_block_lag() grouped providers, found tip=max(block_numbers),
stored (tip - provider_block) as metric_type=block_lag. Pre-computed diff
meant losing raw data and requiring redeploy to change the tip formula.

After: _emit_block_numbers() stores the raw block number as
metric_type=block_number per provider. Grafana computes the diff at query
time using max(block_number) - provider_block_number, making the tip
formula configurable without code changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@common/metrics_handler.py`:
- Around line 41-46: The docstring for the helper that "Emit[s] raw block
numbers" (the function that stores metric_type="block_number") should be
converted to a Google-style docstring: add an "Args:" section describing each
parameter with types, a "Returns:" section stating the return type and meaning
(or "None" if it returns nothing), and a "Raises:" section if it can raise
exceptions; also ensure the function signature includes explicit Python type
hints for parameters and return value to match repository guidelines. Replace
the current free-form description with these sections while keeping the existing
explanation about Grafana-relative lag and skipping instances that didn't
capture a block.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b6c6c3d3-cfe4-47dd-a13c-103ba0615255

📥 Commits

Reviewing files that changed from the base of the PR and between b9cfd78 and 1de3c7d.

📒 Files selected for processing (1)
  • common/metrics_handler.py

Comment thread common/metrics_handler.py
Keep _emit_block_numbers() from feature branch over _compute_block_lag()
from master — raw block numbers emitted for Grafana-side lag computation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@smypmsa smypmsa changed the title feat: collect block lag metric across all chains feat: emit raw block numbers for lag tracking across all chains Apr 13, 2026
- Widen get_params_from_state return type to Union[dict, list] in base
  class so EVMBlockNumberLatencyMetric's list return is type-correct
  (JSON-RPC params can be positional list or named dict)
- Isolate _on_json_response hook in try/except so a hook failure cannot
  break the latency metric; logs at DEBUG and clears captured block number
- Clear _captured_block_number in mark_failure() so a failed metric
  cannot emit a stale block number via _emit_block_numbers()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@common/metric_types.py`:
- Around line 301-307: Remove the redundant `pass` from the empty hook method
`_on_json_response(self, json_response: dict[str, Any]) -> None` in
common/metric_types.py so the method contains only its docstring; locate the
`_on_json_response` definition and delete the `pass` line, leaving the docstring
as the sole body of the method.
- Around line 329-335: In _on_json_response, replace the try/except/pass around
int(result, 16) with contextlib.suppress(ValueError) to make the failure case
explicit and the code cleaner: import contextlib.suppress, wrap the assignment
self._captured_block_number = int(result, 16) in a with suppress(ValueError):
block inside the _on_json_response method so invalid hex strings are quietly
ignored without a bare try/except.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1a26386e-ba95-438d-b871-089fbb901c81

📥 Commits

Reviewing files that changed from the base of the PR and between 1de3c7d and c24527a.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • common/base_metric.py
  • common/metric_types.py
  • requirements.txt

Comment thread common/metric_types.py Outdated
Comment thread common/metric_types.py Outdated
smypmsa and others added 2 commits April 13, 2026 16:06
- Migrate ruff config from top-level to lint. section (removes deprecation warnings)
- Exclude .claude worktree directory from linting
- Shorten HTTPBlockNumberLatencyMetric docstrings across all EVM chain files (E501)
- Shorten HTTPGetMasterchainInfoLatencyMetric docstring in ton.py (E501)
- Move inline address/topic comments to their own lines in EVM metrics (E501)
- Add noqa: E501 to long hex data strings that cannot be split
- Add missing docstrings to SolanaLandingMetric class, __init__, fetch_data, process_data

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
metrics/ethereum.py (1)

157-181: ⚠️ Potential issue | 🟠 Major

Make ws_endpoint an explicit typed parameter.

With **kwargs: object, kwargs.pop("ws_endpoint", None) is inferred as object | None, which violates strict mypy type checking. The parent class WebSocketMetric.__init__() expects ws_endpoint: Optional[str]. Promote ws_endpoint to an explicit keyword-only parameter.

Type-safe shape
     def __init__(
         self,
         handler: "MetricsHandler",  # type: ignore  # noqa: F821
         metric_name: str,
         labels: MetricLabels,
         config: MetricConfig,
-        **kwargs: object,
+        *,
+        ws_endpoint: Optional[str] = None,
     ) -> None:
@@
-        ws_endpoint = kwargs.pop("ws_endpoint", None)
         super().__init__(
             handler=handler,
             metric_name=metric_name,
             labels=labels,
             config=config,
             ws_endpoint=ws_endpoint,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics/ethereum.py` around lines 157 - 181, The __init__ method currently
pulls ws_endpoint from **kwargs which types to object and breaks mypy; change
the signature of the constructor in the class (the __init__ that takes handler:
"MetricsHandler", metric_name: str, labels: MetricLabels, config: MetricConfig)
to accept an explicit keyword-only parameter ws_endpoint: Optional[str] = None
(import Optional from typing), remove kwargs.pop("ws_endpoint", None) and
**kwargs usage, and pass ws_endpoint directly into super().__init__
(WebSocketMetric.__init__) so the parameter types match the parent's
ws_endpoint: Optional[str].
♻️ Duplicate comments (2)
dashboards/tests/test_grafana_sync.py (1)

85-91: 🧹 Nitpick | 🔵 Trivial

Combine nested context managers into one with statement.

This still triggers Ruff SIM117 and should be flattened.

Proposed fix
-    with patch.object(
-        m,
-        "api_get",
-        return_value=[{"uid": "d1", "folderTitle": "Other", "folderUid": "abc"}],
-    ):
-        with pytest.raises(SystemExit):
-            m.resolve_folder_uid(m.load_config())
+    with patch.object(
+        m,
+        "api_get",
+        return_value=[{"uid": "d1", "folderTitle": "Other", "folderUid": "abc"}],
+    ), pytest.raises(SystemExit):
+        m.resolve_folder_uid(m.load_config())

As per coding guidelines, **/*.py: "Run ruff check and ruff check --fix before completing any task."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dashboards/tests/test_grafana_sync.py` around lines 85 - 91, Flatten the
nested context managers into a single with statement to satisfy Ruff SIM117:
instead of nesting with patch.object(...) and with pytest.raises(SystemExit):,
combine them into one statement so the patch and the exception expectation are
applied in the same with block when calling
m.resolve_folder_uid(m.load_config()); update the test that patches m.api_get
and asserts a SystemExit to use a single with that includes both patch.object(m,
"api_get", return_value=[{"uid":"d1","folderTitle":"Other","folderUid":"abc"}])
and pytest.raises(SystemExit).
common/state/blockchain_fetcher.py (1)

129-135: ⚠️ Potential issue | 🟠 Major

Guard transaction list access before indexing.

At Line 130, transactions[0] is evaluated before confirming the list is non-empty, so empty blocks can raise IndexError and incorrectly fall back to empty state.

Proposed fix
             tx_hash = (
-                transactions[0].get("hash", "")
-                if isinstance(transactions[0], dict)
-                else transactions[0]
-                if transactions
-                else ""
+                transactions[0].get("hash", "")
+                if transactions and isinstance(transactions[0], dict)
+                else transactions[0] if transactions else ""
             )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@common/state/blockchain_fetcher.py` around lines 129 - 135, The current
computation of tx_hash accesses transactions[0] before confirming the list is
non-empty, which can raise IndexError for empty blocks; update the logic around
tx_hash to first check transactions truthiness (e.g., if transactions) before
indexing, then determine whether that first element is a dict and extract
.get("hash","") accordingly, ensuring no direct access to transactions[0] unless
transactions is non-empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@common/metric_types.py`:
- Around line 223-230: Update the HTTP error handling so that when
response.status is one of 401, 403, 404, or 429 you clear the instance variable
_captured_block_number and return a zero-value sample instead of raising; for
all other non-200 statuses keep the existing aiohttp.ClientResponseError raise
path. Locate the branch around the response.status check in
common/metric_types.py (the block that currently raises
aiohttp.ClientResponseError), add a conditional for statuses {401,403,404,429}
to clear _captured_block_number and return the zero sample, and leave the
existing raise for other statuses so mark_failure() will still see genuine
errors.
- Around line 249-250: Remove the erroneous await around response.release()
calls in common/metric_types.py: locate both occurrences where the code does
"await response.release()" (the finally block around the request/response
handling and the retry path) and change them to synchronous calls
"response.release()" since aiohttp.ClientResponse.release() is not a coroutine;
ensure no other surrounding logic expects a coroutine (no await) and keep the
release calls inside their existing finally/retry blocks.

In `@dashboards/grafana_sync.py`:
- Line 27: Replace the one-line docstrings for each function in
dashboards/grafana_sync.py (the functions whose docstrings appear at the lines
noted in the review) with Google-style docstrings: add concise description plus
Args, Returns and Raises sections where applicable, and ensure every function
signature includes Python type hints for parameters and return types; update the
specific short docstrings at the listed locations to follow this format and keep
content equivalent to the original one-liner while adding types and sections.

In `@metrics/ethereum.py`:
- Around line 198-216: recv_with_timeout assumes websocket.recv() yields only
str; handle binary frames (bytes) by checking isinstance(message, bytes),
compute message_size = len(message) for logging, decode bytes to a str using
message.decode("utf-8", errors="replace") before returning, and keep/adjust the
return type annotation accordingly (update any typing to reflect that the
function will return a decoded str). Make the changes in the recv_with_timeout
coroutine where websocket.recv() is awaited.

In `@pyproject.toml`:
- Line 9: Replace the explicit Ruff "exclude = [\".claude\"]" setting with
"extend-exclude" so you append ".claude" to Ruff's default exclude patterns
rather than overwriting them; locate the pyproject.toml entry that defines
exclude and change it to use extend-exclude = [".claude"] so default excludes
(e.g., .git, __pycache__, venv, node_modules, build/, dist/) are preserved.

---

Outside diff comments:
In `@metrics/ethereum.py`:
- Around line 157-181: The __init__ method currently pulls ws_endpoint from
**kwargs which types to object and breaks mypy; change the signature of the
constructor in the class (the __init__ that takes handler: "MetricsHandler",
metric_name: str, labels: MetricLabels, config: MetricConfig) to accept an
explicit keyword-only parameter ws_endpoint: Optional[str] = None (import
Optional from typing), remove kwargs.pop("ws_endpoint", None) and **kwargs
usage, and pass ws_endpoint directly into super().__init__
(WebSocketMetric.__init__) so the parameter types match the parent's
ws_endpoint: Optional[str].

---

Duplicate comments:
In `@common/state/blockchain_fetcher.py`:
- Around line 129-135: The current computation of tx_hash accesses
transactions[0] before confirming the list is non-empty, which can raise
IndexError for empty blocks; update the logic around tx_hash to first check
transactions truthiness (e.g., if transactions) before indexing, then determine
whether that first element is a dict and extract .get("hash","") accordingly,
ensuring no direct access to transactions[0] unless transactions is non-empty.

In `@dashboards/tests/test_grafana_sync.py`:
- Around line 85-91: Flatten the nested context managers into a single with
statement to satisfy Ruff SIM117: instead of nesting with patch.object(...) and
with pytest.raises(SystemExit):, combine them into one statement so the patch
and the exception expectation are applied in the same with block when calling
m.resolve_folder_uid(m.load_config()); update the test that patches m.api_get
and asserts a SystemExit to use a single with that includes both patch.object(m,
"api_get", return_value=[{"uid":"d1","folderTitle":"Other","folderUid":"abc"}])
and pytest.raises(SystemExit).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: bdfa2b04-19eb-4311-8056-4364ad73f83a

📥 Commits

Reviewing files that changed from the base of the PR and between c24527a and a0e4899.

📒 Files selected for processing (34)
  • api/read/arbitrum.py
  • api/read/base.py
  • api/read/bnbsc.py
  • api/read/ethereum.py
  • api/read/hyperliquid.py
  • api/read/monad.py
  • api/read/solana.py
  • api/read/ton.py
  • api/support/update_state.py
  • api/write/solana.py
  • common/__init__.py
  • common/base_metric.py
  • common/factory.py
  • common/metric_config.py
  • common/metric_types.py
  • common/metrics_handler.py
  • common/state/__init__.py
  • common/state/blob_storage.py
  • common/state/blockchain_fetcher.py
  • config/defaults.py
  • dashboards/grafana_sync.py
  • dashboards/tests/__init__.py
  • dashboards/tests/test_grafana_sync.py
  • metrics/__init__.py
  • metrics/arbitrum.py
  • metrics/base.py
  • metrics/bnbsc.py
  • metrics/ethereum.py
  • metrics/hyperliquid.py
  • metrics/monad.py
  • metrics/solana.py
  • metrics/solana_landing_rate.py
  • metrics/ton.py
  • pyproject.toml

Comment thread common/metric_types.py
Comment on lines +223 to +230
if response.status != 200:
raise aiohttp.ClientResponseError(
request_info=response.request_info,
history=(),
status=response.status,
message=f"Status code: {response.status}",
headers=response.headers,
)
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle 401/403/404/429 as silent zero-value results.

This new centralized branch still raises for plan-limit / restriction statuses, so those responses will bubble into mark_failure() instead of being ignored. Return a zero sample here and clear _captured_block_number for 401/403/404/429 before raising for other HTTP errors.

Suggested direction
-            if response.status != 200:
+            if response.status in {401, 403, 404, 429}:
+                self._captured_block_number = None
+                return 0.0
+            if response.status != 200:
                 raise aiohttp.ClientResponseError(
                     request_info=response.request_info,
                     history=(),
                     status=response.status,

As per coding guidelines, "Silently ignore HTTP 401/403/404/429 errors (plan restrictions/rate limits) in error handling; zero metrics and log other errors with mark_failure()".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@common/metric_types.py` around lines 223 - 230, Update the HTTP error
handling so that when response.status is one of 401, 403, 404, or 429 you clear
the instance variable _captured_block_number and return a zero-value sample
instead of raising; for all other non-200 statuses keep the existing
aiohttp.ClientResponseError raise path. Locate the branch around the
response.status check in common/metric_types.py (the block that currently raises
aiohttp.ClientResponseError), add a conditional for statuses {401,403,404,429}
to clear _captured_block_number and return the zero sample, and leave the
existing raise for other statuses so mark_failure() will still see genuine
errors.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing handle_error already gates on IGNORED_HTTP_ERRORS (401/403/404/429) and returns early; mark_failure also clears _captured_block_number so no stale block number is emitted.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check handle_error and mark_failure implementations in base_metric.py
ast-grep --pattern 'def handle_error($$$)'
echo "---"
ast-grep --pattern 'def mark_failure($$$)'
echo "---"
rg -n "IGNORED_HTTP_ERRORS|handle_error|mark_failure|_captured_block_number" common/base_metric.py -A5

Length of output: 4256


@smypmsa, you're right — I traced the full flow and the existing infrastructure handles this correctly:

  1. _process_response raises aiohttp.ClientResponseError for any non-200 status (including 401/403/404/429).
  2. collect_metric catches it and calls mark_failure() first — which explicitly sets self._captured_block_number = None, preventing any stale block number from being emitted.
  3. handle_error() then checks isinstance(error, aiohttp.ClientResponseError) and, if the status is in MetricsServiceConfig.IGNORED_HTTP_ERRORS, logs a warning and returns early — skipping metric submission entirely.

My original concern was unfounded. The guard is already in place. Apologies for the noise.


✏️ Learnings added
Learnt from: smypmsa
Repo: chainstacklabs/compare-dashboard-functions PR: 59
File: common/metric_types.py:224-231
Timestamp: 2026-04-13T14:44:37.911Z
Learning: In chainstacklabs/compare-dashboard-functions, HTTP 401/403/404/429 errors are silently handled by the existing BaseMetric infrastructure: `mark_failure()` clears `_captured_block_number`, and `handle_error()` gates on `MetricsServiceConfig.IGNORED_HTTP_ERRORS` returning early without submitting a metric. No special-casing is needed at the `_process_response` level in `common/metric_types.py`.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: CR
Repo: chainstacklabs/compare-dashboard-functions PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:55:12.205Z
Learning: Applies to {metrics,common}/*.py : Silently ignore HTTP 401/403/404/429 errors (plan restrictions/rate limits) in error handling; zero metrics and log other errors with mark_failure()

Learnt from: smypmsa
Repo: chainstacklabs/compare-dashboard-functions PR: 59
File: common/metric_types.py:0-0
Timestamp: 2026-04-13T14:43:40.770Z
Learning: In chainstacklabs/compare-dashboard-functions, EVMBlockNumberLatencyMetric intentionally uses inheritance (template method / hook pattern) over composition. Each per-chain HTTPBlockNumberLatencyMetric is a one-liner subclass; threading a parser callback through every constructor across all chain files was considered worse. The CLAUDE.md "composition over inheritance" guideline does not apply here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing handle_error() method already returns early for 401/403/404/429 before setting any metric value. Since mark_failure() only zeros existing keys in self.values (which is {} on a fresh instance), these errors result in no values being pushed to Grafana, matching the silent-ignore spec.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted — keeping this out of scope for the block-number PR. The change flips emitted response_status from failed (value=0) to success (value=0) for 401/403/404/429 responses, and that behavior shift applies to every HttpCallLatencyMetricBase subclass — not just the new block-number capture path. Dashboards that distinguish response_status=success vs failed would see a real change. Worth its own PR with a dashboard review.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smypmsa, that's a well-reasoned scoping decision. The failedsuccess flip for response_status on 401/403/404/429 is a dashboard-observable behavior change that cuts across all HttpCallLatencyMetricBase subclasses — isolating it to its own PR with an explicit dashboard review is the right call.

Would you like me to open a follow-up GitHub issue to track this? I can include the context about the response_status label flip and the dashboard review requirement.


✏️ Learnings added
Learnt from: smypmsa
Repo: chainstacklabs/compare-dashboard-functions PR: 59
File: common/metric_types.py:230-237
Timestamp: 2026-04-15T08:56:43.964Z
Learning: In chainstacklabs/compare-dashboard-functions, changing `_process_response` in `common/metric_types.py` to return `0.0` early for HTTP 401/403/404/429 (instead of raising `aiohttp.ClientResponseError`) would flip the emitted `response_status` tag from `failed` to `success` for those responses. This affects all `HttpCallLatencyMetricBase` subclasses and any Grafana dashboards that distinguish `response_status=success` vs `response_status=failed`. Such a change requires its own PR with a dashboard review, separate from the block-number capture work.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
gh issue create \
  --repo chainstacklabs/compare-dashboard-functions \
  --title "feat: return 0.0 early for HTTP 401/403/404/429 in _process_response (response_status label change)" \
  --assignee smypmsa \
  --body "## Summary

In \`common/metric_types.py\`, \`HttpCallLatencyMetricBase._process_response\` currently raises \`aiohttp.ClientResponseError\` for **all** non-200 HTTP statuses, including 401, 403, 404, and 429. The existing \`BaseMetric.handle_error()\` then gates on \`MetricsServiceConfig.IGNORED_HTTP_ERRORS\` and returns early, so no metric value is pushed to Grafana — matching the silent-ignore spec.

An alternative would be to return \`0.0\` early (and clear \`_captured_block_number\`) for those statuses directly inside \`_process_response\`. However, this would flip the emitted \`response_status\` tag from \`failed\` to \`success\` for 401/403/404/429 responses, which is a dashboard-observable behavior change.

## Scope of impact

- Affects **all** \`HttpCallLatencyMetricBase\` subclasses across every chain.
- Any Grafana dashboard panel that filters or aggregates on \`response_status=failed\` vs \`response_status=success\` would see different results.
- Requires an explicit dashboard review before merging.

## Action items

- [ ] Implement early \`return 0.0\` path for \`{401, 403, 404, 429}\` in \`_process_response\`
- [ ] Clear \`_captured_block_number\` on the early return path
- [ ] Audit all Grafana dashboards for panels using \`response_status\` label
- [ ] Update dashboard queries / alerts as needed
- [ ] Add or update tests covering the new early-return path

## References

- PR `#59` (block-number capture work): https://github.com/chainstacklabs/compare-dashboard-functions/pull/59
- Discussion comment: https://github.com/chainstacklabs/compare-dashboard-functions/pull/59#discussion_r3073618830

Requested by `@smypmsa`."

Length of output: 71


@smypmsa, I've opened a follow-up issue to track this: #61 – feat: return 0.0 early for HTTP 401/403/404/429 in _process_response (response_status label change).

It captures the scope of impact (all HttpCallLatencyMetricBase subclasses, Grafana response_status label flip) and the action items (implementation, dashboard audit, test coverage) so it's ready to pick up in a dedicated PR with a dashboard review.


🧠 Learnings used
Learnt from: smypmsa
Repo: chainstacklabs/compare-dashboard-functions PR: 59
File: common/metric_types.py:230-237
Timestamp: 2026-04-15T08:56:43.964Z
Learning: In chainstacklabs/compare-dashboard-functions, changing `_process_response` in `common/metric_types.py` to return `0.0` early for HTTP 401/403/404/429 (instead of raising `aiohttp.ClientResponseError`) would flip the emitted `response_status` tag from `failed` to `success` for those responses. This affects all `HttpCallLatencyMetricBase` subclasses and any Grafana dashboards that distinguish `response_status=success` vs `response_status=failed`. Such a change requires its own PR with a dashboard review, separate from the block-number capture work.

Learnt from: smypmsa
Repo: chainstacklabs/compare-dashboard-functions PR: 59
File: dashboards/grafana_sync.py:27-27
Timestamp: 2026-04-15T08:56:43.269Z
Learning: When reviewing Python changes in this repo, avoid asking for broad docstring expansions (e.g., adding multi-section "Args/Returns/Raises" blocks) in files that are unrelated to the PR’s feature scope. Keep one-line docstrings that already satisfy pydocstyle with the "google" convention unchanged. If wider docstring-section coverage is desired as a baseline, defer it to a dedicated docstring-sweep PR rather than expanding docstrings opportunistically during feature PR review.

Comment thread common/metric_types.py Outdated


def load_config() -> dict:
"""Load and validate required Grafana config from environment variables."""
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Convert new function docstrings to Google-style format.

The added docstrings are plain one-liners; they should follow the repository’s Google-style docstring requirement for Python functions.

As per coding guidelines, **/*.py: "Follow PEP 8 style guide and use Google-style docstrings with type hints on all functions".

Also applies to: 41-41, 49-49, 54-54, 67-67, 76-76, 85-85, 95-95, 103-103, 139-139, 176-176, 222-222, 250-250

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dashboards/grafana_sync.py` at line 27, Replace the one-line docstrings for
each function in dashboards/grafana_sync.py (the functions whose docstrings
appear at the lines noted in the review) with Google-style docstrings: add
concise description plus Args, Returns and Raises sections where applicable, and
ensure every function signature includes Python type hints for parameters and
return types; update the specific short docstrings at the listed locations to
follow this format and keep content equivalent to the original one-liner while
adding types and sections.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One-liner docstrings are valid Google style for simple functions with no parameters to document; no multi-section format is required when there are no Args or Returns worth describing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smypmsa, you're absolutely right. The Google Style Guide explicitly states that Args/Returns/Raises sections "can be omitted in cases where the function's name and signature are informative enough that it can be aptly described using a one-line docstring." The functions in grafana_sync.py clearly fall into this category. I'll update my understanding for future reviews.


✏️ Learnings added
Learnt from: smypmsa
Repo: chainstacklabs/compare-dashboard-functions PR: 59
File: dashboards/grafana_sync.py:27-27
Timestamp: 2026-04-13T14:44:24.866Z
Learning: In chainstacklabs/compare-dashboard-functions (and generally for **/*.py files), one-liner Google-style docstrings are valid and sufficient for simple, well-named functions whose name and signature are self-explanatory. Args/Returns/Raises sections are only required when parameters, return values, or exceptions need additional explanation beyond what the signature conveys. Do not flag one-liner docstrings as non-compliant with Google style when the function is simple and self-descriptive.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: CR
Repo: chainstacklabs/compare-dashboard-functions PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:55:12.205Z
Learning: Applies to **/*.py : Follow PEP 8 style guide and use Google-style docstrings with type hints on all functions

Learnt from: smypmsa
Repo: chainstacklabs/compare-dashboard-functions PR: 59
File: common/metric_types.py:0-0
Timestamp: 2026-04-13T14:43:40.770Z
Learning: In chainstacklabs/compare-dashboard-functions, EVMBlockNumberLatencyMetric intentionally uses inheritance (template method / hook pattern) over composition. Each per-chain HTTPBlockNumberLatencyMetric is a one-liner subclass; threading a parser callback through every constructor across all chain files was considered worse. The CLAUDE.md "composition over inheritance" guideline does not apply here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One-liner docstrings are valid Google style for trivial functions with no non-obvious parameters or return values; expanding them all to multi-section form would add boilerplate without improving readability.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declining for this PR — these one-liner docstrings already pass pydocstyle under convention = "google", and grafana_sync.py is unrelated to the block-number capture feature. Expanding to full Args/Returns/Raises sections here would be cosmetic noise in a feature PR. Better handled in a dedicated docstring-sweep PR across the codebase if the team wants that baseline.

Comment thread metrics/ethereum.py
Comment thread pyproject.toml Outdated
smypmsa and others added 3 commits April 13, 2026 16:23
- Remove redundant `pass` from _on_json_response base hook (docstring suffices)
- Use contextlib.suppress(ValueError) instead of try/except/pass in EVMBlockNumberLatencyMetric
- Add docstrings to _on_json_response in metrics/solana.py and metrics/ton.py
- Convert _emit_block_numbers docstring to Google-style with Returns section
- Guard transactions[0] access behind emptiness check in blockchain_fetcher.py
- Combine nested with statements in test_grafana_sync.py (SIM117)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove await from response.release() calls: aiohttp 3.13 release() is sync
- Handle binary frames in recv_with_timeout by decoding bytes to str
- Use extend-exclude instead of exclude in pyproject.toml to preserve ruff defaults

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move _captured_block_number from BaseMetric to HttpCallLatencyMetricBase —
it's JSON-RPC-specific state and doesn't belong on WebSocket/Solana-landing
instances. MetricsHandler._emit_block_numbers now reads it via getattr so
non-HTTP instances are cleanly skipped without an isinstance check. Override
mark_failure in HttpCallLatencyMetricBase to preserve the stale-value clear.

Bump the hook-failure log from debug to warning so a broken capture is
visible in prod logs, not silent.

Drop uv.lock (was accidentally committed with requires-python=">=3.13",
contradicting the py39 target) and gitignore it to prevent recurrence.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
dashboards/tests/test_grafana_sync.py (1)

156-170: ⚠️ Potential issue | 🟠 Major

Add strict type annotations and fix unused parameter in fake_api_get using Python 3.9-compatible syntax.

Line 156 lacks return type annotation and has an unused cfg parameter, failing strict typing/lint gates. The proposed fix must use Union instead of | syntax to maintain Python 3.9 compatibility per coding guidelines.

Proposed fix
+from typing import Any, Union
+
-    def fake_api_get(cfg, path):
+    def fake_api_get(_cfg: Any, path: str) -> Union[list[dict[str, Any]], dict[str, Any]]:
         if "/api/search" in path and "type=dash-db&limit=200" in path:
             return [
                 {
                     "uid": "abc",
                     "title": "My Board",
                     "folderTitle": "MyFolder",
                     "folderUid": "xyz",
                 }
             ]
         if "/api/search" in path:
             return search_results
         if "/api/dashboards/uid/" in path:
             return detail
         return []
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dashboards/tests/test_grafana_sync.py` around lines 156 - 170, Annotate
fake_api_get with explicit typing compatible with Python 3.9: rename the unused
cfg parameter to _cfg (or prefix it with underscore) to mark it unused, add
typing imports (List, Dict, Any) and set the return annotation to List[Dict[str,
Any]] for the function signature def fake_api_get(_cfg: Any, path: str) ->
List[Dict[str, Any]]; do not use the | union operator (use typing.Union only if
you need multiple types elsewhere) and ensure the body still returns list of
dicts as before.
common/metric_types.py (1)

267-305: ⚠️ Potential issue | 🟠 Major

Reset connection timing state on each retry.

timing survives across retry attempts. After a 429, the next request can reuse the existing connection and skip the trace callbacks, leaving conn_start/conn_end from the first attempt in place. That makes the final rpc_time too low and can even trip the negative-latency guard on a successful retry.

Suggested fix
         async with aiohttp.ClientSession(
             trace_configs=[trace_config],
         ) as session:
             response_time = 0.0
             response = None

             for retry_count in range(MAX_RETRIES):
+                timing.clear()
                 start_time: float = time.monotonic()
                 response = await self._send_request(session, endpoint)
                 response_time = time.monotonic() - start_time

                 if response.status == 429 and retry_count < MAX_RETRIES - 1:
                     wait_time = int(response.headers.get("Retry-After", 3))
                     response.release()
                     await asyncio.sleep(wait_time)
                     continue

                 break

             conn_time = (
                 timing["conn_end"] - timing["conn_start"]
                 if "conn_start" in timing and "conn_end" in timing
-                else 0
+                else 0.0
             )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@common/metric_types.py` around lines 267 - 305, The timing dict used by
on_connection_create_start/on_connection_create_end persists across retries
causing stale conn_start/conn_end values; to fix, clear or reinitialize timing
at the start of each retry iteration (inside the for retry_count in
range(MAX_RETRIES) loop, before calling self._send_request) so each attempt gets
fresh connection timing. Update the code around timing, the
on_connection_create_start/on_connection_create_end callbacks, and the loop that
calls _send_request and computes conn_time to ensure timing is reset per
attempt.
common/base_metric.py (1)

116-132: ⚠️ Potential issue | 🟠 Major

Make ignored 401/403/404/429 paths truly silent.

These branches still emit warning logs for the exact status codes that are supposed to be treated as expected plan-limit / rate-limit cases. That will generate noisy logs for normal, non-actionable scrapes even though the metric is intentionally skipped.

Suggested fix
         if isinstance(error, aiohttp.ClientResponseError):
             status_code: int = error.status
             if status_code in MetricsServiceConfig.IGNORED_HTTP_ERRORS:
-                logging.warning(
-                    f"Ignoring HTTP error [{status_code}] for "
-                    f"{self.labels.get_prometheus_labels()}"
-                )
                 return  # Skip metric submission for ignored errors

         if isinstance(error, websockets.exceptions.InvalidStatusCode):
             status_code = error.status_code
             if status_code in MetricsServiceConfig.IGNORED_HTTP_ERRORS:
-                logging.warning(
-                    f"Ignoring WebSocket connection error [{status_code}] for "
-                    f"{self.labels.get_prometheus_labels()}"
-                )
                 return  # Skip metric submission for ignored errors

As per coding guidelines, {metrics,common}/*.py: "Silently ignore HTTP 401/403/404/429 errors (plan restrictions/rate limits) in error handling; zero metrics and log other errors with mark_failure()".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@common/base_metric.py` around lines 116 - 132, The branches that check
MetricsServiceConfig.IGNORED_HTTP_ERRORS currently still call logging.warning
and therefore produce noise; update the aiohttp.ClientResponseError and
websockets.exceptions.InvalidStatusCode branches to be truly silent for ignored
status codes by removing the logging.warning calls and returning immediately
(preserving the existing return behavior), and ensure other errors still call
mark_failure() and log as before; reference the symbols
MetricsServiceConfig.IGNORED_HTTP_ERRORS, labels.get_prometheus_labels(),
aiohttp.ClientResponseError, and websockets.exceptions.InvalidStatusCode to
locate and modify the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dashboards/tests/test_grafana_sync.py`:
- Around line 293-315: The test helpers declare unused parameters causing
ARG001; update the function signatures to mark unused args with leading
underscores so lint ignores them: change fake_api_get(cfg, path) to
fake_api_get(_cfg, path) and change fake_api_post(cfg, path, payload) to
fake_api_post(_cfg, _path, payload), leaving all internal logic (abc_call_count,
posted append, return values, and use of path/payload) unchanged; reference
functions fake_api_get and fake_api_post in your edits.

In `@pyproject.toml`:
- Line 25: The project currently suppresses ANN401 by listing it in the ignore
setting (ignore = ["D203", "D213", "ANN401"]) which permits typing.Any in
annotations and weakens type safety; remove "ANN401" from that ignore list in
pyproject.toml so static checks will enforce more precise annotations, or if
keeping it is intentional add a short justification comment in pyproject.toml
explaining why typing.Any is required for specific modules (e.g.,
dynamic/external API handling) and add targeted # type: ignore comments instead
of a global suppression.
- Line 11: The Ruff configuration uses the newer [tool.ruff.lint] namespace and
nested keys (isort, mccabe, pydocstyle, per-file-ignores) which require Ruff >=
0.4.0, so pin the ruff dev dependency to ">=0.4.0" (or a specific 0.4.x+/later
version) in your project’s dev dependencies (where dev tools are declared) so
these settings are respected; update the dependency entry that provides the ruff
package to include the version constraint and run dependency/lockfile update.

---

Outside diff comments:
In `@common/base_metric.py`:
- Around line 116-132: The branches that check
MetricsServiceConfig.IGNORED_HTTP_ERRORS currently still call logging.warning
and therefore produce noise; update the aiohttp.ClientResponseError and
websockets.exceptions.InvalidStatusCode branches to be truly silent for ignored
status codes by removing the logging.warning calls and returning immediately
(preserving the existing return behavior), and ensure other errors still call
mark_failure() and log as before; reference the symbols
MetricsServiceConfig.IGNORED_HTTP_ERRORS, labels.get_prometheus_labels(),
aiohttp.ClientResponseError, and websockets.exceptions.InvalidStatusCode to
locate and modify the code.

In `@common/metric_types.py`:
- Around line 267-305: The timing dict used by
on_connection_create_start/on_connection_create_end persists across retries
causing stale conn_start/conn_end values; to fix, clear or reinitialize timing
at the start of each retry iteration (inside the for retry_count in
range(MAX_RETRIES) loop, before calling self._send_request) so each attempt gets
fresh connection timing. Update the code around timing, the
on_connection_create_start/on_connection_create_end callbacks, and the loop that
calls _send_request and computes conn_time to ensure timing is reset per
attempt.

In `@dashboards/tests/test_grafana_sync.py`:
- Around line 156-170: Annotate fake_api_get with explicit typing compatible
with Python 3.9: rename the unused cfg parameter to _cfg (or prefix it with
underscore) to mark it unused, add typing imports (List, Dict, Any) and set the
return annotation to List[Dict[str, Any]] for the function signature def
fake_api_get(_cfg: Any, path: str) -> List[Dict[str, Any]]; do not use the |
union operator (use typing.Union only if you need multiple types elsewhere) and
ensure the body still returns list of dicts as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f7bfdb1e-3f08-4dcf-a7e1-c99168c60ba1

📥 Commits

Reviewing files that changed from the base of the PR and between a0e4899 and bec9564.

📒 Files selected for processing (10)
  • .gitignore
  • common/base_metric.py
  • common/metric_types.py
  • common/metrics_handler.py
  • common/state/blockchain_fetcher.py
  • dashboards/tests/test_grafana_sync.py
  • metrics/ethereum.py
  • metrics/solana.py
  • metrics/ton.py
  • pyproject.toml

Comment thread dashboards/tests/test_grafana_sync.py Outdated
Comment thread pyproject.toml
target-version = "py39"
extend-exclude = [".claude"]

[tool.ruff.lint]
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

What Ruff version introduced the [tool.ruff.lint] configuration namespace? When should I use [tool.ruff.isort] vs [tool.ruff.lint.isort]?

💡 Result:

The [tool.ruff.lint] configuration namespace was introduced in Ruff 0.4.0. Use [tool.ruff.lint.isort] instead of the deprecated [tool.ruff.isort] for isort-specific linting configuration. The top-level [tool.ruff.isort] still works but triggers a deprecation warning recommending migration to the lint namespace.

Citations:


🏁 Script executed:

#!/bin/bash
# Description: Check Ruff version constraints

# Check pyproject.toml dependencies
rg -A 5 -B 5 "ruff" pyproject.toml

# Check if there's a requirements file or lock file
fd -e txt -e lock | xargs rg "ruff" 2>/dev/null || echo "No requirements/lock files found with ruff"

Repository: chainstacklabs/compare-dashboard-functions

Length of output: 887


Ensure Ruff version is pinned to 0.4.0 or later.

The [tool.ruff.lint] configuration namespace was introduced in Ruff 0.4.0. Your pyproject.toml uses the nested [tool.ruff.lint.*] structure (isort, mccabe, pydocstyle, per-file-ignores), which requires Ruff >= 0.4.0. Without pinning the Ruff version in your dependencies, older versions will silently ignore these lint settings.

Also applies to: 27-27, 30-30, 33-33, 36-36

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` at line 11, The Ruff configuration uses the newer
[tool.ruff.lint] namespace and nested keys (isort, mccabe, pydocstyle,
per-file-ignores) which require Ruff >= 0.4.0, so pin the ruff dev dependency to
">=0.4.0" (or a specific 0.4.x+/later version) in your project’s dev
dependencies (where dev tools are declared) so these settings are respected;
update the dependency entry that provides the ruff package to include the
version constraint and run dependency/lockfile update.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declining — this repo doesn't manage ruff as a project dependency (no [project.dependencies] or dev-deps group for tooling). Ruff is run externally via uvx ruff / team tooling, which handles version selection outside the project's pyproject.toml. Introducing a dev-deps block just to pin ruff is scope creep for a feature PR and a broader repo-wide decision.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smypmsa, understood — that makes sense. Since Ruff is managed externally via uvx ruff and version selection is handled outside pyproject.toml, pinning it here would be unnecessary overhead. I'll drop this suggestion.

(。•́︿•̀。)


✏️ Learnings added
Learnt from: smypmsa
Repo: chainstacklabs/compare-dashboard-functions PR: 59
File: pyproject.toml:11-11
Timestamp: 2026-04-15T09:16:37.524Z
Learning: In chainstacklabs/compare-dashboard-functions, Ruff is not managed as a project dependency (no `[project.dependencies]` or dev-deps group in pyproject.toml). It is run externally via `uvx ruff` / team tooling, so version pinning inside pyproject.toml is not applicable. Do not suggest adding a dev-deps block or pinning ruff in pyproject.toml for this repository.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: smypmsa
Repo: chainstacklabs/compare-dashboard-functions PR: 59
File: common/metric_types.py:230-237
Timestamp: 2026-04-15T08:56:43.964Z
Learning: In chainstacklabs/compare-dashboard-functions, changing `_process_response` in `common/metric_types.py` to return `0.0` early for HTTP 401/403/404/429 (instead of raising `aiohttp.ClientResponseError`) would flip the emitted `response_status` tag from `failed` to `success` for those responses. This affects all `HttpCallLatencyMetricBase` subclasses and any Grafana dashboards that distinguish `response_status=success` vs `response_status=failed`. Such a change requires its own PR with a dashboard review, separate from the block-number capture work.

Comment thread pyproject.toml
Ruff ARG001 on fake_api_get (cfg) and fake_api_post (cfg, path) inside
test_cmd_push_uploads_changed_skips_conflicts. Underscore prefix matches
the convention for intentionally-unused arguments and satisfies the lint
rule without changing call semantics.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@smypmsa smypmsa force-pushed the feat/block-lag-metric branch from e0fc51c to 294231f Compare April 15, 2026 14:10
@smypmsa smypmsa self-assigned this Apr 16, 2026
@smypmsa smypmsa merged commit 710c99c into master Apr 16, 2026
1 check passed
@smypmsa smypmsa deleted the feat/block-lag-metric branch April 16, 2026 08:39
smypmsa added a commit that referenced this pull request Apr 16, 2026
…tion

The block number feature (#59) emits metric_type=block_number on the same
response_latency_seconds measurement. Panels without a metric_type filter
pick up raw block numbers (~280M) in latency aggregations, breaking
provider scores, p95 charts, and per-method panels.

Add metric_type="response_time" to every response_latency_seconds selector
across all 31 dashboard files (8 blockchains, global + regional).
Solana transaction_landing_latency panels already had metric_type filters
and are untouched.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
smypmsa added a commit that referenced this pull request Apr 16, 2026
…tion (#62)

The block number feature (#59) emits metric_type=block_number on the same
response_latency_seconds measurement. Panels without a metric_type filter
pick up raw block numbers (~280M) in latency aggregations, breaking
provider scores, p95 charts, and per-method panels.

Add metric_type="response_time" to every response_latency_seconds selector
across all 31 dashboard files (8 blockchains, global + regional).
Solana transaction_landing_latency panels already had metric_type filters
and are untouched.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant