Skip to content

[RPC][Tracker] Reject invalid tracker message sizes and consume frame header#19591

Closed
cchung100m wants to merge 2 commits into
apache:mainfrom
cchung100m:issue-19585
Closed

[RPC][Tracker] Reject invalid tracker message sizes and consume frame header#19591
cchung100m wants to merge 2 commits into
apache:mainfrom
cchung100m:issue-19585

Conversation

@cchung100m
Copy link
Copy Markdown
Contributor

Hi Committers,

This PR fixes #19585.

Root Cause

  • TCPEventHandler.on_message parsed the 4-byte int 32 length header directly from the accumulated buffer without limits and left the header bytes in the buffer until the full payload arrived.
  • If the header decoded to 0 or to an extremely large (e.g., 0x7FFFFFFF) value, self._data could grow without bound (or the header would be repeatedly re-read without being consumed), leading to OOM or denial-of-service.

Solution

  • Introduce MAX_TRACKER_MSG_BYTES = 1 << 20 (1MiB).
  • After detecting at least 4 bytes in the buffer, read and immediately delete the 4-byte header.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the RPC tracker's robustness by introducing a maximum message size limit (MAX_TRACKER_MSG_BYTES) and refactoring the message processing loop in python/tvm/rpc/tracker.py to validate incoming data. The review feedback suggests improving the loop termination condition by checking the socket state and recommends wrapping the JSON parsing and handler execution in a try...except block to prevent crashes from malformed inputs.

Comment thread python/tvm/rpc/tracker.py
@@ -222,18 +224,21 @@ def on_message(self, message):

while True:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The while True loop continues processing messages from the internal buffer even if the connection is closed during an iteration (e.g., by self.close() in the previous iteration or inside call_handler). Changing this to while self._sock: ensures that the loop terminates immediately if the connection is no longer active.

Suggested change
while True:
while self._sock:

Comment thread python/tvm/rpc/tracker.py Outdated
@cchung100m cchung100m closed this May 21, 2026
@cchung100m cchung100m deleted the issue-19585 branch May 21, 2026 06:33
@cchung100m
Copy link
Copy Markdown
Contributor Author

Close PR because the original issue is tracked by #19586

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug][RPC] RPC tracker buffers unbounded data on a single TCP connection

1 participant