Skip to content

WIP: Massive pacemaker-based refactors#4011

Draft
nrwahl2 wants to merge 202 commits intoClusterLabs:mainfrom
nrwahl2:nrwahl2-based
Draft

WIP: Massive pacemaker-based refactors#4011
nrwahl2 wants to merge 202 commits intoClusterLabs:mainfrom
nrwahl2:nrwahl2-based

Conversation

@nrwahl2
Copy link
Copy Markdown
Contributor

@nrwahl2 nrwahl2 commented Dec 24, 2025

No description provided.

@nrwahl2 nrwahl2 force-pushed the nrwahl2-based branch 3 times, most recently from deaa1a7 to bc1dd5d Compare December 31, 2025 07:31
@nrwahl2 nrwahl2 force-pushed the nrwahl2-based branch 3 times, most recently from 852e402 to 5241abc Compare January 3, 2026 06:14
@nrwahl2 nrwahl2 mentioned this pull request Jan 3, 2026
@nrwahl2 nrwahl2 force-pushed the nrwahl2-based branch 10 times, most recently from 3416148 to 2dd98b1 Compare January 8, 2026 20:01
Comment thread daemons/based/based_remote.c Outdated
* reference to the client's \c GSource has been removed. There is likely
* only one reference when we call this function, and thus the client is
* likely to be freed before we return. The current GLib code looks as if
* this is always the case. However, that is a GLib implementation detail.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

g_main_context_dispatch() dispatches all pending sources in a batch before moving to the next main loop iteration. If we're running this function, then we're running it as a result of a particular source being dispatched. So I think there may be another reference to the source we're removing if it's also being dispatched as part of this batch.

So I'll weaken this assumption. But I still don't think we need to wait for remote clients to be destroyed by the callback before dropping clients. (See a couple of commits ahead of here.)

nrwahl2 added 29 commits May 2, 2026 23:54
Replace it with a boolean "modifies_cib".

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Using include-what-you-use.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
We return early if cib_discard_reply is set.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Use cib_api_operations_t:query() instead.

A CIB query should always be processed locally. Except for brief
intervals when changes are being processed, the CIB contents should be
the same on all nodes within a cluster partition.

As of this commit, a query is in fact always processed locally. The host
argument of query_from is ignored.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
parse_peer_options() is the only place where needs_reply can get set to
true. If we're in stand-alone mode, we can't receive a message from a
cluster peer.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
...meaning after the "if (!process)" guard.

This is an incremental change to make other changes easier. This changes
behavior in a small way -- if cib_status is not OK, we will return an
error instead of calling process_ping_reply(), which would sync our CIB.
However, returning an error seems more correct.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This doesn't change behavior, but it facilitates an upcoming change.
There are only two situations in which parse_peer_options() returns
false. In both cases, we're supposed to ignore the request. An upcoming
commit will make parse_peer_options() return void, and we'll make
decisions about how to proceed based on the values of process,
needs_reply, and local_notify. At that point, we don't want to check
cib_status for the two cases that we're supposed to ignore.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
The meaning of the return code was never documented, and I've found it
confusing. It meant "if false, don't do anything else -- don't check the
CIB status, don't call a processor function, don't send a reply, and
don't notify a local client."

IMO it's simpler to let the existing logic in based_handle_request()
handle these cases. We'll end up returning without doing anything if
appropriate.

The "return true" cases don't need any modification other than changing
to "return". The "return false" cases need to negate the default values
by setting process and needs_reply to false.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This is intended to be equivalent to the existing code.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
The leak was introduced by 2c4737b.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
It always returned true.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Drop the NULL check since we have only a single caller that guarantees
non-NULL. Use data->len rather than calling strlen().

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
The struct was created using calloc() (via pcmk__assert_alloc()), so we
can assume these arrays are already zeroed.

I've kept the line that sets msg->sender.id = 0, just so that its value
is explicitly documented.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Also, it should be functionally impossible for local_name_len to
overflow a uint32_t, but add an assertion due to type differences.

The memcpy() call is safe if local_name_len is 0; it will do nothing.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
The only caller outside of a unit test passes 0.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Pull it into pcmk__cpg_send_xml().

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
There are nine calls and none of them check the return value.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Of the 36 calls, only 6 checked the return value. We could aim to check
the return value in more places, but currently it's impossible for it to
return false unless Pacemaker was built without Corosync support (that
is, without support for any cluster layer), which in turn is currently
not allowed.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
We modify and reuse request XML elsewhere. We might as well do it here,
as it makes the function much shorter.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
The messages at the beginning of based_handle_request() give us
basically the same info as the messages we're dropping from
log_local_options() and parse_peer_options(). Those seem more useful
anyway, because they provide context for any other messages we log in
based_handle_request().

Also add a brief message for forwarding a request, since we return
immediately after forwarding.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Send a reply even if the cib_discard_reply flag is set. The client will
still discard the reply. (TODO: The client discards the reply for sync
requests but still needs to discard it for async.)

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status: in progress PRs that aren't ready yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants