Fix: libcib: Prevent based or cibadmin from crashing when handling an XPath query for an attribute#3980
Conversation
|
@gao-yan Sorry for the inactivity on this PR. There have been a lot of XML-related refactorings lately, but I see that the bug is still present. Would you be willing to update your patches against main? I am hoping to get this into 3.0.2-final. |
|
@clumens, thanks for paying attention. I'm revisiting this right away. |
… XPath query for an attribute With 5fe98f4 in 3.0.1, if an XPath query for an attribute resulted in a single match, for example: $ cibadmin --query --xpath "/cib/@crm_feature_set" , based would encounter a fatal assertion in pcmk__xml_copy(): error: pcmk__xml_copy: Triggered fatal assertion at xml.c:844 : src->type == XML_ELEMENT_NODE The assertion wouldn't be triggered if there were multiple matches for an attribute, despite the fact that only the last match would be displayed: $ cibadmin --query --xpath "//@uname" <xpath-query uname="node2"/> But in case --node-path option was anyhow specified, cibadmin would encounter a fatal assertion in output_xml_id(): error: output_xml_id: Triggered fatal assertion at cibadmin.c:865 : id != NULL If --no-children option was anyhow specified, the attribute name would be displayed as an element name and the value wouldn't be shown: $ cibadmin --query --xpath "/cib/@crm_feature_set" --no-children <crm_feature_set/> This commit fixes the issues by handling non-element matches separately. By creating an "xpath-query" XML element for a single match, it's displayed in the same format as for the last match in case of multiple matches so far: $ cibadmin --query --xpath "/cib/@crm_feature_set" <xpath-query crm_feature_set="3.20.5"/>
9efad10 to
686bf5f
Compare
|
Updated the patches based on main now. |
| pcmk__debug("Processing %s op for %s with %s", op, xpath, path); | ||
| free(path); | ||
|
|
||
| if (match->type != XML_ELEMENT_NODE) { |
There was a problem hiding this comment.
The assertion wouldn't be triggered if there were multiple matches for an
attribute, despite the fact that only the last match would be displayed:
This behavior is pretty strange to me. But it's because *answer != NULL, so pcmk__xml_copy() doesn't hit the assertion:
if (parent == NULL) {
xmlDoc *doc = NULL;
// The copy will be the root element of a new document
pcmk__assert(src->type == XML_ELEMENT_NODE);
doc = pcmk__xml_new_doc();
copy = xmlDocCopyNode(src, doc, 1);
pcmk__mem_assert(copy);
xmlDocSetRootElement(doc, copy);
} else {
copy = xmlDocCopyNode(src, parent->doc, 1);
pcmk__mem_assert(copy);
xmlAddChild(parent, copy);
}
There was a problem hiding this comment.
Now a matching attribute is displayed with the corresponding element name
and the ID if present:$ cibadmin --query --xpath "/cib/@crm_feature_set" <cib crm_feature_set="3.20.5"/>Multiple matches are all displayed:
$ cibadmin --query --xpath "//@uname" <xpath-query> <node id="1" uname="node1"/> <node id="2" uname="node2"/> <node_state id="1" uname="node1"/> <node_state id="2" uname="node2"/> </xpath-query>
I don't like that this "matched attributes" output looks the same as "matched elements" output :( But I don't have any great ideas currently. It is useful to show what element a matched attribute came from, as you have done, especially if there are multiple matches.
One possible alternative would be some other element type like <xpath-attr-match> with attributes to indicate the element name and ID, as shown below.
<xpath-query>
<xpath-attr-match element-name="node" element-id="1" uname="node1"/>
...
</xpath-query>
This unfortunately assumes that the matched attribute isn't named element-name or element-id.
(I had another idea, but I forgot it before I could write it down :). I don't think it was a great idea either.)
I also wish that a single match were displayed in the same way as multiple matches (inside an <xpath-query> wrapper). But your proposed approach is consistent with the way we already handle single-vs-multiple element matches, so that's okay.
Note that your proposed approach can produce duplicate match entries if we match both an element and an attribute:
$ CIB_file=cts/cli/crm_mon.xml cibadmin --query --xpath "//@uname[. = 'cluster02']|//node[@uname = 'cluster02']"
<xpath-query>
<node id="2" uname="cluster02"/>
<node id="2" uname="cluster02"/>
<node_state id="2" uname="cluster02"/>
</xpath-query>
In the above example, we matched the <node id="2" uname="cluster02"/> element. We also matched the uname="cluster02" attribute belonging to that element. But we output the attribute with the element name and ID, so we get duplicate lines. And the <node> element has no attributes besides id and uname. Again, the matched element is indistinguishable from the matched attribute.
@clumens Do you have any thoughts here?
There was a problem hiding this comment.
Just a few quick thoughts...
- I've never used
cibadmin --xpathbefore, so I have no expectations on how it's supposed to work. - We do not have any cts regression tests anywhere that use
cibadmin --xpath, which seems like an oversight. - The
crm_failcounttool processes output, but I haven't dug into the details of what it's looking for. Additionally,pcsappears to process the output as well. It stands to reason that there are other tools out there relying on the output as well, so we need to be careful not to break them.
One possible alternative would be some other element type like
<xpath-attr-match>with attributes to indicate the element name and ID, as shown below.<xpath-query> <xpath-attr-match element-name="node" element-id="1" uname="node1"/> ... </xpath-query>This unfortunately assumes that the matched attribute isn't named
element-nameorelement-id.
I don't especially like this because it feels like we're processing the output and imposing additional structure on it, which seems like a business we don't want to be in.
I also wish that a single match were displayed in the same way as multiple matches (inside an
<xpath-query>wrapper). But your proposed approach is consistent with the way we already handle single-vs-multiple element matches, so that's okay.
I do like this. It's consistent and it makes parsing easier for whoever is consuming this output.
Note that your proposed approach can produce duplicate match entries if we match both an element and an attribute:
$ CIB_file=cts/cli/crm_mon.xml cibadmin --query --xpath "//@uname[. = 'cluster02']|//node[@uname = 'cluster02']" <xpath-query> <node id="2" uname="cluster02"/> <node id="2" uname="cluster02"/> <node_state id="2" uname="cluster02"/> </xpath-query>In the above example, we matched the
<node id="2" uname="cluster02"/>element. We also matched theuname="cluster02"attribute belonging to that element. But we output the attribute with the element name and ID, so we get duplicate lines. And the<node>element has no attributes besidesidanduname. Again, the matched element is indistinguishable from the matched attribute.
This feels like user error to me - they specified a confusing or incorrect query, so they get confusing or incorrect results.
There was a problem hiding this comment.
- The
crm_failcounttool processes output, but I haven't dug into the details of what it's looking for.
It's querying element XPaths and grabbing the nvpair values where the name matches a particular prefix.
# Build xpath to match all transient node attributes with prefix
QAS_XPATH="/cib/status/node_state[@uname='${QAS_TARGET}']"
QAS_XPATH="${QAS_XPATH}/transient_attributes/instance_attributes"
QAS_XPATH="${QAS_XPATH}/nvpair[starts-with(@name,'$QAS_PREFIX')]"
...
QAS_ALL=$(cibadmin --query --xpath="$QAS_XPATH" 2>/dev/null)
...
# Extract the attribute values (one per line) from the output
QAS_VALUE=$(echo "$QAS_ALL" | sed -n -e \
's/.*<nvpair.*value="\([0-9][0-9]*\|INFINITY\)".*>.*/\1/p')
The rabbitmq-cluster resource agent -- the only one that uses cibadmin with the --xpath/-A option -- also queries for an element. pcs also only queries for elements.
I don't especially like this because it feels like we're processing the output and imposing additional structure on it, which seems like a business we don't want to be in.
We're also doing that if we display the original element. In that case, we're imposing additional structure on the output in a hidden or misleading way. What we matched is the attribute, not the element.
We don't actually need to show the element name or element ID at all. We could just have elements like <attr-match name="some_name" value="some_value"/>. Showing the element name/ID seems to be just a stylistic choice that gao-yan was proposing.
I do like this. It's consistent and it makes parsing easier for whoever is consuming this output.
I don't think this would break anything in pcs, resource-agents, or crm_failcount... so it is a possibility, if we're brave.
This feels like user error to me - they specified a confusing or incorrect query, so they get confusing or incorrect results.
It doesn't feel that way to me. It's perfectly reasonable to specify an XPath that matches both elements and attributes. We could say that it's a somewhat odd thing to do, but IMO it's not confusing or incorrect.
If an XPath expression expr1|expr2 matches the same element twice, the results will NOT be duplicated. If an XPath expression expr1|expr2 matches an element and one of its attributes, then under the proposed scheme the element will be displayed twice. And there's no way to distinguish between the two matches.
This would be uncommon and it wouldn't be catastrophically bad, but it doesn't feel right to me :/
There was a problem hiding this comment.
- The
crm_failcounttool processes output, but I haven't dug into the details of what it's looking for.It's querying element XPaths and grabbing the nvpair values where the name matches a particular prefix.
# Build xpath to match all transient node attributes with prefix QAS_XPATH="/cib/status/node_state[@uname='${QAS_TARGET}']" QAS_XPATH="${QAS_XPATH}/transient_attributes/instance_attributes" QAS_XPATH="${QAS_XPATH}/nvpair[starts-with(@name,'$QAS_PREFIX')]" ... QAS_ALL=$(cibadmin --query --xpath="$QAS_XPATH" 2>/dev/null) ... # Extract the attribute values (one per line) from the output QAS_VALUE=$(echo "$QAS_ALL" | sed -n -e \ 's/.*<nvpair.*value="\([0-9][0-9]*\|INFINITY\)".*>.*/\1/p')The
rabbitmq-clusterresource agent -- the only one that usescibadminwith the--xpath/-Aoption -- also queries for an element.pcsalso only queries for elements.
Right, the idea of this PR is basically to resolve the issues by handling non-element matches separately from the element ones.
I don't especially like this because it feels like we're processing the output and imposing additional structure on it, which seems like a business we don't want to be in.
We're also doing that if we display the original element. In that case, we're imposing additional structure on the output in a hidden or misleading way. What we matched is the attribute, not the element.
We don't actually need to show the element name or element ID at all. We could just have elements like
<attr-match name="some_name" value="some_value"/>. Showing the element name/ID seems to be just a stylistic choice that gao-yan was proposing.
It was more than a stylistic choice :-) The idea was also to make the results actually meaningful/informative. I can imagine how a brief "attr-match" without the context, i.e. the element name and the element ID, could be meaningless/clueless to users. It is useful to show what element a matched attribute came from, especially if there are multiple matches as you mentioned too :-)
I do like this. It's consistent and it makes parsing easier for whoever is consuming this output.
I don't think this would break anything in pcs, resource-agents, or
crm_failcount... so it is a possibility, if we're brave.This feels like user error to me - they specified a confusing or incorrect query, so they get confusing or incorrect results.
It doesn't feel that way to me. It's perfectly reasonable to specify an XPath that matches both elements and attributes. We could say that it's a somewhat odd thing to do, but IMO it's not confusing or incorrect.
If an XPath expression
expr1|expr2matches the same element twice, the results will NOT be duplicated. If an XPath expressionexpr1|expr2matches an element and one of its attributes, then under the proposed scheme the element will be displayed twice. And there's no way to distinguish between the two matches.This would be uncommon and it wouldn't be catastrophically bad, but it doesn't feel right to me :/
Well, the use case is hard to be justified but not technically incorrect either :-) How about we leave it like this until we feel the needs and have a good idea how it should be handled?
| continue; | ||
| } | ||
|
|
||
| if (pcmk__is_set(options, cib_no_children)) { |
There was a problem hiding this comment.
This patch is reasonable, but I don't think it's necessary. This behavior has been present since cib_xpath_address was added in commit dd94186, and cib_xpath_address is now deprecated. So this would be a behavioral change for a feature that's going away soon.
There was a problem hiding this comment.
It's basically fixing a buggy behavior even though the feature is deprecated. But I'm also a bit hesitant about this.
| if (match->type != XML_ELEMENT_NODE) { | ||
| if (match->type != XML_ELEMENT_NODE | ||
| && pcmk__is_set(options, cib_xpath_address)) { | ||
| /* @COMPAT cib_xpath_address is deprecated since 3.0.2 |
There was a problem hiding this comment.
IMO it would make more sense to display the matching node path of the attribute, instead of the node path of the attribute's element. So, for example,
/cib/configuration/nodes/node[@id='1']/@uname
But this matches the behavior prior to 5fe98f4. So that's fine, since this is deprecated anyway.
There was a problem hiding this comment.
Right, that was the idea :-)
gao-yan
left a comment
There was a problem hiding this comment.
Thanks for your thoughts!
| if (match->type != XML_ELEMENT_NODE) { | ||
| if (match->type != XML_ELEMENT_NODE | ||
| && pcmk__is_set(options, cib_xpath_address)) { | ||
| /* @COMPAT cib_xpath_address is deprecated since 3.0.2 |
There was a problem hiding this comment.
Right, that was the idea :-)
| continue; | ||
| } | ||
|
|
||
| if (pcmk__is_set(options, cib_no_children)) { |
There was a problem hiding this comment.
It's basically fixing a buggy behavior even though the feature is deprecated. But I'm also a bit hesitant about this.
| pcmk__debug("Processing %s op for %s with %s", op, xpath, path); | ||
| free(path); | ||
|
|
||
| if (match->type != XML_ELEMENT_NODE) { |
There was a problem hiding this comment.
- The
crm_failcounttool processes output, but I haven't dug into the details of what it's looking for.It's querying element XPaths and grabbing the nvpair values where the name matches a particular prefix.
# Build xpath to match all transient node attributes with prefix QAS_XPATH="/cib/status/node_state[@uname='${QAS_TARGET}']" QAS_XPATH="${QAS_XPATH}/transient_attributes/instance_attributes" QAS_XPATH="${QAS_XPATH}/nvpair[starts-with(@name,'$QAS_PREFIX')]" ... QAS_ALL=$(cibadmin --query --xpath="$QAS_XPATH" 2>/dev/null) ... # Extract the attribute values (one per line) from the output QAS_VALUE=$(echo "$QAS_ALL" | sed -n -e \ 's/.*<nvpair.*value="\([0-9][0-9]*\|INFINITY\)".*>.*/\1/p')The
rabbitmq-clusterresource agent -- the only one that usescibadminwith the--xpath/-Aoption -- also queries for an element.pcsalso only queries for elements.
Right, the idea of this PR is basically to resolve the issues by handling non-element matches separately from the element ones.
I don't especially like this because it feels like we're processing the output and imposing additional structure on it, which seems like a business we don't want to be in.
We're also doing that if we display the original element. In that case, we're imposing additional structure on the output in a hidden or misleading way. What we matched is the attribute, not the element.
We don't actually need to show the element name or element ID at all. We could just have elements like
<attr-match name="some_name" value="some_value"/>. Showing the element name/ID seems to be just a stylistic choice that gao-yan was proposing.
It was more than a stylistic choice :-) The idea was also to make the results actually meaningful/informative. I can imagine how a brief "attr-match" without the context, i.e. the element name and the element ID, could be meaningless/clueless to users. It is useful to show what element a matched attribute came from, especially if there are multiple matches as you mentioned too :-)
I do like this. It's consistent and it makes parsing easier for whoever is consuming this output.
I don't think this would break anything in pcs, resource-agents, or
crm_failcount... so it is a possibility, if we're brave.This feels like user error to me - they specified a confusing or incorrect query, so they get confusing or incorrect results.
It doesn't feel that way to me. It's perfectly reasonable to specify an XPath that matches both elements and attributes. We could say that it's a somewhat odd thing to do, but IMO it's not confusing or incorrect.
If an XPath expression
expr1|expr2matches the same element twice, the results will NOT be duplicated. If an XPath expressionexpr1|expr2matches an element and one of its attributes, then under the proposed scheme the element will be displayed twice. And there's no way to distinguish between the two matches.This would be uncommon and it wouldn't be catastrophically bad, but it doesn't feel right to me :/
Well, the use case is hard to be justified but not technically incorrect either :-) How about we leave it like this until we feel the needs and have a good idea how it should be handled?
|
Is there perhaps a minimum fix we could do now to fix the segfault, and then a more comprehensive fix we could do later? I'd love to get this in for 3.0.2, which I need to be doing the -rc2 for in the next day or two. |
…bute Although cibadmin --node-path option is deprecated, the previous behavior is restored by handling the corresponding element of a non-element match instead of returning nothing: $ cibadmin --query --xpath "//@uname" --node-path /cib/configuration/nodes/node[@id='1'] /cib/configuration/nodes/node[@id='2'] /cib/status/node_state[@id='1'] /cib/status/node_state[@id='2']
686bf5f to
1514ed2
Compare
|
Sure thing. There are only the necessary fixes now. |
With 5fe98f4 in 3.0.1, if an XPath query for an attribute resulted in a
single match, for example:
, based would encounter a fatal assertion in pcmk__xml_copy():
The assertion wouldn't be triggered if there were multiple matches for an
attribute, despite the fact that only the last match would be displayed:
But in case --node-path option was anyhow specified, cibadmin would
encounter a fatal assertion in output_xml_id():
If --no-children option was anyhow specified, the attribute name would be
displayed as an element name and the value wouldn't be shown:
This commit fixes the issues by handling non-element matches separately.
Besides, instead of being displayed with a vague element name "xpath-query",
now a matching attribute is displayed with the corresponding element name
and the ID if present, so that the output is meaningful:
Multiple matches are all displayed:
Also, although cibadmin --node-path option is deprecated, the previous
behavior is restored by handling the corresponding element of a
non-element match instead of returning nothing:
And --node-path option is respected even if --no-children is also specified.