saloonphp/xml-wrangler

Xpath doesn't work when `xmlns` is used

ruudk opened this issue · 5 comments

ruudk commented

Not sure if it's related to this library, but I'm trying to parse a Symfony DI Container XML file, and it works fine.

But I cannot parse it using Xpath.

Example:

$example = \Saloon\XmlWrangler\XmlReader::fromString(<<<XML
<container xmlns="http://symfony.com/schema/dic/services" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://symfony.com/schema/dic/services https://symfony.com/schema/dic/services/services-1.0.xsd">
  <services>
    <service id="service_container" class="Symfony\Component\DependencyInjection\ContainerInterface" public="true" synthetic="true"/>
    <service id="kernel" class="TicketSwap\Kernel" public="true" synthetic="true" autoconfigure="true">
      <tag name="controller.service_arguments"/>
      <tag name="routing.route_loader"/>
    </service>
  </services>
</container>
XML
);

dump($example->xpathElement('/container/services/service[@id="service_container"]')->get()); 
// empty array

When I remove xmlns="http://symfony.com/schema/dic/services" from the <container root it works.

veewee commented

FYI @Sammyjo20 :

An xmlns without a prefix requires a prefix to be manually registered on xpath.
See https://github.com/veewee/xml/blob/main/docs/dom.md#namespaces
The query will need to use that prefix. For example: /prefixed:container/prefixed:services/prefixed:service[@id="service_container"]

However, if the XML would contain it's own prefix by using xmlns:prefixed="https://" - there is no need to register that namespace. It can be used directly just like that. (See bool $registerNodeNS = true flag in DOMXPath).

In ext-dom code:

$doc = new DOMDocument();
$doc->loadXML(<<<'EOXML'
<container xmlns="http://symfony.com/schema/dic/services" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://symfony.com/schema/dic/services https://symfony.com/schema/dic/services/services-1.0.xsd">
  <services>
    <service id="service_container" class="Symfony\Component\DependencyInjection\ContainerInterface" public="true" synthetic="true"/>
    <service id="kernel" class="TicketSwap\Kernel" public="true" synthetic="true" autoconfigure="true">
      <tag name="controller.service_arguments"/>
      <tag name="routing.route_loader"/>
    </service>
  </services>
</container>
EOXML
);

$xpath = new DOMXPath($doc, registerNodeNS: true);
$xpath->registerNamespace('prefixed', 'http://symfony.com/schema/dic/services');

var_dump($xpath->query('/prefixed:container/prefixed:services/prefixed:service[@id="service_container"]'));

There's no way around this as far as I know, which makes it annoying when having to deal with one or more unprefixed xmlns namespaces.

Thanks for reporting this issue and your help @veewee - I’ll take a look on Sunday! Toon, could I maybe spoof the reader by setting a default in the reader like [‘xmlns’ => null]?

veewee commented

Not in the reader. But you can do whatever you want when using xml_decode.
I've written a traverser that removes all namespace information before decoding it:

use VeeWee\Xml\Dom\Traverser\Visitor\RemoveNamespaces;
use function VeeWee\Xml\Dom\Configurator\traverse
use function VeeWee\Xml\Encoding\xml_decode;


$data = xml_decode($xml, traverse(new RemoveNamespaces()));

If you are looking for only removing unprefixed namespaces, I might need to introduce another option or completely new traverser alltogether.

More info:
https://github.com/veewee/xml/blob/main/docs/dom.md#removenamespaces

Hey @ruudk this should be now fixed in v0.2.0 - let me know if this fixes your problems and thanks again for reporting this!

ruudk commented

Thank you!