Replace `Reader#attribute_nodes` with a rough equivalent
Opened this issue · 3 comments
Context
Issue #2598 describes at length an incompatibility between Ruby's compacting garbage collector and libxml2's xmlTextReaderExpand
which re-uses C structs previously returned. The solution chosen and implemented in #2599 was to deprecate Reader#attribute_nodes
. This method was removed entirely in Nokogiri v1.16.0.
In #2599 I acknowledged that there may be use cases the needed the semantics of #attribute_nodes
(e.g., returning all attributes) and for which #attribute_hash
may not suffice (because the non-namespaced hash keys used may collide). I intentionally deferred introducing a replacement until/unless we had a real-world use case that could drive out the API design.
In #2599 (comment), @ccutrer brought a real-world use case: https://github.com/instructure/ratom/blob/v0.10.11/lib/atom/xml/parser.rb#L103
So let's do this!
API design
I would like to play with a few options for retrieving the namespaced attributes from a Reader cursor.
Some ideas:
- Mimic
Node#attribute_nodes
with a newReader#attribute_nodes
and return anArray<XML::Attr>
. This might be a challenging implementation because we'll be re-complicating the XML::Attr lifecycle again, but it's worth attempting to see if something simple like an ephemeral XML::Document would work. - Mirror
Reader#attribute_hash
withReader#namespaced_attribute_hash
which returns aHash<String => String>
in which the hash key is the attribute name including the namespace prefix if any, and the hash value is the attribute value.
@ccutrer I'm curious what you think the API might look like?
Cleanup
While we're here, we should make sure to delete the additional GC lifecycle handling in xml_node.c
for attributes. I deleted the #attribute_nodes
method but didn't go further than that.
#3 is fine with me, as long as I also have any easy way to get from the namespace prefix to the namespace's href. Though maybe the key should be a pair of strings? So I don't have to split on ":" immediately to get back to the prefix itself? Or maybe not. Gotta consider attributes without a namespace. The ratom code actually re-combines it and joins with ":" for one of its branches.
@ccutrer I'm not sure I understand - you say #3
but I only listed two ideas?
It was meant as a "or an alternative you didn't mention." Sorry for any confusion