simplesamlphp/saml2

Make it possible to add custom extensions

jaimeperez opened this issue · 5 comments

This is a followup from #187. It's a valid point that we should allow extensions to be implemented on top of the library, without the need to implement them directly here, in the library. It's more or less the same argument for modules: we have more flexibility, people don't need to depend on us updating or releasing new modules or versions of the existing ones, and they can be better maintained.

Now, in order to allow for custom extensions to be used, we need to provide a framework developers can use for that. We have two different situations where we will handle extensions in the library: creating an object with extensions on it and marshalling into XML, and the other way around, creating an object with extensions from an existing XML document.

In the first case, we just need extensions to implement some given interface that provides the methods we need. The most important is toXML(), although possibly others in the current AbstractXMLElement are good candidates as well. I believe though we should avoid forcing people to extend our AbstractXMLElement. In the end, all we need is a guarantee that a class implements a certain set of methods, but how they do that is irrelevant for us. Considering inheritance in PHP is quite limited (e.g. a class can only extend from one other class), we shouldn't impose restrictions on the way developers design their class hierarchy.

For the second scenario, though, we need some bigger infrastructure. The problem is that when we receive an XML, we only have three pieces of information: the XML namespace of the element, the namespace prefix and the local name of the element. From that (a DOMElement) we need to resolve a class name where we can call fromXML() passing the DOMElement. Unfortunately, there's no way to resolve from the information available into a qualified class name.

I'd say we need a way to register extensions in the library. Since we already have a rudimentary dependency injection mechanism (sort of) in the Container, I think we can reuse that. Basically, any software using our SAML2 library and wanting to use third-party extensions, would need to register those extensions programmatically in the Container, so that the library can use it to resolve the tuple <NamespaceURI, ElementName> into a qualified class name.

Let me suggest the following actions:

  • Create an interface that extensions must implement. AbstractXMLElement should implement this interface for our convenience, but external extensions should be able to just implement the interface without the need to inherit from our classes. fromXML() and toXML() are the most obvious candidates for that interface, although probably the getAttribute(), getIntegerAttribute() and getBooleanAttribute() methods are useful as well. In this context, we should probably have methods as well to get the name and namespace URI for the element a given object is modeling.
  • Add methods to the ContainerInterface to register extensions and resolve from the aforementioned tuple into a qualified class name.
  • Implement those methods in our container, so that software using the library can register extensions at will and they can then be processed seamlessly by our code. I think the way extensions are registered is out of the scope for this library, but probably we should support that in SimpleSAMLphp as well.
  • Use the new extension resolution method from the Container to try to identify extensions in the SAML2\XML\md\Extensions class, check if those resolve and implement our extension interface, and then call fromXML() on the corresponding classes.
  • Eventually get rid of the list of supported extensions in the Extensionsclass and make it generic: "If we can resolve your class and it implements our interface, we'll use that, otherwise you'll be given a Chunk and it's up to you to process it".

How does this sound? Comments? Objections? Things I have missed?

Sounds great!

Great writeup, thank you @jaimeperez . I closed the original ticket, as this is roadmap we all can follow and help make real.

few questions, that I came up with thinking about implementation

  1. Should the extension be able to override handling of element provided by library itself? Ie. are the built-in classes fallback if no extension handling such data is found?
  2. Registration tuple <NamespaceURI, ElementName> is meant to be unique? (can single xml element trigger two or more extensions?)
  3. will serialization of extensions be able to overwrite/update/remove existing elements outside of self boundary (ie. touch elements in different part of DOMDocument than self and successors)
  4. Container could probably maintain unique combinations of <NamespaceURI, NamespacePrefix>, so extensions do not overwrite default prefixes (saml samlp saml2p ...) unintentionally, and register those on top-level of serialized document, instead of requiring each element to contain namespace declaration. Maybe even serving as registry, that will provide implementing extensions with generated prefixes, so there are no conflicts and/or duplicates.

Let see if I can answer your questions properly 😄

  1. The built-in classes would handle extensions we already support, but I haven't considered being able to extend those themselves. In any case, I don't see any reason why we shouldn't avoid overriding the default extensions available. Also, if no class suitable for handling a given extension is found, then it would be handled as a SAML2\XML\Chunk object.
  2. Yes. An XML element with a given name in a given namespace should always be unique. But that's of course from the XML perspective, not from the framework's, so it's a good point. For the shake of simplicity, I'd say we should assume one class per tuple. If you add more, then the last one added is used. In any case, the SAML2\XML\md\Extensions class shouldn't have to choose. If there is an element we recognise in the DOM, we should process it into one single object.
  3. I'm not sure I follow on this one, but if I understood correctly, I'd say no. This should affect only extensions added to a SAML2\XML\md\Extensions object.
  4. The prefix is irrelevant, but this raises the question of what happens if you try to reuse the same prefix for two different namespaces. This one would be hard to avoid, because then we would need to keep another registry for <namespace, prefix> tuples. The other way around, having multiple prefixes for the same namespace, I think would be legal.
    If we reuse the same extension registry to make it general and keep track of namespaces and prefixes, that would indeed tackle both problems at the same time, but then we would need to register manually all the existing combinations, plus make it impossible to redefine an entry (e.g. adding a tuple <namespace, localName, class> would have to fail).

I'm sorry for not responding, thank you for explanations.
Specific notes
3. There are two possible models of serialization, either you run toXML() on extension, and append returned data to Extensions element, or you run something as toXML($domDocument) and then the extension could probably modify whatever data in the document it wants to. Given you don't want to implement all the use-cases within library, second approach seems more dynamic and less work for you to maintain the whole thing
4. I'd vote for keeping registry of <namespace, class> with possible prefix declaration, which the extension wants to use, but if the requested prefix is already assigned, then the framework would generate/assign different one, to prevent conflicts.

I believe all of these points have been addressed in master