tDOM/tdom

Why limit the maximum of created namespace to 254?

Closed this issue · 5 comments

I have some soap documents have a large number of namespace, but tdom limit the maximum of new namespace to 254, I'd like to know what's the purpose for this? Thank you!

It does not look like inside tDom there is any reason for the restriction - the namespace list is grown dynamically. You could try changing the check in domNewNamespace (generic/dom.c L802) to a bigger number and see if anything breaks.

It is right, that the namespace list is grown dynamically. There is nevertheless a reason for the limit. The proposed modification won't break in the sense, that tDOM crashes, but it still break things in a fundamental way: it just won't work correctly (with documents with more that 255 namespaces).

The reason is, that the namespace 'nr' is stored in a just 8 bit wide place in the node and attribute structures. (Something of: 255 namespaces are 'enough for everyone'.)

It's not difficult, to modify tdom to relax that limit, but this needs definitely more than changeing just one line. And it will increase the memory needs of a tdom DOM tree by a few percent.

Rolf, is the problem that this bit in expat.h where they are using an unsigned int for the children? Or is it more than that?

typedef struct XML_cp XML_Content;

struct XML_cp {
  enum XML_Content_Type         type;
  enum XML_Content_Quant        quant;
  XML_Char *                    name;
  unsigned int                  numchildren;
  XML_Content *                 children;
};

No, not at all. It is

typedef struct domNode {

    domNodeType         nodeType  : 8;
    domNodeFlags        nodeFlags : 8;
    domNameSpaceIndex   namespace : 8;
    unsigned int        info      : 8;
    unsigned int        nodeNumber;
    domDocument        *ownerDocument;
    struct domNode     *parentNode;
    struct domNode     *previousSibling;
    struct domNode     *nextSibling;

    domString           nodeName;  /* now the element node specific fields */
    struct domNode     *firstChild;
    struct domNode     *lastChild;
#ifdef TCL_THREADS
    struct domNode     *nextDeleted;
#endif
    struct domAttrNode *firstAttr;

} domNode;

and

typedef struct domAttrNode {

    domNodeType         nodeType  : 8;
    domAttrFlags        nodeFlags : 8;
    domNameSpaceIndex   namespace : 8;
    unsigned int        info      : 8;
    domString           nodeName;
    domString           nodeValue;
    int                 valueLength;
    struct domNode     *parentNode;
    struct domAttrNode *nextSibling;

} domAttrNode;

in dom.h.

In both structures the

domNameSpaceIndex   namespace : 8;

declaration limits the nr of possbile different namespaces to 8 bit.

The solution is obvious.

Raised the limit of possible different namespaces in trunk.