Support for comments
timnieder opened this issue · 0 comments
We encountered a issue where the library fails to read the file if it finds a comment:
TypeError: Cannot read properties of undefined (reading 'filter')
at easy-template-x.js:1907:47
at Generator.next (<anonymous>)
at asyncGeneratorStep (asyncToGenerator.js:3:1)
at _next (asyncToGenerator.js:22:1)
at _ZoneDelegate.invoke (zone.js:368:26)
at Object.onInvoke (core.mjs:11083:33)
at _ZoneDelegate.invoke (zone.js:367:52)
at Zone.run (zone.js:129:43)
at zone.js:1257:36
at _ZoneDelegate.invokeTask (zone.js:402:31)
Here it fails to find the body node (or the children of the body node):
async getHeaderOrFooter(type) {
var _sectionProps$childNo, _attributes;
const nodeName = this.headerFooterNodeName(type);
const nodeTypeAttribute = this.headerFooterType(type);
// find the last section properties
// see: http://officeopenxml.com/WPsection.php
const docRoot = await this.mainDocument.xmlRoot();
const body = docRoot.childNodes[0];
const sectionProps = last(body.childNodes.filter(node => node.nodeType === XmlNodeType.General));
if (sectionProps.nodeName != 'w:sectPr') return null;
A look into the document.xml shows the issue:
<w:document ...
mc:Ignorable="w14 w15 wp14"><!-- Generated by Aspose.Words for .NET 23.5.0 -->
<w:body>
<w:p w:rsidR="001F47F4" w:rsidP="001F47F4" w14:paraId="425762CC" w14:textId="77777777">
....
The library tries to get the first child node as the body, but this is not always the case (e.g. there could be a comment beforehand or like in #103 a w:background
tag).
A file generated with the aspose.words online editor has the same problem:
test.docx
I've tried to fix it by changing the code:
private async getHeaderOrFooter(type: ContentPartType): Promise<XmlPart> {
const nodeName = this.headerFooterNodeName(type);
const nodeTypeAttribute = this.headerFooterType(type);
// find the last section properties
// see: http://officeopenxml.com/WPsection.php
const docRoot = await this.mainDocument.xmlRoot();
const body = docRoot.childNodes.find(node => node.nodeName == 'w:body');
if (body == null)
return null;
const sectionProps = last(body.childNodes.filter(node => node.nodeType === XmlNodeType.General));
if (sectionProps.nodeName != 'w:sectPr')
return null;
This allows the library to parse the comments without a problem, but not they are detected as text nodes (I believe) and written into the final document like this:
<w:document ...
xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape"
mc:Ignorable="w14 w15 wp14">
<#comment/>
<w:body>
<w:p w:rsidR="001F47F4" w:rsidP="001F47F4" w14:paraId="425762CC" w14:textId="77777777">
...
Which fails to open using word as it's an invalid document.
To make a proper fix, one would probably have to add full support for comments.