aerkalov/ebooklib

[element "ol" incomplete; missing required element "li"] in [nav.xhtml] when creating epub with nested directories

lsyanling opened this issue · 2 comments

I'm going to create an epub that has multiple volumes with multiple chapters in each volume as follows:

# 6 chapters in total, 2 chapters in each volume
chapters = ['1', '2', 
            '1', '2',
            '1', '2']
volumes = ['Volume1',
           'Volume2',
           'Volume3']

Then I create the eBook, the logic of the code is slightly more complex and I've tried to simplify it as much as possible.

book = epub.EpubBook()
chapter_index = 0
for volume_title in volumes:
    volume_chapters = []

    for i in range(chapter_index, chapter_index + 2):
        chapter_title = chapters[i]
        chapter = epub.EpubHtml(file_name=f'{volume_title}_{chapter_title}.xhtml',
                                media_type='application/xhtml+xml',
                                title=chapter_title,
                                content = '<p>Content</p>')
        book.add_item(chapter)
        volume_chapters.append((chapter, chapter.file_name))
        chapter_index += 1

    book.spine.extend(volume_chapters)
    toc_section = (epub.Section(volume_title), [(chap, file_name) for chap, file_name in volume_chapters])
    book.toc.append(toc_section)

book.add_item(epub.EpubNcx())
book.add_item(epub.EpubNav())
epub.write_epub('test.epub', book)

However, the book I created is not recognized by the reader. Then I use epubcheck for this and it prompted the following error.

ERROR(RSC-005): ./test.epub/EPUB/nav.xhtml(16,20): Error while parsing file: element "ol" incomplete; missing required element "li"
ERROR(RSC-005): ./test.epub/EPUB/nav.xhtml(20,20): Error while parsing file: element "ol" incomplete; missing required element "li"
ERROR(RSC-005): ./test.epub/EPUB/nav.xhtml(29,20): Error while parsing file: element "ol" incomplete; missing required element "li"
ERROR(RSC-005): ./test.epub/EPUB/nav.xhtml(33,20): Error while parsing file: element "ol" incomplete; missing required element "li"
ERROR(RSC-005): ./test.epub/EPUB/nav.xhtml(42,20): Error while parsing file: element "ol" incomplete; missing required element "li"
ERROR(RSC-005): ./test.epub/EPUB/nav.xhtml(46,20): Error while parsing file: element "ol" incomplete; missing required element "li"
Check finished with errors

I have 6 chapters and there are 6 errors, which means each chapter corresponds to one error. Then the nav.xhtml is like this:

    <ol>
      <li>
        <span>Volume1</span>
        <ol>
          <li>
            <a href="Volume1_1.xhtml">1</a>
            <ol />        <!-- issue here -->
          </li>
          <li>
            <a href="Volume1_2.xhtml">2</a>
            <ol />
          </li>
        </ol>
      </li>
      <li>
        <span>Volume2</span>
        <ol>
          <li>
            <a href="Volume2_1.xhtml">1</a>
            <ol />
          </li>
          <li>
            <a href="Volume2_2.xhtml">2</a>
            <ol />
          </li>
        </ol>
      </li>
      <li>
        <span>Volume3</span>
        <ol>
          <li>
            <a href="Volume3_1.xhtml">1</a>
            <ol />
          </li>
          <li>
            <a href="Volume3_2.xhtml">2</a>
            <ol />
          </li>
        </ol>
      </li>
    </ol>

The problem is the <ol/> after the <a> in each chapter.
In my tests, if there is no nested directory (i.e. volume), this error will not be encountered.
I'm not sure if my code is wrong, but ChatGPT tells me my code is fine.

Thanks for sending this, looks like an interesting issue. I will need some time to figure out what is going on here but try this code. I think the code bellow is what you need.

Looking quickly I would say the issue is here volume_chapters.append((chapter, chapter.file_name)). 2nd value in the tuple should be tuple or list with sub chapters. When I do that I get the saame issue, empty

    .

    for volume_title in volumes:
        volume_chapters = []
        for i in range(chapter_index, chapter_index + 2):
            chapter_title = chapters[i]
            chapter = epub.EpubHtml(file_name=f'{volume_title}_{chapter_title}.xhtml',
                                    media_type='application/xhtml+xml',
                                    title=chapter_title,
                                    content = '<p>Content</p>')
            book.add_item(chapter)
            volume_chapters.append(chapter)
            
            chapter_index += 1
    
        book.spine.extend(volume_chapters)
        toc_section = (epub.Section(volume_title), volume_chapters)
    
        book.toc.append(toc_section)
    

There is no doubt that you're right. Thanks for your help!

By the way, my Epub reader may be too censorious. I tried other Epub readers, and they can recognize the book correctly that contains redundant <ol/>.

Have a nice day at work!

Thanks for sending this, looks like an interesting issue. I will need some time to figure out what is going on here but try this code. I think the code bellow is what you need.

Looking quickly I would say the issue is here volume_chapters.append((chapter, chapter.file_name)). 2nd value in the tuple should be tuple or list with sub chapters. When I do that I get the saame issue, empty

.

for volume_title in volumes:
    volume_chapters = []
    for i in range(chapter_index, chapter_index + 2):
        chapter_title = chapters[i]
        chapter = epub.EpubHtml(file_name=f'{volume_title}_{chapter_title}.xhtml',
                                media_type='application/xhtml+xml',
                                title=chapter_title,
                                content = '<p>Content</p>')
        book.add_item(chapter)
        volume_chapters.append(chapter)
        
        chapter_index += 1

    book.spine.extend(volume_chapters)
    toc_section = (epub.Section(volume_title), volume_chapters)

    book.toc.append(toc_section)