Parse for pdf
nikophil opened this issue · 5 comments
give a way to parse for pdf and parse only some parts of the docs
Almost done
- there are some prev / next links hardcoded in the rst of the best practices, i think this might disappear (moreover because i now handle thes links in the json - they are really useless)
- If some link exists between different pages of the same "book" we want to print as pdf, the parser renders it with an actual link
href="page.html"and not as anchorshref="#page". This behavior exists at least in toc, but it could be real for any cross link inside the same book. I don't see another solution than parsing all html, and replacing it.
there are some prev / next links hardcoded in the rst of the best practices, i think this might disappear (moreover because i now handle thes links in the json - they are really useless)
You mean, for example, the "Next: Creating a project" at the bottom of https://symfony.com/doc/current/best_practices/introduction.html right?
In a perfect world, we would remove these and the auto-generated next/prev would handle this in HTML automatically. Let's just keep this on the "list" for now - we can see how the next/prev links look, and then hopefully remove these manual ones later.
If some link exists between different pages of the same "book" we want to print as pdf, the parser renders it with an actual link href="page.html" and not as anchors href="#page". This behavior exists at least in toc, but it could be real for any cross link inside the same book. I don't see another solution than parsing all html, and replacing it.
Technically, this is ok! The current PDF-generating code actually already contains a bunch of code (regex, etc) to find and fix the links. However, as this code is very coupled to Sphinx, I think we should re-implement it ourselves - basically have an option that will dump one "section" into a single, final HTML file (or maybe JSON file... so it can be more easily parsed... but containing HTML) with all the links already fixed.
You mean, for example, the "Next: Creating a project" at the bottom of https://symfony.com/doc/current/best_practices/introduction.html right?
yep, i was talking about that.
ok, let's keep that, but i'm pretty sure we'll soon get rid of it
Technically, this is ok! The current PDF-generating code actually already contains a bunch of code (regex, etc) to find and fix the links. However, as this code is very coupled to Sphinx, I think we should re-implement it ourselves - basically have an option that will dump one "section" into a single, final HTML file (or maybe JSON file... so it can be more easily parsed... but containing HTML) with all the links already fixed.
what do you mean ? i was thinking that we're using princexml to generate pdf ?