basxsoftwareassociation/bread

Contrib module for document templates

saemideluxe opened this issue · 4 comments

For the ponds project we will need a way to let users create a document template which has placeholders that can be populated with values from the database. The output should then be a PDF or similar.

The following is needed to implement this:

  • a new contrib module inside bread
  • Model(s) and views/UI to create and edit document templates
  • Conceptually I would keep it similar to the reports module, meaning that a base-model can be specified and field definitions can be saved to "extract values" from a given instance of the according model. So, the placeholders in the template document should likely be the same or similar to how report columns are specified.
  • The model class should have a method which can map a list of objects to a list of output documents
  • I recommend we start with docx/dotx -> PDF or docx/dotx - docx as supported file types for input -> output. I would not write a big plugin structure or anything, keep it as simple as possible and when we need other formats we can consider again how to implement.
  • We should explore both options: Taking a word template (dotx) as input or a docx. I think dotx would be cleaner and I think there is some python module which is able to fill this in correctly. docx would also work if we have some special marker strings. However, I am less confident in the reliability of search-and-replace if we do it that way, as this is a not plain text at all. (see links below, also search for "mailmerge" in the context of word document processing)
  • How exactly we will call the rendering should left be open for now. Like via a URL or used inside model save methods to assign rendered documents to a model file field or other ways. Important is that we have reliable building blocks which we can easily reuse in different contexts.

The docx handling and the PDF generation are likely the more difficult issues. Important here is: The process needs to be doable on a headless linux system (no word installed). It would be nice if we would have no non-pip dependencies (e.g. libreoffice or pandoc binaries). I think for reading/filling out dotx templates that is not too much of a problem, but rendering to PDF (in the same process) might be more challenging to find something good. Pandoc would be great, but it seems to not have an official (native) API e.g. for non-commandline usage, which makes it much less ideal.

Also, we should keep an eye on what dependencies we have to add for this. Especially if they are not maintained. Instead of adding an unmaintained wrapper pip-package we better rewrite it inside bread or copy and adjust it with a reference to the original code.

Some links:

Okay, after some more research dotx and mailmerge seem not be the best options, going with the docxtpl might be the best option. I haven't touched Word for many years now, so input from some experienced user might be helpfull as well...

A nice thing for UX might be an inplace preview when editing a template, where the user can also quickly page through different objects to see how things would get rendered.

Another thought here: We could either let the user define fields with "variable names" which can then be used inside the template, or we can assume that the values inside the document template are valid field values. After giving it some thought I think the second option is the better one, as the template document will only need to be uploaded once in order to change how a value is displayed.

Another note: This could be the occasion to think about a simple "formatting DSL" which gives some access to the ObjectFieldValue and/or bread.formatters.format_value features. Would also be usable in bread.contrib.reports. But low prio.

Closing this for now, will need to test with real-use cases in ponds.