xou/elixlsx

Add formula <f> tag support

Closed this issue · 15 comments

<c r="M4" s="1">
  <f>E4*D4-C4</f>
  <v>312.39750000000004</v>
</c>

Some idea how to look this in code?
We need some tag/function/symbol to different fix values from formulars.

Variant 1 - Tuple in content parameter: (my favorite)
Sheet.set_cell(sheet, "A1", [{formular: "E4*D4-C4"}, {value: 312.3975}]) --> formular and value
Sheet.set_cell(sheet, "A1", {formular: "E4*D4-C4"}) --> formular (not sure if this be useful)
Sheet.set_cell(sheet, "A1", {value: 312.3975}) --> value
Sheet.set_cell(sheet, "A1", 312.3975) --> value

Variant 2 - New function:
Sheet.set_cell_formular(sheet, "A1", "E4*D4-C4", 312.3975) --> formular and value
Sheet.set_cell_formular(sheet, "A1", "1+2*3") --> formular (not sure if this be useful)
Sheet.set_cell(sheet, "A1", 312.3975) --> value

Some other ideas?

xou commented

Generally, I would prefer the first version (or a variant of it), as we do a similar thing with {:excelts, num} already, but I am honestly a bit reluctant about introducing this.

I just did a quick test run whether a <v> is required with <f>. For reference, the file is here:
https://drive.google.com/file/d/0B6L7trRG-SSCcnZvTWRwOWZhTVE/view?usp=sharing

  • In Libreoffice, a missing <v> is recalculated on load; <v> overrides the <f> value (until you force it to reload the formula - which basically involves editing the cell, the "recalculate" button didn't do the trick).
  • Google Drive's "Preview-mode" (preview-lightbox?), probably equivalent to the gmail one, relies on the <v> entirely, omitting it results in an empty cell and just as libreoffice the <v> value overrides the <f>,
  • Google Sheets re-calculates all fields on load, so a wrong <v> value is fixed on load.

If people specify both value and formula, they need to make sure that the formula in the excel language matches the internal calculation. Add a bit of copy&paste here, a rounding- or floating-point error there, and the inconsistency in how the various applications handle discrepancies, and you get a great way of shooting yourself into the foot and/or hours of debugging inconsistent results.

On the other hand, we can force people to specify only a formula, but then the GMail-Preview (and potentially other previewers or applications) won't display anything in that field, another inconsistency.

I'm also a bit worried about the level of abstraction here, and how "clean" the API is. I'm assuming that in near-100% of cases, people will want to have the cell display the "correct" value as calculated by the formula. That sort-of implies that the API should "do the job" of calculating <v> itself, but implementing the whole excel-formula language seems to be a bit out of scope ;-). And as I implied above, it just feels a bit wrong calculating the value twice in two different languages.

All of that being said, it's still a resonable feature to have. I could imagine use cases for this (interactive spreadsheet with default values, maybe?), so it might be worth the tradeoff. (I do think it would require a huge ⚠️ sign in the documentation and API though, this might actually a case for moving it into a different function (Sheet.set_cell_formular version)).

@developerworks could you point out your use case, so we could evaluate this better?

xou commented

Dug a bit to see how other projects are handling this:

  • xslxwriter has an optional value parameter: http://xlsxwriter.readthedocs.io/working_with_formulas.html. The documentation does talk about "setting a global flag forcing a re-calculation on load", but mentions that this doesn't work with previewers or pdf converters
  • openpyxl does not seem to have a value parameter: https://openpyxl.readthedocs.io/en/default/usage.html#using-formulae (doesn't display anything in drive preview)
  • libxlsxwriter provides a method for optionally specifying a user result. Again:

    Libxlsxwriter doesn't calculate the value of a formula and instead stores the value 0 as the formula result. It then sets a global flag in the XLSX file to say that all formulas and functions should be recalculated when the file is opened.
    This is the method recommended in the Excel documentation and in general it works fine with spreadsheet applications.

The flag the docs are talking about seems to be the fullCalcOnLoad="1" attribute in <calcPr> (in workbook.xml) (in xlsxwriter, libxlsxwriter)

Oh, that's very tricky. There a lot of unclear things :-(

Based on your search and arguments I think you are right and we should create a new function for formulars.

The fullCalcOnLoad option should enabled by default, so the most applications will calculate this fields.

Bad thing is that some preview tools don't calculate the values and show nothing.
To show an empty cell is in my opinion a bad case. Also a default value like not calculated or 0 should set by default. Additional the user have an option to overwrite the fallback value with own content.

In short: I think the solution of xslxwriter are the best way.

In any case wee need some details and a ⚠️ about this topic in documentation.

Maybe, we can borrow the implementation of xlsxwriter?

By the way, calculation is realized in PHPExcel, but I don't think we need it at this stage

xou commented

The fullCalcOnLoad option should enabled by default, so the most applications will calculate this fields.

Agreed, other implementations seem to do that as well, doesn't seem to hurt. I did a quick test run though, and LibreOffice's behaviour didn't seem to change (i.e., was still displaying the <v> value instead of recalculating it), so I'd definitely add an explicit warning.

Also a default value like not calculated or 0 should set by default. Additional the user have an option to overwrite the fallback value with own content.

I think this is a separate issue. Adding formula support might be as simple as adding another case to get_content_type_value and extending xl_sheet_rows a bit (and probably a good portion of unit tests, I'm a bit worried about breaking backwards compatibility at this point).

Adding a global default comes with the problem that applications like LibreOffice, potentially others, will use the default instead of recalculating on load - even if they could recalculate. I think it's best to have the library user decide which behavior to use for each cell - specify value: -> applications will use that (even when it's wrong), don't specify value: -> applications will recalculate if they can.
Doesn't mean a global default value is a bad thing, but I would prefer leaving that as a separate issue for now.

@developerworks I don't think we can easily port xlsxwriter's implementation, conceptually their implementation is very different. Again, it would be good if you could point out how you are planning to use this - esp. regarding global default values.

@unti1x Sounds like a fun thing to implement ;) Might even make sense as a separate external library, but definitely out of scope at this stage here.

Adding a global default comes with the problem that applications like LibreOffice, potentially others, will use the default instead of recalculating on load - even if they could recalculate.

Uh.., ok. Then I agree that the user should decide for set a value or not.

xou commented

Sorry, I currently don't have any time to work on this. If anyone wants to, feel free to assign the issue to yourself :)

I have already started to work on it (my time is currently limited too, so it's developing slowly), I create a PR if finshed.

I don't know how I can assign the issue to me and if I need permissions for that ?! 😳

xou commented

Sorry for the late reply. I've added you as a collaborator and once you accept I should be able to assign the issue to you and/or you should be able to assign the issue to yourself.

xou commented

@developerworks can you check out the master branch version and verify that this works for you?

xou commented

@Fabi755 what do you think would be a good way of adding a default value? {:formula, "SUM()", value: 123} or something like this?

Yes, I think {:formula, "FUNC", opts = [value: "default_value"]} is perfect
I see that I forgot the fullCalcOnLoad="1" parameter.

I will add this next week.

xou commented

Pushed to hex in 0.2.0.