kipcole9/money

Not an issue, but a question

Closed this issue · 13 comments

Hi Kip,

Has it been thought through on how to benefit from the Money library in a use case where it's absolutely redundant to use Money instances even in memory-only given the fact that each single instance out of potentially tens of thousands or more instances (of money) that are present in the data model are of the same currency and identical formatting?

As I haven't yet had the time to delve deeper into the Money source code, do you find it feasible performance-wise to store (memory and DB -wise alike) the Decimals, while creating Money instances on demand for computation purposes only, or do you find the Money library as such too heavyweight in general for one such use-case?

Thanks,

Damir

Damir, its a very reasonable question. The intent of ex_money is primarily about guarantees more than absolute performance. The fact that it's decimals under the hood demonstrates that.

On the other hand, the format in the database adds only 3 bytes per column.

There is no doubt that if you have performance critical code and only one currency then storing the amount as an integer will probably be the fastest approach. And then of course you take responsibilities for the guarantees. It might be the right trade off.

That is clear, but Money does keep in memory both the currency atom and the formatting for each of its instances (in addition to the Decimal value) and I am not using a relational DB but am storing the whole data model in a JSON document instead . I had no intention of storing integers only in either memory od DB (although I must say now that it may suffice since both the precision and the format stay fixed over the entire sequence of amounts, so it's not a bad idea either, especially given the scale).

Some quick investigation:

Memory summary

  • A Money.t will take about 24 words or 192 bytes. An integer will take typically 8 bytes.
  • A Decimal alone will take about 12 words or 96 bytes
  • A Money.Ecto.Composite.Type will take 30 bytes typically to store. A Postgres integer will take either 4 bytes or 8 bytes.

Memory size of a Money.t

A quick test shows:

iex> x = Money.new(:USD, 100)
#Money<:USD, 100>
iex> :erts_debug.size x
24

Meaning that a Money.t typically takes 24 words, or 192 bytes on a 64-bit machine.

Size of a Decimal

iex> :erts_debug.size Decimal.new(100) 
12

A Decimal takes 12 words, about 1/2 of the whole Money.t struct.

Storage size of a Money.Ecto.Composite.Type in Postgres

In Postgres, the documentation for the NUMERIC type says:

The actual storage requirement is two bytes for each group of four decimal digits, plus three to eight bytes overhead.

Using an example database with one row only:

money_dev=# select
    pg_size_pretty(sum(pg_column_size(payroll))) as total_size,
    pg_size_pretty(avg(pg_column_size(payroll))) as average_size,
    sum(pg_column_size(payroll)) * 100.0 / pg_total_relation_size('organizations') as percentage 
    from organizations;

 total_size |       average_size        |       percentage       
------------+---------------------------+------------------------
 30 bytes   | 30.0000000000000000 bytes | 0.09155273437500000000

We can see that it appears that a money composite type takes 30 bytes (variable depending on the amount) to store. A Postgres integer will take either 4 bytes or 8 bytes depending on type selected. Its 8 bytes for most integers in the BEAM but arbitrary precision integers which overflow the native data type can take a lot more although this is unlikely for money amounts.

Great answer, thanks!

One more thing, not a requirement, just food for thought.

Imagine the use-case I mentioned previously, with huge swaths of amounts all in the same currency and all with the same precision and formatting such as with an accounting software or a financial planning tool. The relevant conclusions that can be drawn from your last comment are as follows:

  1. It would take 24 or 48 (but lets stick to 24) times less memory (per concurrent web app user) to keep the integers in memory server-side than to keep the instances of Money;
  2. Each computation could assume the same "context" i.e. same currency, same precision and same formatting and thus be potentially faster i.e. less whatever verifications there are;
  3. The context (just like with Decimal.t) could be mapped to the process or processes in charge of doing the computations.

All the algos would remain virtually the same and the only change to the Money module interface would be accepting integers in addition to Money.t instances (all integers or all Money.t not a mix thereof) and raise an ArgumentError or similar if a context is not mapped to the process in which the integer taking functions get invoked.

Good thought experiments! In writing the library I had the following goals in mind:

  1. Correctness
  2. Formalised (ie uses formal currency data from ISO 4217 and CLDR)
  3. Fully localised

You've posed the question: Can the implementation be more time and space efficient.
Which I reframe to be: Can the implementation be more time and space efficient and still meet the goals.

Option 1: Defer using Money.t until absolutely necessary

This option means the developer takes responsibility for the correctness and uses ex_money only for formatting. This option is available today by using Money.from_integer/3.

Option 2: Use a more memory efficient structure

Given that a Money.t is 24 words (192 bytes), is there a more memory efficient approach?

  • {:USD, Decimal.new(100)} is 15 words
  • {:USD, 100} is 3 words + 1 word for the integer itself
  • {:USD, 10000, []} (ie with empty formatting) is 4 words + 1 for the integer
    Do these structures provide the same guarantees as using Money.t? Not quite as rigorous since there is no __struct__ type to match against. On the other hand we can still validate the currency code and the integer in guards so the guarantees might be enough.

Option 3: Use an encoded integer

One interesting option might be to encode both the currency code and the amount into a single integer. ISO 4217 defines a 3-digit numeric code for currencies so would need 10 bits to store it. Then there are 54 bits left to store the currency amount. It would still need to be a signed integer in order to be a complete replacement for the Money.t struct.

An example would be <<978::10, 1000::signed-integer-54>> where here 978 is the ISO 4217 numeric code for EUR and 1000 is the amount. The bitstring is interpreted as EUR 10.00.

In this format, with 54 bits to work with as a signed integer, we can store +/- 9007199254740992 which, I suspect, will cater for most use cases.

Here we can still validate the currency code, the amount is an integer so we can still interpret that correctly in the context of the currency.

This format would have some issues as a serialisation format since math operations in the database would not return correct results. But serialisation could be done as a composite type of two integers: currency numeric code (small int) and amount (large int).

Next Steps

It appears there are at least 2 different representations that can be much more space efficient: {:USD, 1000} and encoded integer.

  • The first, {:USD, 1000} delivers almost the same level of guarantees as the Money.t and at 4 words not 24 words its 6 times less memory.

  • The second, an encoded integer, is 1 word instead of 24 and is therefore the most space efficient. But not all guarantees can be met.

These still don't answer some open questions:

  1. How much more time efficient are they?
  2. How would this affect serialisation and deserialisation to the database?
  3. Is the complexity too much of a compromise to maintainability?

Thoughts welcome. I'll definitely do some experimentation and see what might be possible, practical and sustainable.

Note that in any implementation using integers, precision has to be fixed so all integers can be interpreted correctly. We could also encode a precision in the integer but thats likely too complex and the law of diminishing returns probably applies.

The practical implication is that for sum and subtraction there should be no issue. For multiplication I think it's still ok. Division is most definitely a problem - or at least would be incompatible with the current ex_money implementation. The current implementation defers rounding to the currencies digits to the very last possible moment so precision can be preserved. Thats not going to be possible in an integer implementation. My understanding is that financial institutions expect to retain at least 7 decimal digits of precision and I don't believe that can be maintained any of these proposals.

If the experiments prove positive, I'll probably implement them as a new but complementary library. Which, assuming the benchmarks prove out, I'll call "ex_fast_money". This will make clear the guarantees are different.

Option 4 - encoded integer with precision

Following on from the previous Option 3, we could encode the precision in 3 bits allowing for 8 digits of precision. Also noting that small integers on the BEAM, for 64-bit systems, is actually 60 bits since 4 are kept for type information.

<<978::10, 3::3, 1000::signed-integer-47>>

Would mean EUR 1.000 where the precision is set by the 3::3. That way we have useful arbitrary precision and still be able to do fast math (at least sum, subtract and multiply - division I'm still looking at)

I implemented two experimental versions of Decimal on the weekend:

  • One uses bit strings to store the number
  • The other uses a packed integer

Neither of these implementations is ready for production use at all.

Then I ran some very basic benchmarking. TLDR; Packed decimals are over 30% more space efficient but also nearly 75% slower. The primary slowdown is the packing of the decimal into an integer.

The memory analysis below does not account for primitive (ie native integer) space. For a Decimal thats a further 3 words (or 12 bytes) and for PackedDecimal its a further 1 word (4 bytes). So the memory difference is slightly more in favour of PackedDecimal than shown below.

However, this data does not, in my mind, create a compelling reason to consider alternative implementations for Money given that the :amount is the primary contributor to space and time. Nevertheless, suggestions and comments welcome!

mix run ./bench/new.exs

Name                        ips        average  deviation         median         99th %
Decimal                  6.33 M      157.85 ns ±30542.90%           0 ns        1000 ns
Bitstring Decimal        4.00 M      250.19 ns ±17352.21%           0 ns        1000 ns
Packed Decimal           3.62 M      276.03 ns ±25034.99%           0 ns        1000 ns

Comparison: 
Decimal                  6.33 M
Bitstring Decimal        4.00 M - 1.58x slower +92.34 ns
Packed Decimal           3.62 M - 1.75x slower +118.18 ns

Memory usage statistics:

Name                 Memory usage
Decimal                      96 B
Bitstring Decimal           128 B - 1.33x memory usage +32 B
Packed Decimal               64 B - 0.67x memory usage -32 B

The results above bugged me, I didn't think the results should vary so much. Its challenging with fast loops of course, and the median of 0 ns above with such big deviation illustrates that. I decided to try running the benchmark over a longer period of time so see if the results stabilise. And being on OTP 24 with the JIT to see if that produces a change over the longer time too.

It does look like all three implementations converge quite closely when run longer, at least when calling new/1 in which the packing of either the integer or the bitstring is the dominant performance point. Memory utilisation remains lowest on the packed decimal implementation as expected.

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 1 min
memory time: 4 s
parallel: 1
inputs: none specified
Estimated total run time: 3.30 min

Benchmarking Bitstring Decimal...
Benchmarking Decimal...
Benchmarking Packed Decimal...

Name                        ips        average  deviation         median         99th %
Decimal                  4.45 M      224.84 ns ±45237.16%           0 ns        1000 ns
Bitstring Decimal        4.13 M      242.28 ns ±40507.71%           0 ns        1000 ns
Packed Decimal           4.02 M      248.55 ns ±21506.50%           0 ns        1000 ns

Comparison: 
Decimal                  4.45 M
Bitstring Decimal        4.13 M - 1.08x slower +17.44 ns
Packed Decimal           4.02 M - 1.11x slower +23.70 ns

Memory usage statistics:

Name                 Memory usage
Decimal                      96 B
Bitstring Decimal           128 B - 1.33x memory usage +32 B
Packed Decimal               64 B - 0.67x memory usage -32 B

Very exhaustive, I may say. Much more than I expected.

One thing puzzles me, though. Why are you disregarding the possibility of segregating the formatting and precision data from the amounts altogether, e.g., as I mentioned previously, in a separate singleton-per-data-model, context that is required to be mapped to the process prior to executing the computations in question?

For as long as the specification requires it, all of your original goals are still achieved:

  1. Correctness
  2. Formalised (ie uses formal currency data from ISO 4217 and CLDR)
  3. Fully localised

It's just that in this case the integral atomic piece of information is no longer the amount, but the model containing many amounts and the contextual (model-wide) formatting and precision.

Btw, there is no general industry-wide rule on the desired level of precision in financial calculus. There may be some "best" or most common practices but one is discouraged from relying on those and in favor of what's actually being stipulated in each particular case. Each Credit Agreement (e.g. a Syndicated Loan Agreement) for instance, explicitly stipulates an exact decimal precision (no more no less) to be used for each particular type of calculation in order for all the parties to come up with the same results.

Very helpful, thank you. I googled as much as I could to identify and standards or practises related to precision when I was writing the lib and couldn't find any so your knowledgeable feedback is very helpful.

Why are you disregarding the possibility of segregating the formatting and precision data from the amounts altogether

Not disregarding, just wanted to see what might be possible in a more compact representation of the current implementation in order to see where the boundaries are for improvement. Just experimentation.

In part because at its essence, a decimal is mostly what you suggest: an amount disconnected from the currency and formatting. The formatting field in Money.t "costs" only one word for the field name itself if no formatting data is provided since an empty list [] occupying no space.

I suppose I have some hesitancy in having amounts interpreted as money but in order to interpret that amount correctly some additional external state is required. Worth experimentation for sure but it does feel uncomfortable.

In the scenario you describe, how would you think the following should be handled in this case:

  1. Serialisation and deserialisation. What would be stored in a database? Sent in a JSON response?
  2. Handle a multi-process architecture. Perhaps a money stored in a GenServer state, or send it a message to another process.

Yes, but as previously pointed out, it was you who brought the Decimal.t "heavyweight" nature to my attention, so I figured while at optimizing why not go all the way.

As for the answers to your questions:

  1. With the amounts already kept in memory as integers and with their common descriptor in a separate structure within the same model, I expect they can be serialized and deserialized accordingly. For as long as both the amounts and the descriptor are treated together as an integral model/document, there is no ambiguity re their meaning.
  2. Generally speaking, a format/precision descriptor should always accompany the integer values it complements. So, if it takes a multi-process based computation (as in this XIRR library https://github.com/tubedude/finance-elixir) or as you mention, a GenServer based implementation, yes, it's either to be kept in memory (the process-accessible structures) along with the amounts to run the computation on or passed to the process in charge of one. But I believe these are all user-level decisions depending heavily on how they choose to model their data.