/pgdecimal

Decimal64 and decimal128 for PostgreSQL

Primary LanguageCOtherNOASSERTION

pgDecimal

PostgreSQL for decimal64 and decimal128. Decimal64 will have 16 precision digits and decimal128 has 34. 34 is huge. This work is inspired by the pgDecimal package by Pavel Stehule. See README.pavel for the documents from original author.
It went through significant changes.

We implemented decimal64 and decimal128. The backend is the decNumber library from http://speleotrove.com/decimal/. See README.decnumber for build instructions.

We did not use Decimal32/64 as the original pgDecimal, because we want to access some API in decDouble/Quad. Also, decimal32 is not implemented, because it only got 7 precision digits which is too small to be useful. (The real reason, is that decSingle does not implement arithmatics :-)

decimal64 is stored like INT64 and decimal128 is stored like interval type. There is no palloc in decimal64 arithmatics and exactly one palloc in decimal128 arithmatics.

TODO

The current implementation of conversion between differnt numeric types is not efficient. We first convert the input type to cstring then cstring to decimal64/128. Type conversion/cast is quite tricky. One need to be very careful, for example, it is easy to comeup with a query that numeric type is actually faster than decimal64 because the unexpect casting cost. Also watch out sometimes actual execution on numeric type actually casted to double precision. Anyway, measure, before deploy in real system.

The INTEL library has necessary converting functions for decimal64/128 from/to int64/float/double. This alone, could be the reason to use INTEL BID lib when possilbe. Will investigate/do, next.

Usage Example

The following was on an Intel NUC, core i5 5252U @ 1.6GHz, 16GB memory, SSD. Only CPU matters.

ftian=# set vitesse.enable = 0;
SET
ftian=# \timing
Timing is on.

ftian=# create table tt(ii bigint, d double precision, d64 decimal64, d128 decimal128, n numeric(15, 3));
CREATE TABLE
Time: 9.628 ms

ftian=# insert into tt select i, i + 0.123, i + 0.123, i + 0.123, i + 0.123 from generate_series(1, 1000000) i;
INSERT 0 1000000
Time: 2861.325 ms

ftian=# select * from tt limit 2;
 ii |   d   |  d64  | d128  |   n   
 ----+-------+-------+-------+-------
  1 | 1.123 | 1.123 | 1.123 | 1.123
  2 | 2.123 | 2.123 | 2.123 | 2.123
(2 rows)

Time: 0.288 ms

--
-- Simple predicates
--

ftian=# set vitesse.enable = 0;
SET
Time: 0.112 ms
ftian=# select count(*) from tt where (d + d*d + d*d*d + d*d*d*d) > 10000000;
 count  
--------
 999945
(1 row)

Time: 432.331 ms
ftian=# select count(*) from tt where (n + n*n + n*n*n + n*n*n*n) > 10000000;
 count  
--------
 999945
(1 row)

Time: 1763.774 ms
ftian=# select count(*) from tt where (d64 + d64*d64 + d64*d64*d64 + d64*d64*d64*d64) > 10000000;
 count  
--------
 999945
(1 row)

Time: 1090.795 ms
ftian=# select count(*) from tt where (d128 + d128*d128 + d128*d128*d128 + d128*d128*d128*d128) > 10000000;
 count  
--------
 999945
(1 row)

Time: 1387.014 ms


--
-- sort
-- 

ftian=# select * from tt order by d limit 2 offset 999000;
   ii   |     d      |    d64     |    d128    |     n      
--------+------------+------------+------------+------------
 999001 | 999001.123 | 999001.123 | 999001.123 | 999001.123
 999002 | 999002.123 | 999002.123 | 999002.123 | 999002.123
(2 rows)

Time: 749.732 ms
ftian=# select * from tt order by n limit 2 offset 999000;
   ii   |     d      |    d64     |    d128    |     n      
--------+------------+------------+------------+------------
 999001 | 999001.123 | 999001.123 | 999001.123 | 999001.123
 999002 | 999002.123 | 999002.123 | 999002.123 | 999002.123
(2 rows)

Time: 1944.858 ms
ftian=# select * from tt order by d64 limit 2 offset 999000;
   ii   |     d      |    d64     |    d128    |     n      
--------+------------+------------+------------+------------
 999001 | 999001.123 | 999001.123 | 999001.123 | 999001.123
 999002 | 999002.123 | 999002.123 | 999002.123 | 999002.123
(2 rows)

Time: 1579.657 ms
ftian=# select * from tt order by d128 limit 2 offset 999000;
   ii   |     d      |    d64     |    d128    |     n      
--------+------------+------------+------------+------------
 999001 | 999001.123 | 999001.123 | 999001.123 | 999001.123
 999002 | 999002.123 | 999002.123 | 999002.123 | 999002.123
(2 rows)

Time: 1752.646 ms

--
-- Hash joinable
-- 

ftian=# explain select count(*) from tt t1 join tt t2 on t1.d64 * t1.d64 + t1.d64 = t2.d64 + t2.d64 * t2.d64;
                                    QUERY PLAN                                    
----------------------------------------------------------------------------------
 Aggregate  (cost=6875071228.00..6875071228.01 rows=1 width=0)
  ->  Hash Join  (cost=36707.00..5625071228.00 rows=500000000000 width=0)
        Hash Cond: (((t1.d64 * t1.d64) + t1.d64) = (t2.d64 + (t2.d64 * t2.d64)))
        ->  Seq Scan on tt t1  (cost=0.00..20300.00 rows=1000000 width=8)
        ->  Hash  (cost=20300.00..20300.00 rows=1000000 width=8)
              ->  Seq Scan on tt t2  (cost=0.00..20300.00 rows=1000000 width=8)
(6 rows)

Time: 0.754 ms

ftian=# select count(*) from tt t1 join tt t2 on t1.d64 * t1.d64 + t1.d64 = t2.d64 + t2.d64 * t2.d64;
 count  
---------
 1000000
(1 row)

Time: 1659.609 ms
ftian=# select count(*) from tt t1 join tt t2 on t1.n * t1.n + t1.n = t2.n + t2.n * t2.n;
 count  
---------
 1000000
(1 row)

Time: 2212.401 ms

-- 
-- nestedloop, just for fun
--
ftian=# select count(*) from tt t1, tt t2 where t1.ii < 10000 and t2.ii < 10000 and t1.d * t1.d + t1.d > t2.d + t2.d * t2.d;
 count   
----------
 49985001
(1 row)

Time: 20558.837 ms

ftian=# select count(*) from tt t1, tt t2 where t1.ii < 10000 and t2.ii < 10000 and t1.n * t1.n + t1.n > t2.n + t2.n * t2.n;
 count   
----------
 49985001
(1 row)

Time: 70237.267 ms
ftian=# select count(*) from tt t1, tt t2 where t1.ii < 10000 and t2.ii < 10000 and t1.d64 * t1.d64 + t1.d64 > t2.d64 + t2.d64 * t2.d64;
 count   
----------
 49985001
(1 row)

Time: 48016.060 ms
ftian=# select count(*) from tt t1, tt t2 where t1.ii < 10000 and t2.ii < 10000 and t1.d128 * t1.d128 + t1.d128 > t2.d128 + t2.d128 * t2.d128;
 count   
----------
 49985001
(1 row)

Time: 60443.629 ms