sequel_pg overwrites the inner loop of the Sequel postgres adapter row fetching code with a C version. The C version is significantly faster (2-6x) than the pure ruby version that Sequel uses by default.
The speed up that sequel_pg gives you depends on what you are selecting, but it should be noticeable whenever many rows are selected. Here’s an example that shows the difference it makes on a couple of models:
$ irb -r model -r benchmark irb(main):001:0> Track.count => 140854 irb(main):002:0> Album.count => 5579 irb(main):003:0> puts Benchmark.measure{Track.each{}} 10.740000 0.190000 10.930000 ( 11.875343) => nil irb(main):004:0> puts Benchmark.measure{10.times{Album.each{}}} 7.920000 0.070000 7.990000 ( 8.482130) => nil irb(main):005:0> require '/data/code/sequel_pg/ext/sequel_pg/sequel_pg' => true irb(main):006:0> puts Benchmark.measure{Track.each{}} 2.360000 0.400000 2.760000 ( 3.723098) => nil irb(main):007:0> puts Benchmark.measure{10.times{Album.each{}}} 1.300000 0.190000 1.490000 ( 2.001393) => nil
Here’s an example that uses a modified version of swift’s benchmarks (github.com/shanna/swift/tree/master/benchmarks/):
benchmark sys user total real rss sequel #select 0.090000 2.020000 2.110000 2.246688 46.54m sequel_pg #select 0.000000 0.250000 0.250000 0.361999 7.33m
sequel_pg also has code to speed up the map, to_hash, to_hash_groups, select_hash, select_hash_groups, select_map, and select_order_map Dataset methods, which is on by default. It also has code to speed up the loading of model objects, which is off by default as it isn’t fully compatible. It doesn’t handle overriding Model.call, Model#set_values, or Model#after_initialize, which may cause problems with the following plugins that ship with Sequel:
-
class_table_inheritance
-
force_encoding
-
serialization
-
single_table_inheritance
-
typecast_on_load
-
update_primary_key
If you want to extract that last ounce of performance when loading model objects and you can live with the limitations, you can enable the model optimization via:
# All datasets DB.optimize_model_load = true # Specific dataset Artist.dataset.optimize_model_load = true
If you are using PostgreSQL 9.2beta3 or higher on the client, then sequel_pg should enable streaming support. This allows you to stream returned rows one at a time, instead of collecting the entire result set in memory (which is how PostgreSQL works by default). You can check if streaming is supported by:
Sequel::Postgres.supports_streaming?
If streaming is supported, you can load the streaming support into the database:
DB.extension(:pg_streaming)
Then you can call the Dataset#stream method to have the dataset use the streaming support:
DB[:table].stream.each{|row| ...}
If you want to enable streaming for all of a database’s datasets, you can do the following:
DB.stream_all_queries = true
Note that pg 0.14.1+ is required for streaming to work. This is not required by the gem, as it is only a requirement for streaming, not for general use.
gem install sequel_pg
Note that by default sequel_pg only supports result sets with up to 256 columns. If you will have a result set with more than 256 columns, you should modify the maximum supported number of columns via:
gem install sequel_pg -- --with-cflags=\"-DSPG_MAX_FIELDS=512\"
Make sure the pg_config binary is in your PATH so the installation can find the PostgreSQL shared library and header files. Alternatively, you can use the POSTGRES_LIB and POSTGRES_INCLUDE environment variables to specify the shared library and header directories.
While previous versions of this gem supported Windows, the current version does not, due to the need to call C functions defined in the pg gem.
sequel_pg doesn’t ship with it’s own specs. It’s designed to replace a part of Sequel, so it just uses Sequel’s specs. Specifically, the spec_postgres rake task from Sequel.
sequel_pg uses GitHub Issues for tracking issues/bugs:
http://github.com/jeremyevans/sequel_pg/issues
The source code is on GitHub:
http://github.com/jeremyevans/sequel_pg
To get a copy:
git clone git://github.com/jeremyevans/sequel_pg.git
There are only a few requirements, which you should probably have before considering use of the library:
-
Rake
-
Sequel
-
pg
-
libpq headers and library
To build the library from a git checkout, after installing the requirements:
rake build
sequel_pg has been tested on the following:
-
ruby 1.8.7
-
ruby 1.9.3
-
ruby 2.0.0
-
ruby 2.1.4
-
rbx 2.2.9
-
You must be using the ISO PostgreSQL date format (which is the default). Using the SQL, POSTGRESQL, or GERMAN date formats will result in incorrect date/timestamp handling. In addition to PostgreSQL defaulting to ISO, Sequel also manually sets the date format to ISO by default, so unless you are overriding that setting (via Sequel::Postgres.use_iso_date_format = false), you should be OK.
-
Adding your own type conversion procs only has an effect if those types are not handled by default.
-
You do not need to require the library, the sequel postgres adapter will require it automatically. If you are using bundler, you should add it to your Gemfile like so:
gem 'sequel_pg', :require=>'sequel'
-
sequel_pg currently calls functions defined in the pg gem, which does not work on Windows and does not work in some unix-like operating systems that disallow undefined functions in shared libraries. If
RbConfig::CONFIG['LDFLAGS']
contains-Wl,--no-undefined
, you’ll probably have issues installing sequel_pg. You should probably fixRbConfig::CONFIG['LDFLAGS']
in that case.
Jeremy Evans <code@jeremyevans.net>