/edn_turbo

ruby gem for parsing edn data. Written as a C-extension using Ragel

Primary LanguageC++MIT LicenseMIT

edn_turbo 0.5.5

Fast Ragel-based EDN parser for Ruby.

edn_turbo is a parser plugin for edn. With a few exceptions edn_turbo provides the same functionality as the edn gem, but since the edn_turbo parser is implemented in C, it is an order of magintude faster.

Some quick sample runs comparing time output of file reads using edn and edn_turbo (see issue 12):

irb(main):001:0> require 'benchmark'
=> true
irb(main):002:0> require 'edn'
=> true
irb(main):003:0> s = "[{\"x\" {\"id\" \"/model/952\", \"model_name\" \"person\", \"ancestors\" [\"record\" \"asset\"], \"format\" \"edn\"}, \"id\" 952, \"name\" nil, \"model_name\" \"person\", \"rel\" {}, \"description\" nil, \"age\" nil, \"updated_at\" nil, \"created_at\" nil, \"anniversary\" nil, \"job\" nil, \"start_date\" nil, \"username\" nil, \"vacation_start\" nil, \"vacation_end\" nil, \"expenses\" nil, \"rate\" nil, \"display_name\" nil, \"gross_profit_per_month\" nil}]"
=> "[{\"x\" {\"id\" \"/model/952\", \"model_name\" \"person\", \"ancestors\" [\"record\" \"asset\"], \"format\" \"edn\"}, \"id\" 952, \"name\" nil, \"model_name\" \"person\", \"rel\" {}, \"description\" nil, \"age\" nil, \"updated_at\" nil, \"created_at\" nil, \"anniversary\" nil, \"job\" nil, \"start_date\" nil, \"username\" nil, \"vacation_start\" nil, \"vacation_end\" nil, \"expenses\" nil, \"rate\" nil, \"display_name\" nil, \"gross_profit_per_month\" nil}]"
irb(main):004:0> Benchmark.realtime { 100.times { EDN::read(s) } }
=> 0.083543
irb(main):005:0> Benchmark.realtime { 100000.times { EDN::read(s) } }
=> 73.901049
irb(main):006:0> require 'edn_turbo'
=> true
irb(main):007:0> Benchmark.realtime { 100.times { EDN::read(s) } }
=> 0.007321
irb(main):008:0> Benchmark.realtime { 100000.times { EDN::read(s) } }
=> 2.866411

Dependencies

Notes:

  • edn_turbo uses a ragel-based parser but the generated .cc file is bundled so ragel should not need to be installed.

  • If the gem fails to install due to a compilation error, make sure you have icu4c installed. The reported gem install error doesn't make it clear this is the issue.

Usage

Simply require 'edn_turbo' instead of 'edn'. Otherwise (with the exceptions noted below) the API is the same as the edn gem.

    require 'edn_turbo'

    File.open(filename) do |file|
       output = EDN.read(file)
       pp output if output != EOF
    end

    # also accepts a string
    pp EDN.read("[ 1 2 3 abc ]")

	# metadata
	e = EDN.read('^String ^:foo ^{:foo false :tag Boolean :bar 2} [1 2]')
	pp e          # -> [1, 2]
	pp e.metadata # -> {:foo=>true, :tag=>#<EDN::Type::Symbol:0x007fdbea8a29b0 @symbol=:String>, :bar=>2}

Or instantiate and reuse an instance of a parser:

    require 'edn_turbo'

    p = EDN::new_parser
    File.open(filename) do |file|
       output = p.parse(file)
       pp output if output != EOF
    end

    # with a string
    pp p.parse("[ 1 2 3 abc ]")


    # set new input
    s = "(1) :abc { 1 2 }"
    p.set_input(s)

    # parse token by token
    loop do
      t = p.read
      break if t == EOF

      pp t
    end

Differences with edn gem

edn_turbo reads String and core IO types using C-api calls. However, data from StringIO sources is extracted using read() calls into the ruby side.

Known problems

v0.3.2:

  • Some unhandled corner cases with operators and spacing remain. edn_turbo handles things like 1 / 12 and 1/ 12 but parse errors occur with 1/12 and 1 /12 because it treats /12 as an invalid symbol.