/chomp

A fast monadic-style parser combinator designed to work on stable Rust.

Primary LanguageRustApache License 2.0Apache-2.0

Chomp

Gitter Build Status Coverage Status Crates.io Documentation

Chomp is a fast monadic-style parser combinator library designed to work on stable Rust. It was written as the culmination of the experiments detailed in these blog posts:

For its current capabilities, you will find that Chomp performs consistently as well, if not better, than optimized C parsers, while being vastly more expressive. For an example that builds a performant HTTP parser out of smaller parsers, see http_parser.rs.

Installation

Add the following line to the dependencies section of your Cargo.toml:

[dependencies]
chomp = "0.3.1"

Usage

Parsers are functions from a slice over an input type Input<I> to a ParseResult<I, T, E>, which may be thought of as either a success resulting in type T, an error of type E, or a partially completed result which may still consume more input of type I.

The input type is almost never manually manipulated. Rather, one uses parsers from Chomp by invoking the parse! macro. This macro was designed intentionally to be as close as possible to Haskell's do-syntax or F#'s "computation expressions", which are used to sequence monadic computations. At a very high level, usage of this macro allows one to declaratively:

  • Sequence parsers, while short circuiting the rest of the parser if any step fails.
  • Bind previous successful results to be used later in the computation.
  • Return a composite datastructure using the previous results at the end of the computation.

In other words, just as a normal Rust function usually looks something like this:

fn f() -> (u8, u8, u8) {
    let a = read_digit();
    let b = read_digit();
    launch_missiles();
    return (a, b, a + b);
}

A Chomp parser with a similar structure looks like this:

fn f<I: U8Input>(i: I) -> SimpleResult<I, (u8, u8, u8)> {
    parse!{i;
        let a = digit();
        let b = digit();
                string(b"missiles");
        ret (a, b, a + b)
    }
}

And to implement read_digit we can utilize the map function to manipulate any success value while preserving any error or incomplete state:

// Standard rust, no error handling:
fn read_digit() -> u8 {
    let mut s = String::new();
    std::io::stdin().read_line(&mut s).unwrap();
    s.trim().parse().unwrap()
}

// Chomp, error handling built in, and we make sure we only get a number:
fn read_digit<I: U8Input>(i: I) -> SimpleResult<I, u8> {
    satisfy(i, |c| b'0' <= c && c <= b'9').map(|c| c - b'0')
}

For more documentation, see the rust-doc output.

Example

#[macro_use]
extern crate chomp;

use chomp::prelude::*;

#[derive(Debug, Eq, PartialEq)]
struct Name<B: Buffer> {
    first: B,
    last:  B,
}

fn name<I: U8Input>(i: I) -> SimpleResult<I, Name<I::Buffer>> {
    parse!{i;
        let first = take_while1(|c| c != b' ');
                    token(b' ');  // skipping this char
        let last  = take_while1(|c| c != b'\n');

        ret Name{
            first: first,
            last:  last,
        }
    }
}

assert_eq!(parse_only(name, "Martin Wernstål\n".as_bytes()), Ok(Name{
    first: &b"Martin"[..],
    last: "Wernstål".as_bytes()
}));

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Contact

File an issue here on Github or visit gitter.im/m4rw3r/chomp.