/parsley

A fast and modern parser combinator library for Scala

Primary LanguageScalaBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Parsley GitHub Workflow Status GitHub release GitHub license GitHub commits since latest release (by SemVer) Badge-Scaladoc

What is Parsley?

Parsley is a fast and modern parser combinator library for Scala based loosely on a Haskell-style parsec API.

How do I use it? parsley Scala version support parsley Scala version support parsley Scala version support

Parsley is distributed on Maven Central, and can be added to your project via:

// SBT
libraryDependencies += "com.github.j-mie6" %% "parsley" % "4.5.2"

// scala-cli
--dependency com.github.j-mie6::parsley:4.5.2
// or in file
//> using dep com.github.j-mie6::parsley:4.5.2

// mill
ivy"com.github.j-mie6::parsley:4.5.2"

Documentation can be found here

If you're a cats user, you may also be interested in using parsley-catsCats friendly to augment parsley with instances for various cats typeclasses:

libraryDependencies += "com.github.j-mie6" %% "parsley-cats" % "1.3.0"

Examples

scala> import parsley.Parsley
scala> import parsley.syntax.character.{charLift, stringLift}

scala> val hello: Parsley[Unit] = ('h' ~> ("ello" | "i") ~> " world!").void
scala> hello.parse("hello world!")
val res0: parsley.Result[String,Unit] = Success(())
scala> hello.parse("hi world!")
val res1: parsley.Result[String,Unit] = Success(())
scala> hello.parse("hey world!")
val res2: parsley.Result[String,Unit] =
Failure((line 1, column 2):
  unexpected "ey"
  expected "ello"
  >hey world!
    ^^)

scala> import parsley.character.digit
scala> val natural: Parsley[Int] = digit.foldLeft1(0)((n, d) => n * 10 + d.asDigit)
scala> natural.parse("0")
val res3: parsley.Result[String,Int] = Success(0)
scala> natural.parse("123")
val res4: parsley.Result[String,Int] = Success(123)

For more see the Wiki!

What are the differences to Haskell's parsec?

Mostly, this library is quite similar. However, due to Scala's differences in operator characters a few operators are changed:

  • (<$>) is known as map
  • try is known as attempt
  • ($>) is known as either #> or as

In addition, lift2 and lift3 are uncurried in this library: this is to provide better performance and easier usage with Scala's traditionally uncurried functions. There are also a few new operators in general to be found here!

Library Evolution Semantic Versioning: early-semver

Parsley is a modern parser combinator library, which strives to be on the bleeding-edge of parser combinator library design. This means that improvements will come naturally over time. Feel free to suggest improvements for consideration, as well as high-level problems you commonly encounter that we may be able to find a way to mitigate (see the Design Patterns for Parser Combinators paper for example!).

Frequency of Major Changes

Part of innovation is being willing to admit design mistakes and rectify them: when a binary-breaking release is made, the opportunity may be taken to polish parts of the libary's API that are clunky, or could be better organised or improved. For example, see the differences between parsley-3.3.10 and parsley-4.0.0! However, constant breaking changes are not a good way to encourage the use of a library as users often want stability: to that end, annoyances and bugbears with the API are only addressed approximately yearly, and the frequence of these will decrease over time. For future major releases, care will be taken to, wherever possible, publish all patch-level changes in a final version to the previous major.minor version, and then all minor-level changes as a final major.(minor+1).0 version before releasing the major-level changes as (major+1).0.0: this will allow users stuck on the old version to benefit as much as possible from the fixes and new functionality.

Versioning Policy

As of 4.0.0, parsley is strictly commited to early-semver, which means that the version numbers are significant:

  • Two versions x._._ and y._._ with x != y are incompatible with each other at a binary level: having x._._ on the classpath with code compiled with the y._._ will most likely result in a linkage-error at runtime.
  • Two versions a.x._ and a.y._ with x <= y are binary compatible, which means that code compiled against a.x._ will still work with a.y._ on the classpath. A "source" component y > x indicates that a.y._ has added or deprecated functionality since a.x._.
  • Two versions a.b.x and a.b.y are binary and source compatible, which means there are no compatiblity concerns between the two versions. Code compiled against a.b.x will run with a.b.y on the classpath and vice-versa. A "patch" component y > x indicates that a.b.y fixes issues (bugs or poor performance) with a.b.x.

In short, if you are on version a.x.y, you can: feel free to upgrade to version a.x.z if z > y without worry; and upgrade to a.z._ if z > x, with a possible (but rare) need to update your code minorly. Occasionally, a "source" component bump may deprecate functionality, but it will provide a migration to tell you how to avoid the deprecation warning. Altered/deprecated functionality may be hidden from the public API in a binary backwards compatible way in a "source" bump and therefore may require updating when recompiled; this will be done sparingly and with minimal disruption as to not discourage updating the libary, and any immediate migration changes to user code from a.x._ to any a.y._ with y > x will be documented in a.y._'s release.

Note: all functionality marked as private [parsley] or within the parsley.internal package is not adherent to early-semver and may be removed or changed at will with no impact to regular/intended use of the library.

Release Candidates and Milestones

Occasionally, a minor (source) release will contain either a significant body of new work, or a significant rework of some internal machinery. In these cases additional versioning may be employed:

  • Experimental (and volatile) new functionality may be iterated with a.b.0-Mn versions: these are (hopefully) working pre-release versions of the functionality, subject to even binary incompatible changes between M versions. When the new API and behaviour becomes stable, the release graduates to the a.b.0-RC1 release candidate.
  • Release candidates are used to iron-out any lingering issues with a minor release and potentially alter the finer-points of the new functionality's behaviour. Binary compatiblity will be preserved between RCx and RCy with y > x except within truly exceptional circumstances.
  • Finally, the release makes it to a.b.0 and is hopefully truly stable.

Version EoL (End of Life) Policy

Old versions of the library may still be given important bug-fixes after it has be obsoleted by a new release. In exceptional circumstances, performance problems may be addressed for old versions. The lifetime policy is as follows:

  • Major (binary) versions reach EoL a minimum of 6 months after its successor was released, unless an extension to its life is requested by a issue.
  • Minor (source) versions reach EoL immediately on the release of its successor, unless deprecations were issued by its successor, in which case it will reach EoL after a minimum of 3 months.

Some more minor bugfixes may not be ported to previous versions if they (a) do not appear in that version or (b) the code has changed too much internally to make porting feasible.

An exception to this policy is made for any version 3.x.y, which reaches EoL effective immediately (December 2022) excluding exceptional circumstances.

Version Released On EoL Status
3.3.0 7th January 2022 EoL reached (3.3.10)
4.0.0 30th November 2022 EoL reached (4.0.4)
4.1.0 18th January 2023 EoL reached (4.1.8)
4.2.0 22nd January 2023 EoL reached (4.2.14)
4.3.0 8th July 2023 EoL reached (4.3.1)
4.4.0 6th October 2023 EoL reached (4.4.1)
4.5.0 6th January 2023 Enjoying indefinite support

Bug Reports Percentage of issues still open Maintainability Test Coverage

If you encounter a bug when using Parsley, try and minimise the example of the parser (and the input) that triggers the bug. If possible, make a self contained example: this will help to identify the issue without too much issue.

How does it work?

Parsley represents parsers as an abstract-syntax tree AST, which is constructed lazily. As a result, Parsley is able to perform analysis and optimisations on your parsers, which helps reduce the burden on you, the programmer. This representation is then compiled into a light-weight stack-based instruction set designed to run fast on the JVM. This is what offers Parsley its competitive performance, but for best effect a parser should be compiled once and used many times (so-called hot execution).

To make recursive parsers work in this AST format, you must ensure that recursion is done by knot-tying: you should define all recursive parsers with val and introduce lazy val where necessary for the compiler to accept the definition.

References