Pyrsistent is a number of persistent collections (by some referred to as functional data structures). Persistent in the sense that they are immutable.
All methods on a data structure that would normally mutate it instead return a new copy of the structure containing the requested updates. The original structure is left untouched.
This will simplify the reasoning about what a program does since no hidden side effects ever can take place to these data structures. You can rest assured that the object you hold a reference to will remain the same throughout its lifetime and need not worry that somewhere five stack levels below you in the darkest corner of your application someone has decided to remove that element that you expected to be there.
Pyrsistent is influenced by persistent data structures such as those found in the standard library of Clojure. The data structures are designed to share common elements through path copying. It aims at taking these concepts and make them as pythonic as possible so that they can be easily integrated into any python program without hassle.
If you want use literal syntax to define them in your code rather than function calls check out Pyrthon. Be aware, that one is experimental, unmaintained and alpha software.
If you cannot find the persistent data structure you're looking for here you may want to take a look at Pyrsistent_extras which is maintained by @mingmingrr. If you still don't find what you're looking for please open an issue for discussion. If we agree that functionality is missing you may want to go ahead and create a Pull Request implement the missing functionality.
The collection types and key features currently implemented are:
- PVector, similar to a python list
- PMap, similar to dict
- PSet, similar to set
- PRecord, a PMap on steroids with fixed fields, optional type and invariant checking and much more
- PClass, a Python class fixed fields, optional type and invariant checking and much more
- Checked collections, PVector, PMap and PSet with optional type and invariance checks and more
- PBag, similar to collections.Counter
- PList, a classic singly linked list
- PDeque, similar to collections.deque
- Immutable object type (immutable) built on the named tuple
- freeze and thaw functions to convert between pythons standard collections and pyrsistent collections.
- Flexible transformations of arbitrarily complex structures built from PMaps and PVectors.
Below are examples of common usage patterns for some of the structures and features. More information and full documentation for all data structures is available in the documentation.
With full support for the Sequence protocol PVector is meant as a drop in replacement to the built in list from a readers point of view. Write operations of course differ since no in place mutation is done but naming should be in line with corresponding operations on the built in list.
Support for the Hashable protocol also means that it can be used as key in Mappings.
Appends are amortized O(1). Random access and insert is log32(n) where n is the size of the vector.
>>> from pyrsistent import v, pvector
# No mutation of vectors once created, instead they
# are "evolved" leaving the original untouched
>>> v1 = v(1, 2, 3)
>>> v2 = v1.append(4)
>>> v3 = v2.set(1, 5)
>>> v1
pvector([1, 2, 3])
>>> v2
pvector([1, 2, 3, 4])
>>> v3
pvector([1, 5, 3, 4])
# Random access and slicing
>>> v3[1]
5
>>> v3[1:3]
pvector([5, 3])
# Iteration
>>> list(x + 1 for x in v3)
[2, 6, 4, 5]
>>> pvector(2 * x for x in range(3))
pvector([0, 2, 4])
With full support for the Mapping protocol PMap is meant as a drop in replacement to the built in dict from a readers point of view. Support for the Hashable protocol also means that it can be used as key in other Mappings.
Random access and insert is log32(n) where n is the size of the map.
>>> from pyrsistent import m, pmap, v
# No mutation of maps once created, instead they are
# "evolved" leaving the original untouched
>>> m1 = m(a=1, b=2)
>>> m2 = m1.set('c', 3)
>>> m3 = m2.set('a', 5)
>>> m1
pmap({'a': 1, 'b': 2})
>>> m2
pmap({'a': 1, 'c': 3, 'b': 2})
>>> m3
pmap({'a': 5, 'c': 3, 'b': 2})
>>> m3['a']
5
# Evolution of nested persistent structures
>>> m4 = m(a=5, b=6, c=v(1, 2))
>>> m4.transform(('c', 1), 17)
pmap({'a': 5, 'c': pvector([1, 17]), 'b': 6})
>>> m5 = m(a=1, b=2)
# Evolve by merging with other mappings
>>> m5.update(m(a=2, c=3), {'a': 17, 'd': 35})
pmap({'a': 17, 'c': 3, 'b': 2, 'd': 35})
>>> pmap({'x': 1, 'y': 2}) + pmap({'y': 3, 'z': 4})
pmap({'y': 3, 'x': 1, 'z': 4})
# Dict-like methods to convert to list and iterate
>>> m3.items()
pvector([('a', 5), ('c', 3), ('b', 2)])
>>> list(m3)
['a', 'c', 'b']
With full support for the Set protocol PSet is meant as a drop in replacement to the built in set from a readers point of view. Support for the Hashable protocol also means that it can be used as key in Mappings.
Random access and insert is log32(n) where n is the size of the set.
>>> from pyrsistent import s
# No mutation of sets once created, you know the story...
>>> s1 = s(1, 2, 3, 2)
>>> s2 = s1.add(4)
>>> s3 = s1.remove(1)
>>> s1
pset([1, 2, 3])
>>> s2
pset([1, 2, 3, 4])
>>> s3
pset([2, 3])
# Full support for set operations
>>> s1 | s(3, 4, 5)
pset([1, 2, 3, 4, 5])
>>> s1 & s(3, 4, 5)
pset([3])
>>> s1 < s2
True
>>> s1 < s(3, 4, 5)
False
A PRecord is a PMap with a fixed set of specified fields. Records are declared as python classes inheriting from PRecord. Because it is a PMap it has full support for all Mapping methods such as iteration and element access using subscript notation.
It is possible to add type information to the record to enforce type checks. Multiple allowed types can be specified by providing an iterable of types.
Custom types (classes) that are iterable should be wrapped in a tuple to prevent their members being added to the set of valid types. Although Enums in particular are now supported without wrapping, see #83 for more information.
Fields are not mandatory by default but can be specified as such. If fields are missing an InvariantException will be thrown which contains information about the missing fields.
It is possible to add invariants that must hold when evolving the record. Invariants can be specified on both field and record level. If invariants fail an InvariantException will be thrown which contains information about the failing invariants. An invariant function should return a tuple consisting of a boolean that tells if the invariant holds or not and an object describing the invariant. This object can later be used to identify which invariant that failed.
The global invariant function is only executed if all field invariants hold.
Global invariants are inherited to subclasses.
>>> class RestrictedVector(PRecord):
... __invariant__ = lambda r: (r.y >= r.x, 'x larger than y')
... x = field(invariant=lambda x: (x > 0, 'x negative'))
... y = field(invariant=lambda y: (y > 0, 'y negative'))
...
>>> r = RestrictedVector(y=3, x=2)
>>> try:
... r.set(x=-1, y=-2)
... except InvariantException as e:
... print(e.invariant_errors)
...
('y negative', 'x negative')
>>> try:
... r.set(x=2, y=1)
... except InvariantException as e:
... print(e.invariant_errors)
...
('x larger than y',)
Invariants may also contain multiple assertions. For those cases the invariant function should return a tuple of invariant tuples as described above. This structure is reflected in the invariant_errors attribute of the exception which will contain tuples with data from all failed invariants. Eg:
It's possible to specify factory functions for fields. The factory function receives whatever is supplied as field value and the actual returned by the factory is assigned to the field given that any type and invariant checks hold. PRecords have a default factory specified as a static function on the class, create(). It takes a Mapping as argument and returns an instance of the specific record. If a record has fields of type PRecord the create() method of that record will be called to create the "sub record" if no factory has explicitly been specified to override this behaviour.
It is also possible to have fields with pyrsistent
collections.
PRecords support serialization back to dicts. Default serialization will take keys and values "as is" and output them into a dict. It is possible to specify custom serialization functions to take care of fields that require special treatment.
>>> from datetime import date
>>> class Person(PRecord):
... name = field(type=unicode)
... birth_date = field(type=date,
... serializer=lambda format, d: d.strftime(format['date']))
...
>>> john = Person(name=u'John', birth_date=date(1985, 10, 21))
>>> john.serialize({'date': '%Y-%m-%d'})
{'birth_date': '1985-10-21', 'name': u'John'}
A PClass is a python class with a fixed set of specified fields. PClasses are declared as python classes inheriting from PClass. It is defined the same way that PRecords are and behaves like a PRecord in all aspects except that it is not a PMap and hence not a collection but rather a plain Python object.
Checked collections currently come in three flavors: CheckedPVector, CheckedPMap and CheckedPSet.
>>> from pyrsistent import CheckedPVector, CheckedPMap, CheckedPSet, thaw
>>> class Positives(CheckedPSet):
... __type__ = (long, int)
... __invariant__ = lambda n: (n >= 0, 'Negative')
...
>>> class Lottery(PRecord):
... name = field(type=str)
... numbers = field(type=Positives, invariant=lambda p: (len(p) > 0, 'No numbers'))
...
>>> class Lotteries(CheckedPVector):
... __type__ = Lottery
...
>>> class LotteriesByDate(CheckedPMap):
... __key_type__ = date
... __value_type__ = Lotteries
...
>>> lotteries = LotteriesByDate.create({date(2015, 2, 15): [{'name': 'SuperLotto', 'numbers': {1, 2, 3}},
... {'name': 'MegaLotto', 'numbers': {4, 5, 6}}],
... date(2015, 2, 16): [{'name': 'SuperLotto', 'numbers': {3, 2, 1}},
... {'name': 'MegaLotto', 'numbers': {6, 5, 4}}]})
>>> lotteries
LotteriesByDate({datetime.date(2015, 2, 15): Lotteries([Lottery(numbers=Positives([1, 2, 3]), name='SuperLotto'), Lottery(numbers=Positives([4, 5, 6]), name='MegaLotto')]), datetime.date(2015, 2, 16): Lotteries([Lottery(numbers=Positives([1, 2, 3]), name='SuperLotto'), Lottery(numbers=Positives([4, 5, 6]), name='MegaLotto')])})
# The checked versions support all operations that the corresponding
# unchecked types do
>>> lottery_0215 = lotteries[date(2015, 2, 15)]
>>> lottery_0215.transform([0, 'name'], 'SuperDuperLotto')
Lotteries([Lottery(numbers=Positives([1, 2, 3]), name='SuperDuperLotto'), Lottery(numbers=Positives([4, 5, 6]), name='MegaLotto')])
# But also makes asserts that types and invariants hold
>>> lottery_0215.transform([0, 'name'], 999)
Traceback (most recent call last):
PTypeError: Invalid type for field Lottery.name, was int
>>> lottery_0215.transform([0, 'numbers'], set())
Traceback (most recent call last):
InvariantException: Field invariant failed
# They can be converted back to python built ins with either thaw()
# or serialize() (which provides possibilities to customize serialization)
>>> thaw(lottery_0215)
[{'numbers': set([1, 2, 3]), 'name': 'SuperLotto'}, {'numbers': set([4, 5, 6]), 'name': 'MegaLotto'}]
>>> lottery_0215.serialize()
[{'numbers': set([1, 2, 3]), 'name': 'SuperLotto'}, {'numbers': set([4, 5, 6]), 'name': 'MegaLotto'}]
Transformations are inspired by the cool library instar for Clojure. They let you evolve PMaps and PVectors with arbitrarily deep/complex nesting using simple syntax and flexible matching syntax.
The first argument to transformation is the path that points out the value to transform. The second is the transformation to perform. If the transformation is callable it will be applied to the value(s) matching the path. The path may also contain callables. In that case they are treated as matchers. If the matcher returns True for a specific key it is considered for transformation.
# Basic examples
>>> from pyrsistent import inc, freeze, thaw, rex, ny, discard
>>> v1 = freeze([1, 2, 3, 4, 5])
>>> v1.transform([2], inc)
pvector([1, 2, 4, 4, 5])
>>> v1.transform([lambda ix: 0 < ix < 4], 8)
pvector([1, 8, 8, 8, 5])
>>> v1.transform([lambda ix, v: ix == 0 or v == 5], 0)
pvector([0, 2, 3, 4, 0])
# The (a)ny matcher can be used to match anything
>>> v1.transform([ny], 8)
pvector([8, 8, 8, 8, 8])
# Regular expressions can be used for matching
>>> scores = freeze({'John': 12, 'Joseph': 34, 'Sara': 23})
>>> scores.transform([rex('^Jo')], 0)
pmap({'Joseph': 0, 'Sara': 23, 'John': 0})
# Transformations can be done on arbitrarily deep structures
>>> news_paper = freeze({'articles': [{'author': 'Sara', 'content': 'A short article'},
... {'author': 'Steve', 'content': 'A slightly longer article'}],
... 'weather': {'temperature': '11C', 'wind': '5m/s'}})
>>> short_news = news_paper.transform(['articles', ny, 'content'], lambda c: c[:25] + '...' if len(c) > 25 else c)
>>> very_short_news = news_paper.transform(['articles', ny, 'content'], lambda c: c[:15] + '...' if len(c) > 15 else c)
>>> very_short_news.articles[0].content
'A short article'
>>> very_short_news.articles[1].content
'A slightly long...'
# When nothing has been transformed the original data structure is kept
>>> short_news is news_paper
True
>>> very_short_news is news_paper
False
>>> very_short_news.articles[0] is news_paper.articles[0]
True
# There is a special transformation that can be used to discard elements. Also
# multiple transformations can be applied in one call
>>> thaw(news_paper.transform(['weather'], discard, ['articles', ny, 'content'], discard))
{'articles': [{'author': 'Sara'}, {'author': 'Steve'}]}
PVector, PMap and PSet all have support for a concept dubbed evolvers. An evolver acts like a mutable view of the underlying persistent data structure with "transaction like" semantics. No updates of the original data structure is ever performed, it is still fully immutable.
The evolvers have a very limited API by design to discourage excessive, and inappropriate, usage as that would take us down the mutable road. In principle only basic mutation and element access functions are supported. Check out the documentation of each data structure for specific examples.
Examples of when you may want to use an evolver instead of working directly with the data structure include:
- Multiple updates are done to the same data structure and the intermediate results are of no interest. In this case using an evolver may be a more efficient and easier to work with.
- You need to pass a vector into a legacy function or a function that you have no control over which performs in place mutations. In this case pass an evolver instance instead and then create a new pvector from the evolver once the function returns.
>>> from pyrsistent import v
# In place mutation as when working with the built in counterpart
>>> v1 = v(1, 2, 3)
>>> e = v1.evolver()
>>> e[1] = 22
>>> e = e.append(4)
>>> e = e.extend([5, 6])
>>> e[5] += 1
>>> len(e)
6
# The evolver is considered *dirty* when it contains changes compared to the underlying vector
>>> e.is_dirty()
True
# But the underlying pvector still remains untouched
>>> v1
pvector([1, 2, 3])
# Once satisfied with the updates you can produce a new pvector containing the updates.
# The new pvector will share data with the original pvector in the same way that would have
# been done if only using operations on the pvector.
>>> v2 = e.persistent()
>>> v2
pvector([1, 22, 3, 4, 5, 7])
# The evolver is now no longer considered *dirty* as it contains no differences compared to the
# pvector just produced.
>>> e.is_dirty()
False
# You may continue to work with the same evolver without affecting the content of v2
>>> e[0] = 11
# Or create a new evolver from v2. The two evolvers can be updated independently but will both
# share data with v2 where possible.
>>> e2 = v2.evolver()
>>> e2[0] = 1111
>>> e.persistent()
pvector([11, 22, 3, 4, 5, 7])
>>> e2.persistent()
pvector([1111, 22, 3, 4, 5, 7])
These functions are great when your cozy immutable world has to interact with the evil mutable world outside.
By default, freeze will also recursively convert values inside PVectors and PMaps. This behaviour can be changed by providing freeze with the flag strict=False.
>>> from pyrsistent import freeze, v, m
>>> freeze(v(1, v(2, [3])))
pvector([1, pvector([2, pvector([3])])])
>>> freeze(v(1, v(2, [3])), strict=False)
pvector([1, pvector([2, [3]])])
>>> freeze(m(a=m(b={'c': 1})))
pmap({'a': pmap({'b': pmap({'c': 1})})})
>>> freeze(m(a=m(b={'c': 1})), strict=False)
pmap({'a': pmap({'b': {'c': 1}})})
In this regard, thaw operates as the inverse of freeze so will thaw values inside native data structures unless passed the strict=False flag.
Pyrsistent is developed and tested on Python 3.8+ and PyPy3.
Pyrsistent is developed with performance in mind. Still, while some operations are nearly on par with their built in, mutable, counterparts in terms of speed, other operations are slower. In the cases where attempts at optimizations have been done, speed has generally been valued over space.
Pyrsistent comes with two API compatible flavors of PVector (on which PMap and PSet are based), one pure Python implementation and one implemented as a C extension. The latter generally being 2 - 20 times faster than the former. The C extension will be used automatically when possible.
The pure python implementation is fully PyPy compatible. Running it under PyPy speeds operations up considerably if the structures are used heavily (if JITed), for some cases the performance is almost on par with the built in counterparts.
PEP 561 style type hints for use with mypy and various editors are available for most types and functions in pyrsistent.
Type classes for annotating your own code with pyrsistent types are also available under pyrsistent.typing.
pip install pyrsistent
Available at http://pyrsistent.readthedocs.org/
Brief presentation available at http://slides.com/tobiasgustafsson/immutability-and-python/
Tobias Gustafsson https://github.com/tobgu
Christopher Armstrong https://github.com/radix
Anders Hovmöller https://github.com/boxed
Itamar Turner-Trauring https://github.com/itamarst
Jonathan Lange https://github.com/jml
Richard Futrell https://github.com/Futrell
Jakob Hollenstein https://github.com/jkbjh
David Honour https://github.com/foolswood
David R. MacIver https://github.com/DRMacIver
Marcus Ewert https://github.com/sarum90
Jean-Paul Calderone https://github.com/exarkun
Douglas Treadwell https://github.com/douglas-treadwell
Travis Parker https://github.com/teepark
Julian Berman https://github.com/Julian
Dennis Tomas https://github.com/dtomas
Neil Vyas https://github.com/neilvyas
doozr https://github.com/doozr
Kamil Galuszka https://github.com/galuszkak
Tsuyoshi Hombashi https://github.com/thombashi
nattofriends https://github.com/nattofriends
agberk https://github.com/agberk
Waleed Khan https://github.com/arxanas
Jean-Louis Fuchs https://github.com/ganwell
Carlos Corbacho https://github.com/ccorbacho
Felix Yan https://github.com/felixonmars
benrg https://github.com/benrg
Jere Lahelma https://github.com/je-l
Max Taggart https://github.com/MaxTaggart
Vincent Philippon https://github.com/vphilippon
Semen Zhydenko https://github.com/ss18
Till Varoquaux https://github.com/till-varoquaux
Michal Kowalik https://github.com/michalvi
ossdev07 https://github.com/ossdev07
Kerry Olesen https://github.com/qhesz
johnthagen https://github.com/johnthagen
Bastien Vallet https://github.com/djailla
Ram Rachum https://github.com/cool-RR
Vincent Philippon https://github.com/vphilippon
Andrey Bienkowski https://github.com/hexagonrecursion
Ethan McCue https://github.com/bowbahdoe
Jason R. Coombs https://github.com/jaraco
Nathan https://github.com/ndowens
Geert Barentsen https://github.com/barentsen
phil-arh https://github.com/phil-arh
Tamás Nepusz https://github.com/ntamas
Hugo van Kemenade https://github.com/hugovk
Ben Beasley https://github.com/musicinmybrain
Noah C. Benson https://github.com/noahbenson
dscrofts https://github.com/dscrofts
Andy Reagan https://github.com/andyreagan
Aaron Durant https://github.com/Aaron-Durant
Joshua Munn https://github.com/jams2
Lukas https://github.com/lukasK9999
Arshad https://github.com/arshad-ml
Want to contribute? That's great! If you experience problems please log them on GitHub. If you want to contribute code, please fork the repository and submit a pull request.
Tests can be executed using tox.
Install tox: pip install tox
Run test for Python 3.8: tox -e py38
- pip install -r requirements.txt
- Update CHANGES.txt
- Update README.rst with any new contributors and potential info needed.
- Update _pyrsistent_version.py
- Commit and tag with new version: git add -u . && git commit -m 'Prepare version vX.Y.Z' && git tag -a vX.Y.Z -m 'vX.Y.Z'
- Push commit and tags: git push --follow-tags
- Build new release using Github actions
Pyrsistent can be considered stable and mature (who knows, there may even be a 1.0 some day :-)). The project is maintained, bugs fixed, PRs reviewed and merged and new releases made. I currently do not have time for development of new features or functionality which I don't have use for myself. I'm more than happy to take PRs for new functionality though!
There are a bunch of issues marked with enhancement
and help wanted
that contain requests for new functionality that would be nice to include. The level of difficulty and extend of the issues varies, please reach out to me if you're interested in working on any of them.
If you feel that you have a grand master plan for where you would like Pyrsistent to go and have the time to put into it please don't hesitate to discuss this with me and submit PRs for it. If all goes well I'd be more than happy to add additional maintainers to the project!