A small python library that can clump lists of data together.
Part of a video series on calmcode.io.
Clumper allows you to quickly parse through a list of json-like data.
Here's an example of such a dataset.
pokemon = [
{'name': 'Bulbasaur', 'type': ['Grass', 'Poison'], 'hp': 45, 'attack': 49},
{'name': 'Charmander', 'type': ['Fire'], 'hp': 39, 'attack': 52},
...
]
Given this list of dictionaries we can write the following query;
from clumper import Clumper
clump = Clumper.read_json('https://calmcode.io/datasets/pokemon.json')
(clump
.keep(lambda d: len(d['type']) == 1)
.mutate(type=lambda d: d['type'][0],
ratio=lambda d: d['attack']/d['hp'])
.select('name', 'type', 'ratio')
.sort(lambda d: d['ratio'], reverse=True)
.head(5)
.collect())
What this code does line-by-line.
This code will perform the following steps.- It imports
Clumper
. - It fetches a list of json-blobs about pokemon from the internet.
- It removes all the pokemon that have more than 1 type.
- The dictionaries that are left will have their
type
now as a string instead of a list of strings. - The dictionaries that are left will also have a property called
ratio
which calculates the ratio betweenhp
andattack
. - All the keys besides
name
,type
andratio
are removed. - The collection is sorted by
ratio
, from high to low. - We grab the top 5 after sorting.
- The results are returned as a list of dictionaries.
This is what we get back:
[{'name': 'Diglett', 'type': 'Ground', 'ratio': 5.5},
{'name': 'DeoxysAttack Forme', 'type': 'Psychic', 'ratio': 3.6},
{'name': 'Krabby', 'type': 'Water', 'ratio': 3.5},
{'name': 'DeoxysNormal Forme', 'type': 'Psychic', 'ratio': 3.0},
{'name': 'BanetteMega Banette', 'type': 'Ghost', 'ratio': 2.578125}]
We've got a lovely documentation page that explains how the library works.
- This library has no dependencies besides a modern version of python.
- The library offers a pattern of verbs that are very expressive.
- You can write code from top to bottom, left to right.
- You can read in many
json
/yaml
/csv
files by using a wildcard*
. - MIT License
You can install this package via pip
.
pip install clumper
It may be safer however to install via;
python -m pip install clumper
For details on why, check out this resource.
There are some optional dependencies that you might want to install as well.
python -m pip install clumper[yaml]
This package is new and not a whole lot of users have been able to find all the edge cases yet. We unit test every method but feel free to notify us of edge cases.
Make sure you check out the issue list beforehand in order
to prevent double work before you make a pull request. To get started locally, you can clone
the repo and quickly get started using the Makefile
.
git clone git@github.com:koaning/clumper.git
cd clumper
make install-dev