Benchmark for Cartesian coordinates feature extraction
Description • Install • Usage • Documentation • Contribute
This repository contains the data for training & benchmarking neural networks on various tasks, with the goal to evaluate feature extraction capabilities of benchmarked models.
Extracting features 2D polygons is not a trivial task. Many models can be applied to this task, and many approaches exist (learning from raw coordinates, learning from a raster image, etc...).
So it's necessary to have a benchmark, in order to quantify and see which model/approach is the best.
Install cartesius
by running :
pip install spwk-cartesius
In cartesius
, the training data is polygons that are randomly generated.
Let's have a look. First, initialize the training set :
from cartesius.data import PolygonDataset
train_data = PolygonDataset(
x_range=[-50, 50], # Range for the center of the polygon (x)
y_range=[-50, 50], # Range for the center of the polygon (y)
avg_radius_range=[1, 10], # Average radius of the generated polygons. Here it will either generate polygons with average radius 1, or 10
n_range=[6, 8, 11], # Number of points in the polygon. here it will either generate polygons with 6, 8 or 11 points
)
Then, we will take a look at the generated polygon :
import matplotlib.pyplot as plt
from cartesius.utils import print_polygon
def disp(*polygons):
plt.clf()
for p in polygons:
print_polygon(p)
plt.gca().set_aspect(1)
plt.axis("off")
plt.show()
polygon, labels = train_data[0]
disp(polygon)
print(labels)
The benchmark relies on various tasks : predicting the area of a polygon, its perimeter, its centroid, etc... (see the documentation for more details)
The goal of the benchmark is to write an encoder : a model that can encode a polygon's features into a vector.
After the feature vector is extracted from the polygon using the encoder, several heads (one per task) will predict the labels. If the polygon is well represented through the extracted features, the task-heads should have no problem predicting the labels.
The notebooks/
folder contains a notebook that implements a Transformer model, trains it on cartesius
data, and evaluate it. You can use this notebook as a model for further research.
Note : At the end of the notebook, a file submission.csv
is saved, you can use it for the Kaggle competition.
To contribute, install the package locally, create your own branch, add your code/tests/documentation, and open a PR !
When you add some feature, you should add tests for it and ensure the previous tests pass :
python -m pytest -W ignore::DeprecationWarning
Your code should be linted and properly formatted :
isort . && yapf -ri . && pylint cartesius && pylint tests --disable=redefined-outer-name
The documentation should be kept up-to-date. You can visualize the documentation locally by running :
mkdocs serve