Machine learning (ML) techniques present new ways of approaching archaeological research questions and interest in applying these methods continues to grow. This repository documents the application of ML techniques to archaeological data, aiming to assist those working in the field by:
- providing an overview of the ways ML is being applied in archaeology
- sparking new ideas whilst reducing duplication of work
- encouraging the sharing of code, data, and other resources
- making resources more FAIR (Findable, Accessible, Interoperable, and Reuseable)
By doing this, we hope to support practitioners to learn about, critically apply, or contribute to conversations about, machine learning techniques for archaeology.
This project is open for contributions!
Check out our ๐บ๏ธ roadmap to get an overview of the current milestones we're working towards and find out how to participate.
Archived releases of this repository with a citeable DOI will be made at regular intervals.
This project was kicked off as part of Open Seeds cohort 8, and was inspired by these great projects: satellite-image-deep-learning, Rchaeology, open-phytoliths, AncientMetagenomeDir, and open-archaeo.
Machine learning techniques can be described and categorised in a number of different ways, which can make the field confusing to navigate. The data structure of this repository aims to simplify things. It's based on a hierarchy of information which goes from the most general way of describing a technique to the most specific, e.g.:
application area โ> task โ> model/algorithm
For contributors, guidance on how to use this hierarchy to structure your contributions can be found in the ๐ repo style guide.
task | authors | year | data type | technique | paper | code | data |
---|---|---|---|---|---|---|---|
regression for stable isotope analysis | Bataille et al | 2020 | strontium | RF | paper | code | data |
regression for stable isotope analysis | Funck et al | 2020 | strontium | RF | paper | nan | data |
regression for stable isotope analysis | Bataille et al | 2018 | strontium | RF | paper | code | nan |
classification for elemental analysis | Charalambous et al | 2016 | ceramics ED-XRF | kNN, C4.5, LVQ | paper | nan | nan |
task | authors | year | data type | technique | paper | code | data |
---|---|---|---|---|---|---|---|
masked language modelling for archaeological text | Brandsen | 2023 | english language | ArchaeoBERT | paper | model | nan |
named entity recognition for archaeological text | Brandsen | 2023 | english language | ArchaeoBERT-NER | paper | model | nan |
masked language modelling for archaeological text | Brandsen | 2023 | dutch language | ArcheoBERTje | paper | model | nan |
named entity recognition for archaeological text | Brandsen | 2023 | dutch language | ArcheoBERTje-NER | paper | model | data |
masked language modelling for archaeological text | Brandsen | 2023 | german language | bert-base-german-cased-archaeo | paper | model | nan |
named entity recognition for archaeological text | Brandsen | 2023 | german language | bert-base-german-cased-archaeo-NER | paper | model | nan |
dataset for named entity recognition | Brandsen et al | 2020 | dutch language | CoNNL | paper | nan | data |
task | authors | year | data type | technique | paper | code | data |
---|---|---|---|---|---|---|---|
dataset for maya archaeology | Kokalj et al | 2023 | lidar visualisation, lidar canopy height, SAR, optical satellite | object recognition, object detection, semantic segmentation | paper | nan | data |
segmentation for field systems | Kรผรงรผkdemirci et al | 2022 | lidar DTMs | U-Net | paper | nan | nan |
image classification for hollow roads | Verschoof-van der Vaart and Landauer | 2021 | lidar visualisation | Resnet-34 CNN | paper | nan | nan |
object detection for mining pits | Gallwey et al | 2019 | lidar DSM | U-Net | paper | model | nan |
object detection for multiple classes | Verschoof-van der Vaart and Lambers | 2019 | lidar visualisation | Faster R-CNN | paper | nan | nan |
task | authors | year | data type | technique | paper | code | data |
---|---|---|---|---|---|---|---|
regression for roman sites | Castiello and Tonini | 2021 | soil, topography | RF | paper | nan | nan |
regression for formative period sites | Yaworsky et al | 2020 | environmental, topography | MaxEnt, RF | paper | code | data |
classification for habitat suitability | Jones et al | 2019 | climate, topography | RF | paper | nan | nan |
classification for soil geochemistry | Oonk and Spijker | 2015 | soil geochemistry | kNN, SVM, NN | paper | nan | nan |