/fliegel

Primary LanguagePythonMIT LicenseMIT

Geocoding the Fliegel Index

This repository contains the source, interim and final data, as well as the source code that was used to geocode the geographic names in the documents Records of the Moravian Mission Among the Indians of North America: Geographic Names Index and Records of the Moravian Mission Among the Indians of North America: White Persons Index.

To reproduce the results, execute the scripts in the order of their enumeration. The GeoNames data has to be downloaded manually from https://download.geonames.org/export/dump/ beforehand.

The authors used Python 3.11.0 with the latest versions of the libraries used (as of January 24, 2023). The GeoNames data was downloaded on January 17, 2023.

Provenance Graph

flowchart TD
classDef entity fill:#fffedf
classDef entity color:#000000
classDef entity stroke:#a4a4a4
classDef entity stroke-width:1px
classDef activity fill:#cfceff
classDef activity color:#000000
classDef activity stroke:#a4a4a4
classDef activity stroke-width:1px
classDef agent fill:#ffebc3
classDef agent color:#000000
classDef agent stroke:#a4a4a4
classDef agent stroke-width:1px
https://github.com/rue-a/fliegel/tree/master/fliegel_geogNames.xml([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/fliegel_geogNames.xml>fliegel_geogNames.xml</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_geog_names.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/interim/fliegel_geog_names.csv>interim/fliegel_geog_names.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_geog_names.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/01_xml2csv.py
https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/countries.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/countries.csv>countries</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/geonames_admin1CodesASCII.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/geonames_admin1CodesASCII.csv>geonames_admin1</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/geonames_admin2Codes.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/geonames_admin2Codes.csv>geonames_admin2</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/admin_codes/admin_levels.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/admin_codes/admin_levels.csv>admin_codes/admin_levels.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/admin_codes/admin_levels.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/02_makle_admin_levels_list.py
https://github.com/rue-a/fliegel/tree/master/misc/abbreviations.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/misc/abbreviations.csv>misc/abbreviations.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_schematized.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/interim/fliegel_schematized.csv>interim/fliegel_schematized.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_schematized.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/03_headlines_into_schema.py
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_condensed.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/interim/fliegel_condensed.csv>interim/fliegel_condensed.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_condensed.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/04_condense_duplicates.py
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_typed.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/interim/fliegel_typed.csv>interim/fliegel_typed.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_typed.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/05_apply_heuristics.py
https://download.geonames.org/export/dump/48be2cf8-6c74-4f7f-898b-d8b98ebcf66e([<a style=color:inherit href=https://download.geonames.org/export/dump/48be2cf8-6c74-4f7f-898b-d8b98ebcf66e>geonames_dl</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/geonames/geonames_prepared.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/geonames/geonames_prepared.csv>geonames_prepared - not in repo, too large</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/geonames/geonames_prepared.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/06_prepare_geonames.py
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_prepared.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/interim/fliegel_prepared.csv>interim/fliegel_prepared.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_prepared.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/07_prepare_geocode.py
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_geocoded.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/interim/fliegel_geocoded.csv>interim/fliegel_geocoded.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/interim/fliegel_geocoded.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/08_geocode.py
https://github.com/rue-a/fliegel/tree/master/fliegel_gazetteer.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/fliegel_gazetteer.csv>fliegel_gazetteer.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/fliegel_gazetteer.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/09_post_processing.py
https://github.com/rue-a/fliegel/tree/master/Fliegel_WhitePeople_sentiment.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/Fliegel_WhitePeople_sentiment.csv>Fliegel_WhitePeople_sentiment.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/interim/white_ppl_index_geocoded.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/interim/white_ppl_index_geocoded.csv>interim/white_ppl_index_geocoded.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/interim/white_ppl_index_geocoded.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/10_geocode_white_ppl_index.py
https://github.com/rue-a/fliegel/tree/master/fliegel_geoperson.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/fliegel_geoperson.csv>fliegel_geoperson.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/fliegel_geoperson.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/11_post_processing_white_ppl_index.py
https://github.com/rue-a/fliegel/tree/master/fliegel_geofactoid.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/fliegel_geofactoid.csv>fliegel_geofactoid.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/fliegel_geofactoid.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/11_post_processing_white_ppl_index.py
https://github.com/rue-a/fliegel/tree/master/fliegel_geoperson_and_geofactoid.csv([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/fliegel_geoperson_and_geofactoid.csv>fliegel_geoperson_and_geofactoid.csv</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/fliegel_geoperson_and_geofactoid.csv-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/11_post_processing_white_ppl_index.py
https://github.com/rue-a/fliegel/tree/master/geojsons/([<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/geojsons/>geojsons/</a>]):::entity
https://github.com/rue-a/fliegel/tree/master/geojsons/-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#wasGeneratedBy>was generated by</a> -->https://github.com/rue-a/fliegel/tree/master/12_geojsons_from_white_ppl_index.py
https://github.com/rue-a/fliegel/tree/master/01_xml2csv.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/01_xml2csv.py>01_xml2csv.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/01_xml2csv.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/fliegel_geogNames.xml
https://github.com/rue-a/fliegel/tree/master/02_makle_admin_levels_list.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/02_makle_admin_levels_list.py>02_makle_admin_levels_list.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/02_makle_admin_levels_list.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/countries.csv
https://github.com/rue-a/fliegel/tree/master/02_makle_admin_levels_list.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/geonames_admin1CodesASCII.csv
https://github.com/rue-a/fliegel/tree/master/02_makle_admin_levels_list.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/geonames_admin2Codes.csv
https://github.com/rue-a/fliegel/tree/master/03_headlines_into_schema.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/03_headlines_into_schema.py>03_headlines_into_schema.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/03_headlines_into_schema.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/misc/abbreviations.csv
https://github.com/rue-a/fliegel/tree/master/03_headlines_into_schema.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/admin_codes/admin_levels.csv
https://github.com/rue-a/fliegel/tree/master/03_headlines_into_schema.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/countries.csv
https://github.com/rue-a/fliegel/tree/master/03_headlines_into_schema.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/interim/fliegel_geog_names.csv
https://github.com/rue-a/fliegel/tree/master/04_condense_duplicates.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/04_condense_duplicates.py>04_condense_duplicates.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/04_condense_duplicates.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/interim/fliegel_schematized.csv
https://github.com/rue-a/fliegel/tree/master/05_apply_heuristics.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/05_apply_heuristics.py>05_apply_heuristics.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/05_apply_heuristics.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/interim/fliegel_condensed.csv
https://github.com/rue-a/fliegel/tree/master/05_apply_heuristics.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/countries.csv
https://github.com/rue-a/fliegel/tree/master/05_apply_heuristics.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/admin_codes/admin_levels.csv
https://github.com/rue-a/fliegel/tree/master/06_prepare_geonames.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/06_prepare_geonames.py>06_prepare_geonames.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/06_prepare_geonames.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://download.geonames.org/export/dump/48be2cf8-6c74-4f7f-898b-d8b98ebcf66e
https://github.com/rue-a/fliegel/tree/master/07_prepare_geocode.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/07_prepare_geocode.py>07_prepare_geocode.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/07_prepare_geocode.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/interim/fliegel_typed.csv
https://github.com/rue-a/fliegel/tree/master/07_prepare_geocode.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/geonames_admin1CodesASCII.csv
https://github.com/rue-a/fliegel/tree/master/07_prepare_geocode.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/geonames_admin2Codes.csv
https://github.com/rue-a/fliegel/tree/master/07_prepare_geocode.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/admin_codes/downloads/countries.csv
https://github.com/rue-a/fliegel/tree/master/08_geocode.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/08_geocode.py>08_geocode.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/08_geocode.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/interim/fliegel_prepared.csv
https://github.com/rue-a/fliegel/tree/master/08_geocode.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/geonames/geonames_prepared.csv
https://github.com/rue-a/fliegel/tree/master/09_post_processing.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/09_post_processing.py>09_post_processing.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/09_post_processing.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/interim/fliegel_geocoded.csv
https://github.com/rue-a/fliegel/tree/master/09_post_processing.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/interim/fliegel_condensed.csv
https://github.com/rue-a/fliegel/tree/master/10_geocode_white_ppl_index.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/10_geocode_white_ppl_index.py>10_geocode_white_ppl_index.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/10_geocode_white_ppl_index.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/fliegel_gazetteer.csv
https://github.com/rue-a/fliegel/tree/master/10_geocode_white_ppl_index.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/Fliegel_WhitePeople_sentiment.csv
https://github.com/rue-a/fliegel/tree/master/11_post_processing_white_ppl_index.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/11_post_processing_white_ppl_index.py>11_post_processing_white_ppl_index.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/11_post_processing_white_ppl_index.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/interim/white_ppl_index_geocoded.csv
https://github.com/rue-a/fliegel/tree/master/12_geojsons_from_white_ppl_index.py[[<a style=color:inherit href=https://github.com/rue-a/fliegel/tree/master/12_geojsons_from_white_ppl_index.py>12_geojsons_from_white_ppl_index.py</a>]]:::activity
https://github.com/rue-a/fliegel/tree/master/12_geojsons_from_white_ppl_index.py-- <a style=color:inherit href=https://www.w3.org/TR/prov-o/#used>used</a> -->https://github.com/rue-a/fliegel/tree/master/fliegel_geofactoid.csv
Loading