docker-compose up -d
docker exec -i pg-github-analysis psql -U github -d github < schema.sql
- install go
go get github.com/benfred/github-analysis
cd $GOPATH/src/github.com/benfred/github-analysis
go install
- fix int64 errors => TODO created fixed fork
go install
./build.sh
to download parallel:
- adjust months in source
- ./build.sh
$GOPATH/bin/gha-download-files
$GOPATH/bin/gha-parse-githubarchive -path /Users/ueli/repos/gh-geo-activity/data/gh-archive
docker exec -i gh-analysis-db psql -U github -d github -c "DROP TABLE events;"
docker exec -i gh-analysis-db psql -U github -d github < data/schema.sql
./import-gh-archive-tsv.sh
REINDEX TABLE events;
REINDEX TABLE locations;
--script github-user-location-scraper
REINDEX TABLE users;
# go run cmd/gha-location-scraper/main.go
$GOPATH/bin/gha-location-scraper
=> should use https://wiki.openstreetmap.org/wiki/Nominatim#Alternatives_.2F_Third-party_providers
REFRESH MATERIALIZED VIEW cities_in_profiles;
REFRESH MATERIALIZED VIEW countries_in_profiles;
--script noaa-location-scraper
NOAA cities and countries are stored as json files in data directory. Refresh them by running this projects main class.
docker exec -i gh-analysis-db psql -U github -d github < data/import-countries-json.sql
docker exec -i gh-analysis-db psql -U github -d github < data/import-cities-json.sql
docker exec -i gh-analysis-db psql -U github -d github < data/views.sql
--script event-weather-scraper